Skip to content

Commit

Permalink
Merge branch 'main' into feature-support-masked-lm-1
Browse files Browse the repository at this point in the history
  • Loading branch information
aalok-sathe committed Nov 15, 2023
2 parents 5ef4ce4 + ee5b226 commit e893aa2
Show file tree
Hide file tree
Showing 11 changed files with 561 additions and 37 deletions.
25 changes: 25 additions & 0 deletions .github/workflows/pylint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Pylint

on: [push]

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10"]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
curl -sSL https://install.python-poetry.org | python -
poetry install # takes no arguments
pip install pylint
- name: Analysing the code with pylint
run: |
pylint $(git ls-files '*.py')
42 changes: 42 additions & 0 deletions .github/workflows/python-publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This workflow will upload a Python Package using Twine when a release is created
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python#publishing-to-package-registries

# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.

name: Upload Python Package

on:
release:
types: [published]

permissions:
contents: read

jobs:
deploy:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v3
with:
python-version: '3.x'
- name: Install dependencies
run: |
sudo apt install curl
python -m pip install --upgrade pip
curl -sSL https://install.python-poetry.org | python -
poetry install # no adtional args
pip install build
- name: Build package
run: poetry build
- name: Publish package
uses: pypa/gh-action-pypi-publish@27b31702a0e7fc50959f5ad993c78deac1bdfc29
with:
user: __token__
password: ${{ secrets.PYPI_API_TOKEN }}
10 changes: 9 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
Compute surprisal from language models!

`surprisal` supports most Causal Language Models (`GPT2`- and `GPTneo`-like models) from Huggingface or local checkpoint,
as well as `GPT3` models from OpenAI using their API!
as well as `GPT3` models from OpenAI using their API! We also support `KenLM` N-gram based language models using the
KenLM Python interface.

Masked Language Models (`BERT`-like models) are in the pipeline and will be supported at a future time.

Expand All @@ -12,6 +13,10 @@ The snippet below computes per-token surprisals for a list of sentences
```python
from surprisal import AutoHuggingFaceModel

from surprisal import KenLMModel
k = KenLMModel(model_path='./literature.arpa')


sentences = [
"The cat is on the mat",
"The cat is on the hat",
Expand All @@ -26,6 +31,9 @@ m.to('cuda') # optionally move your model to GPU!

for result in m.surprise(sentences):
print(result)

for result in k.surprise(sentences):
print(result)
```
and produces output of this sort:
```
Expand Down
134 changes: 127 additions & 7 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "surprisal"
version = "0.1.3a"
version = "0.1.4"
description = "A package to conveniently compute surprisals for text sequences and subsequences"
readme = "README.md"
homepage = "https://github.com/aalok-sathe/surprisal"
Expand All @@ -12,11 +12,12 @@ license = "MIT"
python = "^3.8"
transformers = "^4.20.1"
numpy = "^1.23.1"
torch = "^1.12.0"
torch = "^2.0.0"
plotext = "^5.0.2"
matplotlib = "^3.5.2"
pandas = "^1.4.3"
openai = "^0.23.0"
kenlm = {version = "^0.2.0", optional = true}

[tool.poetry.dev-dependencies]
ipython = "^8.4.0"
Expand Down
Loading

0 comments on commit e893aa2

Please sign in to comment.