A lightweight, no-strings-attached framework for your LLM that allows applying Chain-of-Thought prompt schema
(See related section) towards a massive textual collections.
- ✅ No-strings: you're free to LLM dependencies and flexible
venv
customization. - ✅ Support schemas descriptions for Chain-of-Thought concept.
- ✅ Provides iterator over infinite amount of input contexts served in
CSV
/JSONL
.
- ✅ Progress caching [for remote LLMs]: withstanding exception during LLM calls by using
sqlite3
engine for caching LLM answers;
From PyPI:
pip install bulk-chain
or latest version from here:
pip install git+https://github.com/nicolay-r/bulk-chain@master
To declare Chain-of-Though (CoT) schema, this project exploits JSON
format.
This format adopts name
field for declaring a name and schema
is a list of CoT instructions for the Large Language Model.
Each step represents a dictionary with prompt
and out
keys that corresponds to the input prompt and output variable name respectively.
All the variable names are expected to be mentioned in {}
.
Below, is an example on how to declare your own schema:
{
"name": "schema-name",
"schema": [
{"prompt": "Given the question '{text}', let's think step-by-step.",
"out": "steps"},
{"prompt": "For the question '{text}' the reasoining steps are '{steps}'. what would be an answer?",
"out": "answer"},
]
}
Another templates are available here.
Preliminary steps:
- Define your schema (Example for Sentiment Analysis))
- Wrap or pick LLM model from the list of presets.
Please take a look at the related Wiki page
NOTE: You have to install
source-iter
package
python3 -m bulk_chain.infer \
--src "<PATH-TO-YOUR-CSV-or-JSONL>" \
--schema "ext/schema/default.json" \
--adapter "dynamic:ext/replicate.py:Replicate" \
%%m \
--api_token "<REPLICATE-API-TOKEN>" \
--temp 0.1
All you have to do is to implement BaseLM
class, that includes:
__init__
-- for setting up batching mode support and (optional) model name;ask(prompt)
-- infer your model with the givenprompt
.
See examples with models here.