-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Updated README with CI/CD instructions included
- Loading branch information
1 parent
1550c21
commit bc62782
Showing
1 changed file
with
19 additions
and
49 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,69 +1,39 @@ | ||
# README | ||
# PanKB LLM (PROD) | ||
|
||
## Overview | ||
|
||
A GenAI Assistant based on Langchain + Streamlit + Azure Cosmos DB for MongoDB (vCore) + Docker. | ||
|
||
Authors: | ||
- Binhuan Sun ([email protected]): data preprocessing, LLM, DEV vector DB creation (Chroma), retriever, Streamlit web app. | ||
- Pashkova Liubov ([email protected]): Changing the DEV vector DB (Chroma) to the PROD vector DB instance (Azure Cosmos DB for MongoDB (vCore)) and adjusting the choice of embeddings to the Cosmos DB limitations, the PROD DB index creation, dockerization, integration of the streamlit app with the Django framework templates, the Github repo set up. | ||
|
||
## Important considerations & limitations | ||
|
||
The DB population process took 93 minutes (<i>toedit: goddamn long!!!introduce multithreading to speed up!!!</i>). The MongoDB storage size the populated collection took is ~ 1.0 GiB, incl. the indexes. | ||
|
||
Please note the following limitations and considerations: | ||
- If we use an Azure Cosmos DB for MongoDB instance as the vector DB, we can try only embeddings with dimensionalities <= 2000 because for Azure Cosmos DB for MongoDB the maximum number of supported dimensions is 2000. Maybe it is even for the better, large embeddings are more expensive and not always provide a significant increase in performance. Examples: https://platform.openai.com/docs/guides/embeddings | ||
- We have to create the similarity index. The dimensionality of this index must match the dimensionality of the embeddings. | ||
- The CPU (M30) on a server, where we have our Azure Cosmos DB for MongoDB instance, supports only the <i>vector-ivf</i> index type. To create the <i>vector-hnsw</i> index, we need to upgrade to the M40 tier (it will cost us 780.42 USD per month instead of 211.36 that we pay for M30 now). | ||
- Data preprocessing, LLM, DEV vector DB creation (Chroma), retriever, Streamlit web app: Binhuan Sun, [email protected] | ||
- Changing the DEV vector DB (Chroma) to the PROD vector DB instance (Azure Cosmos DB for MongoDB (vCore)) and adjusting the choice of embeddings to the Cosmos DB limitations, the PROD DB index creation, dockerization, integration of the streamlit app with the Django framework templates, the github repo maintenance: Pashkova Liubov, [email protected] | ||
|
||
## Scripts execution | ||
|
||
Create the .env file in the following format: | ||
``` | ||
OPENAI_API_KEY=<insert the API key here without quotes> | ||
COHERE_API_KEY=<insert the API key here without quotes> | ||
TOGETHER_API_KEY=<insert the API key here without quotes> | ||
GOOGLE_API_KEY=<insert the API key here without quotes> | ||
ANTHROPIC_API_KEY=<insert the API key here without quotes> | ||
REPLICATE_API_TOKEN=<insert the API key here without quotes> | ||
VOYAGE_API_KEY=<insert the API key here without quotes> | ||
## MongoDB-PROD (Azure Cosmos DB for MongoDB) Connection String | ||
# Had to multiply maxIdleTimeMS by 10 to handle | ||
# urllib3.exceptions.ProtocolError: | ||
# ("Connection broken: ConnectionResetError(104, 'Connection reset by peer')", ConnectionResetError(104, 'Connection reset by peer')) | ||
MONGODB_CONN_STRING = "<insert the connection string here with quotes>" | ||
``` | ||
The connection string and API keys can be obtained from the authors. | ||
|
||
The DB population script does not have to be executed in a docker container. It can be done with the following commands: | ||
Every time when one pushes to the `prod` repo (usually from the DEV server), the changes in the AI Assistant Web Application will be AUTOMATICALLY deployed to the PROD server. The automation (CI/CD) is achieved with the help of Github Actions enabled for the repository. The respective config file is `.github/workflows/deploy-prod-to-azurevm.yml`. In order for the automated deployment to work, you should set up the values of the following secret Github Actions secrets: | ||
``` | ||
pip3 install -r requirements.txt | ||
python3 make_vectordb.py ./Paper_all pankb_vector_store | ||
PANKB_PROD_HOST - the PROD server IP address | ||
PANKB_PROD_SSH_USERNAME - the ssh user name to connect to the PROD server | ||
PANKB_PROD_PRIVATE_SSH_KEY - the ssh key that is used to connect to the PROD server | ||
OPENAI_API_KEY | ||
COHERE_API_KEY | ||
TOGETHER_API_KEY | ||
GOOGLE_API_KEY | ||
ANTHROPIC_API_KEY | ||
REPLICATE_API_TOKEN | ||
VOYAGE_API_KEY | ||
PANKB_PROD_MONGODB_CONN_STRING - MongoDB PROD (Azure CosmosDB for MongoDB) Connection String | ||
``` | ||
The first command above installs all the requirements. The second one runs the script with two command line arguments: the name of the folder containing the documents to feed to the LLM and the name of the MongoDB collection that will contain the vector DB. | ||
These secrets are encrypted and safely stored on Github in the "Settings - Secrets and Variables - Actions - Repository secrets" section. In this section, you can also add new Github Actions secrets and edit the existing ones. However, in order to change a secret name, you have to remove the existing secret and add the new one instead of the old one. | ||
|
||
The command for building and rebuilding the docker container with the Streamlit app inside: | ||
``` | ||
docker compose up -d --build --force-recreate | ||
``` | ||
The dockerized streamlit app does not have to be executed in <i>tmux></i>. It will always be up and running even after the VM is rebooted (achieved by using the option `restart: always` in the docker compose file). | ||
After the Github Actions deployment job has successfully run, the web-application must be available at <a href="pankb.org/ai_assistant" target="_blank">pankb.org/ai_assistant</a>. | ||
|
||
The status of the docker container can be checked with the following command: | ||
``` | ||
docker ps | ||
``` | ||
The command should produce approx. the following output among others: | ||
The command should produce approx. the following output in case of the successful deployment: | ||
``` | ||
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES | ||
54d89d7c4fad pankb_llm:latest "streamlit run strea…" 10 minutes ago Up 10 minutes 0.0.0.0:8501->8501/tcp, :::8501->8501/tcp pankb-llm | ||
``` | ||
|
||
## Availability | ||
|
||
Currently, the Streamlit app is available as a django application: | ||
``` | ||
http://<toedit: pankb server-ip or domain name>/ai_assistant/ | ||
54d89d7c4fad pankb_llm:latest "streamlit run strea…" 23 seconds ago Up 12 seconds 0.0.0.0:8501->8501/tcp, :::8501->8501/tcp pankb-llm | ||
``` |