From 7648e3e53db59038f22636401b558e9e0f8c3820 Mon Sep 17 00:00:00 2001 From: Christian Landgren Date: Thu, 15 Feb 2024 15:00:34 +0100 Subject: [PATCH] Update README.md --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 5f19e4ee..2c28671c 100644 --- a/README.md +++ b/README.md @@ -6,7 +6,7 @@ This is the main repo for fetching data from sources and adding them to our data ## Current Status -First working prototype for pipeline but doesn't work on large PDF files yet. +Test the app in Discord channel #rapporter-att-granska by using the command /pdf and Garbo will be answering with a parsed JSON ## Data Flow @@ -19,7 +19,7 @@ Some of the following steps will be performed in parallel and most will be async 5. Build query from prompt together with relevant embeddings 6. Send to LLM 7. Verify the results first automatically -8. Verify results in Discord channel (separate PR #2) +8. Verify results in Discord channel 9. Save to Wikidata or other database (not done) ### Get Started @@ -37,11 +37,11 @@ NOTE: To add a new job to the queue manually you can uncomment the lines in inde Create a .env file in the root lib and add these tokens/secrets before running the application: -OPENAI_API_KEY= -OPENAI_ORG_ID= -DISCORD_APPLICATION_ID= -DISCORD_TOKEN= -DISCORD_SERVER_ID= + OPENAI_API_KEY= + OPENAI_ORG_ID= + DISCORD_APPLICATION_ID= + DISCORD_TOKEN= + DISCORD_SERVER_ID= ### Next steps / Tasks @@ -51,7 +51,7 @@ DISCORD_SERVER_ID= - [x] Split PDF text into smaller chunks (maybe using langchain pdf instead of custom?) - [x] Add chunks to vector database (ChromaDB) - [x] Use vector database with langchain when doing queries to limit amount of tokens -- [ ] DevOps/Kubernetes setup for databases and deployment +- [x] DevOps/Kubernetes setup for databases and deployment (see [https://github.com/Klimatbyran/infra](infra) repo - private) - [ ] Tests etc ### Operations