Skip to content

ParamThakkar123/Secure-Local-Offline-Rag-System

Repository files navigation

๐Ÿ›ก๏ธ Secure Offline RAG System

We built a RAG system which runs locally on cpu in an offline mode. It uses open source large language models for performing retrieval augmented generation.

๐Ÿš€ Tech Stack

Programming Language

๐Ÿ Python

Frameworks & Libraries

๐ŸŽจ Streamlit โ€” For building the interactive and intuitive user interface.

๐Ÿ”— Langchain โ€” To streamline and optimize the RAG pipeline.

๐Ÿง  Ollama โ€” Efficient, local LLM deployment for high-quality inference.

๐Ÿค— Hugging Face โ€” Powerful models and tools for natural language processing.

Vector Database

๐Ÿ” FAISS (Facebook AI Similarity Search) โ€” Fast, efficient, and scalable vector search for document retrieval.

Reranking Model

๐ŸŽฏ BAAI/bge-reranker-base โ€” Advanced model for reranking results to ensure relevant and accurate information is returned.

โœจ Features

  • Minimum CPU memory and RAM usage
  • Runs locally even in an offline environment (For PDFs and other documents)
  • Highly efficient and quantized model
  • Multilingual support with over 29 languages including Chinese
  • Fast inference
  • Intuitive UI
  • Add new documents to the system without the need for a complete reindexing process, ensuring dynamic and flexible integration of new knowledge.
  • Built with a focus on minimizing memory usage, the system leverages lightweight retrieval techniques such as FAISS (or alternatives like inverted indices) to manage large datasets without consuming excessive memory.
  • Low Latency
  • Total Memory usage: 338 MB (model) + 121 MB (embeddings)
  • Reranking model 1.1GB but loads only when required and loads once.

Hardware requirements

Nvidia GPUs with compute capability 5.0+ because it uses ollama and ollama supports this GPU capability

๐Ÿ“‚ File Structure

Screenshot 2024-11-15 150722

๐Ÿ› ๏ธ Installation Steps

Clone the repository:

> git clone https://github.com/ParamThakkar123/Secure-Local-Offline-Rag-System.git

Change directory:

> cd Secure-Local-Offline-Rag-System

Installl the dependencies :

> pip install -r requirements.txt

Download Ollama app and run it

Open command line and type :

> ollama pull qwen2:0.5b-instruct-q3_K_S
> ollama pull nextfire/paraphrase-multilingual-minilm:l12-v2

Run the app.py using the following command in the command line

> streamlit run app.py

If the above command gives the error โ€œstreamlit not recognizedโ€, enter the following command

> python -m streamlit run app.py

๐Ÿ“ธ Output Screenshots

Screenshot 2024-11-15 170214

Screenshot 2024-11-15 182035

๐ŸŽฅ Demo Video

WhatsApp.Video.2024-11-15.at.6.30.30.PM.1.mp4

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published