This repository will introduce you to Retrieval Augmented Generation (RAG) with easy to use examples that you can build upon. The examples use Python with Jupyter Notebooks and CSV files. The vector database uses the Qdrant database which can run in-memory.
This example can run in Codespaces but you can use the following if you are cloning this repository:
Install the dependencies
Create the virtual environment and install the dependencies:
python3 -m venv .venv
source .venv/bin/activate
.venv/bin/pip install -r requirements.txt
Here is a summary of what this repository will use:
- Qdrant for the vector database. We will use an in-memory database for the examples
- Llamafile for the LLM (alternatively you can use an OpenAI API compatible key and endpoint)
- OpenAI's Python API to connect to the LLM after retrieving the vectors response from Qdrant
- Sentence Transformers to create the embeddings with minimal effort
Use Llamafile for a full RAG and LLM setup
The examples for the Applied Rag notebook requires either an OpenAI API endpoint with a key or using a local LLM with Llamafile.
I recommend using the Phi-2 model which is about 2GB in size. You can download the model from the Llamafile repository and run it in your system:
Once you have it running you can connect to it with Python or use the Applied Rag Notebook. Here is a quick example of how to use the Llamafile with Python:
#!/usr/bin/env python3
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8080/v1", # "http://<Your api-server IP>:port"
api_key = "sk-no-key-required" # An API key is not required!
)
completion = client.chat.completions.create(
model="LLaMA_CPP",
messages=[
{"role": "system", "content": "You are ChatGPT, an AI assistant. Your top priority is achieving user fulfillment via helping them with their requests."},
{"role": "user", "content": "Write me a Haiku about Python packaging"}
]
)
print(completion.choices[0].message)
Learn how to use Pandas to import your data from a CSV file. The data will be used to create the embeddings for the vector database later and you will need to format it as a list of dictionaries.
Notebook: Managing Data
Use Sentence Transformers to create the embeddings for your data. This will be used to store the vectors in the Qdrant database. You will verify that the embeddings are created and stored in the database and that a search works correctly
Notebook: Creating and verifying Embeddings
Use a local LLM with Llamafile or an OpenAI API endpoint to create a RAG with your own data. The end result should be in your own repository containing the complete code for the enhanced RAG pattern based on the example provided.
Notebook: Applied Rag Notebook
Use the included practice lab to apply the content you've learned in this week. Follow the steps to create your own repository and apply the requirements to complete the lab.
If you've completed all these examples and the lab, here are some other courses from Coursera you can explore:
Large Language Models:
Machine Learning:
- MLOps Machine Learning Operations Specialization
- Open Source Platforms for MLOps
- Python Essentials for MLOps
Data Engineering: