Langchain Pinecone Hybrid Search Showcase

PROJECT INFO

Langchain
Pinecone for Vector Database
HuggingFace all-MiniLM-L6-v2 for embeddings
BM25 with mmh3 hashing encoder

Features

Hybrid Search is the combination of full text and vector queries that execute against a search index containing both searchable plain text content and generated embeddings

Demo

Input sentences: ['In 2019, I visited Hungary', 'In 2020, I visited Czech Republic', 'In 2021, I visited Georgia']
Custom query: What country did I visit first?
100%|██████████| 3/3 [00:00<00:00, 24.12it/s]
BM25 values saved to bm25_values.json
100%|██████████| 1/1 [00:02<00:00,  2.15s/it]
Query result: [Document(metadata={'score': 0.286206543}, page_content='In 2019, I visited Hungary'), Document(metadata={'score': 0.255560637}, page_content='In 2020, I visited Czech Republic'), Document(metadata={'score': 0.225382119}, page_content='In 2021, I visited Georgia')]

Generated bm25_values.json is present in the repo

Installing:

1. Clone this repo to your folder:

git clone THIS REPO

2. Create a virtual environment

3. Install the dependencies

pip install -r requirements.txt

Extrawest.com, 2024

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
bm25_values.json		bm25_values.json
hybrid_search_app.py		hybrid_search_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Langchain Pinecone Hybrid Search Showcase

PROJECT INFO

Features

Demo

Installing:

About

Releases

Packages

Languages

License

extrawest/pinecone_hybrid_search

Folders and files

Latest commit

History

Repository files navigation

Langchain Pinecone Hybrid Search Showcase

PROJECT INFO

Features

Demo

Installing:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages