forked from opea-project/GenAIExamples
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Update DocIndexRetriever Example to allow user passing in retriever/r…
…eranker params (opea-project#880) Signed-off-by: minmin-intel <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Loading branch information
1 parent
bd32b03
commit 62e06a0
Showing
8 changed files
with
188 additions
and
12 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,22 @@ | ||
# DocRetriever Application | ||
|
||
DocRetriever are the most widely adopted use case for leveraging the different methodologies to match user query against a set of free-text records. DocRetriever is essential to RAG system, which bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity. | ||
DocRetriever is the most widely adopted use case for leveraging the different methodologies to match user query against a set of free-text records. DocRetriever is essential to RAG system, which bridges the knowledge gap by dynamically fetching relevant information from external sources, ensuring that responses generated remain factual and current. The core of this architecture are vector databases, which are instrumental in enabling efficient and semantic retrieval of information. These databases store data as vectors, allowing RAG to swiftly access the most pertinent documents or data points based on semantic similarity. | ||
|
||
## We provided DocRetriever with different deployment infra | ||
|
||
- [docker xeon version](docker_compose/intel/cpu/xeon/README.md) => minimum endpoints, easy to setup | ||
- [docker gaudi version](docker_compose/intel/hpu/gaudi/README.md) => with extra tei_gaudi endpoint, faster | ||
|
||
## We allow users to set retriever/reranker hyperparams via requests | ||
|
||
Example usage: | ||
|
||
```python | ||
url = "http://{host_ip}:{port}/v1/retrievaltool".format(host_ip=host_ip, port=port) | ||
payload = { | ||
"messages": query, | ||
"k": 5, # retriever top k | ||
"top_n": 2, # reranker top n | ||
} | ||
response = requests.post(url, json=payload) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,71 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import argparse | ||
|
||
import requests | ||
|
||
|
||
def search_knowledge_base(query: str, url: str, request_type="chat_completion") -> str: | ||
"""Search the knowledge base for a specific query.""" | ||
print(url) | ||
proxies = {"http": ""} | ||
if request_type == "chat_completion": | ||
print("Sending chat completion request") | ||
payload = { | ||
"messages": query, | ||
"k": 5, | ||
"top_n": 2, | ||
} | ||
else: | ||
print("Sending text request") | ||
payload = { | ||
"text": query, | ||
} | ||
response = requests.post(url, json=payload, proxies=proxies) | ||
print(response) | ||
if "documents" in response.json(): | ||
docs = response.json()["documents"] | ||
context = "" | ||
for i, doc in enumerate(docs): | ||
if i == 0: | ||
context = str(i) + ": " + doc | ||
else: | ||
context += "\n" + str(i) + ": " + doc | ||
# print(context) | ||
return context | ||
elif "text" in response.json(): | ||
return response.json()["text"] | ||
elif "reranked_docs" in response.json(): | ||
docs = response.json()["reranked_docs"] | ||
context = "" | ||
for i, doc in enumerate(docs): | ||
if i == 0: | ||
context = doc["text"] | ||
else: | ||
context += "\n" + doc["text"] | ||
# print(context) | ||
return context | ||
else: | ||
return "Error parsing response from the knowledge base." | ||
|
||
|
||
def main(): | ||
parser = argparse.ArgumentParser(description="Index data") | ||
parser.add_argument("--host_ip", type=str, default="localhost", help="Host IP") | ||
parser.add_argument("--port", type=int, default=8889, help="Port") | ||
parser.add_argument("--request_type", type=str, default="chat_completion", help="Test type") | ||
args = parser.parse_args() | ||
print(args) | ||
|
||
host_ip = args.host_ip | ||
port = args.port | ||
url = "http://{host_ip}:{port}/v1/retrievaltool".format(host_ip=host_ip, port=port) | ||
|
||
response = search_knowledge_base("OPEA", url, request_type=args.request_type) | ||
|
||
print(response) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters