Skip to content

Commit

Permalink
Enhance SearchQnA (#28)
Browse files Browse the repository at this point in the history
* draft searchQnA example

* skip the aio bug, test pass 1/2

* fix dep

* add copyright

* add README

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix sync chain issues

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Add SearchQnA ui part

Signed-off-by: lvliang-intel <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* frontend workflow

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix linebreak and frontend

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix preci issue

Signed-off-by: lvliang-intel <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix ci issue

Signed-off-by: lvliang-intel <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* fix issue

Signed-off-by: lvliang-intel <[email protected]>

* update backend endpoint

Signed-off-by: lvliang-intel <[email protected]>

---------

Signed-off-by: lvliang-intel <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: lvliang-intel <[email protected]>
  • Loading branch information
3 people authored Mar 29, 2024
1 parent fa38b16 commit 0064b6e
Show file tree
Hide file tree
Showing 51 changed files with 12,482 additions and 27 deletions.
9 changes: 5 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@ repos:
- id: check-json
exclude: |
(?x)^(
ChatQnA/ui/tsconfig.json
ChatQnA/ui/tsconfig.json|
SearchQnA/ui/tsconfig.json
)$
- id: check-yaml
- id: debug-statements
Expand All @@ -25,7 +26,7 @@ repos:
- id: insert-license
files: |
(?x)^(
(ChatQnA|CodeGen|DocSum|VisualQnA)/.*(py|yaml|yml|sh)|
(ChatQnA|CodeGen|DocSum|SearchQnA|VisualQnA)/.*(py|yaml|yml|sh)|
)$
args:
[
Expand All @@ -37,7 +38,7 @@ repos:
- id: insert-license
files: |
(?x)^(
(ChatQnA|CodeGen|DocSum|VisualQnA)/.*(ts|js)|
(ChatQnA|CodeGen|DocSum|SearchQnA|VisualQnA)/.*(ts|js)|
)$
args:
[
Expand All @@ -50,7 +51,7 @@ repos:
- id: insert-license
files: |
(?x)^(
(ChatQnA|CodeGen|DocSum|VisualQnA)/.*(html|svelte)|
(ChatQnA|CodeGen|DocSum|SearchQnA|VisualQnA)/.*(html|svelte)|
)$
args:
[
Expand Down
42 changes: 39 additions & 3 deletions SearchQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Search Question and Answering is the task of using Search Engine (e.g. Google Search) to improve the QA quality. Large language models have limitation on answering real-time information or specific details because they are limited to prior training data. A search engine can make up this advantage. By using a search engine, this SearchQnA service will firstly look up the relevant source web pages and feed them as context to the LLMs, so LLMs can use those context to compose answers more precisely.

## Start Service
# Start Backend Service

- Start the TGI service to deploy your LLM

Expand All @@ -17,11 +17,47 @@ bash launch_tgi_service.sh
```sh
cd /home/sdp/sihanche/GenAIExamples/SearchQnA/langchain/docker
docker build . --build-arg http_proxy=${http_proxy} --build-arg https_proxy=${http_proxy} -t intel/gen-ai-examples:searchqna-gaudi --no-cache
docker run -e TGI_ENDPOINT=<TGI ENDPOINT> -e GOOGLE_CSE_ID=<GOOGLE CSE ID> -e GOOGLE_API_KEY=<GOOGLE API KEY> -e HUGGINGFACEHUB_API_TOKEN=<HUGGINGFACE API TOKEN> -p 8085:8000 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -v $PWD/qna-app:/qna-app --runtime=habana -e HABANA_VISIBE_DEVILCES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host intel/gen-ai-examples:searchqna-gaudi
docker run -e TGI_ENDPOINT=<TGI ENDPOINT> -e GOOGLE_CSE_ID=<GOOGLE CSE ID> -e GOOGLE_API_KEY=<GOOGLE API KEY> -e HUGGINGFACEHUB_API_TOKEN=<HUGGINGFACE API TOKEN> -p 8080:8000 -e http_proxy=$http_proxy -e https_proxy=$https_proxy -v $PWD/qna-app:/qna-app --runtime=habana -e HABANA_VISIBE_DEVILCES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host intel/gen-ai-examples:searchqna-gaudi
```

- Test

```sh
curl http://localhost:8085/v1/rag/web_search_chat_stream -X POST -d '{"query":"Give me some latest news?"}' -H 'Content-Type: application/json'
curl http://localhost:8085/v1/rag/web_search_chat_stream -X POST -d '{"query":"Give me some latest news?"}' -H 'Content-Type: application/json'
```

# Start Frontend GUI

Navigate to the "ui" folder and execute the following commands to start the frontend GUI:

```bash
cd ui
sudo apt-get install npm && \
npm install -g n && \
n stable && \
hash -r && \
npm install -g npm@latest
```

For CentOS, please use the following commands instead:

```bash
curl -sL https://rpm.nodesource.com/setup_20.x | sudo bash -
sudo yum install -y nodejs
```

Update the `BACKEND_BASE_URL` environment variable in the `.env` file by replacing the IP address '127.0.0.1' with the actual IP address.

Run the following command to install the required dependencies:

```bash
npm install
```

Start the development server by executing the following command:

```bash
nohup npm run dev &
```

This will initiate the frontend service and launch the application.
72 changes: 52 additions & 20 deletions SearchQnA/langchain/docker/qna-app/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
# limitations under the License.

import os
import shutil
import sys
from queue import Queue
from threading import Thread
Expand All @@ -24,13 +25,15 @@
from fastapi.responses import StreamingResponse
from langchain.callbacks.base import BaseCallbackHandler
from langchain.chains import RetrievalQAWithSourcesChain
from langchain.globals import set_debug
from langchain.retrievers.web_research import WebResearchRetriever
from langchain_community.embeddings import HuggingFaceInstructEmbeddings
from langchain_community.llms import HuggingFaceEndpoint
from langchain_community.utilities import GoogleSearchAPIWrapper
from langchain_community.vectorstores import Chroma
from starlette.middleware.cors import CORSMiddleware

set_debug(True)
app = FastAPI()

app.add_middleware(
Expand All @@ -41,6 +44,9 @@
allow_headers=["*"],
)

TGI_ENDPOINT = os.getenv("TGI_ENDPOINT", "http://localhost:8080")
SHOW_INTERMEDIATE_LOG = os.getenv("SHOW_INTERMEDIATE_LOG", "True").lower() in ("true", "1")


class QueueCallbackHandler(BaseCallbackHandler):
"""A queue that holds the result answer token buffer for streaming response."""
Expand All @@ -52,13 +58,25 @@ def __init__(self, queue: Queue):
def on_llm_new_token(self, token: str, **kwargs):
sys.stdout.write(token)
sys.stdout.flush()
if self.enter_answer_phase:
if SHOW_INTERMEDIATE_LOG or self.enter_answer_phase:
self.queue.put(
{
"answer": token,
}
)

def on_llm_start(self, *args, **kwargs):
if SHOW_INTERMEDIATE_LOG:
if not self.enter_answer_phase:
msg = "The search engine begin to fetch the HTML pages with these questions:"
else:
msg = "\nGet the answer from Large Language Models:\n"
self.queue.put(
{
"answer": msg,
}
)

def on_llm_end(self, *args, **kwargs):
self.enter_answer_phase = not self.enter_answer_phase
return True
Expand Down Expand Up @@ -94,11 +112,13 @@ def __init__(
callbacks=[QueueCallbackHandler(queue=self.queue)],
)

# check google api key is provided
# Check that google api key is provided
if "GOOGLE_API_KEY" not in os.environ or "GOOGLE_API_KEY" not in os.environ:
raise Exception("Please make sure to set GOOGLE_API_KEY and GOOGLE_API_KEY environment variables!")

# Notice: please check or manually delete the vectordb directory if you do not previous histories
# Clear the last time searching history, which is useful to avoid interfering with current retrievals
if os.path.exists(vectordb_persistent_directory) and os.path.isdir(vectordb_persistent_directory):
shutil.rmtree(vectordb_persistent_directory)
self.vectorstore = Chroma(
embedding_function=HuggingFaceInstructEmbeddings(model_name=vectordb_embedding_model),
persist_directory=vectordb_persistent_directory,
Expand All @@ -109,7 +129,10 @@ def __init__(

# Compose the websearch retriever
self.web_search_retriever = WebResearchRetriever.from_llm(
vectorstore=self.vectorstore, llm=self.llm, search=self.search
vectorstore=self.vectorstore,
llm=self.llm,
search=self.search,
# num_search_results=3
)

# Compose the whole chain
Expand All @@ -119,14 +142,16 @@ def __init__(
)

def handle_search_chat(self, query: str):
response = self.llm_chain({"question": query})
try:
response = self.llm_chain({"question": query})
except Exception as e:
print(f"LLM chain error: {e}")
return "Internal Server Error", ""
return response["answer"], response["sources"]


tgi_endpoint = os.getenv("TGI_ENDPOINT", "http://localhost:8080")

router = SearchQuestionAnsweringAPIRouter(
entrypoint=tgi_endpoint,
entrypoint=TGI_ENDPOINT,
)


Expand All @@ -135,24 +160,28 @@ async def web_search_chat(request: Request):
params = await request.json()
print(f"[websearch - chat] POST request: /v1/rag/web_search_chat, params:{params}")
query = params["query"]
answer, sources = router.handle_search_chat(query=query)
answer, sources = router.handle_search_chat(query={"question": query})
print(f"[websearch - chat] answer: {answer}, sources: {sources}")
return {"answer": answer, "sources": sources}


@router.post("/v1/rag/web_search_chat_stream")
async def web_search_chat_stream(request: Request):
params = await request.json()
print(tgi_endpoint)
print(f"[websearch - streaming chat] POST request: /v1/rag/web_search_chat_stream, params:{params}")
query = params["query"]

def stream_callback(query):
finished = object()

def task():
_ = router.llm_chain({"question": query})
router.queue.put(finished)
try:
_ = router.llm_chain({"question": query})
router.queue.put(finished)
except Exception as e:
print(f"LLM chain error: {e}")
router.queue.put({"answer": "\nInternal Server Error\n"})
router.queue.put(finished)

t = Thread(target=task)
t.start()
Expand All @@ -166,22 +195,25 @@ def task():
continue

def stream_generator():
import codecs

chat_response = ""
# FIXME need to add the sources and chat_history
for res_dict in stream_callback({"question": query}):
for res_dict in stream_callback(query={"question": query}):
text = res_dict["answer"]
chat_response += text
if text == " ":
yield "data: @#$\n\n"
continue
if text.isspace():
# if text.isspace():
# continue
if "\n" in text or "\r" in text:
text = text.replace("\n", "<br/>").replace(" ", "@#$")
yield f"data: {text}\n\n"
continue
if "\n" in text:
yield "data: <br/>\n\n"
new_text = text.replace(" ", "@#$")
yield f"data: {new_text}\n\n"
text = text.replace(" ", "@#$")
yield f"data: {text}\n\n"
chat_response = chat_response.split("</s>")[0]
print(f"[rag - chat_stream] stream response: {chat_response}")
print(f"\n\n[rag - chat_stream] stream response: {chat_response}\n\n")
yield "data: [DONE]\n\n"

return StreamingResponse(stream_generator(), media_type="text/event-stream")
Expand Down
10 changes: 10 additions & 0 deletions SearchQnA/ui/.editorconfig
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
[*]
indent_style = tab

[package.json]
indent_style = space
indent_size = 2

[*.md]
indent_style = space
indent_size = 2
1 change: 1 addition & 0 deletions SearchQnA/ui/.env
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
BACKEND_BASE_URL = 'http://xxxxx:8003/v1/rag'
13 changes: 13 additions & 0 deletions SearchQnA/ui/.eslintignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
.DS_Store
node_modules
/build
/.svelte-kit
/package
.env
.env.*
!.env.example

# Ignore files for PNPM, NPM and YARN
pnpm-lock.yaml
package-lock.json
yarn.lock
34 changes: 34 additions & 0 deletions SearchQnA/ui/.eslintrc.cjs
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
// Copyright (c) 2024 Intel Corporation
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

module.exports = {
root: true,
parser: "@typescript-eslint/parser",
extends: ["eslint:recommended", "plugin:@typescript-eslint/recommended", "prettier"],
plugins: ["svelte3", "@typescript-eslint", "neverthrow"],
ignorePatterns: ["*.cjs"],
overrides: [{ files: ["*.svelte"], processor: "svelte3/svelte3" }],
settings: {
"svelte3/typescript": () => require("typescript"),
},
parserOptions: {
sourceType: "module",
ecmaVersion: 2020,
},
env: {
browser: true,
es2017: true,
node: true,
},
};
13 changes: 13 additions & 0 deletions SearchQnA/ui/.prettierignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
.DS_Store
node_modules
/build
/.svelte-kit
/package
.env
.env.*
!.env.example

# Ignore files for PNPM, NPM and YARN
pnpm-lock.yaml
package-lock.json
yarn.lock
13 changes: 13 additions & 0 deletions SearchQnA/ui/.prettierrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"pluginSearchDirs": [
"."
],
"overrides": [
{
"files": "*.svelte",
"options": {
"parser": "svelte"
}
}
]
}
33 changes: 33 additions & 0 deletions SearchQnA/ui/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
<h1 align="center" id="title"><img align="center" src="./static/favicon.png" alt="project-image" width="50" height="50">
Neural Chat</h1>

### 📸 Project Screenshots

![project-screenshot](https://imgur.com/SmhJSmC.png)
![project-screenshot](https://imgur.com/iGTDcwU.png)
![project-screenshot](https://imgur.com/cbJi5gj.png)

<h2>🧐 Features</h2>

Here're some of the project's features:

- Start a Text Chat:Initiate a text chat with the ability to input written conversations, where the dialogue content can also be customized based on uploaded files.
- Upload File: The choice between uploading locally or copying a remote link. Chat according to uploaded knowledge base.
- Clear: Clear the record of the current dialog box without retaining the contents of the dialog box.
- Chat history: Historical chat records can still be retained after refreshing, making it easier for users to view the context.
- Scroll to Bottom / Top: The chat automatically slides to the bottom. Users can also click the top icon to slide to the top of the chat record.
- End to End Time: Shows the time spent on the current conversation.

<h2>🛠️ Get it Running:</h2>

1. Clone the repo.

2. cd command to the current folder.

3. Modify the required .env variables.
```
BACKEND_BASE_URL = ''
```
4. Execute `npm install` to install the corresponding dependencies.

5. Execute `npm run dev` in both environments
Loading

0 comments on commit 0064b6e

Please sign in to comment.