Skip to content

Latest commit

 

History

History
175 lines (119 loc) · 6.02 KB

README.md

File metadata and controls

175 lines (119 loc) · 6.02 KB

Multi-purpose Chatbot (Local, Remote and HF spaces)

A Chatbot UI that support Chatbot, RAG, Text completion, Multi-modal across HF Transformers, llama.cppp, Apple MLX and vLLM.

Designed support both locally, remote and huggingface spaces.

image


Checkout cool demos using Multi-purpose chatbot.

Supported features

Support backend

  • GPU Transformers with full support MultiModal, document QA, RAG, completion.
  • llama.cppp like Transformers, except pending MultiModal. PR welcome.
  • Apple MLX like Transformers, except pending MultiModal. PR welcome.
  • vLLM like Transformers + Batch inference via file upload, pending MultiModal. PR welcome.

Multi-purpose Chatbot use ENVIRONMENT VARIABLE instead of argparse to set hyperparmeters to support seamless integration with HF space, which requires us to set params via environment vars. The app is launch only with python app.py

Installation

pip install -r requirements.txt

Transformers

pip install -r transformers_requirements.txt

VLLM

pip install -r vllm_requirements.txt

llama.cpp

Follow Llama-cpp-python to install llama.cpp

e.g: On Macos

CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python

MLX

Only on MacOS, remember to install NATIVE python environment.

python -c "import platform; print(platform.processor())"
# should output "arm", if not reinstall python with native

Install requirements

pip install -r mlx_requirements.txt

Usage

We use bash environment to define model variables

Transformers

MODEL_PATH must be a model with chat_template with system prompt (e.g Mistral-7B-Instruct-v0.2 does not have system prompt)

export BACKEND=transformers
export MODEL_PATH=teknium/OpenHermes-2.5-Mistral-7B
export RAG_EMBED_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
export DEMOS=DocChatInterfaceDemo,ChatInterfaceDemo,RagChatInterfaceDemo,TextCompletionDemo
python app.py

Llava-1.5 Transformers

export CUDA_VISIBLE_DEVICES=0
export TEMPERATURE=0.7
export MAX_TOKENS=512
export MODEL_PATH=llava-hf/llava-1.5-7b-hf
export IMAGE_TOKEN="<image>"
export BACKEND=llava15_transformers
export DEMOS=VisionChatInterfaceDemo,VisionDocChatInterfaceDemo,TextCompletionDemo
python app.py

VLLM

export BACKEND=vllm
export MODEL_PATH=teknium/OpenHermes-2.5-Mistral-7B
export RAG_EMBED_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
export DEMOS=DocChatInterfaceDemo,ChatInterfaceDemo,RagChatInterfaceDemo,TextCompletionDemo
python app.py

llama.cpp

export BACKEND=llama_cpp
export MODEL_PATH=/path/to/model.gguf
export RAG_EMBED_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
export DEMOS=DocChatInterfaceDemo,ChatInterfaceDemo,RagChatInterfaceDemo,TextCompletionDemo
python app.py

MLX

export BACKEND=mlx
export MODEL_PATH=mlx-community/Nous-Hermes-2-Mistral-7B-DPO-4bit-MLX
export RAG_EMBED_MODEL_NAME=sentence-transformers/all-MiniLM-L6-v2
export DEMOS=DocChatInterfaceDemo,ChatInterfaceDemo,RagChatInterfaceDemo,TextCompletionDemo
python app.py

Customization

Configs:

  • configs.py where you can find customize UI markdowns and settings global variables

Backend and engines

Gradio Demo tabs

Enableing demos

Setting comma-separated demo class names (e.g ChatInterfaceDemo to enable demo).

export DEMOS=VisionDocChatInterfaceDemo,VisionChatInterfaceDemo,DocChatInterfaceDemo,ChatInterfaceDemo,RagChatInterfaceDemo,TextCompletionDemo

Contributing

We welcome and value any contributions and collaborations. Feel free to open a PR

Citation

If you find our project useful, hope you can star our repo and cite our repo as follows:

@article{multipurpose_chatbot_2024,
  author = {Xuan-Phi Nguyen, },
  title = {Multipurpose Chatbot},
  year = 2024,
}