Skip to content

Intel Generative AI Examples (e.g., ChatQnA with RAG) on Xeon and Gaudi2

License

Notifications You must be signed in to change notification settings

minmin-intel/GenAIExamples

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Generative AI Examples

Introduction

GenAIExamples are designed to give developers an easy entry into generative AI, featuring microservice-based samples that simplify the processes of deploying, testing, and scaling GenAI applications. All examples are fully compatible with Docker and Kubernetes, supporting a wide range of hardware platforms such as Gaudi, Xeon, and NVIDIA GPU, and other hardwares, ensuring flexibility and efficiency for your GenAI adoption.

Architecture

GenAIComps is a service-based tool that includes microservice components such as llm, embedding, reranking, and so on. Using these components, various examples in GenAIExample can be constructed, including ChatQnA, DocSum, etc.

GenAIInfra, part of the OPEA containerization and cloud-native suite, enables quick and efficient deployment of GenAIExamples in the cloud.

GenAIEval measures service performance metrics such as throughput, latency, and accuracy for GenAIExamples. This feature helps users compare performance across various hardware configurations easily.

Getting Started

GenAIExamples offers flexible deployment options that cater to different user needs, enabling efficient use and deployment in various environments. Here’s a brief overview of the three primary methods: Python startup, Docker Compose, and Kubernetes.

Users can choose the most suitable approach based on ease of setup, scalability needs, and the environment in which they are operating.

Deployment Guide

Deployment are based on released docker images by default, check docker image list for detailed information. You can also build your own images following instructions.

Prerequisite

  • For Docker Compose based deployment, you should have docker compose installed. Refer to docker compose install.

  • For Kubernetes based deployment, we provide 3 ways from the easiest manifests to powerful GMC based deployment.

    • You should have a kubernetes cluster ready for use. If not, you can refer to k8s install to deploy one.
    • (Optional) You should have GMC installed to your kubernetes cluster if you want to try with GMC. Refer to GMC install for more information.
    • (Optional) You should have Helm (version >= 3.15) installed if you want to deploy with Helm Charts. Refer to the Helm Installation Guide for more information.
  • Recommended Hardware Reference

    Based on different deployment model size and performance requirement, you may choose different hardware platforms or cloud instances. Here are some reference platforms

    Use Case Deployment model Reference Configuration Hardware access/instances
    Xeon Intel/neural-chat-7b-v3-3 64 vCPUs, 365 GB disk 100 GB RAM, and Ubuntu 24.04 visit the [Intel Tiber Developer Cloud].
    Gaudi Intel/neural-chat-7b-v3-3 1 or 2 Gaudi Card, 16vCPUs, 365 GB disk 100 GB RAM, and Ubuntu 24.04 visit the [Intel Tiber Developer Cloud].
    Xeon Intel/neural-chat-7b-v3-3 64 vCPUs, 100 GB disk 64 GB RAM, and Ubuntu 24.04 AWS Cloud/c7i.16xlarge

Deploy Examples

Use Case Docker Compose
Deployment on Xeon
Docker Compose
Deployment on Gaudi
Kubernetes with Helm Charts Kubernetes with GMC
ChatQnA Xeon Instructions Gaudi Instructions ChatQnA with Helm Charts ChatQnA with GMC
CodeGen Xeon Instructions Gaudi Instructions CodeGen with Helm Charts CodeGen with GMC
CodeTrans Xeon Instructions Gaudi Instructions CodeTrans with Helm Charts CodeTrans with GMC
DocSum Xeon Instructions Gaudi Instructions DocSum with Helm Charts DocSum with GMC
SearchQnA Xeon Instructions Gaudi Instructions Not Supported SearchQnA with GMC
FaqGen Xeon Instructions Gaudi Instructions Not Supported FaqGen with GMC
Translation Xeon Instructions Gaudi Instructions Not Supported Translation with GMC
AudioQnA Xeon Instructions Gaudi Instructions Not Supported AudioQnA with GMC
VisualQnA Xeon Instructions Gaudi Instructions Not Supported VisualQnA with GMC
MultimodalQnA Xeon Instructions Gaudi Instructions Not supported Not supported
ProductivitySuite Xeon Instructions Not Supported Not Supported Not Supported

Supported Examples

Check here for detailed information of supported examples, models, hardwares, etc.

Contributing to OPEA

Welcome to the OPEA open-source community! We are thrilled to have you here and excited about the potential contributions you can bring to the OPEA platform. Whether you are fixing bugs, adding new GenAI components, improving documentation, or sharing your unique use cases, your contributions are invaluable.

Together, we can make OPEA the go-to platform for enterprise AI solutions. Let's work together to push the boundaries of what's possible and create a future where AI is accessible, efficient, and impactful for everyone.

Please check the Contributing guidelines for a detailed guide on how to contribute a GenAI component and all the ways you can contribute!

Thank you for being a part of this journey. We can't wait to see what we can achieve together!

Additional Content

About

Intel Generative AI Examples (e.g., ChatQnA with RAG) on Xeon and Gaudi2

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 31.0%
  • Python 23.1%
  • Svelte 20.8%
  • TypeScript 16.2%
  • SCSS 3.2%
  • JavaScript 2.5%
  • Other 3.2%