add tpu jetstream reference

huggingface · Jan 13, 2025 · 00e5a7b · 00e5a7b
1 parent 15a6a62
commit 00e5a7b
Show file tree

Hide file tree

Showing 2 changed files with 4 additions and 1 deletion.
diff --git a/_blog.yml b/_blog.yml
@@ -5284,5 +5284,7 @@
     - tgi
     - backends
     - vllm
+    - neuron
+    - jetstream
     - tensorrt-llm
     - community
diff --git a/tgi-multi-backend.md b/tgi-multi-backend.md
@@ -40,7 +40,8 @@ The new multi-backend capabilities of TGI open up many impactful roadmap opportu
 * **NVIDIA TensorRT-LLM backend**: We are collaborating with the NVIDIA TensorRT-LLM team to bring all the optimized NVIDIA GPUs \+ TensorRT performances to the community. This work will be covered more extensively in an upcoming blog post. It closely relates to our mission to empower AI builders with the open-source availability of both `optimum-nvidia` quantize/build/evaluate TensorRT compatible artifacts alongside TGI+TRTLLM to easily deploy, execute, and scale deployments on NVIDIA GPUs.
 * **Llama.cpp backend**: We are collaborating with the llama.cpp team to extend the support for server production use cases. The llama.cpp backend for TGI will provide a strong CPU-based option for anyone willing to deploy on Intel, AMD, or ARM CPU servers.
 * **vLLM backend**: We are contributing to the vLLM project and are looking to integrate vLLM as a TGI backend in Q1 '25.
-* **Neuron backend**: we are working with the Neuron teams at AWS to enable Inferentia 2 and Trainium 2 support natively in TGI
+* **AWS Neuron backend**: we are working with the Neuron teams at AWS to enable Inferentia 2 and Trainium 2 support natively in TGI.
+* **Google TPU backend**: We are working with the Google Jetstream & TPU teams to provide the best performance through TGI.
 
 We are confident TGI Backends will help simplify the deployments of LLMs, bringing versatility and performance to all TGI users.
 You'll soon be able to use TGI Backends directly within [Inference Endpoints](https://huggingface.co/inference-endpoints/). Customers will be able to easily deploy models with TGI Backends on various hardware with top-tier performance and reliability out of the box.