Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Bedrock latency-optimized inference #7606

Open
marchellodev opened this issue Jan 7, 2025 · 1 comment
Open

[Feature]: Bedrock latency-optimized inference #7606

marchellodev opened this issue Jan 7, 2025 · 1 comment
Labels
enhancement New feature or request

Comments

@marchellodev
Copy link

The Feature

https://docs.aws.amazon.com/bedrock/latest/userguide/latency-optimized-inference.html

Motivation, pitch

This feature decreases the latency of a couple of models:

  • Anthropic Claude 3.5 Haiku | us.anthropic.claude-3-5-haiku-20241022-v1:0 | US East (Ohio)
  • Meta Llama 3.1 70B Instruct | us.meta.llama3-1-70b-instruct-v1:0 | US East (Ohio)
  • Llama 3.1 405B Instruct

Are you a ML Ops Team?

No

Twitter / LinkedIn details

No response

@marchellodev marchellodev added the enhancement New feature or request label Jan 7, 2025
@krrishdholakia
Copy link
Contributor

interesting - would you expect this to be passed to bedrock by default or opt in? @marchellodev

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants