-
Notifications
You must be signed in to change notification settings - Fork 799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Text-Generation-Inference introducing Multi-Backend #2580
Conversation
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Michelle Habonneau <[email protected]>
Co-authored-by: Jeff Boudier <[email protected]>
00e5a7b
to
3aa00fd
Compare
|
||
TGI is made of multiple components, primarily written in Rust and Python. Rust powers the HTTP and scheduling layers, and Python remains the go-to for modeling. | ||
|
||
Long story short: Rust allows us to improve the overall robustness of the serving layer with static analysis and compiler-based memory safety enforcement: it brings the ability to scale to multiple cores with the same safety guarantees more easily. Leveraging Rust’s strong type system for the HTTP layer and scheduler makes it possible to avoid memory issues while maximizing the concurrency, bypassing Global Interpreter Lock (GIL) in Python-based environments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noticed this small detail when skimming the text:
"...bypassing Global Interpreter Lock (GIL) in..."
--> "...bypassing the Global Interpreter Lock (GIL) in ..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"the" is not actually needed here in this case because it is not referring to a definite article due to the use of "in Python-based environments"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, I thought the definite article would still commonly be used when talking about specific entities, like "the Mona Lisa in Paris", "The Global Interpreter Lock in Python" etc 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🚀
No description provided.