Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QTIP Quantization Support? #2663

Open
aikitoria opened this issue Jan 6, 2025 · 0 comments
Open

QTIP Quantization Support? #2663

aikitoria opened this issue Jan 6, 2025 · 0 comments
Labels
Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers

Comments

@aikitoria
Copy link

I've recently seen this new 2-4 bit quant format that provides significantly better results than the current int4 ones in TensorRT-LLM, so I wanted to make sure you were aware of it! Perhaps it could be integrated?

https://arxiv.org/pdf/2406.11235

Image

cc @Tracin

@nv-guomingz nv-guomingz added the Low Precision Issue about lower bit quantization, including int8, int4, fp8 label Jan 7, 2025
@github-actions github-actions bot added triaged Issue has been triaged by maintainers Investigating labels Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Investigating Low Precision Issue about lower bit quantization, including int8, int4, fp8 triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

2 participants