QTIP Quantization Support? #2663

aikitoria · 2025-01-06T16:06:19Z

I've recently seen this new 2-4 bit quant format that provides significantly better results than the current int4 ones in TensorRT-LLM, so I wanted to make sure you were aware of it! Perhaps it could be integrated?

https://arxiv.org/pdf/2406.11235

cc @Tracin

nv-guomingz added the Low Precision Issue about lower bit quantization, including int8, int4, fp8 label Jan 7, 2025

github-actions bot added triaged Issue has been triaged by maintainers Investigating labels Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QTIP Quantization Support? #2663

QTIP Quantization Support? #2663

aikitoria commented Jan 6, 2025

QTIP Quantization Support? #2663

QTIP Quantization Support? #2663

Comments

aikitoria commented Jan 6, 2025