QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models

Jan 1, 2024·

Jing Liu

Ruihao Gong

Ruihao Gong

,

Xiuying Wei

,

Zhiwei Dong

,

Jianfei Cai

,

Bohan Zhuang

· 0 min read

PDF Cite Code URL

Type

Conference paper

Publication

The Twelfth International Conference on Learning Representations

Last updated on Feb 27, 2024

Ruihao Gong

Authors

← PTSBench: A Comprehensive Post-Training Sparsity Benchmark Towards Algorithms and Models Jan 1, 2024

Rectify representation bias in vision-language models for long-tailed recognition Jan 1, 2024 →