Ruihao Gong

Ruihao Gong

Ruihao Gong focuses on the system and algorithms for accelerating the industry model production, model deployment and model efficiency. His research interests include model quantization, model sparsity, hardware-friendly neural networks for various hardwares such as cloud servers and mobile/egde devices, building systems for large model training and inference, and various applications such as smart city and L2/L4 autonomous driving, personal AI assistants.

News:
[2024.10] Our LLM compression benchmark paper with LLMC toolkit is accepted by EMNLP 2024.
[2024.05] We released LLMC: an off-the-shell tool designed for compressing LLM.
[2024.02] Our TFMQ-DM on Diffusion Model Quantization is accepted by CVPR 2024.
[2024.01] Our QLLM method for LLM quantization is accepted by ICLR 2024.
[2023.12] One paper on lane detection quantization and one paper on post-training sparsity are accepted by AAAI 2024.
[2023.11] We released EasyLLM: a training framework for Large Language Model and Vision Language Model.
[2023.10] Our team won the championship of LPCV 2023.
[2023.10] One paper Outlier Suppression+ for LLM quantization is accepted by EMNLP 2023.
[2023.08] We released LightLLM: a serving framework for Large Language Model and Vision Language Model.
[2023.07] One paper on model size compression is accepted by ICCV 2023.
[2023.06] One FamilySeer paper on deep learning compiler is accepted by ICPP 2023.
[2023.04] One SysNoise paper on model robustness is accepted by MLSys 2023.

Accomplish­ments

LPCV
Winner of Low Power Computer Vision Challenge
IEEE Computer Society
2023 IEEE UAV Chase Challenge Award
LPCV
Winner of Low Power Computer Vision Challenge
Tencent
Tencent Rhino-Bird Elite Training Program
SenseTime
SenseTime Future Star Talents
Beihang University
National Scholarship
China Computer Federation
CCF Outstanding Undergraduate Award

Recent Publications

Quickly discover relevant content by filtering publications.
(2024). Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes. Proceedings of the AAAI Conference on Artificial Intelligence.

PDF Cite

(2024). Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection. Proceedings of the AAAI Conference on Artificial Intelligence.

PDF Cite

(2024). TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

PDF Cite

(2024). QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models. The Twelfth International Conference on Learning Representations.

PDF Cite Code URL

(2024). Rectify representation bias in vision-language models for long-tailed recognition. Neural Networks.

Cite DOI URL

(2023). Outlier Suppression+: Accurate quantization of large language models by equivalent and effective shifting and scaling. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

PDF Cite Code DOI URL

(2023). Lossy and Lossless (L2) Post-training Model Size Compression. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).

PDF Cite Code

(2023). Annealing-Based Label-Transfer Learning for Open World Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Cite

(2023). Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Cite