Ruihao Gong

Ruihao Gong

Associate Research Director of Artificial Intelligence

SenseTime Research

Ruihao Gong is currently a associate research director and team leader of Model Toolchain Team at SenseTime Research under the supervision of Xiaogang Wang and Dahua Lin. He focuses on the system and algorithms for accelerating the industry model production, model deployment and model efficiency. His research interests include model quantization, model sparsity, hardware-friendly neural networks for various hardwares such as cloud servers and mobile/egde devices, building systems for large model training and inference, and various applications such as smart city and L2/L4 autonomous driving, personal AI assistants.

News:
[2024.02] Our TFMQ-DM on Diffusion Model Quantization is accepted by CVPR 2024.
[2024.01] Our QLLM method for LLM quantization is accepted by ICLR 2024.
[2023.12] One paper on lane detection quantization and one paper on post-training sparsity are accepted by AAAI 2024.
[2023.11] We released EasyLLM: a training framework for Large Language Model and Vision Language Model.
[2023.10] Our team won the championship of LPCV 2023.
[2023.10] One paper Outlier Suppression+ for LLM quantization is accepted by EMNLP 2023.
[2023.08] We released LightLLM: a serving framework for Large Language Model and Vision Language Model.
[2023.07] One paper on model size compression is accepted by ICCV 2023.
[2023.06] One FamilySeer paper on deep learning compiler is accepted by ICPP 2023.
[2023.04] One SysNoise paper on model robustness is accepted by MLSys 2023.

Experience

 
 
 
 
 
SenseTime
Associate Research Director
Jun 2023 – Present Beijing

Responsibilities include:

  • Model deployment and compression for deep learning algorithms on various hardware
  • Building the system and infrastructure for large model training and serving
  • Building the fundamental function call ability of LLMs
 
 
 
 
 
SenseTime
Senior Research Manager
May 2022 – Present Beijing

Responsibilities include:

  • Model deployment and compression for deep learning algorithms on various hardware
  • Building the infrastructure to scale up the deep learning model production efficiency
  • Hardware friendly neural networks
 
 
 
 
 
SenseTime
Senior Researcher
Jan 2021 – May 2022 Beijing

Responsibilities include:

  • Model deployment and compression for deep learning algorithms on various hardware
  • Building the Infrasture to scale up the deep learning model production efficiency

Accomplish­ments

SenseTime
SenseTime Black Sheep Award (five people per Year)
LPCV
Winner of Low Power Computer Vision Challenge
IEEE Computer Society
2023 IEEE UAV Chase Challenge Award
LPCV
Winner of Low Power Computer Vision Challenge
Tencent
Tencent Rhino-Bird Elite Training Program
SenseTime
SenseTime Future Star Talents
Beihang University
National Scholarship
China Computer Federation
CCF Outstanding Undergraduate Award

Recent Publications

Quickly discover relevant content by filtering publications.
(2024). Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes. Proceedings of the AAAI Conference on Artificial Intelligence.

PDF Cite

(2024). Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection. Proceedings of the AAAI Conference on Artificial Intelligence.

PDF Cite

(2024). TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

PDF Cite

(2024). QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models. The Twelfth International Conference on Learning Representations.

PDF Cite Code URL

(2024). Rectify representation bias in vision-language models for long-tailed recognition. Neural Networks.

Cite DOI URL

(2023). Outlier Suppression+: Accurate quantization of large language models by equivalent and effective shifting and scaling. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.

PDF Cite Code DOI URL

(2023). Lossy and Lossless (L2) Post-training Model Size Compression. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).

PDF Cite Code

(2023). Annealing-Based Label-Transfer Learning for Open World Object Detection. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Cite

(2023). Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

Cite

Contact