Ruihao Gong focuses on the system and algorithms for accelerating the industry model production, model deployment and model efficiency. His research interests include model quantization, model sparsity, hardware-friendly neural networks for various hardwares such as cloud servers and mobile/egde devices, building systems for large model training and inference, and various applications such as smart city and L2/L4 autonomous driving, personal AI assistants.
News:
[2024.10] Our LLM compression benchmark paper with LLMC toolkit is accepted by EMNLP 2024.
[2024.05] We released LLMC: an off-the-shell tool designed for compressing LLM.
[2024.02] Our TFMQ-DM on Diffusion Model Quantization is accepted by CVPR 2024.
[2024.01] Our QLLM method for LLM quantization is accepted by ICLR 2024.
[2023.12] One paper on lane detection quantization and one paper on post-training sparsity are accepted by AAAI 2024.
[2023.11] We released EasyLLM: a training framework for Large Language Model and Vision Language Model.
[2023.10] Our team won the championship of LPCV 2023.
[2023.10] One paper Outlier Suppression+ for LLM quantization is accepted by EMNLP 2023.
[2023.08] We released LightLLM: a serving framework for Large Language Model and Vision Language Model.
[2023.07] One paper on model size compression is accepted by ICCV 2023.
[2023.06] One FamilySeer paper on deep learning compiler is accepted by ICPP 2023.
[2023.04] One SysNoise paper on model robustness is accepted by MLSys 2023.