Ruihao Gong
Open Menu
Close Menu
Bio
Papers
Projects
Award
Paper-Conference
Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation
Aug 1, 2025
AtomNet: Designing Tiny Models from Operators Under Extreme MCU Constraints
Apr 1, 2025
Tool Playgrounds: A Comprehensive and Analyzable Benchmark for LLM Tool Invocation
Jan 1, 2025
Robust long-tailed recognition with distribution-aware adversarial example generation
Jan 1, 2025
ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding
Jan 1, 2025
Past-Future Scheduler for LLM Serving under SLA Guarantees
Jan 1, 2025
OMNIBAL: Towards Fast Instruction-Tuning for Vision-Language Models via Omniverse Computation Balance
Jan 1, 2025
HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration
Jan 1, 2025
DA-KD: Difficulty-Aware Knowledge Distillation for Efficient Large Language Models
Jan 1, 2025
LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit
Nov 1, 2024
Next »