PiLLM: Resource-efficient LLM Inference Using Workload Prediction

Jan 1, 2026·
Yunqian Fan
,
Shihao Bai
Ruihao Gong
Ruihao Gong
📧 Corresponding Author
,
Zaijun Wang
,
Rui Fan
📧 Corresponding Author
· 0 min read
Type
Publication
Proceedings of the 21st European Conference on Computer Systems (EuroSys)