PiLLM: Resource-efficient LLM Inference Using Workload Prediction
Jan 1, 2026·,
,,·
0 min read
Yunqian Fan
Shihao Bai
Ruihao Gong
📧
Corresponding Author
Zaijun Wang
Rui Fan
📧
Corresponding Author
Type
Publication
Proceedings of the 21st European Conference on Computer Systems (EuroSys)