TokenSim: Enabling Hardware and Software Exploration for Large Language Model Inference SystemsJan 1, 2026·Feiyang Wu,Zhuohang Bian,Guoyang Duan,Tianle Xu,Junchi Wu,Teng Ma,Yongqiangand Gong, Ruihao Yao,Youwei Zhuo· 0 min read CiteTypeConference paperPublicationAdvanced Parallel Processing Technologies (APPT)Last updated on Jan 1, 2026 ← PiLLM: Resource-efficient LLM Inference Using Workload Prediction Jan 1, 2026Pre^3: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation Aug 1, 2025 →