I am a Software Engineer at Meta, working on AI efficiency on MTIA. My research areas include LLM infrastructure, recommendation systems, high performance computing, storage systems, resource management, and cloud computing.

Previously, I was a Tech Lead and Senior Researcher at Microsoft (2022-2025), working on DeepSpeed and recommendation systems. Before that, I was a research scientist at Oak Ridge National Laboratory (2017-2022). During my time at ORNL, I received the 2021 IEEE-CS TCHPC Early Career Researchers Award for Excellence in High Performance Computing.

News

2026 serve as TPCs at SC’26, IPDPS’26, ICPP’26
2025-05 start a new position at Meta.
2025 serve as TPCs at SC’25, IPDPS’25
2024 one paper is accepted to ASPLOS’24
2024 one paper is accepted to IPDPS’24
2023-11 one paper is accepted to SC’23
2022-09 start a new position at Microsoft.
2022-05 two papers are accepted to HPDC’22

Recent Publications

(Click here for complete list)


[2026] Bing Xie et al. Peek: Queue-Predictive KV Cache Management for LLM Serving (under preparation)
[2025] Bing Xie*, Junqi Yin*, Zihao Zhou*, Sarp Oral, Feiyi Wang Scaling Up Data Parallelism in Decentralized Deep Learning (* equal contribution, arXiv preprint)
[2024] Guanhua Wang, Olatunji Ruwase, Bing Xie, Yuxiong He FastPersist: Accelerating Model Checkpointing in Deep Learning (arXiv preprint)
[ASPLOS’24] Shaleen Garg, Jian Zhang, Rajesh Pitchumani, Sharath Parashar, Bing Xie CrossPrefetch: Accelerating I/O Prefetching for Modern Storage
[IPDPS’24] Di Zhang, Manav Raj, Bing Xie, Sheng Di, Dong Dai Cross-System Analysis of Job Characterization and Scheduling in Large-Scale Computing Clusters
[SC’23] Zheng Pan, Yikang Ding, Zhen Zheng, Fei Zhang, Bing Xie, Ruofan Wu, Samuel Smith, Changyou Liu, Olatunji Ruwase, Xiaoyong Du RecFlex: Enabling Feature Heterogeneity-Aware Optimization for Deep Recommendation Models with Flexible Schedules
[HPDC’22] Jean Luca Bez*, Ahmad Maroof Karimi*, Arnab K. Paul*, Bing Xie*, Suren Byna, Philip Carns, Sarp Oral, Feiyi Wang, Jesse Hanley Access Patterns and Performance Behaviors of Multi-layer Supercomputer I/O Subsystems under Production Load (* equal contribution)
[HPDC’22] Di Zhang, Dong Dai, Bing Xie SchedInspector: A Batch Job Scheduling Inspector Using Reinforcement Learning
[NSDI’22] Joshua Romero, Junqi Yin, Nouamane Laanait, Bing Xie, Todd Young, Sean Treichler, Vitalii Starchenko, Albina Borisevich, Alex Sergeev, Michael Matheson Accelerating Collective Communication in Data Parallel Training across Deep Learning Frameworks