Xiangchen Song

I am a PhD student in Machine Learning Department at Carnegie Mellon University, advised by Prof. Kun Zhang (CMU-CLeaR Group). Previously, I studied Computer Science at UIUC with Prof. Jiawei Han.

research

Large language models are sequence models, and my work aims to make their internal representations provably identifiable so we can interpret and steer model behavior with principled guarantees. I build on causal representation learning to recover latent structure in temporal data (time series, video, and text), and bring this lens to mechanistic interpretability of LLMs: designing identifiable sparse autoencoders with feature consistency for LLM activations, analyzing internal reasoning mechanisms, and enabling targeted control for more reliable and efficient model behavior.

contact

Email: xiangchs [at] cs [dot] cmu [dot] edu

news

Sep 23, 2025	Two papers about efficient LLM reasoning have been accepted to NeurIPS 2025 Workshop on Efficient Reasoning (ER@NeurIPS’2025)!!
Sep 22, 2025	Two papers on mechanistic interpretability have been accepted to Mechanistic Interpretability Workshop at NeurIPS 2025 (MechInterp@NeurIPS’2025)!!
Sep 18, 2025	One paper “LLM Interpretability with Identifiable Temporal-Instantaneous Representation” has been accepted to The Thirty-ninth Conference on Neural Information Processing Systems (NeurIPS’2025)!!
May 01, 2025	One paper “Reflection-Window Decoding: Text Generation with Selective Refinement” been accepted to The Forty-Second International Conference on Machine Learning (ICML’2025)!!
Jan 22, 2025	One paper “On the Identification of Temporal Causal Representation with Instantaneous Dependence” has been accepted to The Thirteenth International Conference on Learning Representations (ICLR’2025) with oral presentation!!

selected publications

NeurIPS

LLM Interpretability with Identifiable Temporal-Instantaneous Representation

Xiangchen Song^*, Jiaqi Sun^*, Zijian Li, Yujia Zheng, and Kun Zhang

In The Thirty-ninth Annual Conference on Neural Information Processing Systems, Dec 2025

arXiv HTML PDF Code
NeurIPS MechInterp

Position: Mechanistic Interpretability Should Prioritize Feature Consistency in SAEs

Xiangchen Song^*, Aashiq Muhamed^*, Yujia Zheng, Lingjing Kong, Zeyu Tang, Mona T. Diab, Virginia Smith, and Kun Zhang

In Mechanistic Interpretability Workshop at NeurIPS (Spotlight), Dec 2025

arXiv HTML PDF Code
NeurIPS

Causal Temporal Representation Learning with Nonstationary Sparse Transition

Xiangchen Song, Zijian Li, Guangyi Chen, Yujia Zheng, Yewen Fan, Xinshuai Dong, and Kun Zhang

In The Thirty-eighth Annual Conference on Neural Information Processing Systems, Dec 2024

arXiv HTML PDF Code
NeurIPS

Temporally Disentangled Representation Learning under Unknown Nonstationarity

Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, and Kun Zhang

In Thirty-seventh Conference on Neural Information Processing Systems, Dec 2023

arXiv HTML PDF Code