Latent Causal Discovery in Reinforcement Learning and Large Language Models

Overview
Latent causal discovery seeks to uncover hidden causal mechanisms in data where the true causal variables are unobserved. Our research develops theoretical foundations and practical algorithms for identifying latent causal structures, particularly in reinforcement learning (RL) and large language models (LLMs).
Research Focus
Causal Reinforcement Learning
In RL, learning disentangled causal state representations is crucial for robust decision-making, generalization, and transfer learning. Our work challenges traditional assumptions and redefines state disentanglement through causal constraints and interventions.
- Rethinking State Disentanglement in Causal RL
- Proposing a new framework for interpretable and robust state representations.
- Silver Linings: When Distribution Shifts Enhance Identifiability
- Investigating favorable shifts that improve latent variable identifiability in RL.
Causal Representation Learning in Large Language Models
Do LLMs learn meaningful causal representations? We analyze whether next-token prediction is sufficient for learning human-interpretable causal concepts and propose methods to enhance causal learning.
- I Predict Therefore I Am
- Examining whether transformer-based models implicitly encode causal structures.
- Identifiable Latent Polynomial Causal Models
- Leveraging distribution changes to identify latent causal factors in LLMs.
Applications
π AI for Science β Improving interpretability and robustness in scientific AI applications.
π€ Autonomous Systems β Enabling RL agents to adapt to dynamic environments.
βοΈ Fair & Robust AI β Reducing bias by ensuring models learn true causal relationships rather than spurious correlations.
Selected Publications
π ICLR 2024 β Identifiable Latent Polynomial Causal Models Through the Lens of Change.
π JMLR (Submitted) β Identifying Weight-Variant Latent Causal Models.
π ICML 2025 (Submitted) β Rethinking State Disentanglement in Causal Reinforcement Learning.
π ICML 2025 (Submitted) β Silver Linings: On the Types of Distribution Shifts that Enhance Identifiability in Causal Representation Learning.
π ICML 2025 (Submitted) β I Predict Therefore I Am: Is Next Token Prediction Enough to Learn Human-Interpretable Concepts from Data?
Get in Touch
For further details or collaboration opportunities, feel free to reach out! π