Muxin “Maxine” Liu
MIT Department: Electrical Engineering and Computer Science
Faculty Mentor: Prof. Jacob Andreas
Undergraduate Institution: Harvey Mudd College
Hometown: Shanghai, China
Website: LinkedIn
Biography
Maxine Liu is a senior at Harvey Mudd College majoring in Mathematics, Computer Science, and Cognitive Science. She is interested in natural language processing, with a specific focus on language models’ reasoning ability and interpretability. Her goal is to design more intelligent agents that think in the same way as humans. Maxine has research experience at Harvey Mudd College on domain-specific language design and automated proof-checking system design, and at MIT on large language models for code generation, strategic planning, model editing, and beliefs probing. She also has experience in full-stack development of AI-embedded software products. Maxine is keen on applying fundamental models across various fields and exploring the limits of machines’ cognitive potential. Additionally, she is interested in ethical concerns within AI, particularly in formulating ethical human-machine interactions and minimizing bias to ensure equitable technology access for all.
Abstract
Probing Large Language Models in Theory of Mind Tasks
Muxin Liu1, Belinda Li2, Jacob Andreas2
1Harvey Mudd College,
2Department of Electrical Engineering and Computer Science, Massachusetts Institute
of Technology
Large-scale language models (LMs) have demonstrated the capability to reason about Theory of Mind (ToM) tasks, a key cognitive ability to attribute mental states to individuals. The main challenge lies in inferring people’s beliefs in complex scenarios, where inconsistencies exist between reality and beliefs, and among the beliefs themselves. With probing and causal intervention techniques, we show that LMs can reason about these complexities by constructing an internal world model. This world model allows LMs to track the state of reality and individuals’ belief states through activations. We further model ToM tasks as a partially observable Markov decision process (POMDP), illustrating that LMs encode belief information through Bayesian-style inference, conditioned on reality and the binding information of entities. Our results enhance the interpretability of in-context ToM reasoning in large-scale LMs, and these interpretable strategies can be generalized to broader reasoning tasks in the future.