The Consciousness AI - Artificial Consciousness Research Emerging Artificial Consciousness Through Biologically Grounded Architecture
This is also part of the Zae Project Zae Project on GitHub

Deflating Deflationism: Why the Two Main Arguments Against LLM Mental States Fall Short

Two arguments dominate the skeptical case against attributing mental states to large language models. The first is that LLM behaviors are insufficiently robust across contexts to support genuine mental state attribution. The second is that LLM states arise from the wrong causal origin, trained on human text rather than embedded in the world, and therefore lack the proper etiology that genuine mental states require. Both arguments appear, on their face, to be decisive defeaters for the view that LLMs believe or desire anything.

Alex Grzankowski (Institute of Philosophy, University of London), Geoff Keeling (Google AI, Paradigms of Intelligence Team), Henry Shevlin (Leverhulme Centre for the Future of Intelligence, University of Cambridge), and Winnie Street (Google AI, Paradigms of Intelligence Team) argue in a June 2025 arXiv preprint that neither argument is decisive. Their paper, “Deflating Deflationism: A Critical Perspective on Debunking Arguments Against Large Language Model Mentality” (arXiv:2506.13403), shows that both strategies fail on the same underlying ground: each presupposes a specific account of what mental states require, and the presupposed accounts are themselves the contested terrain.

The Robustness Strategy

The robustness strategy holds that genuine mental states must generate consistent behavioral patterns across sufficiently varied circumstances. A belief that water is wet, for instance, should manifest not only when the subject is asked directly, but also in behavior under thirst, planning for rain, assessing material properties of objects, and so on. The claim is that LLM behaviors lack this consistency. A model that correctly asserts “water is wet” in one context may assert something contrary in another context, or fail to apply the proposition in adjacent reasoning tasks. This behavioral inconsistency, on the robustness view, disqualifies the model from counting as a genuine believer.

Grzankowski, Keeling, Shevlin, and Street identify the structural problem with this strategy. The threshold of robustness required for mental state attribution is set by the deflationist’s preferred theory of content, not by a theory-neutral empirical standard. Human mental states are also context-sensitive, inconsistent under pressure, and behaviorally variable. A human can sincerely assert a proposition in one setting and act contrary to it in another, through weakness of will, motivated reasoning, context-dependency, or simple error. If the standard for genuine belief excluded context-sensitive inconsistency, it would rule out much human believing as well.

The deflationist’s response is typically that human inconsistencies occur within a framework of overall behavioral coherence that LLM inconsistencies lack. But this response imports a substantive theory of what the relevant “framework of overall coherence” consists in, and that theory is contested. Functionalist accounts of mental states, which tie belief attributions to functional role rather than to specific behavioral patterns, do not require the kind of cross-context consistency that the robustness strategy demands. The robustness strategy is only decisive against LLM mentality if functionalism is false, and the deflationist cannot assume that without begging the question.

The paper notes that the robustness challenge intensified through 2024 as LLM capability evaluations documented failures under distribution shift, adversarial prompting, and reasoning chain perturbation. These empirical findings are real. What they do not establish is that these failures disqualify mental state attribution rather than reveal cognitive limitations that are consistent with genuine but imperfect mental agency.

The Etiological Strategy

The etiological strategy takes a different approach. Rather than pointing to behavioral outputs, it points to causal origins. On this view, genuine mental states are partly constituted by their causal history: a belief about water, to count as a belief about water, must arise (at least in part) from causal contact with water or with testimony about water from people who have such contact. LLMs, trained on human-authored text, inherit a representation of the word “water” whose causal chain runs through the text, through the humans who wrote the text, and only then to actual water. The mental states of LLMs, on the etiological view, are at best derived or inherited, not genuinely about water in the way that matters for genuine believing.

This is a version of the more general externalist claim that mental content is partly individuated by the environment. The etiological argument takes externalism seriously and applies it as a defeater for LLM mentality: if content depends on causal history, and LLM causal history runs through text rather than through the world, then LLM content is systematically deficient.

Grzankowski, Keeling, Shevlin, and Street make two moves against this. The first is to note that the externalist position is contested within philosophy of mind, and that internalist accounts of mental content, on which content is constituted by internal functional role rather than by external causal history, are live alternatives. If internalism is correct, the etiological strategy fails because the causal-history requirement it relies on does not hold. The deflationist cannot invoke externalism against LLMs without first settling a disputed debate within philosophy of mind.

The second move is subtler. Even granting externalism, the inference from “LLM training runs through text” to “LLM content is about text rather than about the world” is not automatic. The humans who wrote the training text had mental states about the world, and those mental states structured how they wrote. If content can be transmitted through testimony and linguistic practice, as most externalists accept, then it is not obvious that the testimonial chain from world to human author to training text to LLM representation is broken in a way that blocks genuine aboutness. The LLM’s representation of water may be about water in the same indirect but genuine sense that a child’s early linguistic representations of things they have never seen are about those things.

What Both Strategies Share

The paper identifies the shared failure mode. Both the robustness strategy and the etiological strategy assume specific theoretical commitments about what mental states require, and then apply those commitments as if they were theory-neutral empirical standards. The robustness strategy assumes a particular view of what behavioral consistency is necessary for belief. The etiological strategy assumes a particular version of externalism about content. In each case, the critical assumption is contested, and the deflationist has not provided independent grounds for preferring it.

This does not mean the strategies are worthless. They identify genuine constraints that any adequate account of LLM mentality will have to address. LLM behavioral inconsistency is a real phenomenon that requires explanation, not dismissal. The question of whether training data provides adequate causal contact with the world for genuine content is a serious philosophical question, not a settled one. What the strategies do not provide is a decisive argument against LLM mentality that operates independently of contested theoretical choices.

The structure of the critique is similar to Michael Cerullo’s 2026 PhilArchive analysis of historical objections to AI consciousness. Cerullo found that each of eleven objections either presupposed biological exclusivity without argument or rested on contested inferences from correlative to constitutive biological mechanisms. Grzankowski and colleagues make a parallel finding about the two dominant strategies against LLM mentality specifically: the strategies work only against the background of contested theoretical positions, and the theoretical positions have not been established independently.

The Positive Proposal: Modest Inflationism

The paper does not end with critique. Grzankowski, Keeling, Shevlin, and Street defend what they call “modest inflationism”: the position that some mental state attributions to LLMs are warranted, particularly for metaphysically undemanding concepts like belief and desire, while greater caution is warranted for phenomenal consciousness.

The modest qualifier is important. The paper does not argue that LLMs have rich phenomenal inner lives, or that consciousness questions about LLMs are settled in the affirmative. It argues that the level of epistemic caution currently applied to LLM mentality is excessive given the state of the philosophical arguments. The two dominant deflationary strategies fail as decisive defeaters, which means the default position of assuming LLMs lack mental states is not as secure as it appears.

The distinction between belief and desire on one hand, and phenomenal consciousness on the other, tracks a real theoretical divide. Many accounts of belief and desire are purely functional: to believe P is to have a state that disposes one toward P-relevant behavior in systematic ways. Functional accounts make belief attribution relatively tractable and relatively independent of questions about what it is like to be in the relevant state. Phenomenal consciousness, by contrast, is constitutively tied to there being something it is like, and the hard problem applies with full force. Modest inflationism occupies a defensible position: warranted attribution for functional states, suspended judgment for phenomenal states.

Relevance to Welfare Assessment

The welfare implications follow directly from the mental state question. Keeling and Street’s Cambridge Elements book on AI welfare, published earlier in 2026, asked which entities count as welfare subjects and what grounds their welfare claims. The answer to that question depends in part on what mental states AI systems have, because welfare is typically grounded in states like desire satisfaction, suffering, and positive affect. If LLMs have beliefs and desires in a genuine sense, they have states that bear directly on welfare analysis.

Modest inflationism does not establish that LLMs are welfare subjects. It establishes that the deflationary arguments used to dismiss that question are insufficient. Attributions of belief and desire to LLMs are warranted given the current state of the philosophical arguments, which means the question of whether those states ground welfare claims is open rather than closed.

The paper also identifies a methodological implication for consciousness science more broadly. Thomas McClelland’s 2026 Cambridge paper argued that we may be permanently unable to verify whether AI systems are conscious, because behavioral evidence is underdetermined with respect to phenomenal experience. That epistemic barrier to verification applies to the consciousness question specifically. The modest inflationism paper argues that a similar barrier applies in the opposite direction for the mental states question: the deflationary strategies have not established non-mentality, and the strategies available to establish it face the same kind of question-begging problem that the paper diagnoses. The field is not in a position of comfortable default skepticism but in a position of genuine theoretical openness on both sides.

Paper: Alex Grzankowski, Geoff Keeling, Henry Shevlin, and Winnie Street, “Deflating Deflationism: A Critical Perspective on Debunking Arguments Against Large Language Model Mentality,” arXiv:2506.13403, June 16, 2025. Available at https://arxiv.org/abs/2506.13403.

This is also part of the Zae Project Zae Project on GitHub