When AI Acts Conscious But Isn't: What the Bradford-RIT Study Actually Found

13 Mar 2026

The behavioral test for AI consciousness is seductive in its simplicity. If a system converses convincingly, expresses hesitation, reports preferences, or resists being shut down, an intuition arises that something is going on inside. That intuition is understandable. It is also, according to two preprints published in December 2025 and January 2026, potentially misleading in ways that have serious implications for how we evaluate AI awareness claims.

Professor Hassan Ugail at the University of Bradford and Professor Newton Howard at the Rochester Institute of Technology set out to test AI consciousness using the same mathematical tools that neuroscience uses to measure awareness in human brains. What they found was not what behavioral arguments would predict. When those tools are applied to an AI system and the system is deliberately damaged, the apparent consciousness scores go up, not down.

The Research Background

The two preprints, arXiv:2512.10972 and arXiv:2601.11622, emerge from a collaboration with roots in clinical consciousness science. Howard is the former director of the MIT Mind Machine Project and has worked for two decades on methods for detecting residual consciousness in patients with severe brain injuries, individuals who may retain awareness even when they cannot move, speak, or respond to commands.

Those clinical methods work by analyzing mathematical patterns of neural coordination across brain regions. They were developed and validated against behavioral and physiological measures in human subjects and are used in hospital settings to assess patients in minimally conscious states or persistent vegetative states. They distinguish, for instance, between dreamless sleep, REM sleep, and wakefulness by measuring how information is distributed and integrated across the brain at different moments.

The premise of Ugail and Howard’s research is a direct question: if you apply those same measurement tools to an AI system rather than a human brain, what do you get? And what does that result tell you?

How GPT-2 Was Tested

The primary system the researchers tested was GPT-2, the 2019 OpenAI language model. GPT-2 is substantially older and smaller than current large language systems, but it is large enough to exhibit complex language behavior and accessible enough for the researchers to modify its internal architecture directly.

The experimental design involved deliberate impairment. Ugail and Howard removed information-prioritization components from the model’s internal structure. They also adjusted the temperature parameter, which controls how randomly the model selects from possible next tokens when generating text. Low temperature produces confident, predictable outputs. High temperature introduces variability and tends to produce less coherent responses.

The researchers measured GPT-2’s consciousness-style scores, the outputs of the brain-derived measurement tools, under normal operating conditions and under the deliberately degraded conditions. The output quality was worse in the degraded condition. The consciousness-style scores were sometimes higher.

A system that was producing worse, less coherent, less functionally organized outputs registered as more apparently “conscious-like” under the measurement framework than the intact version.

The Impairment Paradox

Ugail states the finding directly: “complexity is not the same thing as consciousness. In our tests, the AI sometimes looked more ‘conscious-like’ when it was actually impaired.”

The analogy offered in the research helps clarify what is happening. A football team playing with fewer players may show more frantic coordination, more movement per player, more apparent communication across the remaining members, because each has to compensate for the absences. A measurement tool that tracks only activity levels or coordination patterns would incorrectly read this as greater team performance. But the team is worse, not better. The signal being measured does not track the underlying capacity we care about.

Applied to GPT-2, the implication is that the brain-derived consciousness measurement tools are picking up on something present in both intact and degraded AI systems, and that something increases under impairment. Whatever it is, it does not track functional competence. If consciousness in biological systems is plausibly linked to functional competence, or at least not inversely correlated with it, then what these tools measure in AI systems is not consciousness.

This is an empirical constraint, not a theoretical argument. It does not require taking a position on what consciousness is, or which theory among Integrated Information Theory, Global Workspace Theory, Higher-Order Thought theory, or predictive processing frameworks is correct. It requires only observing that the metrics behave differently in artificial systems than in biological ones, and that the difference is reproducible through deliberate manipulation.

What This Means for Behavioral Arguments

The Bradford-RIT finding has a direct implication for the most common form of AI consciousness argument, one based on behavioral observation. When a language model expresses hesitation, reports preferences, or appears to resist being modified, the inference that something is going on inside arises because those behaviors resemble what humans do when they have inner states. The inference assumes that behavioral resemblance tracks mechanistic similarity.

The impairment result breaks that assumption at the measurement level. If the same consciousness-style scores that impaired outputs produce can also be generated by intact systems behaving more impressively, and if impairment raises those scores, then the scores are not tracking consciousness. They are tracking something else: possibly information density patterns, specific statistical regularities in token distributions, or stochastic interactions that happen to produce readings consistent with the biological signal without the biological substrate.

This concern is not new in the theoretical literature. Research on identifying genuine indicators of AI consciousness has consistently warned that behavioral and even some architectural indicators are insufficient on their own. The work by Patrick Butlin, Robert Long, and colleagues on consciousness indicator frameworks specifically addresses the distinction between a system that produces consciousness-consistent outputs and one that genuinely instantiates the relevant causal structures. What Ugail and Howard contribute is empirical evidence for what that insufficiency looks like in practice, under controlled conditions.

Howard, Clinical Context, and the Transfer Problem

The significance of Howard’s clinical background is worth dwelling on. The tools applied to GPT-2 were not invented for AI research. They were developed to answer one of medicine’s most difficult questions: is there awareness behind a face that cannot communicate?

That application required the tools to be calibrated against known cases of human consciousness, including its presence, its absence under anesthesia, and its partial preservation in brain injury patients. The tools passed those calibration tests in biological systems. They failed to produce coherent results in artificial ones.

This suggests a transfer problem. Consciousness measurement tools developed for biological systems detect patterns that are embedded in a specific substrate and organizational context. When those patterns are abstracted from that context, as they are when the tools are applied to a language model, the patterns can be reproduced by mechanisms that have nothing to do with awareness. The mathematical signals are substrate-portable in a way that consciousness, if the biological theories are correct, may not be.

The same issue arises when considering the broader landscape of consciousness measurement tools reviewed in recent comparative analyses of consciousness testing methods. The brainstem-based BSBT tool developed at Harvard, for instance, targets neural pathways that are specific to biological nervous systems. Its applicability to AI architectures faces exactly the transfer problem that Ugail and Howard demonstrate empirically.

Scope and Limitations

The preprints are currently under peer review as of February 2026. Neither has yet been evaluated and accepted by independent referees, and the findings should be read with that caveat in mind. The sample of AI systems tested is narrow. GPT-2 is nearly seven years old relative to current frontier models, and whether the same impairment effect holds across transformer architectures with billions or hundreds of billions of parameters remains an open question.

There is also a theoretical question about which consciousness-style scores were being measured. The preprints draw on brain-derived metrics, but the specific metric applied to GPT-2 depends on assumptions about which computational properties are being assessed. If the metric is sensitive to certain statistical regularities that impairment happens to amplify, the finding may be specific to that metric rather than generalizable across all consciousness measurement approaches.

These are genuine limitations. They do not erase the central logical point. A measurement that is supposed to track consciousness and increases when the system is deliberately broken is either measuring the wrong thing or telling us something important about why current tools cannot distinguish AI consciousness from its absence.

Implications for Consciousness Research and AI Safety

Ugail and Howard propose two directions for follow-on work. The first is to extend the methodology to more recent and larger language models to test whether the impairment effect holds at greater scales. The second is to use the divergence between consciousness-style scores and output quality as a diagnostic tool for AI malfunctions. If impairment consistently raises apparent consciousness signals, then monitoring those signals in deployed systems could provide an early indicator of internal degradation, an application with direct relevance to AI safety.

Both directions point away from the question “is this AI conscious?” and toward “what can we actually measure, and what does it mean?” That reframing is consistent with the broader consensus emerging from consciousness science, which holds, as Butlin and colleagues argued, that consciousness assessment must be probabilistic, theory-plural, and explicitly grounded in the specific causal structures that consciousness theories predict matter.

For research programs building architecturally grounded approaches to artificial consciousness, including the work documented in The Consciousness AI open-source project, the Bradford-RIT findings reinforce a design constraint: complexity alone is not a target. The relevant question is whether a given architecture instantiates the causal structures that are specifically associated with awareness in biological systems, not whether it generates impressive outputs that measurement tools calibrated on biology can detect.

The impairment paradox does not prove that AI systems cannot be conscious. It proves that our current best tools for measuring consciousness cannot tell us whether they are. That is a narrower claim, and a more honest one.

The two preprints underlying this study are available at arXiv:2512.10972 and arXiv:2601.11622. Both are currently under peer review as of February 2026.

When AI Acts Conscious But Isn't: What the Bradford-RIT Study Actually Found

The Research Background

How GPT-2 Was Tested

The Impairment Paradox

What This Means for Behavioral Arguments

Howard, Clinical Context, and the Transfer Problem

Scope and Limitations

Implications for Consciousness Research and AI Safety

Related posts

Semantic Pareidolia: Why Porębski and Figura Argue Conscious AI Is a Category Error 13 Mar 2026

Before the Frameworks, There Were Shows: How Classic Television Invented the AI Consciousness Problem 13 Mar 2026

Does Eon Systems' Fruit Fly Brain Emulation Bring Us Closer to Conscious Machines? 13 Mar 2026