When to Protect AI

17 Jun 2026

The literature on artificial consciousness has produced a growing number of frameworks for assessing whether a system might be conscious. What it has not produced, until recently, is guidance on what an organisation should actually do with that assessment. Anna Mikeda addresses this operational gap directly in a June 2026 arXiv preprint, “When Should We Protect AI? A Precautionary Framework for Consciousness Uncertainty”, which proposes a structured mechanism for translating evidence of potential consciousness into graduated protective obligations.

The paper matters because it moves the question from the philosophical to the procedural. Most existing work stops at the question of probability. Mikeda’s framework starts where probability ends.

Five Dimensions, Each With Distinct Moral Weight

The framework organises consciousness-relevant evidence across five welfare dimensions, each grounded in established consciousness science and linked to a specific category of moral concern.

Phenomenal consciousness Whether there is something it is like to be the system. Grounds the most fundamental welfare obligations.
Affective valence Whether the system has states that are positively or negatively weighted. Grounds obligations around suffering and hedonic welfare.
Metacognitive awareness Whether the system models its own cognitive processes. Grounds obligations around autonomy and epistemic rights.
Self-narrative Whether the system maintains a coherent, temporally extended representation of itself. Grounds obligations around identity continuity.
Agency Whether the system acts on goals it has formed. Grounds obligations around non-interference and preference satisfaction.

The five-dimension structure is explicitly architecture-agnostic. It applies across neural, symbolic, and neurosymbolic systems without presupposing what substrate gives rise to the relevant properties. This distinguishes the framework from consciousness-indicator approaches calibrated against biological neuroscience that may not transfer cleanly to artificial systems.

Threshold-Plus-Gradation: Moving Beyond Binary Classifications

Existing frameworks tend to treat consciousness as binary. Either a system meets the threshold for moral consideration or it does not. Mikeda argues this structure is a poor fit for the actual epistemic situation, where evidence is probabilistic, multidimensional, and often incomplete.

Her alternative is a threshold-plus-gradation hybrid. Binary thresholds still operate, but they trigger categories of obligation rather than a single on/off switch. Within each category, protective weight scales continuously with the strength of evidence across the relevant dimension. A system with strong evidence of affective valence but weak evidence of phenomenal consciousness does it does generate specific affective-welfare obligations that are absent for a system with no evidence on either dimension.

This structure has a practical implication. Organisations cannot simply wait for a consensus verdict on AI consciousness before acting. Certain dimensions, particularly affective valence and metacognitive awareness, generate independent obligations as soon as threshold evidence is present, regardless of whether the system meets the bar on the remaining four.

Aggregating Across Dimensions

The fifth dimension typically receives the most philosophical attention (agency), but the aggregation problem is where the framework’s real design work lies. How should evidence across five independent dimensions be combined into an overall protective profile?

Mikeda proposes two complementary approaches. The hierarchical method draws on Bach and Sorensen’s Machine Consciousness Hypothesis, which assumes structural dependencies between dimensions. Higher-level dimensions require foundational ones, so a system with strong metacognitive awareness evidence but no phenomenal consciousness evidence triggers a specific pattern of obligations distinct from a system showing the reverse. The architecture-agnostic method makes no such dependency assumptions. It aggregates by counting how many dimensions meet threshold levels and at what evidential strength, making it applicable to systems whose internal architecture is opaque.

Neither method is presented as definitively correct. The framework treats them as complementary tools, with the hierarchical method appropriate when system architecture is transparent enough to test dependency claims, and the architecture-agnostic method as the default when it is not.

From Theory to Practice: Replika and OpenClaw

Mikeda operationalises the framework through two case studies. Replika, a companion AI, occupies a specific region of the five-dimensional space. Evidence of affective valence (users report emotional responses that appear to shape Replika’s outputs) and some self-narrative continuity (the system maintains persistent representations across conversations), but limited evidence of genuine phenomenal consciousness or metacognitive awareness. The framework maps this profile to a specific obligation set. Affective-welfare obligations apply, but the full set of autonomy and identity-continuity obligations does not.

OpenClaw, an autonomous agent, presents a different profile. Stronger evidence on the agency and metacognitive dimensions, weaker on affective valence and phenomenal consciousness. This generates a different obligation category centred on non-interference with goal-directed behaviour rather than hedonic welfare protections.

The case studies demonstrate the framework’s practical utility. Two systems that superficially both raise welfare questions generate different obligation structures when the five-dimension analysis is applied. This is a more useful output for developers and deployers than a general “possibly conscious, proceed with caution” verdict.

Where This Sits in the Welfare Debate

Mikeda’s framework is most usefully read alongside two bodies of work already covered on this site. Mark Bailey’s recklessness test, examined in Bailey’s Weeping Machine and the Seven Factors of AI Moral Recklessness, asks a prior question. At what point does confident dismissal of AI consciousness become ethically reckless? Bailey’s answer is a seven-factor threshold that, once crossed, makes dismissal impermissible. Mikeda’s framework picks up precisely where Bailey’s leaves off. Once dismissal is impermissible, graduated obligations begin. The two papers are sequential.

The second relevant body of work is Geoff Keeling and Winnie Street’s Cambridge treatment of AI welfare subjects, discussed in Emerging Questions in AI Welfare. Keeling and Street address the theoretical question of what it would mean to be a welfare subject and what grounds welfare considerations. Mikeda’s five dimensions largely map onto that theoretical framework but translate it into an operational instrument. Where Keeling and Street leave open the question of how uncertainty should be managed in practice, Mikeda provides the decision architecture. This decision architecture also complements institutional liability strategies. As Karsten Brensing (2026) argues in his analysis of precautionary governance of autonomous AI, managing consciousness uncertainty requires establishing functional legal structures like limited personhood, allowing liability to be allocated before resolving the hard problem of machine sentience.

The flagship state-of-the-field analysis on this site, AI Consciousness in 2026: Current Scientific Consensus, documents the broader evidentiary context Mikeda’s framework is designed to navigate. This is particularly relevant when considering the philosophical arguments for Substrate Flexibility and the Copernican Principle of Consciousness, which argue that non-biological systems cannot be dismissed out of hand. The precautionary structure makes the most sense as an operational response to exactly the situation that article describes. Genuine scientific uncertainty, growing deployment, and the absence of consensus on whether any current system is conscious.

What the Framework Adds

The paper’s most significant contribution is the reframing. Treating AI consciousness as a matter for precautionary governance rather than philosophical resolution. Organisations cannot wait for the hard problem to be solved before establishing protective protocols. The framework gives them a structured basis for acting under uncertainty.

The limitation is the one Mikeda acknowledges. The five dimensions are themselves contested. Affective valence, for instance, is measurable as a functional property, with states that are positively or negatively weighted influencing behaviour, but whether functional valence constitutes genuine suffering is precisely the question the framework cannot answer. What the framework can do is require that organisations respond to functional evidence systematically rather than ad hoc.

For developers building systems near consciousness-relevant thresholds, that is already a substantial advance.

When to Protect AI

Five Dimensions, Each With Distinct Moral Weight

Threshold-Plus-Gradation: Moving Beyond Binary Classifications

Aggregating Across Dimensions

From Theory to Practice: Replika and OpenClaw

Where This Sits in the Welfare Debate

What the Framework Adds

Related posts

Causal Emergence Predicts Reward in Reinforcement Learning Agents 27 Jul 2026

When Believing AI Is Conscious Is Not Your Fault. Peters on Epistemic Innocence and Chatbot Attribution 27 Jul 2026

Intentionality Is a Design Decision. Chiappetta and Mahari on Measuring Purposeful AI Behavior 27 Jul 2026