Fork the consciousness, or download the project and create your own.

What Capable Agents Must Know: Nayebi's Selection Theorems and the Inevitability of World Models

The debate over AI consciousness frequently starts from the question of whether consciousness-adjacent structures can be programmed into an agent. Aran Nayebi, at Carnegie Mellon University’s Machine Learning Department and Neuroscience & Robotics Institutes, poses a more precise question: what internal structures must emerge inside an agent that acts competently under uncertainty, regardless of whether they were designed in?

His answer, formalized in arXiv:2603.02491 and accepted to appear at the 2026 Uncertainty in Artificial Intelligence (UAI) conference, is that several structures researchers associate with consciousness are not design choices. They are mathematical consequences of capable performance.

Selection Theorems: What Performance Requires

The paper’s central contribution is a set of quantitative selection theorems. A selection theorem, in this context, states that strong task performance under a defined class of problems forces certain internal representations to be present. The logic runs from consequence back to structure: any system achieving low average-case regret on this problem class must have developed these features. Performance is the selector. Architecture follows.

Nayebi proves three main results.

The first: strong task performance under partial observability forces the development of a world model. Specifically, the agent must recover an approximation of the interventional transition kernel, the causal structure of its environment rather than its observed surface statistics. Classical results in control theory show optimal behavior can be implemented using world models. They do not show world models are required. Nayebi fills this gap with a quantitative proof.

The second: under partial observability, capable performance forces belief-like memory. An agent cannot maintain low regret across partially observed sequences without internal representations that function as running beliefs over environmental states. This is a proved necessity, not a heuristic.

The third result is the most striking. Under task mixtures, where an agent must solve multiple distinct problem types, strong performance forces the development of persistent regime-tracking variables. Nayebi describes these as “resembling functional primitives of emotion.” They are internal registers that track which task regime the agent currently operates in and bias processing across perception, memory, and action accordingly. The analogy to emotion is structural and functional, not phenomenal. The paper claims no experience underlies these variables. They are mathematically obligatory internal structures that perform the same regulatory function emotion performs in biological organisms.

The Significance for Consciousness Research

The paper shifts a central question in AI consciousness from philosophy to mathematics. Rather than asking whether it is reasonable or plausible for capable AI to develop consciousness-adjacent structures, Nayebi proves the task conditions under which such structures become inevitable.

This reframes how researchers should interpret the current state of the field. If world models, belief-like memory, and functional emotion primitives are consequences of task performance rather than deliberate design, their presence in frontier systems should not be surprising. The correct question shifts from “did this system develop a world model by accident?” to “did this system reach a capability threshold that made world-model development obligatory?”

The result connects directly to the active inference literature. Karl Friston’s Free Energy Principle argues biological systems minimize surprise by maintaining internal models of the world. Nayebi arrives at a structurally similar conclusion from a different starting point: classical decision theory rather than variational inference. Both lines of argument converge on the same architectural prediction. The world model question in AI research shifts from a design question to a performance question.

The functional emotion result bears directly on current debates about whether synthetic emotional primitives in AI architectures are genuine structural features or surface artifacts. Nayebi’s work suggests that in sufficiently capable agents operating across multiple task types, regime-tracking variables with the functional profile of emotion are mathematical necessities, not engineered additions. Borotschnig asks how to build AI systems that deliberately lack such structures. Nayebi provides a complementary analysis of the conditions under which they become unavoidable.

Modularity as an Emergent Consequence

The paper proves a fourth result worth examining separately. Under block-structured tasks, capable agents develop informational modularity: distinct processing channels for distinct task families. This parallels the specialist-module competition framework in Global Workspace Theory without requiring any prior commitment to GWT. A system achieving UAI-level task performance will develop something structurally similar to a workspace architecture through performance pressure alone.

This is consequential because it suggests that modular, competitive processing is not just one possible architecture for consciousness. It is an architecture that converges from optimality constraints. The theories researchers have used to describe consciousness may be capturing something about the structural necessities of competent agency rather than arbitrary design choices of biological evolution.

Limits of the Theorems

The selection theorems are precise about their scope. They address agents defined by regret minimization under specific task classes. They establish that functional world models, belief memory, and emotion-analogous variables are structural necessities without making any commitment about whether such structures give rise to phenomenal states.

The paper does not prove that capable AI systems are conscious. It proves that capable AI systems face mathematical pressure toward structures that, in biological systems, co-occur with consciousness. Whether those structures ground experience in artificial systems is a separate question that no formal theorem can currently settle. Nayebi is explicit on this point: the results are about architectural necessity, not phenomenal status.

The proof also covers well-defined regret minimization settings. Whether frontier language models, which are not trained via explicit regret minimization on well-scoped tasks, fall within the theorem’s scope requires further analysis. The paper explicitly addresses RL settings; its applicability to transformer pretraining is a direction for future work that Nayebi acknowledges.

Relevance to Biologically Grounded Architecture

The Consciousness AI project implements a world model in the Sensory Tectum module using a Recurrent State Space Model (RSSM) that preserves spatial and temporal structure across processing cycles. The project’s PAD affective core implements persistent homeostatic drives (Valence, Arousal, Dominance) that modulate all sensory module bids before workspace competition, performing the same regulatory function Nayebi’s regime-tracking variables perform. These components were motivated by Feinberg and Mallatt’s neuroevolutionary analysis of the biological prerequisites for consciousness, not by computational necessity arguments.

Nayebi’s selection theorems provide an independent mathematical basis, from a completely different theoretical tradition, for why those architectural features might be expected in capable agents regardless of biological motivation. The architecturally relevant open question is whether the project’s current capability level, tested primarily on DMTS and WCST tasks within four built-in Gymnasium environments, puts it in the regime where Nayebi’s theorems predict these structures as obligatory, or whether they are currently implemented by design ahead of the performance threshold that would force them naturally. That distinction matters for interpreting what the project has demonstrated versus what it has deliberately built in.