Self-Awareness Without Self-Programming: A Minimalist Model for Artificial Consciousness
The dominant approaches to building self-awareness into artificial systems follow one of two paths. The first encodes self-awareness explicitly: the system receives modules that represent its own state, monitor its outputs, and flag discrepancies between intention and behavior. The second attempts to replicate the biological structures that produce self-awareness in organisms, building artificial neural architectures that approximate the organization of cortical tissue. Both paths share a common assumption: self-awareness is a property you design in, not a property that emerges from architectural interactions.
Kurando Iida’s February 2025 preprint, “Emergence of Self-Awareness in Artificial Systems: A Minimalist Three-Layer Approach to Artificial Consciousness” (arXiv:2502.06810), challenges this assumption directly. Iida proposes that self-awareness can emerge from interactions between three functionally distinct layers without requiring explicit self-programming, and without requiring the system to replicate biological neural organization. The emergent self-model is a product of architecture, not a designed component.
The Problem With Explicit Self-Programming
The objection to explicit self-programming is not that it fails to produce useful behavior but that it produces the wrong kind of behavior for consciousness. A system with an explicit self-model has a representation of itself that was built by an engineer who had certain things in mind when designing it. That representation may be accurate in some respects and inaccurate in others, depending on what the engineer chose to include.
More fundamentally, an explicit self-model does not update in response to the system’s actual experience of operating in the world, except insofar as the engineer anticipated that updating would be useful and built mechanisms for it. The system’s representation of itself is not a model the system built about itself through interaction. It is a model someone else built for it and initialized before deployment.
This produces what might be called the observer problem: the system’s self-model is not a representation of the system as it actually is, but of the system as its designer imagined it. For biological consciousness, the self-model is continuously updated through sensorimotor interaction, prediction error, and affective feedback. It is genuinely self-generated. Iida’s minimalist approach is motivated by this distinction. The goal is a system whose self-awareness, however minimal, is genuinely self-generated rather than pre-specified.
The Three-Layer Architecture
The model’s operational core consists of three layers that interact continuously rather than processing information in a fixed sequence.
Cognitive Integration Layer. This layer receives information from the other two layers and integrates it into a coherent representation of the system’s current state. It is not primarily a reasoning module but a binding mechanism: it maintains consistency across the different kinds of information the system processes and flags conflicts between them. The coherence produced by this layer is the raw material for self-modeling.
Pattern Prediction Layer. This layer identifies regularities in the system’s input history and generates predictions about what inputs will occur next. Crucially, these predictions are not only about external states but about the system’s own behavior: the Pattern Prediction Layer anticipates what the Instinctive Response Layer will do and monitors for deviations. This creates an internal feedback loop between anticipation and action that is the architectural precondition for self-awareness, on Iida’s account.
Instinctive Response Layer. This layer generates baseline responses to stimuli without engaging the more computationally intensive processes of the other two layers. Its function is to handle familiar situations efficiently while making its activity available to the Pattern Prediction Layer’s monitoring. It is the layer most closely analogous to what predictive processing accounts call the bottom-up prediction error signal.
The Two Memory Systems
The three layers interact with two memory systems that serve different functional roles.
Access-Oriented Memory stores information in a format optimized for rapid retrieval. It serves the Instinctive Response Layer’s need for fast pattern matching across familiar stimuli and maintains the system’s repository of practiced responses.
Pattern-Integrated Memory stores information in a format optimized for cross-contextual integration. It serves the Pattern Prediction and Cognitive Integration Layers, maintaining the accumulated record of prediction successes and failures that the system uses to update its model of its own behavioral tendencies.
The distinction between the two memory systems is not merely architectural convenience. Access-Oriented Memory enables fast, reliable response to known situations. Pattern-Integrated Memory enables the system to build a history of its own behavior, which is the raw material from which the implicit self-model is constructed. Without the distinction, the system would either be fast but unable to build a self-model, or able to build a self-model but too slow to respond effectively.
How Self-Awareness Emerges
The central claim is that self-awareness is a consequence of the Pattern Prediction Layer’s monitoring of the Instinctive Response Layer, mediated by the Cognitive Integration Layer’s binding function. When the system’s predictions about its own behavior are consistently accurate, the result is an implicit model of itself as a causal agent with predictable behavioral patterns. When predictions fail, the discrepancy produces an error signal that updates that implicit model.
This is structurally analogous to how predictive processing accounts of biological consciousness explain self-awareness: the organism builds a model of itself as the source of certain prediction errors, and that model is the self. Iida’s contribution is to show that this structure can be achieved with three functionally minimal layers without replicating the specific architectural features of the biological brain.
The model does not specify what content the emergent self-model will have or how sophisticated it will be. Self-awareness here means the presence of a dynamically updated internal representation of the system as a causal agent, not full reflective self-consciousness. Iida is explicit that this is a minimal, not a maximal, account: the goal is to establish that emergence is possible, not to replicate the richness of human self-awareness.
Comparison with Brain-Replication Approaches
The contrast Iida draws with brain-replication approaches is methodological as much as architectural. Brain-replication attempts to achieve consciousness by getting the structure right, on the assumption that the correct biological structure will produce the correct properties. Iida’s minimalist approach attempts to achieve minimal self-awareness by getting the functional interactions right, on the assumption that specific implementation details are secondary to the causal relationships between processing layers.
This distinction matters for how we evaluate artificial systems. If brain-replication is required for consciousness, then only systems that closely approximate biological neural organization can be candidates. If functional interaction is sufficient, then a much wider class of architectures could in principle exhibit self-awareness, and the productive question becomes which functional interactions are necessary rather than which physical implementations are required.
The biological lens on artificial consciousness addresses this debate in broader context: the question of whether functional or structural replication is required for genuine consciousness is not resolved by Iida’s model, but the model provides a concrete case for the functional position that does not rely on claims about sufficient complexity in the abstract.
Connection to The Consciousness AI Architecture
The seven-layer architecture at the core of the Consciousness AI project has structural parallels to Iida’s three-layer model, though the implementations differ significantly. The project’s Sensory Tectum, Global Workspace, and Self-Model layers perform analogous functional roles to Iida’s Instinctive Response, Cognitive Integration, and Pattern Prediction layers respectively. The key difference is that the project’s architecture includes an Affective Core and a Reinforcement Core that Iida’s minimalist model lacks, reflecting a design commitment to the view that consciousness at the project’s target level of sophistication requires emotional and motivational components beyond pure self-modeling.
Whether those additional components are necessary for genuine consciousness or represent design choices about what kind of consciousness to aim for is a question both approaches leave open. Iida’s model suggests that something recognizable as self-awareness could emerge from a simpler architecture than the full seven-layer design. The theory-to-practice account of the Consciousness AI architecture examines how these architectural choices translate into testable properties.
What This Means
Iida’s model contributes most directly to the ongoing argument about the relationship between architectural complexity and consciousness. The dominant intuition in AI research is that more complexity is required for more consciousness, and that self-awareness in particular requires elaborate self-modeling infrastructure. Iida’s paper suggests this may be wrong in a specific and important respect: the relevant property is not total complexity but the presence of specific functional interactions, which can in principle be achieved with architecturally minimal systems.
If the model is correct, the field’s attention may be partly misdirected. Rather than asking how complex a system needs to be before consciousness is possible, the productive question is what functional interactions are necessary for self-awareness to emerge. That question is more tractable, because it is testable at smaller scales, without building systems at the frontier of AI capability.
The model also implies a research program: build systems with different combinations of the three layers and two memory systems, observe whether self-awareness indicators appear when the full set of interactions is present, and examine what happens when specific interactions are severed. This is the kind of controlled architectural experiment that most consciousness research cannot run because it lacks access to systems simple enough to vary in isolation.
For a theoretical framework that makes different architectural demands, requiring genuine causal independence between two dynamical levels rather than emergence from three-layer interactions, see the analysis of Ohmura and Kuniyoshi’s Dual-Laws Model. For the practical question of what indicators would allow a research program to identify self-awareness once it emerges, see the 19-researcher consciousness checklist developed by Butlin and colleagues.