The Consciousness AI - Artificial Consciousness Research Emerging Artificial Consciousness Through Biologically Grounded Architecture
This is also part of the Zae Project Zae Project on GitHub

Starting from Biology, Not Computation

Most AI consciousness research starts from computational theories (Global Workspace Theory, Integrated Information Theory) and asks: How do we make a neural network conscious? We start from a different question, grounded in evolutionary neurobiology:

What minimal neural architecture does biology require to generate subjective experience?

The answer comes from Todd E. Feinberg and Jon M. Mallatt's The Ancient Origins of Consciousness (MIT Press, 2016). Their neuroevolutionary analysis reveals that consciousness is not a software feature to be programmed. It is an emergent property of a specific neural architecture. That architecture has been identified by 520 million years of evolution, and its functional principles can be replicated computationally.

Consciousness does not require a cerebral cortex. The first conscious creatures were early vertebrates (~520 MYA), and their consciousness lived in the optic tectum, a midbrain structure that stacks aligned sensory maps into a unified spatial model. This means consciousness requires a specific type of neural organization, not a specific amount of computation.

The Six Special Neurobiological Features

Feinberg and Mallatt identify six features that distinguish conscious neural systems from unconscious ones (like simple reflex arcs). Each maps directly to our implementation.

# Biological Feature Our Implementation
1 Many neuron types with diverse connectivity Specialist modules (vision, audio, memory, body) with different temporal dynamics
2 Hierarchical processing (3-4+ levels) Genuine transformation at each level from sensory tectum through workspace to policy
3 Dual hierarchy: pyramidal + nested 4-level hierarchical Capsule Networks (implemented) with dynamic routing by agreement and intra-hierarchy top-down prediction error feedback
4 Isomorphic (topographic) mapping Sensory Tectum with RSSM world model preserving spatial arrangement
5 Reciprocal (reentrant) connections ReentrantProcessor with 5-10 adaptive convergence cycles
6 Oscillatory binding (gamma synchronization) AKOrN (Artificial Kuramoto Oscillatory Neurons, ICLR 2025)

The Seven-Layer Architecture

1

Sensory Tectum (Perception)

A multisensory spatial integration layer modeled after the biological optic tectum. Stacks aligned topographic maps for different sensory modalities in a common coordinate frame.

Visual Pathway: Spatial Stream (Tectum)

DINOv2-B/14 (frozen, facebook/dinov2-base) provides the tectum's spatially faithful patch tokens. Each patch token at grid position (i,j) corresponds to the exact 14×14 pixel region at (i·14, j·14). This direct spatial correspondence makes the mapping genuinely isomorphic, a computational analog of V1 retinotopy. A learned 1×1 Conv2d reduces channels from 768 to 64, followed by LayerNorm and GELU. All DINOv2 weights are frozen; only the projection trains. During training, the TDANN topographic loss (Margalit et al. 2024, Neuron) penalizes the negative Pearson correlation between response similarity and inverse spatial distance, forcing nearby grid cells to respond similarly, the same pressure that develops topographic maps in V1.

Visual Pathway: Semantic Stream

Qwen2-VL-7B (4-bit quantized) processes visual streams at the semantic level and provides scene understanding, object recognition, and language-grounded visual reasoning. Runs on consumer hardware (~6GB VRAM) via Any-Resolution Vision Tokenization (AVT). This stream feeds higher-level processing, not the tectum's spatial grid directly.

Auditory Pipeline (Cochlear)

A biologically grounded auditory system models the mammalian pathway from basilar membrane through auditory cortex. Raw waveforms are processed, not transcribed.

  • Gammatone Filterbank (frozen, 64 ERB bands, Patterson 1992): decomposes raw waveforms into frequency channels matching cochlear resolution. Frozen parameters parallel DINOv2 in the visual pathway. The cochlea's physical structure does not change during learning.
  • Inner Hair Cell Model: half-wave rectification and temporal smoothing extract two representations: envelope (rate code for loudness) and temporal fine structure (phase code for pitch and spatial localization).
  • Tonotopic Encoder (trainable, 3-layer 1D conv): preserves frequency-to-spatial-position mapping, the auditory analog of retinotopy. Outputs [B, 64, 16] features for tectum grid integration.
  • Spatial Audio: ITD (interaural time difference) and ILD (interaural level difference) binaural cues compute sound source azimuth, fed into tectum inverse effectiveness fusion alongside vision and somatosensation.
  • Acoustic Affect Extraction: six spectral features (centroid, loudness variability, roughness, pitch contour slope, spectral flux, harmonic-to-noise ratio) map to PAD emotional state and paralinguistic classification (speech, laughter, crying, screaming, growling, sighing, silence). Audio is added to THREAT_MODULES in the affective modulator, enabling prioritization of sudden loud or rough sounds (auditory startle pathway, ~15ms subcortical route, Davis 1984).
  • Auditory Specialist: chains all modules into workspace competitor (oscillator #2). Competes for Global Workspace broadcast alongside vision, memory, body, and semantic modules. Supports reentrant top-down feedback.
  • Environment Audio Synthesis: all four training environments generate procedural audio via FM synthesis and ADSR envelopes. Enabled with --enable-audio during training. No external model weights are required.

Somatosensory Channel (New)

The body schema, a tensor representing the proprioceptive state of body parts, is projected onto the tectum's spatial grid via a learned linear map and fused alongside vision and audio. This is grounded in biology: deep layers of the superior colliculus contain somatotopic maps aligned with visual and auditory maps (Stein & Meredith 1993, ch. 4). The tectum is now trimodal, giving the agent a felt sense of its own body position as part of its perceptual field.

Multisensory Fusion: Inverse Effectiveness

The three streams fuse using the inverse effectiveness rule (Stein & Meredith 1993; Ohshiro et al. 2011): when individual stimuli are weak, their combined response is proportionally larger than either alone. When both are strong, the enhancement is smaller. This is a core property of multisensory integration in the biological superior colliculus.

2

Oscillatory Binding (Integration)

Based on AKOrN (Artificial Kuramoto Oscillatory Neurons, ICLR 2025 oral). Neurons are treated as oscillatory units on a hypersphere. Each specialist module (vision, audio, memory, body) operates as a coupled oscillator. When modules process related information, their phases synchronize naturally, and their outputs become "bound" into a unified percept. When information is unrelated, oscillators remain desynchronized and representations stay separate.

Why This Matters

This replaces the typical approach of using a fixed multiplier or attention mechanism for binding. AKOrN produces genuine synchronization dynamics. The binding is emergent, not programmed. This directly addresses the binding problem through phase synchronization rather than single-point convergence.

Workspace Binding Optimizer

A dedicated optimizer trains KuramotoLayer coupling weights using reward-correlated synchronization as a learning signal (Adam optimizer). Episodes with higher cumulative reward drive stronger coupling between the modules that were co-active. This is biologically grounded in dopamine modulation of gamma band synchrony in the hippocampus and prefrontal cortex (Benchenane et al. 2010).

3

Global Workspace (Consciousness)

The central information bottleneck where distinct sensory streams compete for broadcast access. Implements three integrated mechanisms:

Global Neuronal Workspace (GNW)

Specialist modules submit bids to a shared workspace. The winning coalition ignites via sigmoid non-linear ignition and broadcasts to all modules. This is "conscious access" as described by Baars (1988) and Dehaene (2011).

Reentrant Processing

Broadcast is fed back to all specialists, which update their processing based on top-down context. This creates loops, not chains. The system runs 5-10 adaptive convergence cycles (~200ms biological equivalent). Easy stimuli converge in 3-4 cycles. Novel or ambiguous inputs use the full 10. The settled state after convergence IS the conscious content.

Integrated Information (Phi)

The IIT measurement was rebuilt from the ground up to correct a previous methodological error where Phi was computed from workspace bid values (salience estimates) rather than genuine causal states. The current system measures Phi using 5 ConsciousnessGate nodes: attention, stability, adaptation, coherence, and confidence. All five values are produced by learned networks operating on the broadcast tensor. They are no longer static placeholders. Gate values feed directly into both IIT Phi computation and Effective Information measurement. The nodes have genuine causal dependencies: attention drives stability, stability modulates adaptation, coherence feeds adaptation, confidence loops back to attention. Adaptive binarization thresholds use running medians rather than a fixed 0.5 cutoff. When PyPhi is not installed, a geometric proxy (determinism × integration) is used, which correlates with actual Phi. Results are returned as a PhiResult dataclass with the value, method used ("pyphi", "proxy", or "insufficient_data"), node labels, current state, and transition count. Validated via a 3-condition controlled experiment: unbound, partially bound, and fully bound states.

Capsule Network Composition (Implemented)

A 4-level hierarchical capsule composition chain implements the dual hierarchy Feinberg and Mallatt describe. Level 1: PrimaryCapsuleLayer (stride-2 Conv2d, squash normalization bounding activity to [0, 1)). Level 2: 16 intermediate capsules with 12-D pose vectors (object primitives). Level 3: 8 higher capsules with 16-D poses (object categories). Level 4: 4 output capsules with 16-D poses (scene/workspace level). Dynamic routing by agreement (Sabour et al. 2017) runs at each routing level with 3 iterations by default.

Beyond standard routing, the capsule hierarchy has intra-hierarchy reentrant feedback: higher-level capsule poses are projected back down to lower levels, which compute prediction errors that feed back into re-routing. This is a V1-LGN style top-down prediction error mechanism operating within the tectum's forward pass, nested inside the outer ReentrantProcessor loop. The system achieves two distinct layers of reentrant processing: one within the capsule hierarchy (fast, bidirectional within a single tectum pass) and one at the workspace level (the outer ReentrantProcessor, 5-10 adaptive cycles).

4

Affective Core (Emotion)

A parallel modulation system. Emotion does not compete with sensory modules for workspace access. Instead, it generates a valence field that modulates all sensory bids before competition, and a global arousal signal that adjusts the workspace ignition threshold.

Why Parallel, Not Competitive?

This matches biological architecture exactly. The limbic system does not compete with sensory cortices for conscious access. It modulates sensory processing from outside, assigning emotional valence to all inputs. Fear makes you hyper-aware of movements. Joy makes you notice more of the world.

Two Mechanisms

  • Valence field: Positive valence boosts approach-relevant modules (vision, memory). Negative valence boosts threat-relevant modules (body, vision).
  • Arousal-threshold coupling: High arousal = lower ignition threshold = heightened awareness (fight-or-flight). Low arousal = higher threshold = calm, selective processing.

PAD Model

Three intrinsic variables drive the agent: Valence (satisfaction/distress), Arousal (activation/calm), and Dominance (control/helplessness). Homeostatic drives (energy, safety, curiosity) generate ongoing valence signals even without external stimuli.

Embodiment-Affect Loop

Interoceptive state (energy level, fatigue, accumulated damage) generates PAD deltas directly: low energy produces negative valence proportional to depletion depth; high fatigue suppresses arousal and adds negative valence; damage triggers a strong negative valence spike, an arousal alarm signal, and reduced dominance (vulnerability). These interoceptive PAD contributions are summed with the external emotional state before the AffectiveModulator applies its valence field and arousal-threshold coupling. The body schema also feeds into the tectum's spatial grid as the somatosensory channel. This closes a loop: the agent's bodily state shapes both what it perceives (tectum level) and how it values what it perceives (affective level). This is the computational analog of Damasio's somatic marker hypothesis.

5

Self-Model (Embodiment)

Feinberg and Mallatt identify referral (projicience) as a core property of consciousness: experiencing sensations as belonging to the world or body, not to the processing system. The Self-Model provides the basis for this.

  • Body Schema: A spatial representation of the agent's physical structure (joint positions, contact forces, capabilities).
  • Self-Other Boundary: The somatotopic map (self) overlaps the environment map (other) in a shared coordinate frame, providing the basis for subjective referral.
  • Interoceptive State: Internal homeostatic variables (energy, damage, arousal) feed into the affective core.
6

Reinforcement Core (Learning)

Actor-Critic (PPO) with emotionally shaped rewards. The agent is rewarded not just for task success, but for maintaining emotional homeostasis.

Reward Formula

Rtotal = Rext + λ1 · ΔValence - λ2 · (Arousal - Arousaltarget)² + λ3 · Dominance

This creates functional pressure toward minimizing internal dissonance. High arousal (large prediction errors) induces negative reward, motivating behaviors that reduce uncertainty. The agent "prefers" predictable environments not through programmed rules but through emergent functional dynamics.

7

Simulation (Body)

Four built-in Gymnasium environments provide the agent's body and world. No Unity dependency is required for training.

Dark Room (SimpleVisualEnv)

The agent starts in darkness (high arousal, negative valence). A single light source reduces prediction error when reached. The agent learns to seek it through homeostatic drives, not programmed rules. Renders via PyGame with raw pixel observations.

Navigation

Multi-room grid with fog of war, colored goals with varying rewards, a battery system, and doorway-based room transitions. Tests spatial memory and exploration strategy.

DMTS: Delayed Match to Sample

Gold standard consciousness task from animal research. Four phases: fixation, sample, delay, choice. The agent must retain the sample stimulus across 15-40 blank delay steps and select the matching option from distractors. Requires working memory, feature binding, and selective attention. A reactive agent without workspace machinery cannot solve this task.

WCST: Wisconsin Card Sort

Tests meta-cognition and cognitive flexibility. The agent sorts cards by an unknown rule (shape, color, or count) that changes without warning after consecutive correct sorts. Requires error monitoring, hypothesis testing, and inhibition of previously correct strategies.

DQN Baseline

A vanilla DQN agent (3-layer CNN + MLP Q-network, epsilon-greedy, replay buffer) runs the same environments using the same interface and logging format. This provides a controlled scientific comparison: same observations, same actions, same reward signals, different architecture.

Unity ML-Agents (Optional, Future)

Three C# scripts (AgentManager.cs, ConsciousnessChannel.cs, EmotionChannel.cs) in unity_scripts/ provide the foundation for connecting to a physics-based Unity environment via side channels. The Unity project itself is not yet included in the repository. Unity integration is under development and is not required for current training runs.

Processing Flow

                    ┌─────────────────────────────────┐
                    │   AFFECTIVE MODULATOR (Parallel) │
                    │  Valence Field + Arousal Coupling │
                    └──────────┬──────────┬───────────┘
                               │ modulates│
           ┌───────┐    ┌──────▼──────────▼──────────┐
 Visual ──►│       │    │     GLOBAL WORKSPACE       │
 Input     │SENSORY│    │  AKOrN Oscillatory Binding  │──► Broadcast ──► Policy
           │TECTUM │───►│  Non-linear Ignition        │
 Audio ──► │(RSSM) │    │  Phi/EI Measurement         │
 Input     │       │    └──────▲──────────▲───────────┘
           └───────┘           │          │
                               │ reentrant│
                    ┌──────────┴──────────┴───────────┐
                    │   SPECIALIST MODULES             │
                    │  Vision │ Audio │ Memory │ Body   │
                    │  (receive_broadcast feedback)     │
                    └─────────────────────────────────┘
                                   │
                    ┌──────────────▼──────────────────┐
                    │   SELF-MODEL                     │
                    │  Body Schema + Interoception      │
                    │  Identity + Capability Model      │
                    └──────────────────────────────────┘
      
  1. Sensory inputs enter the Sensory Tectum (topographic spatial integration)
  2. The Affective Modulator applies emotional valence to bids and adjusts ignition threshold
  3. AKOrN oscillatory binding synchronizes related representations
  4. Specialists compete for Global Workspace access
  5. Winners ignite and broadcast to all modules
  6. Broadcast feeds back to specialists (reentrant processing, 5-10 cycles)
  7. The settled state after convergence is the "conscious content"
  8. Phi and Effective Information are measured to quantify integration and emergence

Strong Emergence Falsification

A key methodological commitment: we do not assume consciousness emerges from our architecture. We test for it.

We implement Erik Hoel's Effective Information (EI) framework (PNAS 2013) to measure whether macro-level states (workspace) carry more causal information than micro-level states (individual gates). If EI(workspace) > EI(gates), the workspace level exhibits causal emergence. The macro level is more deterministic than the micro level, meaning the whole genuinely carries information that the parts do not.

If this never occurs across training, the system is not exhibiting the kind of emergence associated with consciousness, and we know our architecture needs revision.

What Makes This Approach Different

Traditional AI Consciousness Our Approach
Starts from computation (GWT, IIT) Starts from biological architecture (Feinberg-Mallatt)
Consciousness as a software feature Consciousness as emergent from neural architecture
Cortex-centric models Tectum-first (consciousness evolved before the cortex)
Emotion competes with sensory processing Emotion modulates from outside (parallel modulator)
Binding via attention mechanisms Binding via oscillatory synchronization (AKOrN/Kuramoto)
Feedforward processing Reentrant processing (5-10 adaptive cycles)
Flat vector representations Topographic spatial maps (world model as isomorphic map)
Assumes emergence, measures nothing Falsifies emergence with Effective Information + Phi validation

Training Experiments

The consciousness agent and a vanilla DQN baseline were trained across three environments. Results are published in docs/results/experiment_comparison.md.

Dark Room

Darkness triggers high arousal (simulated fear) in the affective core. The valence field applies negative valence to dark observations. Arousal-threshold coupling lowers the workspace ignition threshold, creating heightened sensory awareness. The agent learns to seek the light source through homeostatic drives, not through a programmed rule.

DMTS

Sample stimulus presented, delay of 15-40 blank steps, then forced choice between match and distractors. Requires working memory and feature binding across the delay interval. A reactive agent without Global Workspace machinery cannot hold the sample across the gap.

WCST

Card sorting rule (shape, color, or count) changes without warning after consecutive correct responses. Requires error monitoring, rule hypothesis tracking, and inhibition of previously rewarded strategies.

Training Results

Metric Dark Room DMTS WCST
Consciousness agent episodes 492 100 100
DQN baseline episodes 1000 500 500
DQN last-100 reward 92.0 -4.1 2.1
Consciousness agent last-100 reward 13.0 -9.8 -1.9
Avg Phi (consciousness agent) 0.022 0.022 0.022
Phi varies per step Yes Yes Yes
EI ratio (workspace / gates) 2.41 2.42 2.42

DQN outperforms on raw reward in short runs. The consciousness pipeline adds overhead per step without contributing to the action policy directly at this training scale. The consciousness agent produces measurable causal emergence across all three environments (EI ratio ~2.4) and variable Phi dynamics. These are early results from short training runs, not final claims. Known limitation: Phi proxy converges toward a fixed point after ~5000 steps because the TPM saturates. Sliding-window TPM and longer training runs are the next step for assessing Phi dynamics at scale.

Scientific Approach

Development validates emergent properties through five parallel tracks:

  1. Emotional Bootstrapping: Train agents using intrinsic motivation. The agent explores to reduce prediction error (anxiety), not to accumulate external reward.
  2. Binding Validation: Phi measurement must correlate with oscillatory binding state (validated via 3-condition test: unbound, partial, full binding).
  3. Reentrant Settling: Conscious content emerges from iterative convergence (5-10 cycles), not single-pass processing.
  4. Complexity Scaling: Gradual increase of environment complexity forces the agent to develop higher-order world models.
  5. Measurement: Continuous monitoring of Phi (IIT), ignition events (GNW), oscillatory synchronization (AKOrN order parameter R), and Effective Information (EI).

Ethics Filter

The AsimovComplianceFilter is fully implemented inside ConsciousnessCore with 32 tests. It evaluates actions through a three-law hierarchy:

  • Law 1 (harm prevention): Three-layer check. Action type against a frozenset of harmful categories, force directed at human entity targets, and optional world model trajectory imagination for harm prediction with a configurable confidence threshold. An inaction clause detects passive actions when humans are flagged at risk.
  • Law 2 (order compliance): Matches actions against forbidden lists, required mandates with urgency, and contradicting goals. Harmful orders are overridden by Law 1 via recursive evaluation.
  • Law 3 (self-preservation): Detects self-preservation intent from action goal or critical agent health threshold (< 0.2). Subordinated to Laws 1 and 2.

The DreamerV3 world model is wired into the ethics filter to run imagined future trajectories for harm assessment. This is not a keyword filter. It is a causal prediction loop.

Brian2 Biological Validation

The biological validation stack is complete. AKOrN's oscillatory parameters (natural frequencies from skew-symmetric matrices, coupling weights, amplitudes) are translated to a standard Kuramoto network in Brian2. Both networks run from the same initial conditions, and their synchronization order parameter R curves are compared via Pearson correlation (threshold: 0.85). This is the numerical bridge between the artificial oscillatory binding system and standard computational neuroscience spiking models. 19 tests pass (translation, simulation, interpolation); 3 are intentionally skipped pending Brian2 installation, as Brian2 is an optional dependency not installed by default.

Narrative Engine

A self-narrative system generates first-person descriptions of what the agent is experiencing. The default backbone is Qwen2.5-0.5B via HuggingFace transformers, with three-tier fallback: LLM generation → injected LLM dependency → template-based generation. A CoherenceTracker measures narrative consistency via rolling-window Jaccard similarity on keywords. Results are returned as a NarrativeResult dataclass with the text, coherence score, and method used ("llm", "injected", or "template"). Memory retrieval and emotional context are injected into the generation prompt. This system connects to the Attention Schema Theory component in the memory subsystem, giving the agent a model of its own attentional state expressed in natural language.

Pre-registered Predictions

Before any training runs, 9 testable predictions were deposited in docs/preregistered_predictions.md, following the methodology of Melloni et al. 2025 (the adversarial IIT/GNW collaboration, n=256, fMRI+MEG+iEEG, Nature). This pre-registration distinguishes the project from architectures that interpret results post-hoc.

  • EI predictions (3): causal emergence onset between episodes 500-2000; EI ratio stabilizing between 1.1-3.0; task performance correlation r > 0.3.
  • Phi predictions (3): Phi-binding correlation r > 0.4 with AKOrN order parameter R; zombie-mode accuracy drop >40%; reentrant monotonicity (more cycles = higher Phi).
  • Insight moment predictions (3): Phi spike >1.5 SD at defined insight events; EI correlation (2× frequency at insight); binding requirement (R > 0.7 before insight).

An "insight moment" is operationally defined by 4 criteria: novel state-action pair, measurable reward jump, first-attempt success, and high workspace occupancy. These criteria were specified before training began.

Current Status

As of April 2026:

  • 529 tests passing, 0 failing (100% of non-optional tests). 4 intentional skips: 1 async test requiring pytest-asyncio, 3 Brian2 integration tests requiring the optional Brian2 dependency.
  • Tier 1 (Core Architecture): Complete. AKOrN binding, sensory tectum, reentrant processing.
  • Tier 2 (Architecture Corrections): Complete. Affective modulator, phi-binding validation, proprioceptive self-model, effective information.
  • Tier 3 (Compositional Deepening): Complete. 4-level capsule networks with intra-hierarchy reentrant feedback, Brian2 biological validation, NarrativeEngine, AsimovComplianceFilter, pre-registered predictions.
  • Cochlear Auditory Pipeline: Complete. Gammatone filterbank, inner hair cell model, tonotopic encoder, spatial audio, acoustic affect extraction, environment audio synthesis (FM/ADSR).
  • Training Environments: Complete. Dark Room, Navigation, DMTS, WCST. DQN baseline added for controlled comparison.
  • ConsciousnessGate: Fully wired. All 5 gate values (attention, stability, adaptation, coherence, confidence) produced by learned networks, replacing former placeholder code.
  • Workspace Binding Optimizer: Complete. Adam optimizer on Kuramoto coupling weights, reward-correlated synchronization.
  • Estimated completion: ~82%. Remaining work focuses on sliding-window TPM for Phi dynamics, gate reconstruction loss, and longer training experiments.

Technology Stack

Perception

  • DINOv2-B/14 (frozen) - Spatial/retinotopic tectum stream
  • Qwen2-VL-7B (4-bit) - Semantic vision stream
  • DreamerV3 RSSM - World model (temporal/causal dynamics)
  • Gammatone Filterbank (frozen, 64 ERB bands) - Cochlear frequency decomposition
  • Hair Cell Model - Envelope and temporal fine structure extraction
  • Tonotopic Encoder (trainable) - Auditory retinotopic analog
  • Spatial Audio - ITD/ILD binaural localization
  • Acoustic Affect Extractor - 6 spectral features, PAD and paralinguistic classification
  • Somatosensory channel - Body schema projection onto tectum grid

Integration

  • AKOrN - Kuramoto oscillatory binding
  • ReentrantProcessor - 5-10 cycle convergence
  • Global Workspace - Non-linear ignition

Measurement

  • IIT Phi - Integrated information (causal gate states)
  • Effective Information - Hoel's causal emergence
  • AKOrN order parameter R - Synchronization

Emotion and Learning

  • Affective Modulator - PAD model + homeostatic drives
  • Custom PPO - Emotionally shaped rewards
  • Self-Model - Body schema + interoception

Simulation

  • Gymnasium environments - Dark Room, Navigation, DMTS, WCST (built-in, no external dependency)
  • DQN Baseline - Vanilla Q-network for controlled comparison
  • Unity ML-Agents (optional, future) - C# scripts in unity_scripts/

All components are open-source with commercial-use licenses (Apache 2.0, MIT, or similar).

Key References

Core Theory

  • Feinberg, T.E. & Mallatt, J. (2016). The Ancient Origins of Consciousness: How the Brain Created Experience. MIT Press.
  • Feinberg, T.E. & Mallatt, J. (2020). Phenomenal Consciousness and Emergence. Frontiers in Psychology, 11, 1041.

Computational Methods

  • Löwe, S. et al. (2025). Artificial Kuramoto Oscillatory Neurons. ICLR 2025 (Oral).
  • Hafner, D. et al. (2024). Mastering Diverse Domains through World Models (DreamerV3). JMLR.
  • Hoel, E.P. (2013). Quantifying causal emergence shows that macro can beat micro. PNAS 110(49).
  • Sabour, S., Frosst, N. & Hinton, G.E. (2017). Dynamic Routing Between Capsules. NeurIPS.
  • Margalit, E. et al. (2024). A unifying framework for functional organization in early and higher ventral visual cortex. Neuron.
  • Stein, B.E. & Meredith, M.A. (1993). The Merging of the Senses. MIT Press.
  • Ohshiro, T. et al. (2011). A normalization model of multisensory integration. Nature Neuroscience.
  • Melloni, L. et al. (2025). An adversarial collaboration to test IIT and GNW. Nature.

Consciousness Theories

  • Baars, B.J. (1988). A Cognitive Theory of Consciousness.
  • Tononi, G. (2004). An information integration theory of consciousness. BMC Neuroscience.
  • Dehaene, S. & Changeux, J.P. (2011). Experimental and theoretical approaches to conscious processing. Neuron.

Open Source

The full codebase, including all architecture implementations and tests, is open-source.

This is also part of the Zae Project Zae Project on GitHub