The Consciousness AI - Artificial Consciousness Research Emerging Artificial Consciousness Through Biologically Grounded Architecture
This is also part of the Zae Project Zae Project on GitHub

The Cogitate Consortium Test: When IIT and GNW Faced Their Own Falsification Criteria

Two of the most influential theories of consciousness have now been tested against their own falsification criteria, by the researchers who built them, in a single preregistered study. The results, published in Nature on April 30, 2025 (Volume 642, Issue 8066, pages 133–142, DOI: 10.1038/s41586-025-08888-1), are neither a victory for either side nor a clean defeat. They are something more methodologically significant: the first rigorous demonstration that the two theories most commonly cited in AI consciousness research do not, in their current forms, hold up under adversarial empirical scrutiny.

The Cogitate Consortium assembled 256 human participants and three measurement modalities — functional MRI, magnetoencephalography, and intracranial EEG — to test the core predictions of Integrated Information Theory, led in this study by Giulio Tononi, and Global Workspace Theory, led by Stanislas Dehaene. The adversarial collaboration design required both teams to specify, in advance, which experimental results would count as falsification of their own theory. Neither team could walk back the goalposts after seeing the data.

That commitment is what makes the study matter. Adversarial preregistration turns an empirical test from a persuasion exercise into a genuine scientific constraint. The Cogitate Consortium is the first group in consciousness science to apply it at this scale, with this level of instrumentation, across both of the field’s dominant theoretical frameworks simultaneously.


What Each Theory Predicted

Integrated Information Theory, developed by Giulio Tononi at the University of Wisconsin, holds that consciousness corresponds to integrated information. IIT makes a specific anatomical prediction: conscious perception should be associated with sustained synchronization in the posterior cortical “hot zone,” a region spanning visual, parietal, and posterior temporal areas. This sustained activity reflects the integration of information across posterior circuits. When a participant consciously perceives a stimulus, posterior synchronization should be present and stable. When the stimulus is not consciously perceived, or when perception is absent, the signature should be reduced or absent.

Global Workspace Theory, developed by Bernard Baars and elaborated by Stanislas Dehaene and Jean-Pierre Changeux, holds that consciousness corresponds to the global broadcast of information across a distributed cortical network. GNW makes a different prediction: conscious perception should trigger a late, widespread “ignition” event in which information suddenly propagates from sensory areas into prefrontal and parietal networks. This ignition is characterized by a specific temporal signature. It arrives late relative to initial sensory processing and involves strong prefrontal activation. When a stimulus fails to cross the threshold of conscious access, ignition does not occur.

These predictions are testable. They have specific timing signatures, specific anatomical locations, and specific relationships to the subjective report of perception. The Cogitate Consortium’s design was built to measure all of them simultaneously, across a large participant sample and three independent recording technologies.


What the Data Showed

IIT’s posterior synchronization prediction was not reliably present in the data. The sustained posterior cortical activity that IIT identifies as the marker of conscious perception did not appear with the consistency and magnitude the theory requires across the 256 participants and three recording methods.

GNW’s ignition prediction showed a different pattern of failure. The consortium found absent ignition at stimulus offset and weak prefrontal representation of some conscious dimensions. The late, widespread broadcast that GNW predicts as the defining event of conscious access was either absent or substantially weaker than the theory anticipates. The prefrontal signature that GNW treats as a core marker of global broadcast did not emerge reliably as a function of reported conscious perception.

Both findings came from preregistered analyses using criteria that the theory’s own proponents had agreed would count as falsification. This is not a result that can be dismissed as methodological quibbling or as unfair interpretation of the data by critics. The researchers who built these theories agreed, before seeing the data, what results would and would not confirm their predictions. The results did not confirm them.

Neither theory was completely falsified. Both IIT and GNW retain evidential support from other studies using other paradigms and other participant samples. What the Cogitate Consortium established is that their core predictions, as operationalized in this large, multi-modal, preregistered study, did not hold at the level the theories require.


Why the Method Matters as Much as the Result

The adversarial collaboration design deserves attention independent of its specific findings, because it introduces a standard that the consciousness field has historically resisted.

Consciousness research has a replication problem that goes deeper than the general replication crisis in psychology. Because there is no consensus on what consciousness is, there is no settled method for testing whether a given experiment has measured it. Researchers can always argue that a failed prediction reflects an imperfect operationalization of the theory rather than a problem with the theory itself. The adversarial preregistration removes this escape route. Both teams chose their own operationalizations, agreed they were fair, and committed to accepting the results.

The broader implication is methodological. If the field is going to produce cumulative, replicable knowledge about consciousness rather than an accumulating collection of mutually contradictory single studies, it needs more experiments designed this way. The Cogitate Consortium is the first at this scale. Whether it triggers a shift toward adversarial collaboration as a standard for consciousness research is a separate question, but it establishes that the design is feasible at the highest level of empirical rigor currently available.


The AI Consciousness Problem These Results Expose

IIT and GNW are not only theories of biological consciousness. They are the two frameworks most commonly cited when researchers attempt to evaluate whether artificial systems might be conscious, and when engineers attempt to design systems that implement consciousness-relevant properties.

The Brock University and IONS research program, which proposed applying IIT’s cause-effect power equations directly to artificial architectures, depends on IIT being a reliable predictor of consciousness-relevant properties. If IIT’s core predictions do not hold reliably in biological systems, the case for applying its formalism to silicon architectures becomes more complicated. This does not invalidate the approach. The Cogitate Consortium’s findings challenge IIT’s specific predictions about posterior synchronization, not its underlying mathematical framework. But it means that any IIT-based evaluation of artificial systems should acknowledge that the theory’s empirical predictions are currently contested at their foundation.

GNW has a parallel problem for AI research. GNW’s global broadcast architecture has been proposed as a design target for artificial consciousness: a system that implements information broadcast across a global workspace might thereby satisfy GNW’s conditions for consciousness. If GNW’s ignition prediction is weak or absent in biological systems, researchers designing toward a GNW architecture should know that the theory’s empirical grounding in biological consciousness is not as secure as assumed.

The 14-indicator framework developed by Butlin, Long, Bengio, and Chalmers draws heavily on both IIT and GNW as theoretical sources for its indicators. Indicators GWT-1 through GWT-3, requiring global information broadcast, and IIT-adjacent indicators requiring integrated information, are both affected by the Cogitate findings. This does not mean those indicators should be discarded. It means they should be held with appropriate uncertainty about the empirical basis of the theories they derive from.


What Comes After

The honest response to the Cogitate Consortium findings is not to abandon IIT and GNW. Both theories have substantial evidential support across many other studies, and neither was fully falsified by this experiment. The appropriate response is to treat the theories as research programs requiring further development rather than as settled frameworks ready for direct application.

For the AI consciousness research community, this means several things in practice. Methodological claims about AI systems that cite IIT or GNW as justification should specify which version of the theory they are relying on and acknowledge the empirical challenges the Cogitate study raises. Architectural claims that a system implements GNW-style global broadcast or IIT-style integrated information as proofs of consciousness-relevance should be read with awareness that the empirical signatures of those properties in biological systems are currently disputed.

Equally significant is what the adversarial preregistration design itself demonstrates: that consciousness theories can be subjected to genuine empirical constraint, not just selective confirmation. The field has theories. It now has evidence that those theories do not automatically survive contact with their own falsification criteria. That is progress, even when the progress takes the form of a negative result.

For approaches that try to measure consciousness without committing fully to either IIT or GNW, such as the multidimensional profile approach analyzed in the scoring framework debate, the Cogitate findings add methodological pressure toward theoretical pluralism. If no single theory survives adversarial testing intact, frameworks that aggregate evidence across multiple theories rather than committing to one gain relative credibility. That is not a comfortable conclusion for the field, but it is consistent with what the data now show.

The measurement problem for artificial consciousness is harder than it appeared before the Cogitate Consortium published. That is the most precise summary of what the results establish.

This is also part of the Zae Project Zae Project on GitHub