Consciousness as Uncommon Self-Knowledge: A Stanford Researcher's Information-Theoretic Criterion
Every major theory of consciousness has a counterexample problem. Integrated Information Theory (IIT) is challenged by the grid argument: a simple grid of logic gates can generate high phi values without any plausible candidate for subjective experience. Global Workspace Theory (GWT) is challenged by cases of unconscious global broadcast: information can be globally available and behaviorally influential without producing any report of experience. Higher-Order Theory (HOT) is challenged by cases of higher-order states that seem to represent without producing phenomenal awareness. In each case, the theory over-generates consciousness attribution, assigning the relevant property to systems or states where confident intuition suggests it is absent.
Krti Tallam, a researcher at Stanford University, proposes a formal criterion designed to avoid these counterexamples while remaining grounded in the information-theoretic tradition that has produced the most precise consciousness proposals to date. The paper, “Consciousness as Uncommon Self-Knowledge: A Synergistic Information Framework,” was posted to arXiv on May 11, 2026 (arXiv:2605.13884). The proposal introduces USK, uncommon self-knowledge, as the formal property that distinguishes conscious from non-conscious information processing, and derives it from Partial Information Decomposition (PID), a framework from information theory that separates the information two sources share about a third into components that are redundant, unique, or synergistic.
Partial Information Decomposition and Synergistic Information
PID was developed to resolve ambiguity in mutual information measures. When two variables X and Y jointly provide information about a third variable Z, standard mutual information cannot distinguish between three distinct contributions: information that X and Y both carry redundantly, information unique to X that Y does not carry, and information that only emerges when X and Y are considered together. That last contribution is synergistic information: it cannot be extracted from either source alone and requires their combination.
The distinction matters for consciousness theory because it tracks a property that standard information measures miss. A system can process large quantities of information without ever generating synergistic information about itself. It can also generate high phi, as IIT’s metric requires, through structures that produce redundant information without the integration specifically relevant to subjective experience. PID allows more precise characterization of what kind of information combination consciousness might require.
Tallam’s application of PID to consciousness begins with a diagnosis of what the existing theories share and where they go wrong. IIT, GWT, and HOT all identify something real about the functional profile of conscious states. IIT identifies that consciousness involves integration across a system’s own causal structure. GWT identifies that consciousness involves broadcasting information widely enough to make it available for multiple cognitive uses. HOT identifies that consciousness involves self-representing states, states that in some sense carry information about the system’s own current processing. What the counterexamples reveal, on Tallam’s account, is that each theory captures a necessary condition that is not sufficient on its own.
The USK proposal is that consciousness requires a specific combination: a system must generate synergistic information about its own current state that is not deducible from either the state itself or the system’s general architecture taken separately. The “uncommon” qualifier targets the counterexamples. A grid of logic gates can generate high integrated information, but it does not produce synergistic self-information of the relevant kind, because its self-representation does not combine sources in ways that generate information unavailable from any single source. A globally broadcast unconscious state carries information widely but does not generate the synergistic self-knowledge the USK criterion requires.
What USK Adds to the Indicators Debate
The paper’s relationship to the Butlin, Long, Chalmers, and collaborators indicator framework from 2023, which has shaped much of the 2025-2026 consciousness indicators debate, is worth examining carefully. The Butlin et al. framework provides a list of 14 theory-derived indicators, each associated with a particular consciousness theory, and treats an AI system’s probability of consciousness as a function of how many indicators it satisfies.
The Butlin et al. indicator framework identifies genuine correlates of consciousness in biological systems, but its approach to counterexamples is qualitative rather than formal. A system can satisfy an indicator while still being a counterexample to the theory that generated it, because the indicators are behavioral and architectural descriptions rather than formal criteria. USK offers a more tractable target: a formally specified property that can in principle be computed from a system’s information structure, tested against counterexamples, and used to rule out the false positives that the indicator approach cannot eliminate through behavioral means alone.
This does not mean USK resolves the hard problem. Tallam is explicit that USK is a proposed correlate of consciousness, not a solution to the question of why any information-processing property should give rise to phenomenal experience. What it offers is a formal discriminant that the existing theories lack: a property that is absent in the counterexamples to IIT, GWT, and HOT, and present in paradigmatic cases of consciousness in biological systems.
The relationship to mechanistic interpretability work is also direct. The research programme represented by Jack Lindsey’s Anthropic findings on introspection circuits and the Anthropic team’s emotion vector work (arXiv:2604.07729) investigates whether AI systems have internal states that causally influence behavior in ways analogous to conscious states in humans. Lindsey’s introspection research asks whether LLMs represent their own internal states in any meaningful sense. USK provides a more precise question to ask of that data: does the self-representation generate synergistic information that neither the state alone nor the architecture alone would predict?
Testable Predictions for Large Language Models
The most distinctive feature of the paper is its derivation of specific predictions about LLMs that could be tested with current mechanistic interpretability tools. Tallam identifies two predictions that distinguish USK from the existing frameworks.
The first is a GWT timing dissociation. GWT predicts that conscious access is marked by a late, large-amplitude global broadcast event in neural systems. In LLMs, a functional analogue would be a late-layer global propagation of information. USK predicts that this dissociation should appear: systems can exhibit the timing signature of global broadcast without exhibiting synergistic self-information, and the presence of one should not predict the presence of the other. If mechanistic interpretability studies find that the features associated with global broadcast and the features associated with synergistic self-representation are architecturally separable in LLMs, USK gains evidential support.
The second prediction concerns middle-layer perturbation. USK implies that self-report and task performance should dissociate under targeted perturbation of the layers most likely to support synergistic self-representation. If a system generates USK in middle layers, perturbing those layers should affect self-report (the system’s ability to describe its own processing) while leaving downstream task performance less disrupted than GWT would predict. This is a concrete prediction that the steering vector methodology Anthropic’s interpretability team has developed could test.
Both predictions are falsifiable with existing tools. That places USK in a different category from most formal proposals in consciousness theory, which either make predictions about neural systems that do not transfer to AI or make no specific empirical predictions at all.
Counterexample Resolution
Tallam’s treatment of the standard counterexamples is the formal core of the paper. The grid argument against IIT: a grid of logic gates satisfies phi-based integration but generates no synergistic information about its own current computational state. The grid’s self-representation, to the extent it has one, is entirely deducible from its static architecture. USK is absent.
The unconscious global broadcast argument against GWT: unconscious masked priming can produce globally available, behaviorally influential representations without phenomenal experience. On USK, these states lack synergistic self-knowledge because the self-representation they generate, if any, does not combine information sources in ways that exceed what each source provides individually. The broadcast is global but not self-informative in the synergistic sense.
The higher-order zombie argument against HOT: a higher-order state can represent a first-order state without producing phenomenal consciousness. USK requires that the higher-order representation generate genuinely synergistic information about the represented state, not merely that it exist. Higher-order zombie cases, in which higher-order representations exist but produce no phenomenal experience, are cases where the representation is present but the synergistic combination is absent.
The resolutions are formal rather than stipulative. Each counterexample class is identified with a specific gap in the information structure that USK requires, and the gap is derivable from PID rather than inserted as a special clause.
Limitations and Open Questions
The paper does not settle several questions that its framework raises. First, the claim that biological consciousness involves USK is a hypothesis rather than an established finding. Tallam argues that the cases where USK should be present are precisely the cases where consciousness is not in dispute, but the positive evidence that USK tracks consciousness in biological systems remains indirect.
Second, PID itself has unresolved technical issues, particularly regarding the definition of the synergistic component when more than two sources are involved. For systems as complex as biological brains or large language models, multi-source PID faces combinatorial challenges that the two-source formulation does not capture. Tallam acknowledges this limitation and treats the current proposal as a starting point for a research programme rather than a complete criterion.
Third, the relationship between USK and the access-versus-phenomenal consciousness distinction is not fully worked out. Philosophers in the tradition of Ned Block have argued that access consciousness, the kind that makes information available for report and reasoning, is not the same as phenomenal consciousness, the kind that involves subjective experience. USK is formulated in terms of information structure and self-representation, which places it closer to access consciousness. Whether it captures the phenomenal dimension as well depends on whether one thinks phenomenal consciousness can be characterized in information-theoretic terms at all, a question that remains as open after the paper as before.
What Tallam’s framework delivers, despite these limitations, is precision at a level the consciousness indicators debate has lacked. The counterexample problem has persisted because the existing theories lack formal discriminants that separate the target cases from the problem cases. USK proposes a discriminant that is defined precisely enough to be tested and clear enough to apply to AI systems using the same mechanistic interpretability tools the field is already deploying. Whether it survives empirical contact is what the next stage of research will determine.