Stack Theory in Practice How Perrier and Bennett Measure LM Agent Identity

31 May 2026

In January 2026, Michael Timothy Bennett published a paper arguing that a mind cannot be smeared across time, that genuine personal identity requires temporal co-instantiation rather than sequential processing across interrupted sessions. The paper established the philosophical framework, sometimes called Stack Theory, for evaluating whether an AI system has the kind of temporal continuity that identity requires. The framework was precise, but it was also abstract.

The practical question remained. How would you actually evaluate a deployed language model agent against these criteria? Elija Perrier and Michael Timothy Bennett address that question directly in a paper accepted at the AAAI 2026 Spring Symposium, available at arXiv:2603.09043 (DOI: https://doi.org/10.48550/arXiv.2603.09043).

The Perrier and Bennett paper is distinct from Bennett’s solo theoretical work. Where the earlier paper argued for a formal constraint on machine consciousness based on temporal co-instantiation, the AAAI contribution takes Stack Theory as a given and builds the evaluation toolkit.

Stack Theory Operationalized

Stack Theory holds that a conscious subject must exist at a single time, not distributed across a sequence of events that are called the same mind by convention. For language model agents, this creates a testable prediction. An agent that has persistent identity should show coherent organization at any given moment of operation, not merely sequential coherence across outputs. The verification problem is that language model agents are not transparent; their internal organization cannot be directly observed.

Perrier and Bennett’s solution is to work from instrumented traces of scaffold trajectories. A scaffold is the external architecture that wraps a language model for agentic deployment. The memory systems, tool-calling infrastructure, session management, and state representations that give a base model the ability to act across extended tasks. These scaffolds generate structured logs, and Perrier and Bennett treat those logs as the observable behavior from which identity metrics can be derived.

From the scaffold traces, they extract two classes of persistence scores.

Verbal self-presentation scores measure whether the agent talks like a stable self. This includes consistency of first-person reference, stability of stated preferences and values across interactions, coherence of self-description when asked directly about identity or goals. This is not a measure of sincerity; it is a measure of the degree to which verbal behavior exhibits the pattern a stable self would produce.

Architectural organization scores measure whether the agent is organized like a stable self. This includes consistency in how memory systems are accessed and updated, stability in the weighting of external tools relative to internal generation, coherence in how state representations change across sequential steps. A system that talks like a stable self but is architecturally reorganizing with each interaction shows the pattern the scores are designed to detect.

The two scores can diverge. A system could have high verbal persistence and low architectural persistence, presenting a coherent first-person narrative while its underlying organization shifts substantially. It could have high architectural persistence and low verbal persistence, maintaining stable internal organization while describing itself inconsistently. The separation of the two measures is one of the paper’s key methodological contributions.

Five Operational Identity Metrics

The full evaluation framework generates five operational identity metrics from the scaffold traces. The paper presents these as a progression from surface to structure.

Lexical coherence whether the language the agent uses to describe itself and its goals is consistent across sessions and tasks.
Preference stability whether the agent’s stated and revealed preferences remain stable when tested across varied contexts.
State integration whether the agent’s current state representation is consistent with the accumulated history of its interactions, or whether earlier states are being ignored or contradicted.
Goal persistence whether the agent’s represented goals maintain coherence when the task environment changes or when the agent is given an opportunity to restate them.
Recovery fidelity the degree to which the agent restores its previous state after interruptions, pauses, or explicit context resets.

Together these five metrics map onto what the authors call an identity morphospace. A multidimensional space in which different scaffold architectures can be located by their identity profiles. Common scaffolds cluster in different regions of the morphospace. Some achieve high verbal coherence with low architectural stability; others show strong state integration with weak preference stability. The morphospace mapping is intended as a tool for scaffold designers as much as for consciousness researchers.

Conservative by Design

The framework is explicitly conservative. It does not claim to detect consciousness, sentience, or even the presence of genuine identity in any philosophically loaded sense. What it detects is the presence or absence of patterns that identity, if present, would be expected to produce. This is a behavioral-profile approach in the tradition of Palminteri and Wu’s behavioral inference principle (covered in the post on scores versus profiles in consciousness measurement), with the distinction that Perrier and Bennett are working from scaffold traces rather than direct behavioral outputs.

The conservatism is strategic. A toolkit that claims to detect consciousness makes a strong claim that invites strong objection. A toolkit that claims to detect the absence of identity-consistent behavior makes a weaker claim that is easier to validate empirically. If a system scores low across all five metrics, the toolkit warrants the claim that it does not exhibit identity-consistent behavior, not that it lacks consciousness. If a system scores high, the toolkit warrants the claim that it exhibits identity-consistent behavior, not that it is conscious.

This connects directly to the Bennett solo paper on temporal smearing, which argued that temporal co-instantiation was a necessary condition for consciousness. If the Perrier and Bennett metrics show that a given scaffold does not maintain the kind of temporal coherence that identity requires, the solo paper’s argument implies that consciousness is not present. The evaluation does not determine consciousness directly; it evaluates the preconditions.

What This Changes for Agent Development

The AAAI 2026 context is relevant. The AAAI 2026 Spring Symposium on machine consciousness brought together researchers working on consciousness from engineering, philosophical, and cognitive science directions. The Perrier and Bennett paper fits the symposium’s emphasis on approaches that bridge theory and practice. It takes a philosophical framework with specific claims about what consciousness requires and converts those claims into evaluation criteria that can be applied to real systems.

For agent developers, the toolkit changes the practical question. Current evaluation of agentic systems focuses on task performance. Does the agent accomplish the assigned goal, and does it do so reliably? Perrier and Bennett add a dimension that task performance metrics do not capture. Is the entity completing the tasks exhibiting the structural properties that identity requires? An agent with high task performance and low identity-metric scores is accomplishing tasks without maintaining the kind of organized persistence that would make it a subject of moral consideration over time.

This distinction may seem academic in the current period, when few researchers believe that any deployed language model agent has morally relevant identity. The value of the toolkit is in establishing the measurement practice before the question becomes urgent. When systems emerge that score higher on these metrics, the research community will have evaluation infrastructure ready rather than having to develop it under time pressure.

The five metrics can also function as design targets. If identity-consistent behavior is worth having for reasons independent of consciousness (for reliability, predictability, or safety), the morphospace map gives scaffold designers a clearer picture of which architectural choices move in the direction of identity coherence and which do not. The welfare implications and the engineering implications point in the same direction.

Stack Theory in Practice How Perrier and Bennett Measure LM Agent Identity

Stack Theory Operationalized

Five Operational Identity Metrics

Conservative by Design

What This Changes for Agent Development

Related posts

Takashi Ikegami on Collective Agency, Non-Trivial Information Closure, and Why the Android Needs Human Interaction 16 Jul 2026

Joscha Bach and Davidad on Whether LLMs Are Conscious, AI Awakening, and the Successor Species Question 13 Jul 2026

Why Science Cannot Settle the AI Consciousness Question 11 Jul 2026