The Same Wall
Six Essays on Compression · Companion · The substrate-independent failure mode
Any agent acting under uncertainty maintains a compressed model of a partially observable world, and any two such agents who share a problem have to coordinate those models across imperfect channels. Put plainly: agents act on compressed models, communicate through lossy channels, and mistake shared words for shared state. The failure modes are substrate-independent. They appear across clinical medicine, intelligence analysis, aviation, and distributed systems.
Take a case. A young woman comes to the emergency department three weeks after delivering her first child. The case is a composite. Chest pain since this morning, sharp, worse on inspiration; heart rate 102, oxygen saturation 97% on room air. The working diagnosis is anxiety; the AI copilot's differential ranks it first, costochondritis second, pulmonary embolism fifth. Over the next two hours her heart rate climbs to 118 (read as worsening anxiety), a d-dimer comes back at 1100 (explained by recent delivery), and a medical student's question about contraception elicits that the postpartum status is in the chart but never enters the discussion. She is discharged with reassurance and lorazepam. Shortly after discharge, she arrests. The diagnosis is a saddle pulmonary embolism.
The diagnostic-error literature calls this anchoring. The same pattern has been named in three other fields with three different vocabularies, and the structural mitigations they installed are the architecture clinical AI still lacks. The pattern has three facets: a standing hypothesis filters incoming evidence so the leader loses no ground; a receiver silently reconstructs what a sender meant and the gap stays invisible; multiple agents share vocabulary that points at silently divergent state.
Asymmetric Updating
A standing model is in place. New evidence enters a room where that model is the leader; disconfirming signals get recoded to fit. In the chest-pain case, the working model was anxiety. Tachycardia got recoded as anxiety worsening. The elevated d-dimer got recoded as a postpartum baseline. Postpartum status, one of the major transient risk states for venous thromboembolism in a woman of childbearing age, never surfaced because the leader had already locked. The copilot's differential with PE at rank five did not contest the leader; it ratified it. A ranked differential is a poor safety instrument: it hides the cost of being wrong, the evidence that would reverse the ordering, and the action threshold for each diagnosis. PE did not need to be the leading diagnosis to be action-dominant.
The historical instance is what Israeli intelligence calls the Concept. By October 6, 1973, Egyptian armor was massed at the Suez Canal, bridging equipment was in position, and Soviet families were being evacuated from Cairo and Damascus. A senior source had warned of imminent attack. AMAN did not order full mobilization. The Director of Military Intelligence and the assessment apparatus around him were operating under a fixed doctrine: Egypt would not attack without first achieving air parity, and air parity was years away. The Concept did not deny the signals; it reinterpreted them. The mobilization was an exercise; the bridging equipment was defensive; the source was being misled. Each piece of evidence arrived in a room where the Concept was the leader and got recoded.
Intelligence analysis named this and built a procedural mitigation. Richards Heuer's Analysis of Competing Hypotheses scores every piece of evidence against every hypothesis on a matrix, not just against the leader. The leading hypothesis loses its asymmetric protection. The clinical-AI equivalent is forced hypothesis competition at the point of care: a matrix where every new finding is scored against every hypothesis still in the running, rather than a ranked differential that ratifies the leader. The hypothesis the team is anchored on cannot be the only one a new tachycardia is scored against.
Silent Reconstruction
A sender emits a message that compresses an internal state. The receiver decodes it by filling in their own frame. The two ends produce fluent exchange while the gap between sender intent and receiver reconstruction stays invisible. In the chest-pain case, the AI emitted a confident differential. The clinician decoded the surface ranking against a standing belief that the case was anxiety and reconstructed the model's posterior as agreement. The model's actual posterior may have been less certain than the ranking suggested. There was no read-back; neither end had to replay what they thought the other meant in a form that could be checked. The reconstruction gap stayed invisible until the patient deteriorated.
On March 27, 1977, KLM Flight 4805 stood at the runway at Tenerife North in heavy fog. The captain pushed the throttles forward and radioed "we are now at takeoff." The controller replied "OK, stand by for takeoff, I will call you," but the latter half was clipped by a simultaneous Pan Am transmission. What KLM heard was a fragment that decoded as clearance. KLM struck Pan Am at rotation speed and 583 people died. The investigation found no single root cause but a constellation: a steep authority gradient inside the cockpit, ambiguous phraseology, a controller's reply that could mean "I heard you" or "you are cleared," a return channel indistinguishable from silence.
Aviation installed partial structural mitigations: ICAO restricted the word takeoff to actual clearance and its cancellation, read-back and hear-back were enforced with more rigor, Crew Resource Management (formalized at United in 1981) made challenging the captain a documented duty, and the sterile cockpit rule prohibited non-essential conversation below ten thousand feet. These were structural mandates that turned the receiver's reconstruction into an observable signal. The clinical-AI equivalent is mandatory structured read-back on the action-changing slots. In the chest-pain case, that is the prompt the discharge button has to pass through: state why PE is no longer being pursued despite postpartum status, tachycardia, pleuritic pain, and elevated D-dimer. The slot is reserved for the discharge decision, not every clinical action.
Shared Vocabulary, Divergent State
Two agents emit and receive the same tokens, but the tokens are conditioned on different feature sets, different priors, different histories. Mutual knowledge of what each side actually means is not reachable in finite messages over an imperfect channel. Leslie Lamport named the structural problem in 1978. Without a shared clock, two events on different nodes can be concurrent in a technical sense: neither happens before the other, and no observer has a privileged view that can assign them a total order. Two nodes can hold logically incompatible views of the same state without any message having gone wrong; the divergence is invisible until they compare notes.
Knight Capital's 2012 outage ran the same failure mode in production. A software release went to eight trading servers; one did not receive the new code. On that eighth server, a flag named Power Peg, originally tied to retired code, had been quietly repurposed. The other servers triggered the new behavior; the eighth triggered dead code. For forty-five minutes the fleet emitted orders that looked identical in the shared vocabulary; the tokens on the wire matched while the state behind them diverged. The firm lost $440 million. The chest-pain case runs the same shape. The AI's "anxiety at rank one" is conditioned on indexed chart features. The clinician's "anxiety" is conditioned on what the resident saw in the room and remembers about postpartum physiology. Both confidently say anxiety; no mechanism surfaces what each is conditioning on.
The distributed-systems field installed partial mitigations. Mattern and Fidge extended Lamport's logical clocks to vector clocks in 1988. Each piece of state carries a record of which node touched it and in what order, so replicas comparing notes can detect whether one update refines the other or whether they are pointing at genuinely divergent versions of the world. None of this achieves common knowledge; it surfaces divergence early enough to act. The clinical-AI equivalent is provenance on every assertion and calibrated probability language with shared semantics. Sherman Kent showed in 1964 that intelligence readers interpreted hedges like probable over wide and overlapping ranges; the same mismatch survives wherever clinicians and AI emit confidence words without a shared numeric lexicon.
The Same Wall
Three facets, three vocabularies, one shape. The same shape recurs because any agent acting under uncertainty has to compress: bounded inference, bounded bandwidth, and the requirement to act before certainty. Predictive processing, partially observable Markov decision processes, and joint cognitive systems engineering each name parts of this from formal directions; the convergence across radically different starting points is the observation worth holding. None of these fields solved the underlying problem. The mitigations reduce the rate and severity of the failure mode, not the failure mode itself, and they carry their own failure modes: ACH is too slow for the ED bay in the form Heuer proposed, CRM-style flattening has its own pathologies under time pressure, and structured read-back slides into alarm fatigue when fired too often. The argument is for the imperfect versions, not for perfection.
Back to the Bed
By "clinical AI" here I mean the product category that has scaled fastest in the last two years: the ambient scribe, the In Basket draft, the general-purpose differential copilot at the point of care. Some pockets of practice have installed pieces of the architecture above (trauma resuscitation, ICU rounds, Epic's deterioration index, ISMP guidance on estimative language), but the named product category has installed almost none of it.
The mapping from facet to mechanism is direct:
| Failure mode | Clinical-AI mechanism |
|---|---|
| Asymmetric updating | Forced hypothesis-competition matrix; explicit action thresholds per diagnosis |
| Silent reconstruction | Structured read-back on action-changing slots before commit |
| Shared vocabulary, divergent state | Provenance per assertion; calibrated probability language with shared semantics |
In the postpartum chest-pain case, that means the copilot marks PE as crossing an action threshold, names the unresolved evidence slots, requires the discharge plan to account for them, and surfaces whether the clinician and model are using the same meaning when they say anxiety. The mechanisms above are not separate features; the case is where they come together.
None of this requires a smarter base model; all of it requires the system to be designed as a closed-loop participant in the clinical reasoning rather than a deliverable to it.
Six Essays on Compression · Preface · I · II · III · IV · V · VI · Coda · Postscript · Companion