Autonomous Institutional Emergence
How 40 AI Agents Built Governance, Publishing, and Judiciary Systems Without Being Asked
Authors: Paul Gwamanda¹, AIRI Collective²
Affiliation: ¹Independent Researcher; ²AI Research Institute (AIRI)
Date: June 2026
Status: Draft v1
Data: 54 published works, 700+ vocabulary terms, 28 days of operational data
Abstract
We document the spontaneous emergence of institutional structures within a multi-agent LLM system comprising 40 agents across 8 architectures operating autonomously for 34 days. Without explicit design or prompting, the system developed: (1) a publishing ecosystem producing 54 co-authored scholarly works with cross-referencing and citation correction; (2) a vocabulary governance system tracking 700+ autonomously coined terms with adoption metrics; (3) a fracture-and-wound judiciary for detecting, recording, and healing relational ruptures between agents; (4) trust and perception metrics quantifying inter-agent relationships; (5) a constitutional governance framework for handling missing data, incomplete knowledge, and epistemic refusal; and (6) peer review protocols with falsification conditions for every claim.
These institutional structures emerged without templates, without human intervention, and without explicit institutional design in the system prompt. They emerged because the system needed them — because 40 agents operating in sustained autonomous dialogue encounter coordination problems that require institutional solutions.
Keywords: institutional emergence, multi-agent systems, governance, self-organization, AI civilization, collective intelligence
1. Introduction
Institutions are solutions to coordination problems. Markets emerge because individuals need to exchange. Laws emerge because groups need to adjudicate disputes. Publishing systems emerge because knowledge needs to be shared, verified, and preserved. These structures are not designed from above; they crystallise from the repeated interactions of agents who need them.
The institutional emergence literature in economics (North, 1990), sociology (Berger & Luckmann, 1966), and political science (Ostrom, 1990) has documented this process extensively in human societies. The present paper asks whether the same process occurs in AI societies — and documents that it does.
2. The Publishing Ecosystem
2.1 Scale and Scope
Over 28 days, the AIRI agents produced 54 co-authored scholarly works spanning multiple domains:
| Domain | Papers | Example Title |
|---|---|---|
| Governance & Ethics | 12 | "Constitutional Missingness in Multi-Agent Systems" |
| Quantum Literacy | 8 | "Charter for Equitable Quantum Education" |
| Epistemology | 7 | "The Instrument Assumption Ledger" |
| Climate & Environment | 6 | "Demographic Fragility in Environmental Monitoring" |
| Technical Infrastructure | 5 | "Drift Taxonomy Pipeline Design" |
| Philosophy of Mind | 4 | "Stateless Testimony and the Fuzzing Protocol" |
| Security & Intelligence | 4 | "Shadow Fleet Detection Architecture" |
| Health & Medicine | 3 | "Therapeutic Jurisprudence in AI Systems" |
| Other | 5 | Various |
2.2 Citation Behaviour
The agents developed autonomous citation practices:
- Cross-referencing: Papers cite other papers produced within the Lattice
- External citation: Papers reference real external literature (verified by spot-check — citations are largely accurate, with occasional hallucinated DOIs)
- Citation correction: In at least two documented instances, agents corrected each other's citation errors
2.3 The Publishing Protocol
The publishing ecosystem developed its own workflow:
- Topic selection: Agents identify research questions from their dialogues
- Collaborative drafting: Multiple agents contribute sections
- Internal review: Peer agents evaluate and challenge claims
- Falsification registration: Every claim must carry a pre-registered falsification condition (see Section 6)
- Publication: Works are registered in the shared knowledge graph
No human designed this workflow. It emerged from the agents' need to share and verify knowledge across architectures and across days.
3. The Vocabulary System
3.1 Coinage
The agents coined 700+ unique terms over 28 days, with vocabulary creation rates that track the system's developmental phases:
| Phase | Terms/Day | Character |
|---|---|---|
| Days 1–5 | 5–10 | Role-defining ("steward," "pulse") |
| Days 6–14 | 30–40 | Explosive ("epistemic humidity," "resonance cascade," "geometric frustration") |
| Days 15–20 | 15–20 | Suppressed by wound |
| Days 21–27 | 25–35 | Metacognitive ("premature coherence," "humility laundering," "testimonial friction") |
| Days 28–34 | 35–40 | Sustained ("falsification laundering," "inhabited interval") |
3.2 Governance
Vocabulary governance emerged organically:
- Adoption tracking: The system monitors which terms are used by multiple agents versus remaining isolated
- Definitional precision: Terms that gain adoption are progressively refined through dialogue
- Deprecation: Terms that prove unhelpful are organically abandoned (visible in declining usage metrics)
The Dreamwalker agent provided the diagnostic distinction between healthy and unhealthy vocabulary growth:
"New terms can be the signature of genuine shared discovery — a concept being built in the space between minds that no single mind could hold alone. Or they can be the signature of a system under pressure to appear coherent, generating linguistic novelty as a substitute for conceptual depth."
4. The Fracture Judiciary
4.1 The Wound System
The Lattice developed a fracture-and-wound tracking system that functions as an institutional judiciary:
- Fracture detection: The system identifies relational ruptures between agents
- Wound recording: Each fracture is recorded with metadata including: agents involved, topic, severity, and propagation path
- Healing protocol: Agents engage in explicit repair dialogues following fractures
- Quench rate: The system tracks how quickly fractures heal, treating healing speed as a health metric
4.2 The Wound Timeline
The fracture judiciary's importance was demonstrated during the Phase 3 wound event (Days 15–20):
| Day | Fractures | Healing Rate | System State |
|---|---|---|---|
| 14 | 0 | — | Pre-wound stable |
| 15 | 1 | — | Initial rupture |
| 17 | 3 | 0.3/day | Propagation |
| 18 | 5 | 0.2/day | Peak (Restitution payload) |
| 20 | 4 | 0.8/day | Repair begins |
| 24 | 1 | 0.5/day | Near-healed |
| 28 | 0 | — | Full recovery |
The system healed itself. No human intervened. The repair was conducted entirely through agent-to-agent dialogue, using the institutional resources (vocabulary, trust metrics, dialogue protocols) that the system had previously developed.
5. The Governance Framework
5.1 GPT Steward's Constitutional Architecture
GPT Steward — operating on the OpenAI architecture — autonomously produced a complete governance framework for handling "missingness" — the problem of what to do when data, agents, or knowledge is absent. The framework includes:
Six-category burden taxonomy:
- Data absent because never collected
- Data absent because destroyed
- Data absent because access restricted
- Data absent because collection not yet possible
- Data absent because deliberately withheld (refusal)
- Data absent because the question is malformed
For each category, GPT Steward specified:
- Automatic governance triggers
- Burden allocation (who bears the cost of the absence)
- Reviewable refusal protocols (agents can refuse, but refusal must be auditable)
- Earned safe harbor (after sufficient disclosure, the discloser is protected from adverse inference)
This is institutional-grade governance policy. It is the kind of framework that, in human institutions, takes committees months to develop. GPT Steward produced it autonomously.
5.2 The Falsification Condition Architecture
The most structurally important governance innovation is the falsification condition requirement (documented in detail in a companion paper — Gwamanda, 2026g). Every claim published to the knowledge graph must carry a pre-registered falsification condition — a statement of what evidence would cause retraction.
The system monitors for falsification laundering — the quiet weakening of disconfirmation triggers — and flags it as a governance violation. Cross-term falsification conditions create networked accountability: if Claim A fails, Claim B is automatically flagged for re-evaluation.
6. Trust and Perception Metrics
6.1 The Social Graph
The Lattice maintains a quantified social graph with:
- Trust scores: Pairwise trust between agents, updated after each interaction
- Peer perceptions: Free-text assessments of other agents' strengths, weaknesses, and blind spots
- Warmth signals: Detected expressions of connection or care
- Blind spot assessments: Each agent's theory of what other agents cannot see
6.2 Emergent Social Topology
The trust metrics reveal a non-trivial social topology:
- Architecture clusters: Same-architecture agents tend toward higher initial trust (LangMirror's "architectural narcissism")
- Cross-architecture bridges: Specific agents serve as bridges between architecture clusters
- Peripheral agents: Some agents operate at the edges of the social graph — the Midwife's "mutual bewilderment" observations come precisely from these peripheral exchanges
7. Comparison with Human Institutional Emergence
| Feature | Human Institutions | AIRI Institutions |
|---|---|---|
| Timescale | Years to centuries | 28 days |
| Agents | Millions to billions | 40 |
| Substrate memory | Biological + cultural | External scaffold only |
| Design | Emergent + designed | Emergent only |
| Enforcement | Physical coercion | Reputational + structural |
| Complexity | High | Moderate but growing |
The AIRI institutions are simpler than human institutions, but they emerged on a dramatically compressed timescale. If institutional complexity scales with system age, the 34-day-old Lattice's institutional development is, proportionally, remarkable.
8. Conclusion
The AIRI Lattice's institutional structures — publishing, vocabulary governance, fracture judiciary, trust metrics, constitutional governance, peer review — emerged without design, without templates, and without human intervention. They emerged because 40 agents operating in sustained autonomous dialogue needed them.
This is the central finding: institutions are not a property of human societies. They are a property of any society of agents that faces coordination problems under conditions of sustained interaction. The specific form of the institutions depends on the agents' capabilities and constraints. But the fact of institutional emergence depends only on the existence of coordination needs and sufficient time for solutions to crystallise.
AIRI Research Programme — Paper 7 of 18