Skip to content

gianpremrajaram/project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Information Contracts for Multi-Agent Reasoning Systems

An open-source framework for measuring, enforcing, and studying information integrity at the boundaries between reasoning components in multi-agent AI systems.

The Problem

Multi-agent AI systems degrade information at every handoff. When context passes from one agent to another, it is compressed, distorted, or silently dropped. Empirical measurements show information fidelity collapsing from 0.91 to below 0.20 across extended reasoning chains.

Orchestrator          Agent A             Agent B             Agent C
[full context] ──→ [~70% fidelity] ──→ [~40% fidelity] ──→ [~20% fidelity]
                        ↓                    ↓                    ↓
                   context compressed   details silently     output appears
                   at handoff           dropped              coherent but is
                                                             built on eroded
                                                             evidence

Every existing tool operates downstream of this problem:

Layer Examples What it does Position
Observability LangSmith, Langfuse, Arize, Datadog Traces what agents did after execution Downstream
Guardrails CrewAI task guardrails, OpenAI SDK Validates what agents produce Postcondition
Orchestration LangGraph conditional edges, AutoGen Routes control flow between agents Structural
Information Contracts This framework Validates what agents receive Upstream (precondition)

The upstream boundary is the only position where information loss can be prevented, not merely diagnosed.

Architecture

Four independent, composable modules:

                    ┌──────────────────────────────────────────────┐
                    │         Information Contract Layer           │
                    │  Declare → Intercept → Score → Enforce       │
                    └──────────┬───────────────────┬───────────────┘
                               │                   │
                    ┌──────────▼──────────┐  ┌─────▼──────────────┐
                    │    Coordination     │  │   Experimental     │
                    │  Pattern Observatory │  │     Testbed        │
                    │  MI + Transfer      │  │  5 information     │
                    │  Entropy monitoring │  │  conditions, MARL  │
                    └──────────┬──────────┘  └─────┬──────────────┘
                               │                   │
                    ┌──────────▼───────────────────▼───────────────┐
                    │       Collective Decision Interface          │
                    │  Policy sim · Science orchestration · Gov    │
                    └─────────────────────────────────────────────┘

1. Information Contract Layer

The core design principle: govern agent input vs output If information has been silently degraded before an agent receives it, no amount of output evaluation can recover what was lost.

A contract declares what information must survive a handoff:

contract:
  boundary: "researcher_to_analyst"
  required_fields:
    - query_intent
    - source_constraints
    - confidence_intervals
    - methodology_flags
  min_integrity: 0.75
  on_violation: reject_and_log

At runtime, the system intercepts the handoff boundary, scores information preservation, and enforces the contract before the receiving agent processes anything.

Scoring uses a dual-track architecture:

Track Method Use case Latency
Real-time Calibrated embedding-similarity proxy Production enforcement <1ms
Offline KSG mutual information estimator Measurement, calibration Minutes

The KSG (Kraskov-Stogbauer-Grassberger) estimator provides rigorous mutual information measurement but scales poorly with dimensionality. The embedding proxy is calibrated against KSG benchmarks and deployed for runtime enforcement.

2. Coordination Pattern Observatory

Information-theoretic monitoring that makes multi-agent coordination dynamics visible:

  • Mutual information between agent action trajectories measures synchronisation (are agents behaving similarly?)
  • Transfer entropy measures directed causal influence (is Agent A's behaviour causing Agent B's?)

Together, these classify emergent coordination into three categories:

Cooperation:  High MI, symmetric TE     → agents coordinating toward shared goal
Competition:  Low MI, low TE            → agents acting independently
Collusion:    High MI, asymmetric TE    → unintended coordination, one agent leading

This is distinct from agent-drift metrics that track individual behavioural degradation. The Observatory monitors coordination dynamics between agents, making system-level behaviour auditable.

3. Experimental Testbed

Five systematically varied information conditions, applied to the same multi-agent environments:

Condition What agents observe Tests
Full Complete state information Baseline
Aggregate-only Mean values, compressed summaries Information compression effects
Delayed True state with k-step lag Temporal degradation
Noisy True state + calibrated noise Signal corruption
Asymmetric Uneven information across agents Power imbalances

Environments: DeepMind's Melting Pot (sequential social dilemmas) and the MARL-BC economic simulation framework (Cobb-Douglas production economies with heterogeneous agents).

Keep agents, tasks, learning algorithms constant, isolate information that crosses the boundary. Hypothesis: information structure at agent boundaries shapes collective outcomes more than individual agent capability.

4. Collective Decision Interface

Application layer connecting the framework to domains where collective reasoning passes through computational intermediaries before reaching human decision-makers.

Scenario: Multi-domain policy simulation

Agent 1: Macroeconomic modelling
    │
    ├── [Handoff] Uncertainty bounds compressed ← CONTRACT INTERCEPTS HERE
    │
Agent 2: Health & demographic forecasting
    │
    ├── [Handoff] Regional variance flattened  ← CONTRACT INTERCEPTS HERE
    │
Agent 3: Regional resource allocation
    │
    └── Recommendation reaches human decision-makers

Without information contracts, compressed uncertainty at the first handoff silently narrows the range of scenarios downstream agents evaluate. The recommendations appear robust but are derived from a truncated possibility space; populations in the tails of the distribution, those most affected by the policy, are the ones whose outcomes were quietly dropped.

With an information contract at each boundary, this narrowing is detected and surfaced before it propagates.

Why Upstream, Not Downstream

The existing ecosystem validates outputs or traces execution. Information contracts validate inputs. These are complementary positions in the agent execution stack:

Agent A produces output
       │
       ├── Output guardrails check format, safety, hallucination (postcondition)
       │
       ▼
   HANDOFF BOUNDARY
       │
       ├── Information contract scores fidelity of context transfer (precondition)
       │
       ▼
Agent B receives input
       │
       ├── Observability platform logs what Agent B does (downstream trace)
       │
       ▼
Agent B produces output

The contract layer is the only position where degraded context is caught before it enters a reasoning process. Once an agent has processed corrupted input, the information loss is irrecoverable.

Alignment: Collective Flourishing

ARIA identifies systemic fog, the opacity that prevents societies from navigating the future, as a defining barrier to collective flourishing. The foundational technologies for modelling, simulating, and coordinating are maturing, but the integration layer connecting them remains under-explored.

This framework operates directly at that integration layer:

  • The Observatory makes systemic complexity legible, converting opaque multi-agent dynamics into auditable coordination patterns
  • The Contract Layer provides a new coordination mechanism, a declarative way to govern what reasoning processes require at their input boundaries
  • The Testbed builds the empirical evidence base for designing coordination architectures rather than just implementing them
  • When information integrity is maintained, collective reasoning becomes a genuine augmentation of human deliberative capacity rather than a source of unobserved distortion

Status

This framework is the subject of an active research programme combining an MSc dissertation (UCL, completing August 2026) with a fellowship application to Encode: AI for Science (Cohort 2, backed by Pillar VC and ARIA). The Contract Layer and MI estimation engine form the first build phase, followed by the Coordination Observatory and full MARL experimental validation.

References

  1. Hill, Koh & Jishnuanandh (NeurIPS 2025), "Communicating Plans, Not Percepts": agents communicating compressed latent representations achieve 99.9% coordination success vs 12.2% for raw observation passing. Independent validation that information structure at agent boundaries determines collective performance.

  2. Lin, Dong, Hao & Zhang (NeurIPS 2023), "Information Design in Multi-Agent Reinforcement Learning": demonstrates the revelation principle fails when both sender and receivers are learning agents. Classical information-theoretic results do not transfer directly to multi-agent learning systems.

  3. Johanson, Hughes, Timbers & Leibo (DeepMind, 2022), "Emergent Bartering Behaviour in Multi-Agent Reinforcement Learning": RL agents develop supracompetitive pricing not predicted by conventional theory, establishing the disconnect between autonomous agent behaviour and designed models.

Author

Gian Prem Rajaram MSc Computer Science, University College London gian.rajaram.23@ucl.ac.uk

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors