Fleet Observability Agent

◆ Always-on Monitor

The single pane of glass over the whole fleet. It ingests OpenTelemetry GenAI traces from every agent run (tokens, tool calls, latencies, decisions) and detects the signals evals cannot catch offline: behavioural drift, cost blowups, tool-error spikes, a sliding judge score. On detection it signals Guardrails over A2A.

Memory

Working The rolling window of fleet metrics and the anomaly under inspection.

Episodic Per-agent behavioural baselines and prior drift events.

Semantic OpenTelemetry GenAI semantic conventions; what 'healthy' looks like per agent.

Store Trace warehouse + per-agent baseline store

Orchestration

swarm MCPA2A

Harness · Managed Agents: always-on session over the trace stream; compaction keeps rolling fleet state in scope; sandboxed code-exec for drift statistics.

Tools

{ } OpenTelemetry GenAI trace pipeline API { } Metrics + cost dashboards API ›_ Drift-detection sandbox Code exec ⇄ Guardrails agent A2A

Evals & guardrails

Anomaly-detection precision tracked; a missed prod regression is a control failure.
Drift thresholds per agent are versioned and reviewable.
Traces retained immutably for post-incident forensics and regulatory exam.

Frontier edge

▲Causal drift attribution: traces a sliding judge score back to its actual cause (a malformed upstream feed) via counterfactual reasoning, not a correlated dashboard wiggle.
▲Agent-mesh governance: the fleet-wide vantage point that lets it negotiate (A2A) real-time interventions with Guardrails the instant a population starts to misbehave.
▲Continual baselining: each agent's behavioural 'healthy' profile updates online, so drift is measured against a living baseline rather than a stale snapshot.

A sample run

Trigger The SOC Triage Agent's judge-sampled accuracy slides 4 points over 6 hours.

1Detect the drift against the agent's baseline; rule out a traffic-mix shift.
2Trace the regression to a new alert source feeding malformed enrichment context.
3Quantify blast radius: which dispositions in the window are now suspect.
4Signal Guardrails (A2A) and open a flagged case for the owning team.

Output Real-time alert with the root-caused trace; Guardrails tightens the agent to supervised mode on that alert source while the upstream feed is fixed.

In numbers

100%

Agent runs traced

< 10 min

Mean time to detect drift

190M

Fleet spans / day

Handoffs

Hands to → Guardrails & Kill-Switch Agent → Eval Harness Agent → Incident Response Agent