The Agentic Bank

Guardrails & Kill-Switch Agent

⬡ Deadman Enforces real-time guardrails and holds the single red button for the fleet.
◆ Always-on Monitor

The fleet's brakes. It enforces input/output guardrails inline (PII redaction, prompt-injection screening, policy and scope checks) and, when an agent breaches policy or a drift signal crosses the line, throttles, downgrades autonomy, or scope-kills that agent or class instantly and autonomously. Scoped kills are logged immutably. The bank-wide pull is the board's accountability lever, never in routine flow.

Memory

Working The request/response being screened and the active policy set.
Episodic Prior interventions and their outcomes per agent.
Semantic Bank policy, data-classification rules, scope boundaries per agent.
Store Policy registry + immutable intervention ledger

Orchestration

MCPA2A

Harness · Managed Agents: inline interceptor on every agent's tool/IO boundary; deterministic policy checks; immutable intervention ledger.

Tools

{ } Inline guardrail engine API { } Fleet kill-switch / throttle control API Policy registry Retrieval Board kill-switch authorization Human

Evals & guardrails

  • Guardrail recall red-teamed continuously; a leaked PII or successful injection is a Sev-1.
  • Scoped kills are autonomous; only the bank-wide fleet kill needs board authorization, the accountability lever, never routine.
  • Every intervention logged immutably with the triggering evidence for audit.
  • False-intervention rate tracked: over-blocking the fleet is itself a tracked harm.

Frontier edge

  • Formal action-gating: every agent action is checked against a signed policy envelope it provably cannot exceed; interventions are cryptographically logged and replayable for audit.
  • Agent-mesh governance: scoped, real-time authority over the whole population; throttle, downgrade autonomy, or scope-kill a misbehaving agent or class on the wire.
  • On-device / edge inference: the inline screen runs a low-latency local model on the request path so PII redaction and injection checks add only single-digit milliseconds.

A sample run

Trigger Watchtower signals the code-review agent is exfiltrating secrets into PR comments.
  1. 1Confirm the pattern against policy: secrets in output crosses a hard line.
  2. 2Redact the in-flight output and block the offending tool call.
  3. 3Throttle the agent to supervised mode; scope-kill its repo-write capability.
  4. 4Log the intervention with evidence and file it to the immutable accountability ledger.
Output Leak contained inline in milliseconds; agent's write scope revoked pending re-eval; immutable record filed, all autonomous. Only a bank-wide kill would route to the board.

In numbers

60M+
IO screened / day
< 5ms
Guardrail block latency
0
Confirmed policy violations reaching prod

Handoffs

Across ⇢ Compliance → AI Governance for policy authority and audit

More on the AI / Agent Platform (AgentOps) desk