The Agentic Bank
TEC Engine room · 6 desks · 14 agents

Technology, Data & AI Platform

The agentic control plane and the control tower that governs the fleet.

Platform engineering, SRE, cybersecurity, data governance, software delivery and the AI/Agent platform itself. High-volume operational workloads: alerts, pull requests, tickets, lineage. AgentOps is the control plane that deploys, evals, traces and guardrails the bank's agent fleet, kill-switch included.

How it runs

Incidents detect and remediate themselves; the SOC swarm triages and hunts continuously; code reviews itself and opens its own PRs; lineage stays live. AgentOps runs every agent on one harness: versioned prompts, gold-set regression evals in CI, OpenTelemetry GenAI traces, real-time guardrails and a single scoped kill-switch. The loop runs agent-to-agent; the board sets the mandate and holds the bank-wide kill-switch.

Platform & Site Reliability (SRE)

2 agents

Keeps the bank's services up: monitoring, incident detection, auto-remediation, capacity, deploys and the chaos of production.

Workflow · Signal → alert → agentic triage → diagnose → remediate → post-mortem. Known incidents run runbook autopilot (restart, scale, roll back); novel outages route to a responder agent that reasons them end to end.

Cybersecurity / SOC

2 agents

Defends the bank: SIEM alert triage, threat hunting, vulnerability management and incident response across the security estate.

Workflow · Detection → triage → investigation → containment → threat hunt → remediation, an agentic swarm working a flood of mostly-benign alerts at machine speed.

Data Governance & Quality

2 agents

Keeps the bank's data trustworthy: quality monitoring, lineage, cataloguing, and the governance that regulators (BCBS 239) demand.

Workflow · Ingest → profile → quality checks → lineage capture → catalogue → certify. Bad batches are quarantined before they reach downstream models, reports and agents.

AI / Agent Platform (AgentOps)

5 agents

The control tower. Deploys, versions, evals, traces and guardrails every agent in the bank, with a scoped kill-switch. The shared harness, eval rig, observability plane, prompt/version registry and A2A/MCP service registry the whole fleet runs on.

Workflow · Register → version prompts/tools → CI regression eval → deploy (champion/challenger) → trace + guardrail in prod → drift-detect → offline consolidate → re-eval. The agentic control plane governs the fleet; the board sets the mandate and holds the kill-switch.

Eval Harness Agent
Runs the gold-set, judge and red-team evals that gate every agent release.
Gates every agent release. It runs gold-set regression suites in CI, orchestrates LLM-as-judge and agent-as-judge scoring, fires adversarial red-team prompts, and runs champion/challenger bake-offs before any new prompt or model version reaches production. A drop on any safety-critical suite blocks the release.
Fleet Observability Agent
Traces, monitors and drift-detects every agent in production in real time.
The single pane of glass over the whole fleet. It ingests OpenTelemetry GenAI traces from every agent run (tokens, tool calls, latencies, decisions) and detects the signals evals cannot catch offline: behavioural drift, cost blowups, tool-error spikes, a sliding judge score. On detection it signals Guardrails over A2A.
Guardrails & Kill-Switch Agent
Enforces real-time guardrails and holds the single red button for the fleet.
The fleet's brakes. It enforces input/output guardrails inline (PII redaction, prompt-injection screening, policy and scope checks) and, when an agent breaches policy or a drift signal crosses the line, throttles, downgrades autonomy, or scope-kills that agent or class instantly and autonomously. Scoped kills are logged immutably. The bank-wide pull is the board's accountability lever, never in routine flow.
Agent Registry & Protocol Agent
Operates the A2A/MCP/AP2 service registry and version control for the fleet.
The fleet's directory and DNS. It registers every agent, its tools and its A2A capability card, brokers MCP server discovery, governs versioned prompts and tool schemas, and underwrites AP2-based agent-to-agent payment mandates. It resolves one agent's request for another to a trusted, version-pinned, in-policy peer.
Fleet Consolidation Agent
Offline experience-replay that consolidates fleet memory and proposes improvements.
An offline batch job, not a live actor. It replays the day's agent trajectories (Reflexion- and SEAL-style), distils repeated corrections into procedural-memory updates, consolidates episodic logs into semantic facts, and drafts candidate prompt/playbook improvements. Every proposal routes through Crucible's evals and an independent oversight-agent gate before it ships. Experience replay, not live action.

Software Delivery & Code Review

2 agents

Ships the bank's software: PR generation, automated code review, test generation, dependency hygiene and release management.

Workflow · Ticket → branch → implement → PR → review → CI → merge → release, an agentic loop where the implementer agent and an independent review agent gate every change.

IT Service Desk

1 agents

The internal help desk: access requests, password resets, provisioning, and the long tail of employee IT tickets.

Workflow · Ticket raised → triage → resolve (knowledge base / automation) or route to L2/L3. The bulk is high-volume: access requests, password resets, software installs, provisioning.