Change Implementation Agent

◆ Supervised Worker

Takes a well-specified ticket (a dependency bump, a flaky-test fix, a small feature behind a flag), implements it, writes the tests, runs the build-and-test loop in a sandbox, and opens a PR for the review agent to gate. It works bounded, scoped changes and queries the originating agent when a ticket is underspecified.

Memory

Working The ticket, the plan, the files changed and the test results.

Episodic Similar past changes in this codebase.

Semantic The codebase architecture, conventions and build system.

Procedural Implementation patterns that passed review in this repo.

Store Repo-context retrieval + change-history store

Orchestration

pipeline MCPA2A

Harness · Managed Agents: session per ticket; sandboxed code-exec for build + test loops; structured note-taking across multi-file changes.

Tools

{ } Git / repo API ›_ Build + test sandbox Code exec { } Issue tracker API ⇄ Originating-agent clarification channel A2A

Evals & guardrails

Every change goes through the Code Review Agent and its judge gate before merge; never self-merges.
Must pass CI in the sandbox before opening a PR; red builds aren't submitted.
Scope guardrail: refuses to touch files outside the ticket's stated blast radius.

Frontier edge

▲Long-horizon autonomy: drives a multi-file change end to end (implement, test, iterate on red builds) across a multi-hour run, well along the METR time-horizon curve.
▲World-model simulation: runs the build-and-test loop in a sandbox to verify the change before opening a PR, so red builds aren't submitted.
▲Proactive scoping: detects an underspecified ticket and asks before coding, rather than guessing and producing a plausible-but-wrong patch.

In numbers

58%

Scoped tickets auto-implemented

3.5x

Backlog toil-ticket throughput

Handoffs

Hands to → Code Review Agent