Where agents belong in the loop
Reversibility, observability, blast radius. The criteria for handing off.
A typical engineering ticket comes in pieces: design a small feature, write the contracts, build the UI, add unit and end-to-end tests, run the database migration, deploy it. One common pattern is to hand the entire ticket to an AI coding agent and ask it to ship. Another is to keep agents away from the ticket until the next model lands. Both treat the ticket as one job. It isn't. It is a stack of slices with very different reversibility, observability, blast radius, and substrate profiles, and each slice deserves its own placement decision.
This essay is a playbook for placing agents slice by slice. The lever you control is the substrate the agent runs on, not the model itself.
Place agents per slice instead of per ticket
Stop treating a ticket as a single placement decision. The ticket is not one job; it is a sequence of slices. Make the placement call per slice. The answer can be agent, gated agent, or human, and it can vary across slices inside the same ticket.
The slice menu I carry into every placement conversation:
- spec gathering
- solution document
- contracts and schema
- UI implementation
- unit tests
- end-to-end tests
- code review
- database migrations
- deployment and rollout
- grunt-work automation (release notes, changelogs)
Each row scores differently. UI implementation scores high on reversibility and is well bounded by typed components. Database migrations score low on reversibility and high on blast radius. Treating the whole ticket as one job forces one decision on very different work. Treating each slice separately gets you to a different answer per row, in the same ticket, with the same model.
Score the slice on three criteria
Before handing a slice to an agent, score it on three things. These are slice properties; the model does not affect them.
Reversibility. Can the action be undone in minutes, locally, without coordinating with someone else? Editing a file in a branch is reversible. Sending a message to a customer is not, and neither is dropping a database column. The faster and more local the rollback, the more autonomy a slice can have.
Observability. Can a human read what the agent did fast enough, and clearly enough, to catch a wrong answer before damage is done? A diff is observable. A multi-step tool sequence with side effects across systems is less observable. The worst case is when the output looks fine, is wrong, and nobody finds out for weeks.
Blast radius. How many records, customers, services, or dollars does one action touch? A sandbox with one synthetic record is small. A production database write at scale is large. Blast radius is how big the regret is if you are wrong.
The mistake I see most often is scoring on one lens and skipping the others. The typical version: a team optimizes for reversibility, decides "we can roll back", and ignores observability, so the team will not notice it needs to roll back. The agent then quietly produces wrong output for weeks.
These are slice properties. Even a capable model on a low-reversibility, low-observability slice still has to be gated. The model does not raise the slice's score; the substrate does.
Build the substrate; that's the lever you control
Watch two teams with the same model access work on the same slice. One team's agent ships useful work; the other's is unusable. The difference is rarely the model; it is the substrate.
Substrate is the environment the agent runs in. It decides what the agent can be trusted with. Five components, all of which you control:
- Institutional memory the agent can read at runtime: the team's decisions, conventions, and in-flight context.
- General AI infrastructure: the IDE setup, the model access tier the org pays for, and the in-house automations already in place.
- Tool coverage: typed connectors into the systems the agent acts on (the design tool, the component library, the database, the build pipeline).
- Deterministic verification: typed contracts, lint, CI, and tests dense enough that the agent's output can be checked without a human in the path.
- Reviewer bandwidth: humans who can read agent output at the rate it arrives, without a queue silently growing behind them.
A few months back my team was choosing where to first place an agent. We picked UI implementation: the slice scored well on reversibility (anything wrong shows up in a diff) and well on blast radius (a broken layout in a branch is the worst case). The first attempts were mediocre. We connected the IDE to the design tool the designers were already using, and added a typed integration to the team's component library. Then we tightened custom rules week after week to catch the mistakes the agent kept making. The model did not change in those weeks; the substrate did. By the end of the run a large fraction of the team's UI work had shifted from human-led to agent-led, on the same model that had been mediocre at the start.
Anthropic's measurement work on real agent traffic makes a similar point: by the 750th session, operators grant auto-approve roughly twice as often as new operators do on the same kinds of tasks. The criteria stay the same; more slices pass them over time as evidence and substrate build up.
Substrate is the lever. Waiting for the next model is the slowest version of this work; investing in substrate now is the fastest.
Be pessimistic on low-reversibility, iterative everywhere else
Your stance on a slice splits along reversibility.
Where reversibility is low (database migrations, irreversible deployments, anything irreversible by policy, external messages), keep the agent out of the final action until guardrails are proven and the rollback path is bulletproof. The agent can still help with a draft; the final action is gated.
Where reversibility is high (UI work, unit tests, code review comments, document drafts), the default flips. Put the agent in the loop now and tighten guardrails as you learn. The cost of an unhelpful agent comment is the reviewer skipping it. The cost of a bad migration is not.
This asymmetry lets a team make progress without betting the business. My team introduced agent code review knowing the first iterations would be mostly noise. Reviewers were told to skip the noise comments. We tightened guardrails on what the agent was allowed to comment on, round after round. Each pass, the noise dropped and the signal rose. The total cost of that iteration was limited by a slice property: reviewers ignoring a comment costs almost nothing.
The trap is mistaking "we have an agent here" for "the guardrails work". On a high-reversibility slice the guardrails still matter; you just build them in the open instead of perfecting them first.
Design guardrails for human over-delegation as well as model error
Most guardrail design assumes the agent is the thing that fails. Sometimes the operator is.
A junior team member of ours started delegating everything. Whole slices were going to agents that should have stayed on the per-slice menu. The guardrails the team had built against model failure caught the pattern early. We had originally designed for "model writes wrong code". The actual catch was "human handed off a slice that should not have left their desk".
The lesson is that the guardrails have to catch human-side failures too. Some of the same mechanisms (review gates, contract enforcement, the deterministic verification substrate above) double as a safety net against over-delegation. If you design only for model error, you miss the operator who hands off work that should have stayed with them. The people most likely to do that are the most junior on the team.
Treat placement as perishable; re-tier as substrate grows
Whatever placement map you draw today will be wrong within a year.
That is not a problem with the criteria; the criteria are stable. The number of slices that pass them keeps growing, because substrate keeps changing. A slice that was "do not hand off" twelve months ago can be "fully autonomous loop" today, on a substrate that did not exist then.
The clearest example on my team is the loop that watches for crash signals, files a ticket, attempts a fix on a narrow class of bugs, and raises a PR. I would not have believed this loop was possible six to twelve months ago. The criteria did not change; the substrate caught up. The five components above each crossed the bar one by one. Once every lens passed, the slice moved up a tier.
Make re-tiering routine. Hold the criteria steady. Re-check which slices pass every quarter, or whenever a substrate component changes (a new integration, a model upgrade). The conversation costs you half an hour. Not running it costs you a slowly-wrong map of where agents belong.
When not to
A few zones stay out of bounds regardless of how well a slice scores on the three criteria.
Anything irreversible by policy (billing actions, PII handling, security-sensitive permission changes) is gated by policy, not by blast-radius math. The math may say "small impact"; the policy still says "human in the loop". See security and privacy non-negotiables and the ethics of AI on a team.
Cost is a fourth lens this essay has skipped. Cost, latency, reliability as first-class covers it. If a slice passes reversibility, observability, and blast radius but the token economics are wrong, the agent does not belong there yet.
Every section above teaches the same loop:
- Decompose the ticket.
- Score each slice on reversibility, observability, and blast radius.
- Build the substrate.
- Hold the criteria steady.
- Re-tier when the substrate moves.
Agents go where a slice's three lenses let them go, on a substrate the team has actually built.