Reserving capacity for the unknown
Teams at 100% planned capacity cannot absorb a single surprise.
A team I once ran shipped on a three-week cadence. One Wednesday in week one, three things landed in the same morning. A leadership P0 about a pricing change the go-to-market team needed shipped before quarter end. A production issue customer support had triaged for two days, now escalated to us. And the head of growth dropping into our channel to ask why the experiment we owed them was sliding. The team was already at full sprint capacity. There was no decision that didn't break a commitment.
That morning is the failure mode every new EM eventually runs into, and the root cause is structural. In a fast-moving company, priorities come from many places. Engineering owns the roadmap, but customer support drives incident urgency, growth defines its own deadlines, and leadership reserves the right to a true P0.
The clean version of this is one senior engineering manager whose call is full and final, with every priority routing through them. Larger, slower orgs sometimes have it because there's enough stability for one person to hold the full priority map in their head. In fast-moving cos, the role is rare and brittle. Either the person can't keep up with the pace and becomes a bottleneck, or the org grows around them and starts routing decisions through PMs, growth leads, and skip-level execs anyway. Most teams end up with four people who all believe they have veto power, and your job as the EM is no longer to wait for the arbiter.
If you accept that surprises are a property of the environment, the question becomes how much room your plan leaves to absorb one.
The first move most teams reach for is wrong. It's the per-task buffer: pad every estimate by 10 to 20 percent and trust that the cushion will be there when you need it. It's defensive and invisible, and it fails twice. It fails inward because cushion buried inside an estimate disappears into the work. Pad a three-day task to four and it takes four. The cushion doesn't survive as recoverable slack at the end. It gets eaten during execution, and by the time you go looking for it, there is nothing left to pull. It fails outward because the cushion isn't a thing you can point to and protect. It's buried across every task estimate, and PM can't see any of it. PM doesn't believe in it. Growth doesn't believe in it. Leadership definitely doesn't.
The mental model that fixed this for me is a bucket. Capacity is the bucket. Work is the water. A planned sprint is what we've already poured in, and any incoming surprise is more water you're trying to add. People often think you can swap water out, pull a task to make room, but in practice that swap is expensive. You're mid-flight on the bumped task, dependencies are already moving, and the team has to re-plan in the middle of executing. A bucket that isn't full can absorb the new water. A bucket that is full forces a tradeoff every time.
The framework I run leaves a named line in the bucket that everyone, including PM and growth and leadership, can see. Reserve is a category of work in the plan, not a margin on estimates. Martin Fowler called the underlying idea "slack" years ago; what follows is two ways I've found to operationalize it so it doesn't quietly dissolve.
The first is a sprint-level buffer with the breakdown written out. On a three-week sprint, each engineer has about fifteen working days. Nine of those go to planned task work, two to three I carve out as named reserve, and the remaining three or four are absorbed by standups and small coordination tasks that aren't worth estimating. I'll name where the reserve goes: roughly one day for adhoc leaves (each person on the team takes about one a month, and across the team that's a near-certainty in any given sprint), roughly two days for adhoc work pulled in mid-sprint. Once you start naming the drains, the list grows: interview loads when hiring is on, onboarding cost for a recent hire, on-call carryover from last week, the code review queue, cross-team unblocks. You don't have to model these formally. Naming them is most of the work, because the buffer stops being a vibe and starts being a line item PM can argue with. A rolling average over a few sprints will tighten the number. Do that later, not first.
The second is structural rather than per-sprint. We run a one-week mid-release after every three-week main sprint. The full cycle is three weeks of planned work followed by one week of calendar reserve. The week absorbs spillover from the last release, production issues that don't warrant a hotfix, internal bugs, tech debt, and small tech initiatives. It's also the natural home for sprint retro, 1-1s, and next-sprint planning, which would otherwise eat into "real" sprint capacity and be first to drop under pressure. Every stakeholder knows the week exists. No one has to negotiate it back into the schedule each cycle, which is the part that matters.
Both of these will have weeks where the reserve isn't consumed. That's the point. But you do want a staged backlog ready for those weeks: low-priority production issues, bugs, crashes, well-scoped debt items. If a developer is freeing up ahead of time and the reserve isn't being eaten, pull from there. Tech changes fast enough that finding good filler work is rarely the bottleneck. The discipline is keeping the backlog triaged and shallow, not letting it accumulate into a junk drawer that nobody trusts.
Negotiate the reserve before the surprise lands. Mid-incident is the worst time to argue with PM about whether slack is real. The smallest version of this contract is a pre-agreed swap rule. If a P0 comes in, the lowest-ranked item in the current sprint bumps to next sprint, automatically, without a meeting. You don't have to get every stakeholder to love the rule. You have to get them to acknowledge it once, in the calm.
A team running at 100 percent planned capacity has decided to pay for every surprise with chaos. Reserve is what you write into the plan so the decision is already made.