Diagnosing underperformance — Engineering Playbook

Six to twelve months. That is how long it took me, the first time, to realise the engineer I was coaching did not need more coaching. He agreed in every 1:1. He acknowledged every piece of feedback. He sometimes even brought ideas. The PRs kept coming back with the same comments. Releases slipped. Reviewers burnt time iterating on the same issues. I kept reading the surface and missing the absence. What I should have been reading was the rest of the team, watching me tolerate a problem I had misdiagnosed, and quietly losing trust in my judgement.

Flagging underperformance is a separate problem. If your ladder and quarterly reviews are not yet doing that job, start with levels and progression. What follows is the diagnosis you run once someone has been flagged.

The standard coaching frame is a 2x2 you have seen in most management books: skill on one axis, will on the other. Four quadrants, four labels (enthusiastic beginner, disillusioned learner, reluctant expert, disengaged), four actions. It is popular because it lands a clean diagnosis in a single conversation. It is also wrong about two things at once. It labels the person instead of the situation, and it hides the failure mode that fooled me for a year.

The real diagnosis runs on two axes, but they are skill and intent, and one check runs before either of them.

Most lists you will find add fit (does this role match this person) and context (does the environment let them succeed) as a third and fourth axis. In a 200-engineer org with role mobility, that can be honest. In a Series B startup with 40 engineers and three product areas, fit and context collapse to one of two answers. Either the role is what it is and the fit is not there (part ways), or the context is broken and the fix is yours to make (own it, do not diagnose a person for it). The four-way split sounds rigorous and acts like a stall. Two axes, plus a room test. That is the working version.

The room test comes first: the room, not the engineer. Hold pay, project mix, and people roughly constant. Now look at the team.

If intent looks low across most of the team, the cause is yours. Look at the work mix, the last pay cycle, the narrative you have been telling them, your own manager. In December 2023 my team came out of a layoff into a year with no hike. Intent across the whole room looked rough. None of those people were the problem. The story I was telling them was. That arc belongs to re-narrating change when direction shifts. What matters here is the room test itself: a team-wide intent dip means look at yourself before you look at any individual.

If intent looks low in one person while the rest of the team is performing, the team is your control group. Their good performance is the evidence that pay, project, and people are not the problem. The signal is the person. Only now do you start diagnosing the individual.

Inside the individual diagnosis, four cells. All four can look the same in a Tuesday status meeting; they require different actions.

Skill gap, fixable. The engineer wants to close the gap, the gap is bounded, the ramp is realistic. Go to the repair loop. Most of your diagnostic load should land here in the early years of someone's career.

Skill gap, unfixable. The gap is too large or the ramp is too long for the team to absorb. Part. This is rare. Most "unfixable" calls turn out to be skill-fixable cases the manager gave up on too early, or intent cases hiding behind a skill label.

Intent gap, loud. Dodging the harder ticket, debating every piece of feedback, gaming the standup. The easy diagnosis and the easy action. Once obvious motivators (pay, people, project) have been ruled out, it is rarely fixable.

Intent gap, silent. This is the one that took me a year. The engineer engages. Agrees. Acknowledges. Sometimes brings ideas. There is nothing on the surface to push back on. What is missing is the ambition to push past the current level. They are not unmotivated by the work; they have made peace with their ceiling and would like to do their job. It shows up most often in seniors who arrive with a bar set by a previous company. If their work was good enough there, they expect it to be good enough here. Easy to misread for many months as a slow skill build.

The thing that finally flipped the diagnosis on my engineer was not a piece of code or a missed release. It was a question I asked myself between two 1:1s: when was the last time he came to me about this without waiting for the calendar?

The answer was never.

The kernel test is whether the engineer drives the improvement process or only reacts to it. Driving looks like the engineer setting the next move on their own. Reacting looks like waiting for you to set it. Do they bring their own solutions, or do they ask which solution you would like? Do they come asking for feedback between 1:1s, or do they wait? Do they propose the checklist, or do they build the checklist you suggested and never touch it again? Agreement and acknowledgement without self-driven motion is the ceiling tell.

The same kernel test catches a different case that looks identical from the outside. I have had engineers with the same surface symptoms whose root was confidence, not ceiling, often after years of harsh feedback from a previous manager. They want to drive the process; they just second-guess every motion. The kernel test still answers cleanly (yes, they drive), and the action is different. That arc belongs to the repair loop, not here.

Once you have the diagnosis, the action is one question: is their current level good enough for the team's required bar?

For a confirmed ceiling case, the answer is usually no, and the action is to part. Hand off to parting well. For skill-fixable or confidence cases, hand off to the repair loop. For the team-level intent dip, hand off to the work you owe the room.

The cycle time on this diagnosis was six to twelve months the first time. It is closer to weeks now, not because the test is sharper but because I have learnt to read the absence of self-driven motion instead of waiting for delta in results. What used to require three quarterly reviews and a stack of PR comments now reads off one or two 1:1s and the pattern of who initiates.

The audience for the action is not the engineer being parted. The audience is the rest of the team, watching to see whether you can name a problem they have already named in their own heads. You raise the team bar; you do not lower it. The cost of a misdiagnosis is not in the person you keep too long; it is in the strong engineers who watch you keep them and quietly stop trusting your judgement.

Starter kit

Run this on yourself before the next 1:1 with the engineer you are worried about.

Step 1. Room check. Is intent low across most of the team?

Yes → the cause is yours. Fix narrative, work mix, pay, your own manager. Stop here.
No → the team is your control group. Continue.

Step 2. Find the cell.

If you see...	It is...	Do...
Visible attempts, asks for feedback, brings half-solutions, frustrated with own pace	Skill gap, fixable	Repair loop
Engineer agrees this is not their ladder, or honest time-to-bar is worse than a backfill	Skill gap, unfixable	Part
Persistent low engagement, pay and people and project already addressed	Intent gap, loud	Part
Engages, agrees, marginal improvement over many months	Intent gap, silent	Run Step 3

Step 3. Kernel test. Fill with concrete dates, or write "never."

Question	Last instance
Came to you about this between 1:1s, without waiting for the calendar
Brought a solution, not a question about which solution you would prefer
Built on a process change after one suggestion, without follow-up from you

Mostly "never" → silent ceiling. Part.
Mostly "yes, recently" but still struggling → confidence, not ceiling. Repair loop.

Step 4. Bar check. Is their current level good enough for the bar your team needs?

No → part.
Yes → no action.

Run before the 1:1, not in it.