The two-week notice came and went. You thought you had a plan. But now it’s week three, and the team keeps getting stuck on questions nobody can answer. Something broke in production this week and it took four hours to even understand what it was doing — let alone fix it. The systems are running, but nobody is confident about why.
This is a bus factor problem. And if you’re honest with yourself, you probably knew it existed before this person left. You just hoped it wouldn’t become a crisis before you fixed it.
Here’s how to get through the next 60 days.
The First Week: Map What You Don’t Know
The instinct is to start recruiting immediately. Resist it. Hiring the wrong person into a knowledge vacuum is worse than having the position open — they’ll make decisions based on incomplete context and you’ll spend the next year cleaning up after them.
The first week is about inventory.
Sit down with your remaining engineering team — all of them — and run through these questions:
- What systems did this person build or primarily own? List them by name.
- For each system: who else has worked in it, even briefly?
- What are the failure modes of each system? Does anyone know?
- What are the deployment and release processes? Are they written down, or did they live in someone’s head?
- What decisions were recently made or in progress that only this person knew the full context for?
- What external systems, vendors, or integrations did they manage personally?
You will find gaps in this exercise. That’s the point. A gap you know about is manageable. A gap you discover at 2am during an incident is a crisis.
The First Two Weeks: Protect Production
Based on your inventory, identify the three systems or processes most likely to need intervention in the next 60 days. These are your immediate risk.
For each one, assign a named owner from your existing team — even if they’re not the ideal person, even if they’re not fully comfortable yet. Named ownership with accountability is better than collective uncertainty. “Nobody knows” is how you end up with six engineers looking at a broken deploy script and none of them willing to touch it.
For each of these systems, do a basic documentation sprint. It doesn’t need to be perfect. It needs to be enough that someone can understand what the system does, how to deploy it, how to roll back, and who to call if something breaks. One afternoon per system, paired with whoever knows the most. Write it in a shared doc, not someone’s notes app.
If your outgoing engineer is still on good terms — and if they left voluntarily, they usually are — consider a short-term consulting arrangement for the first 60 days. Many engineers are willing to take a few hours a week at a consulting rate to be available for questions. This is a reasonable thing to ask for, and it’s much cheaper than the alternative.
The First Month: Find Your Real Technical Lead
There is someone on your team right now who is carrying more than their title suggests. The person who answers the hardest Slack questions. The person whose name comes up when something is ambiguous. The one the rest of the team defers to.
That person needs a conversation — not a promotion announcement, but a real conversation. Tell them what you’re dealing with. Ask them directly: what would they need to feel more confident owning the technical direction for the next 90 days? Better title, better pay, protected time to document instead of just ship?
This is not necessarily your future CTO or VP of Engineering. But right now you need technical continuity more than you need an organizational chart. And whoever is carrying the load informally should be compensated and recognized for doing it explicitly.
What Went Wrong and How to Fix It
You had a bus factor problem. One person knew too much and documented too little. That’s not unusual — it’s almost universal at companies that built fast. But now is the time to fix the structural cause, not just the immediate symptom.
The fix has three parts.
Make documentation a delivery criteria, not an aspiration. Every significant feature, every infrastructure change, every architectural decision should produce a brief written artifact. Not a novel. A few paragraphs that captures what was built, why, and how to operate it. This takes 20 minutes after you’ve done the work. It’s worth thousands of hours in future incident response.
Require shared ownership on anything in production. No system should have a single person who understands it. Pair on critical work. Do regular knowledge-sharing sessions. Rotate who runs deploys so nobody is the only one who knows how.
Make bus factor part of your engineering review process. Every quarter, go through your systems and ask: if this person left tomorrow, who would own this? If the answer is nobody, that’s a risk on your risk register, not something you hope doesn’t happen.
On Hiring the Replacement
When you do hire — and you should, once you’ve stabilized — don’t hire for the skills that left. Hire for the skills you need next.
The engineer who just left built the system you have. The next hire needs to scale it, maintain it, and eventually hand it off to someone else. Those are different skills. Write the job description based on where you’re going, not where you’ve been.
If you’re in the middle of this right now — trying to figure out what’s at risk, who can own what, and how to stop the bleeding — that’s exactly what a 15-minute call can help with. I’ve done this assessment more than once for companies that thought they were in worse shape than they were. Book time at go.nebari.cc/15-min and we’ll map out the next 60 days.
Related: Your CTO Just Quit — What Now? | Engineering Culture and Retention | Signs Your Engineering Team Needs Outside Help
