Bus Factor: The Vulnerability of Concentrated Knowledge
A team’s bus factor refers to the number of people who could “get hit by a bus” before the project becomes paralyzed. A bus factor of 1 means a single absence is enough to block the organization. The metaphor is morbid, but the reality it describes is mundane: vacation, sick leave, resignation, or a simple job change. In most IT teams, the actual bus factor hovers between 1 and 2 for critical systems.
This vulnerability is not an accident. It emerges naturally from how organizations manage (or ignore) technical knowledge. While some individuals deliberately cultivate their indispensability, most situations of concentrated knowledge result from systemic dynamics that no one consciously chose.
The Faces of the Single Point of Failure
The legacy DBA. He’s been there for fifteen years. He knows every stored procedure, every trigger, every historical workaround in the production database. Documentation? “It’s in his head.” When he goes on vacation, the team holds its breath. When he resigns, it will be a crisis.
The silent architect. She designed the system five years ago. She knows why this module communicates with that one, why this dependency exists, why “you absolutely must not touch this file.” She never documented these decisions: they seemed obvious to her. Each new hire spends weeks reconstructing a fraction of this knowledge through code archaeology.
The solo DevOps. He manages the infrastructure, CI/CD pipelines, secrets, certificates, cloud access. He’s the only one who knows how to deploy to production. Developers send him tickets; he resolves them. No one else has the rights, the knowledge, or even the curiosity to understand what he does.
The technical founder. In startups, this is often the CTO or first developer. He wrote the first lines of code, chose the stack, defined the conventions. Five years later, the team has grown, but the founding decisions remain opaque to everyone but him.
These profiles are not exceptions. They are the norm.
A Systemic Symptom, Not Individual Fault
It’s tempting to blame these individuals: “He keeps information to himself,” “She refuses to document.” This reading misses the point. Knowledge concentration is rarely a deliberate choice; it is the predictable result of several organizational forces.
Velocity pressure. Documenting takes time. Training a colleague takes time. When every sprint is a race against the clock, knowledge stays in the head of whoever acquired it. The organization implicitly rewards fast execution, not transmission.
Lack of planned redundancy. Teams are often sized to the minimum. Each person has “their” domain. This specialization maximizes apparent short-term efficiency and creates single points of failure in the medium term.
Valorization of individual expertise. Organizations celebrate “experts,” “go-to persons,” “subject matter specialists.” This recognition, legitimate in itself, reinforces knowledge centralization. Being indispensable becomes a marker of value, consciously or not.
Turnover as taboo. Many organizations plan as if their key employees would never leave. Preparing a transition is sometimes perceived as a lack of trust, or even an invitation to leave.
When the system pushes toward concentration, expecting individuals to resist this pressure alone is unrealistic.
Measuring and Reducing the Risk
Bus factor is not inevitable. It can be measured and addressed, provided it becomes an explicit priority.
Map knowledge dependencies. For each critical system, identify who knows what. The question is not “who is responsible?” but “who can intervene if the responsible person isn’t there?” A simple table suffices: system, primary expert, backup, documentation level.
Institutionalize pair programming and code review. These practices don’t just serve quality; they spread knowledge. Code reviewed by two people is code understood by two people. A feature developed in pair is a feature two brains can maintain.
Rotate responsibilities. Periodically require that the “backup” take the lead on a system they don’t master. Short-term discomfort produces long-term resilience. Google’s SRE teams call this the “wheel of misfortune”: everyone must be capable of handling incidents on any service.
Documentation as deliverable. Integrate documentation into the definition of “done.” An undocumented system is not shipped. This rule, difficult to maintain under pressure, is the only one that produces up-to-date documentation.
Onboarding as resilience test. Every new hire who struggles to understand a system reveals a transmission deficit. Treat onboarding not as a chore but as a bus factor audit.
The Hidden Cost of Indispensability
Organizations systematically underestimate the cost of knowledge concentration. The visible cost (expert’s salary, training time) is minimal compared to invisible costs.
Opportunity cost: the overloaded expert becomes a bottleneck. Projects wait for their availability. Their time is monopolized by support rather than innovation.
Fear cost: teams avoid touching systems they don’t understand. Legacy code becomes untouchable, technical debt accumulates, agility disappears.
Departure cost: when the expert finally leaves (and they will leave), the organization discovers the extent of its dependency. Projects stop, incidents multiply, knowledge must be rebuilt at great expense.
Investing in knowledge distribution is not a luxury. It’s insurance against a risk that always materializes—the only question is when.
Want to dive deeper into these topics?
We help teams adopt these practices through hands-on consulting and training.
or email us at contact@evryg.com