DevOps and the Three Ways

DevOps is often misunderstood as a set of tools or a job title. The Phoenix Project by Gene Kim, Kevin Behr, and George Spafford argues that it is something more fundamental: a philosophy of work flow derived from manufacturing principles, applied to the relationship between software development and IT operations. The Three Ways are the theoretical framework underpinning the DevOps movement, and they constitute one of the most operationally rigorous frameworks for understanding why software organizations fail — and how to fix them.

The Context: Why IT Organizations Break Down

The Phoenix Project is structured as a novel — a deliberate choice that allows the authors to dramatize the dysfunction of a typical IT organization before explaining why it occurs. The dysfunction is recognizable to anyone who has worked in software organizations:

  • Urgent projects consume all capacity, leaving no room for operational stability
  • A single expert (“Brent”) becomes a bottleneck through whom all complex work must flow
  • Changes are made without proper documentation, leading to undiagnosable failures
  • Development throws untested code “over the wall” to Operations, which then owns the chaos
  • Unplanned work (firefighting) continuously displaces planned work, preventing progress

The theoretical insight, borrowed from the manufacturing Lean tradition, is that these are not bad luck or bad people — they are the predictable consequences of managing IT work without understanding its nature as a flow system.

The Four Types of Work

Before the Three Ways can be understood, one must distinguish the four types of IT work:

  1. Business projects — features and capabilities requested by the business
  2. IT operations projects — infrastructure and internal capability improvements
  3. Changes — modifications to existing systems
  4. Unplanned work — firefighting, incident response, recovery from failures

The critical insight is that unplanned work is the destroyer:

“Unplanned work kills your ability to do planned work, so you must always do whatever it takes to eradicate it. Murphy does exist, so you’ll always have unplanned work, but it must be handled efficiently.”

And the compounding mechanism:

“‘Technical debt’ that is not being paid down… comes from taking shortcuts, which may make sense in the short-term. But like financial debt, the compounding interest costs grow over time. If an organization doesn’t pay down its technical debt, every calorie in the organization can be spent just paying interest, in the form of unplanned work.”

This is the IT capacity death spiral: shortcuts create technical debt, technical debt generates unplanned work, unplanned work prevents planned improvement, so more shortcuts are taken to meet commitments, generating more technical debt.

The First Way: Fast Flow

The First Way is about creating fast, predictable, unidirectional flow of work from Development through IT Operations to the customer. The manufacturing analogy is a production line: material should move steadily forward, with minimal work-in-process (WIP) and no backward flow.

Key principles:

  • Limit WIP: Too many projects in flight creates constant context-switching and priority conflicts. Freezing all non-essential work to focus on one project produced dramatic improvements in the novel.
  • Protect the constraint: Brent, as the bottleneck, must never be idle for trivial reasons, and must never be assigned to work that others could do. “Any improvement made after the bottleneck is useless, because it will always remain starved, waiting for work from the bottleneck.”
  • Make work visible: Kanban boards, WIP limits, and visual management tools make the state of the system apparent. “If it’s not on the kanban board, it won’t get done.”
  • Reduce batch sizes: The theoretical ideal is single-piece flow. Larger batches increase variance, delay feedback, and lock up capital. “In any system of work, the theoretical ideal is single-piece flow, which maximizes throughput and minimizes variance.”

The wait time formula is particularly clarifying:

“The wait time is the ‘percentage of time busy’ divided by the ‘percentage of time idle.’ If a resource is ninety percent busy, the wait time is ninety percent divided by ten percent, or nine hours. Our task would wait in queue nine times longer than if the resource were fifty percent idle.”

This explains why overloaded systems grind to a halt: the queue time grows nonlinearly as utilization approaches 100%.

The Second Way: Amplify Feedback

The Second Way is about creating fast feedback loops from right to left — from customers back to development, from operations back to development, from testing back to coding. The goal is to detect and fix quality problems as early as possible, before they compound.

Key principles:

  • Fix defects where they originate: Every defect passed downstream becomes exponentially more expensive to fix.
  • Continuous testing: Integrating automated tests into the deployment pipeline means every code commit triggers a quality check.
  • Environment parity: Development, test, and production environments must be identical. The famous dysfunction: “it’s running on my laptop” — indicating that no one had ensured environment consistency.
  • Ten deploys a day: The target isn’t arbitrary — it’s an aspiration that forces automation. You cannot deploy ten times a day with manual processes. The constraint forces investment in the deployment pipeline.

“The First Way is all about controlling the flow of work from Development to IT Operations. Now you must prove that you can master the Second Way, creating constant feedback loops from IT Operations back into Development, designing quality into the product at the earliest stages.”

The Third Way: Culture of Experimentation and Learning

The Third Way is about creating a culture that simultaneously rewards experimentation (which requires accepting failure) and builds mastery through repetitive practice.

“The Third Way is all about ensuring that we’re continually putting tension into the system, so that we’re continually reinforcing habits and improving something. Resilience engineering tells us that we should routinely inject faults into the system, doing them frequently, to make them less painful.”

The key insight: in complex systems, the question is not whether failures will occur, but whether the organization learns from them. Sensei Mike Rother’s principle:

“‘Improving daily work is even more important than doing daily work.’ If you are not improving, entropy guarantees that you are actually getting worse.”

The Improvement Kata — repetitive practice of improvement cycles — is the mechanism by which teams build the habits that produce mastery. “Studies have shown that practicing five minutes daily is better than practicing once a week for three hours.”

The Deployment Pipeline: Operationalizing the Three Ways

The synthesis of all three ways is the deployment pipeline — the automated path from code commit to production deployment:

“You need to get everything in version control. Everything. Not just the code, but everything required to build the environment. Then you need to automate the entire environment creation process. You need a deployment pipeline where you can create test and production environments, and then deploy code into them, entirely on-demand.”

This is not merely a technical choice — it is an organizational transformation. When deployment is automated, the separation between Development and Operations dissolves. Both groups are responsible for the pipeline; both are invested in its reliability; both benefit from its speed.

Jeff Lawson in Ask Your Developer frames the organizational implication:

“The person who writes the code also ‘wears the pager’ for that code after it goes into production. It’s your code. If it crashes, you fix it. We like this idea because it pushes developers to deliver higher-quality code.”

And the infrastructure investment required:

“It’s not uncommon for great software companies to invest upward of 50 percent of all R&D funds into infrastructure.”

The Manufacturing Analogy

Throughout The Phoenix Project, the IT organization is compared to a manufacturing plant. This is not merely a pedagogical device — it is a claim that the principles governing physical production systems (Theory of Constraints, Lean, Total Productive Maintenance) transfer directly to knowledge work when properly translated.

The “Brent problem” is a direct analog to a bottleneck machine in a factory: everything routes through the single constrained resource, work piles up upstream, and the entire system’s throughput is limited by that one constraint. The solution is the same in both cases: identify the constraint, exploit it (ensure it is never idle or doing low-value work), subordinate all other decisions to it, and then elevate it (expand its capacity).

The Organizational Resistance

The Phoenix Project’s deepest insight may not be technical but political: the dysfunction it describes is maintained by an organizational immune system. Leaders who optimize for their own metrics (development velocity, security compliance checklists, infrastructure uptime) without understanding the system-level effect create the exact chaos they’re trying to prevent. “Your job as VP of IT Operations is to ensure the fast, predictable, and uninterrupted flow of planned work that delivers value to the business while minimizing the impact and disruption of unplanned work.” This systems-level framing requires leaders to relinquish local optimization in favor of global throughput — a politically difficult ask.