Bottlenecks and Code Review

Thinking about constraints

Nov 26, 2020

Hi, I’m Russell. This is issue #1 of Russell’s Index, where I write about the lessons I’ve learned—and continue to learn—as a founding employee at SharpestMinds.

Managerial fiction and bottlenecks

I recently read The Goal: A Process of Ongoing Improvement. It’s a book about operations management in the form of fiction. Alex Rogos—plant manager—struggles to save his manufacturing plant from being shut down by corporate. The day is won by focusing on the plant’s constraints (or bottlenecks) and increasing throughput. It’s a little cheesy, but I found it educational and surprisingly hard to put down.

The first lesson Alex Rogos learns is that optimizing anything but the bottleneck is a waste of time. Throughput—the rate of production—can only be as fast as the slowest step in the chain. Any gains in the efficiency of the system have to come from the bottlenecks.

Like most good advice, it sounds like common sense. But—like most good advice—it’s easier said than done. Without a holistic view of the system, it’s easy to get trapped in local optima.

In the book, one of the bottlenecks turns out to be a heat treatment machine with limited capacity. The steps that lead to the heat treater have a higher capacity. They can produce parts faster than the heat treater can process them, resulting in a growing backlog in front of the heat treatment machine.

In operations jargon, that backlog is work-in-process (WIP). WIP can hide inefficiencies in the system, encourage busy-work, and increase costs (on storage and maintenance, for example). There’s also a heavy psychological toll that comes from an infinite backlog.

Most of the steps in any process should not run at their maximum capacity. They should match the capacity of the bottleneck. [1]

It can seem counter-intuitive. Letting the bottleneck set the pace means that employees responsible for non-bottleneck steps will have less to do. From their point of view (and their managers), this might feel wasteful. There’s often pressure—explicit and implicit—to stay busy and fill the day with work. That time would be better spent trying to increase the capacity of the bottleneck.

Bottlenecks and code review

The Goal is set in a manufacturing plant, but the ideas it teaches—inspired by the Toyota Production System—can be applied to any process.

Consider software development. Throughput, in this case, is the rate at which code is deployed to production. [2] At SharpestMinds, we make sure that every update to our code-base goes through a code review step before we merge and deploy. This is a bottleneck. No matter how much code I write on any given day, it still has to be tested and reviewed.

The easy fix is to get rid of code review. But it’s a step I’m not ready to abandon. There are a lot of benefits—catching mistakes and flaws, keeping co-workers up to date on changes to the code-base, and forcing accountability.

There are ways to make code review more efficient, however. Linting and automated tests can catch a lot of mistakes early, reducing the load on human reviewers. Another healthy developer habit is to be the first to review your own code. You might catch some obvious mistakes and avoid additional back and forth between you and the reviewer.

It can be tempting to dismiss CI tools and automated tests—especially in the early days of a startup—because it feels like time taken away from building. But, in the long run, investing in automated testing will increase the capacity of a bottleneck.

Even the smallest steps can help. In my first few months working as a developer, I did not have a linter enabled on my editor. When I finally clued in, it saved me tons of time. By catching formatting and syntax errors as I was coding, it reduced the need for additional commits to fix stupid mistakes.

Taking the time to think through processes and find the bottlenecks can be useful, even though it feels like common sense. We can be blinded by the need to feel busy, or by legacy procedures—because that’s how we’ve always done it.

Just recently, I found a blind spot in a process at SharpestMinds. Once a pull request is approved, our procedure has been that the person who wrote the code merges the change. There may have been a good reason for this, but I forget it now. It’s just how we’ve always done it. [3] But it’s increasing the time spent at the bottleneck—potentially extending the lead-time of a code change by days. If the reviewer approves the code on Friday afternoon but the author is off until Monday, it becomes idle WIP all weekend. A quick update to our process—if you approve it, you can merge it—will give us an easy win by reducing WIP and increasing throughput.

[1] Though it’s good to maintain a small buffer at the bottleneck in case a non-bottleneck step stops or breaks. Idle time at the bottleneck is costly.

[2] Actually, I’d be a bit worried about misaligned incentives with this definition. The goal of any company is not to ship the most code, it’s to provide value for its stakeholders. A better measure of throughput might be something like, the rate at which we can add useful features for our users. The bottleneck here will often be the measurement step—collecting data and feedback from users to determine if it is indeed useful, and what to build next.

[3] Perhaps it was because the author of the code would be better suited to watch for obvious errors as their changes are rolled out to production. With good enough testing and documentation (on how to revert a change if it breaks in production, for example), this is a moot point.

Thanks for reading! Subscribe for a new post every week(ish).

Product Engineering Playbook

Discussion about this post