Thoughts On Tech DebtPosted: April 3, 2009
Martin Fowler recently updated his article on technical debt, and we’ve been discussing it in-house lately as well (though isn’t that always a conversation at any company with long-lived products?), so I’ve been thinking about it a lot lately.
Personally, I think it’s perhaps the most difficult engineering concept for non-engineers to internalize, because most things in the real world just don’t work that way; hence, the necessity of some analogy to a more common real-world concept. The core feature of development that lets technical debt happen is that the input of one “operation” is always the output of the previous one, meaning that mistakes and shortcuts build up over time, progressively dragging you down. Everyone has infrastructure that subtly impacts everything they do (if you’re a chef and your pots are poor quality or your kitchen is poorly laid out, it makes everything harder), and reputation can always make your business suffer (if you’re a sales guy and you offend a potential client, that can hurt you long term), but there are very few disciplines where you have to continuously build something up over the course of years. Even if you’re in construction, once you’re done with a given building you move on to the next one. But code bases are expected to essentially live forever, meaning that mistakes made during the beginning add up over time. That just doesn’t happen if you’re a chef, or a salesman, or a doctor, or an artist, or almost any other job you can think of. For engineers, the concept comes naturally: everyone understands that the decisions they make now will affect how they do their job in the months and years ahead. But for someone that’s never experienced that, I think it’s just a very difficult concept to internalize.
That said, it’s an important concept to grasp. To me, the important part about technical debt isn’t the principal, as it were: it’s the interest. Realistically, though, it doesn’t work like real interest. Rather, it’s more like a tax: the amount you pay isn’t fixed based on the size of the “debt,” but rather is generally proportional to how much work you want to do, and the size of the “debt” really determines the tax rate rather than some fixed amount of overhead. (One might argue that there is, in fact, some fixed amount in the form of ongoing maintenance, so there’s probably an argument to be made that the tax analogy isn’t really accurate either.) But either way you think about it, the important part of the concept isn’t just that there’s a backlog of stuff to fix, but rather that prior decisions that were made affect your ability to work productively in the future.
There are two things that I think are less obvious about how insidious technical debt is. The first one is that incurring the debt sets expectations artificially high about how much work the team can do; if you incur a ton of debt in the first version of a product in order to get it out the door, you’ve set the expectation that the team can do X amount of work in a 12-month release, when in reality you could only do X/2 without incurring debt . . . and because of that debt, it’s now more like X/2.5. The second insidious thing is that paying it down requires a huge resource commitment and delivers very little short term benefit. It often seems like a total black hole; if you’re paying 10% yearly interest on a $100,000 loan, paying back $50,000 on top of the interest only saves you $5,000 a year. So paying off the debt often seems like a poor investment, which means it just builds up and slowly exerts more and more of a tax on development, making it even harder to do something about it.
Of course, technical debt isn’t exactly measurable, and neither is productivity, but just for fun let’s pretend that we can and do a little math anyway, since I think it’s an interesting exercise. Imagine we’re measuring both productivity and debt in feature-dollars, and that we have a team that can do $100k worth of feature-dollars in a given year. The first version of the product, however, needs to be ready in one year and have $200k worth of features in it. So to get over the hump, the team borrows $100k at 10% APR. Of course, the technical debt lenders are cut throat, and in reality it’s always harder to fix things than it would have been do them right in the first place; we can imagine that as if the tech debt lenders charged back-breaking fees, say 30%. So after year one, we’ve got $200k worth of features and $130k worth of debt costing us $13k a year.
The team realizes that it overextended in the first release, but no one can quite swallow a 50% cut in productivity; the team did $200k the first time around, right? So instead, they shoot for $120k, thinking that’s a much more reasonable target. But their original rate minutes the $13k in debt means that to get $120k of features out, they need to incur another $33k of debt, which with fees we’ll round up to $40k.
By the time the third year rolls around, the project is $170k in debt, and the team decides to do something about it. They decide to scale their dev effort back in half and only deliver $60k worth of features, so they pay $17k in interest on the debt, do $60k worth of feature work, and have $23k leftover for technical debt. By sacrificing about 1/4 of their total dev capacity for the release (and more like 40% of their actual feature-building capacity), the team manages to reduce the debt down to $147k, saving them all of $2.3k per year in debt. So next time around, they’re basically in exactly the same boat.
As the debt gets ever higher, there becomes an inflection point where the debt is high enough to nearly bring development to a total halt, and yet so large that nothing can be done about it. Imagine if the team instead tried to deliver $200k of features in each release. In the second release, they’re paying $13k in interest, so they have to take out $113k in loans to hit their target, adding maybe $150k after fees. In the third release, they’re paying $28k in interest, so they have to take out $128k in loans, adding $160k in debt. So by the fourth release, they’ve got $440k in debt; if they take on no further loans (and after some point you really can’t), their dev capacity will around half of what it should be. But the debt is also so large that there’s realistically no way to pay it down; it would take five years of no further feature work. So at that point, it’s basically checkmate for the product . . . either you limp along with a product that doesn’t really evolve anymore and hope that a competitor doesn’t blow by you while you’re standing still, or you try to rewrite the whole thing and hope that doesn’t completely kill the project (which is by far the most likely outcome of a total rewrite effort).
You can play through that scenario with different perceived interest rates, or thinking of the debt as a tax instead of a constant amount, and over longer periods of time, but hopefully it illustrates the problems that I mentioned above, both around artificially increasing expectations for the team, leading to yet more debt, leading to yet more pressure to cut corners, and around the fact that paying down the debt often requires a Herculean effort for very little payoff. Make of that what you will. As with real debt, there’s a time and a place to incur it: sometimes it’s important to hit a deadline, or to get a feature in for a key client, and the interest and fees are worth the cost. But in the long run, technical debt can’t be allowed to build up to the point where it’s both too large to pay down and too large to allow for future development work, which requires walking a fine line between incurring debt when it’s necessary to get things done fast enough and holding it off or paying it down so that it doesn’t get out of control.