An Apology for Agile: When to (Not) Use It and How to Make It Work

After reading this post about scrum yesterday, I went over to comment on the article on HackerNews, and was genuinely saddened by the overall negative tone of the comments there around scrum/agile. It seems like a fairly large percentage of people have only had negative experiences with “agile” processes and are either actively hostile to them or, at best, don’t see the point.

My overly-long comment response got eaten by what appeared to be a server timeout, but I think a blog post is a better explanation anyway. So here’s what you might consider an apology for agile process, wherein I’ll discuss what problems it solves, when not to use it, and how to avoid screwing it up if you do decide to use it.

In this discussion, I’ll refer to “agile” as a general thing, even though there are many variants on it. For purposes of this discussion, “agile” is a development process that involves timeboxed iterations of fixed length, work broken down into small-scale stories, and a product owner who generates (most of) the stories and decides on their priorities. That’s a simplification, and there are many variants on that theme, but for purposes of this post that’s what I mean.

We’ve All Got Problems, Right?

Anyone writing code is, kind of by definition, following some sort of process, even if it’s not an explicit process: your “process” might consist of writing code, testing a few things, deploying your website, and then repeating the cycle. So there’s always some current or default process that you’re following, and the only reason you should ever consider switching processes to agile (or anything else) is to solve a problem. If everything’s working great for you, then stop reading right now: keep doing what you’re doing!

The analogy that comes to mind here is, oddly, barefoot running. I struggled for years with shin splints and knee problems to the point where I could run, at most, once a week, on a trail, or else I’d get horribly injured. Eventually I tried barefoot-ish running (in those weird toe shoes) as a way to try to avoid those injuries and, for me, it’s worked wonders. There are people out there who evangelize barefoot running as if everyone should do it no matter what, because it’s better for you or “more natural” and so forth, but you know what? If you’re a runner and you don’t have recurring knee or lower leg injuries and you’re happy running, then keep doing whatever it is you’re doing! It’s working! The last thing you want to do is fix something that’s not broken. If you get shin splits and knee pain, by all means, try the barefoot thing; it might help. But if it ain’t broke, don’t fix it.

A development process is like that: if what you’re doing is working, keep doing it. If someone tries to tell you there’s One True Way to develop software and that if you’re not doing full-time pairing with test-driven development using story cards and iterations and daily standups . . . well, just ignore them, and be content in the knowledge that if that guy is your competitor, your business is going to do just fine.

What Agile Does

So what problems does agile solve? Primarily, agile solves problems that arise due to the interaction between developers and product owners. If you’ve got a division of labor where one person or set of people is responsible for defining what the product should do and prioritizing the features, while other people are responsible for the actual implementation, then you’re likely to run into these problems; if there’s no such division, then you’re much less likely to run into these problems, and agile is likely to be far less helpful. I can think of seven specific problems that agile helps address. Don’t have these problems? Then agile isn’t going to help you.

The first problem is killing developer productivity by constantly shifting priorities and directions. The classic problem here is that a developer starts working on feature A, but gets interrupted because the product owner decides suddenly that feature B is more important. The next day, after talking with a prospective customer, the product owner decides that feature C is really the most important thing. As a result, the developer is left with a bunch of half-finished work, and everyone loses. The primary mechanism for addressing this in agile is the iteration/sprint, which is supposed to be a “time box” where priorities are adjusted only at the start, but not within the time box.

The second problem is an inability for product owners to make well-informed tradeoffs around priorities. For example, in order to decide which out of features A, B, C, and D to work on, it’s important that a product owner know how much relative work those features are. D might be the most important individual feature, but if A, B, and C combined as are much work as D, then that combination of features might be more compelling. Without reasonably-accurate estimates, a product owner can’t make those tradeoffs. Agile attempts to address that with relatively-estimated, small-scale stories (which are ideally also fairly independent, but that’s often easier said than done).

The third problem is having too many unnecessary meetings and status checks. I realize this sounds odd given the number of meetings and ceremony that often accompanies agile (standups, estimation meetings, acceptance meetings, retrospectives, demos . . .), but the theory behind the daily standup meeting is to give everyone who cares about the project’s status a well-known place to listen in, so they don’t bug people for status updates at random one-off times.

The fourth problem is an inability to accurately predict when a project will be done, primarily with the aim of either cutting scope or moving the deadline (or adding developers, which is always dicey), and knowing that one of those will need to be done as much ahead of time as possible. If you’re doing continuous deployment, this probably matters a whole lot less. If you’re releasing packaged software with a hard ship date, it matters a lot more. Agile attempts to address this by empirically measuring the team’s “velocity” and then comparing that to the number of “points” of work left, which (in my experience) tends to work much, much better than just constantly estimating and re-estimating when things will be done. (More on this later, because it’s probably the most important bit.)

The fifth problem is frustrated developers due to poorly-defined features. I’ve been in situations where developers attempted to start work on a particular feature only to find that the product owner hadn’t really thought it through, and that tends to just lead to a bunch of frustration and wheel-spinning: at best you waste time while you wait for a hastily-conceived answer, at worst the developer just makes their own decisions about how things should work and manages to get it completely wrong. Agile attempts to address this problem via story generation and estimation; if you can’t estimate a story, or it seems way too big, it’s a pretty good sign it’s not well defined yet.

The sixth problem is the temptation to adjust the quality knob in order to meet a date target. This one is pretty self-explanatory, I’d imagine, to anyone who’s ever actually developed any software. Agile attempts to address this by getting a shared definition of “done-ness” up front, and then providing accurate information around progress such that other levers can be pulled instead.

Lastly, this isn’t so much a problem per se, but agile builds in time for reflection on the product and the process. The rhythm of iterations gives you natural points for retrospectives where you analyze what’s working and try to change what’s not.

Again, don’t have those problems? Then agile probably isn’t going to buy you much. Have those problems? Maybe it’ll help.

Let me also take this chance to say that development processes work best when they’re voluntarily adopted by the team in question in response to real problems that they want to address. When the developers themselves see the process as something they want, because it helps them do their work well, you’re much more likely to succeed than when the process is imposed on the developers by some outside agent to solve their problems. As a general rule, trying to get developers to do anything which they don’t perceive as helping them do their work is going to be a failure. Developers are happy and think they’re doing awesome work but product owners feel like they can’t prioritize and have no visibility? You’re in for some rough conversations if you’re a product owner or manager trying to impose a new process. Developers are frustrated by constantly changing priorities, vaguely-defined features, and constant nagging about when things will be done? They’ll likely be much more receptive to trying agile.

One problem that agile most definitely doesn’t solve is a dysfunctional organization. Agile evangelists sometimes spin it as a way to make a dysfunctional organization functional, which is precisely wrong the wrong thing: agile can help competent, well-intentioned people be more productive by improving the communication and information flow between different parties in the development process. If people are incompetent, if management and development are at odds on fundamental issues (management wants stuff done as fast as possible, developers don’t want to cut corners), or if the developers don’t trust the product owners to make prioritization decisions around features, agile isn’t going to solve any of those problems. Agile can perhaps help build back trust in an organization by allowing a team to be successful, but if the organizational prerequisites aren’t there, it’s not going to work. And, of course, agile is definitely not a guarantee of success, it’s just one thing that can, in the right situations, help make success more likely.

When Everything Looks Like A Flat-Head Screw

Allow me to make another poor analogy. A development process such as agile is tool, so think of agile as a flat-head screwdriver. The general problem domain for flat-head screwdrivers is effectively “attaching things to other things,” but a flat-head screwdriver is only useful if you’re going to attach those things with flat-head screws. If you have a phillips-head screw, you can maybe wedge a flat-head screwdriver in at an angle and try to use it, but if someone tells you to do it, you’re going to think they’re an idiot. If they ask you to pound a nail in with it, you’re really going to be annoyed. Sometimes, you need a different type of screwdriver, sometimes you need a hammer, and sometimes you just need superglue. Just because you’re attaching two things together doesn’t imply that you need a flat-head screwdriver or that one will even be helpful. And if your only exposure to flat-head screwdrivers is in situations where you really need a hammer, you’re going to think that flat-head screwdrivers pretty much suck, and you’ll wonder why any idiot would ever want one.

Unfortunately, there are agile disciples who seem to omit all those little subtleties. “Flat-head screwdrivers are awesome for attaching things together!” they say. “If it doesn’t work for you, you must be doing it wrong.” Or maybe, just maybe, you really just need some glue, and a screwdriver isn’t going to help at all . . . The one-process-fits-all evangelists have really done a lot to harm the popular perception of agile, I’m afraid.

So when should you *not* use agile? Agile works best when you have a team of people working with a product owner in a known technology and problem domain on definable features that can be relatively estimated with reasonable reliability and built and delivered incrementally. It doesn’t work so hot for, well, anything else. If you’re prototyping something, if it’s a small scale effort (i.e. a few days) you can fit it into agile by doing a timeboxed spike, but for large-scale prototypes it’s inappropriate. If it’s a fundamentally difficult or new problem domain, where you can’t reliably predict even the relative difficulty of any task, agile isn’t going to work. If there are major non-linearities to the work such that things are highly unpredictable, agile isn’t going to work. For example, performance tuning doesn’t fit the model at all: you have no idea how long it’ll take to make something run 50% faster, or if it’s even possible, so writing a story card that says “Make process X run 50% faster” is pretty pointless since there’s no possible way to estimate it. If you’re doing a large-scale refactoring or re-architecting project that’s really an all-or-nothing thing, agile doesn’t work so well; estimates are likely to be unreliable, and you don’t have the option to cut scope by shipping some stories out of that but not others. Hopefully you get the idea. The core of agile is really relative estimation of small-scale stories, and if that’s not possible then it’s not going to work, and things are going to break down pretty severely. In my experience agile also doesn’t deal well with large-scale architectural decisions; those have to be made outside the process, before it starts, or at some point during development you have to temporarily jettison agile as you re-architect things, then re-start your sprints.

Agile is neither useless nor a suicide pact. Use it when it works, do something else when it doesn’t, and use the built-in feedback mechanisms to adjust the process based on the problem domain. If you’ve got a bunch of screws to turn, use a screwdriver. If you’ve got a bunch of nails to bang in, by all means put the screwdriver down and go get a hammer instead. Don’t assume everyone else has flat-head screws to turn, but don’t assume that everyone else has nails either: there’s room in this world for all kinds of tools, and people using ones you don’t find useful are quite probably just solving a different set of problems than you are.

How To Not Screw Up Agile

Here at Guidewire we’ve done our fair share of experimenting, and I think we have a pretty good idea of what doesn’t work, as well as what can work in the right circumstances. And in my perspective on the world, the two non-obvious-and-easiest-to-screw-up aspects of agile that are critical to its success are relative estimation and agreement on “doneness.” We certainly didn’t do those at first, and I hear of lots of other teams that make the same mistakes, so I can only assume it’s an oversight that a lot of people make. The two things work together, and without them everything else in agile kind of falls apart.

Agreeing on doneness means deciding ahead of time what it means for a story to be finished. Sometimes people call this “done done,” since it’s not uncommon for a developer to say something like, “Oh yeah, the FooBar widget is done, I just need to write the tests for it,” with the implication that merely being done doesn’t actually mean work has halted, so it’s not really finished until it’s really truly done done.

Anyway, it’s absolutely critical that doneness be defined up front and that everyone, product owner and developers and QA and anyone else involved agree on it. Does it mean that unit tests are written? That QA has signed off and all identified bugs fixed? That code documentation was added? That it was code reviewed? That it was merged into the main code line? That customer-facing documentation was added? It doesn’t matter what the answers are, it just matters that there are answers that everyone agrees to. (Obviously, though, you want your answers to correlate with the condition the code needs to be in for you to ship it.)

Relative estimation means that stories are estimated in a relative fashion, relative to other stories, rather than in some absolute measure of days. That’s generally done in terms of “points,” with all 1 point stories being about the same size, all 2 point stories being roughly twice as much work as the 1 point stories, and so on. A game like planning poker is often used to help the team converge on estimates to improve their accuracy, and accuracy usually improves over time as the team becomes more familiar with the problem domain. Those relative estimates are then mapped back to actual days by empirically tracking how many stories the team actually gets “done done” in a given period of time, known as the team’s “velocity.” Note that velocity is a team metric, not an individual metric: if you change the team composition, the velocity will change, and the team as a whole can be either greater than or less than the sum of its parts, depending on how well people work together. Also note that velocity is likely to bounce around a bit, especially early in a project, so in practice you’ll often use something like the running average of the team’s velocity for planning purposes. Not perfect, because this is software development that we’re talking about and it’s inherently unpredictable, but it’s far better than anything else I’ve ever seen anyone try.

Relative estimation is much easier to do reliably than absolute estimation; not 100% reliably, of course, but more reliably. Absolute estimation requires a developer to take too many things into account: development time, test-writing time, documentation time, bug-fixing time, the probabilistic chance that something will blow up and go horribly wrong, even general overhead from meetings or other interruptions. It even requires you to take into account individual differences, since Alice might do a story in 1 day that takes Bob 2 days. Taking all of that into account is all really hard to do, as it turns out. Rather than saying something will take “2 days,” which might mean “2 days of uninterrupted work if everything goes smoothly and nothing blows up and I only write a few tests,” it’s much easier to say “this is 2 points because I think it’s about twice as much work as that thing I said was 1 point.” You don’t have to estimate your overhead, or how much time you have to spend on development versus testing, you just measure it. You don’t even have to take who does the work into account, so long as both Alice and Bob take twice as long to finish 2-point stories as they do to finish 1-point stories; Alice could work twice as fast as Bob, and the math still all works out, because you’re measuring overall team velocity, not individual velocity. Maybe your 4 developers get 40 points of work done in 10 days, and maybe they get 8 points of work done; it doesn’t really matter, so long as the estimates are about right relative to each other. That’s how fast you’re working, so now you can start to get an idea of how long the project will take, and make decisions accordingly.

The other crucial advantage is that relative estimation doesn’t pressure people into rushing things to meet the time estimate they gave. As a developer, if you say something is 2 days of work, and you’ve been working on it for 4 days, it’s very, very tempting to just call it done and move on; the psychological pull is pretty strong there. If you said it’s 2 points of work, and the points to days mapping is computed and somewhat variable anyway, it’s much easier to just keep working until it’s done. Maybe our velocity will be lower this sprint as a result, or maybe it was a poor estimate and it was more work than I thought, or maybe that’s just how long 2-point stories take. It’s much easier to just keep working until it’s “done done” if you don’t give a fixed date estimate up front.

I really can’t over-emphasize how important these two things are. If you don’t do these, you’re likely not going to have a good experience with agile. Here’s kind of how things tend to break down. Say you don’t agree on what “done done” means up front. Suddenly, everyone is tempted to adjust the quality knob when the going gets rough, which is exactly what no one really wants. (Note: if that is in fact what management wants, you have deeper organizational problems which agile isn’t going to fix.) Or perhaps you count stories as “done” which aren’t really “done done,” which gives you an inflated velocity for the initial stages of the project, so maybe you think you’re getting 20 points of work done an iteration when in reality you can only get 14 done but you’re fudging the numbers. Now you have two problems: you’ve got 6 points of unscheduled off-the-grid work lurking in the future *and* you’ve got everyone making their plans based on the team doing 20 points of work per iteration, which completely throws off everyone’s ability to plan and prioritize. Or say you try to do absolute estimation instead of relative estimation. Now there’s pressure to cut corners to meet the estimates, and the estimates themselves are wildly inaccurate, but no one is really sure how inaccurate they are because you’re not really rigorous about tracking it or making sure that things are “done”, so once again your ability to measure your rate of progress is gone. Once you lose that visibility into the real rate of progress, not only does it throw off your ability to plan your development schedule and cut scope/shift dates/take other action as necessary, but it starts to make people nervous. Product owners know that they don’t really know how things are going, so they start bothering developers outside of standup meetings and changing priorities mid-iteration to try to exert some control over the process, or they make hasty decisions to try to course-correct, or they just freak out and kill the project because they have no idea if it’ll be done on time and on budget or if it’ll come in 6 months late and 200% over budget, and if it comes in 200% over budget they’ll get fired.

So if you take away just one thing from this ramble of a blog post, let it be this: relative estimation and agreement on doneness are absolutely critical to the success of the agile process.

Summing It Up

So that’s what I’ve got for you this time around. In summary: agile is useful in certain circumstances, it solves specific problems that you may or may not have, it’s only worth trying if you do in fact have those problems and if it seems like a good match to your problem domain, and relative estimation and “done doneness” are essential to the success of the process.

Why There Have Been No Posts Lately…

I apologize for the extended absence of posts here on the dev blog, but I’ve been busy with a new Open Source project:

The JSchema Project is an attempt to define a very simple schema system for JSON documents. JSON, for those who are not familiar, is a subset of Javascript that has proven to be a useful and less verbose alternative to XML for data-interchange:

    "group" : "Developers",
    "members" : [
      { "id" : 1, "first_name" : "Joe", "last_name" : "Smith" },
      { "id" : 2, "first_name" : "Jennifer", "last_name" : "Mitchum" }

It is easy to parse and produce, and integrates well with newer front-end technologies written in Javascript.

The motivation for this project, it will not surprise you, was Gosu. An excellent engineer at Amica, JP Camara, had been playing around with a JSON-based Type Loader for Open Source Gosu. I contacted him about adopting a schema-based approach, and began looking at the industry standard, JSON Schema. It quickly became apparent that JSON Schema was too complicated, and that a small extension to JP’s simple and elegant template approach would yield an expressive schema language, with a Gosu Type Loader to boot.

To give you a taste of how simple JSchema is, here is the schema for the JSON example above:

    "group" : "string",
    "members" : [
      { "id" : "integer", "first_name" : "string", "last_name" : "string" }

I hope that schema makes intuitive sense to most people: JSchema was designed such that the schema corresponds closely to the documents it describes, and it should be easy to take an example JSON document and transform it into a JSchema schema.

The JSchema specification is still very young but I think it is reasonably complete. We may add some more core data types (a ‘bytes’ datatype for raw byte data seems like it might be useful, for example) but the core ideas are working out well in our Gosu implementation (I’ll blog more about that in a later post).

If you are interested in the specification and/or participating in its design (or making the website prettier), you can fork it here. Participation is very welcome!