Phoenix First Two XP Sprints

Phoenix has finished its Sprint 5 and 6, which are the first two XP Sprints.

An XP team always goes through four stages, “forming, storming, norming, performing”. The first Sprint felt like a storming stage, where we are trying to figure out the best way to get the code in without spending too much time on upfront design. At the same time, we are also getting used to paired programming.

Even though paired-programming has become an old trick for me, I still feel that my pairing skill has gotten worse during the past three years of working solo. The second Sprint felt a lot better, and I am hoping to keep this trend.

Items that worth noting:

  • We modified the lava lamp to have a green light on when everything is good. Even though it is redundant, it has very positive effect among us. The only thing we might need to watch out is that someone mentioned that they could be fire hazard because the lamp gets very hot at the end of the day. So we are going to turn them off by the end of the day. This is when I found out that the X10 remote controller does not work, so they are back for replacement now.
  • The lava lamps are helping us getting on the habit of treating broken tests as the highest priority. Due to the nature of Phoenix, we got some interesting test breakage already. We got tests that only break on the server, tests that only break on Linux, and a test that hung. One interesting discovery is that each time we are forced to figure out what is wrong and fix them, our tests ended up making better sense and being more like behavior driven, and I was planning on settling for hacks to keep the test passing!
  • At the beginning of the project, we chose to create just enough stories to get us through the first Sprint, then created a few more for the second Sprint. Looking back, I think that is a good choice. The kind of stories that we create now are so much different but better from the earlier ones. I think that is because at the beginning, your system has literally nothing. It would take a very good story writer to come up with a list stories that really fit into the “INVEST” category of the story. I am not saying it is impossible, I just think that two Sprints of bad stories is not a bad price to pay to get the ball rolling as early as possible and avoid lots of hassle to learn and teach and debate about good stories vs bad stories.

Checking in on Process Changes

As I blogged about a while back, for this release cycle of the PolicyCenter product we’ve made a number of changes to our development process to try make things work more smoothly and predictably.  We re-organized the team into cross-functional sub-teams (which we call “pods”), moved from four-week sprints to two-week sprints, started scheduling and estimating work based on story cards with assigned points instead of estimating in days, and put more of an emphasis on being “done done” with features before moving on to anything else.  We’ve been at it for 8 full sprints now, which is long enough to get a pretty good read on how it’s worked out.

In something of a pleasant surprise, given the totally unpredictable nature of software development where hardly anything ever works like you’d expect, the changes have actually worked out very well.  While it’s too early to tell how “done done” we’re really getting (we’ll obviously find out more as we try to close out the release), the product is certainly more stable and more complete early in the release than it’s ever been before.  The biggest benefit though, by a wide margin in my estimation, has been the change to break out all our work into small (1-5 day) stories that get written up and estimated with the entire team.  While the estimation meetings that we do that in can seem fairly tedious and slow-moving (because they are), the time spent in them has proven absolutely invaluable.  Working off of stories instead of PRDs has helped keep the development team moving by making sure there’s always a steady supply of ready-to-work on items (and we can tell when that supply is low), and doing the meetings as a pod makes sure that everyone is on the same page, or at least much closer to it than happened before.

So suppose you want to try this out on your own team.  What exactly do you need to do?  The INVEST model ( is a great starting point for thinking about what makes for a good story, though personally I’m not yet sold on the whole “incremental architecture” thing, so I’m less strict about avoiding infrastructure-only stories.  Where do the stories come from?  On our team, the stories are worked out collaboratively during a weekly meeting we call an “estimation session,” where whoever is driving a feature (generally the product manager, but theoretically the developers themselves for dev-driven features) presents what they’d like done and has help from the rest of the group turning that into bite-sized cunks.  The stories are then estimated using “planning poker,” where each developer has a set of cards with estimate numbers on them (we use 0, 1/2, 1, 2, 3, 5, 8, and ?) and independently chooses the estimate they think is appropriate; everyone hides their chosen card until everyone else is done, to avoid any sort of groupthink, and everyone shows at once.  If there are discrepencies, we talk them out (other teams re-vote until the vote settles on a number, but my group has been less strict about that and tends to just come to a group decision quickly, since none of us think it’s worth a huge amount of time debating if something is really a 2 or a 3).  The estimates are done in “points” which, at the start, you can think of as “ideal developer days,” but which ideally just become a relativized estimate (3 point cards should take roughly 3 times as long as a 1 point card) that then allows you to empirically measure how many points you do per sprint.

The meetings tend to run the smoothest when the driver for the feature has at least an idea of how things break down into stories already; the rest of the team can then question that, suggest alternative splits or combinations, etc., but having that starting point helps keep things moving.  The important thing that happens during the meeting, though, is really that people ask questions.  Lots and lots of questions.  Does this need to work in cases X and Y or only X?  Do we have any test infrastructure for this sort of thing?  When you say you want feature “foo,” do you mean A or do you mean B?  Where does the field labeled “Total” on the mockup come from and how is it defined?  Does this need to be configurable by customers or can we hardcode the logic? Etc.  Everyone on the team (or, in our case, pod) is at the meeting, so everyone comes away with approximately the same understanding of what needs to be done.  Perhaps more importantly, the questioning serves to identify stories and features that aren’t really ready yet for development:  either no one understands the issue well enough to estimate it (so we need to do some research), or the feature owner can’t answer the questions in enough detail yet, or the estimates turn out to be much higher than the PM expected and they decide to re-think the feature and de-scope it in some way.  To me, it’s really the estimation part that drives all of that:  in order to really accurately estimate a small chunk of work, you need to really understand what it is, and if you didn’t have to give that estimate you’d be much less vigilant about trying to understand the feature.

So out of all the different agile practices, story-based workflow might well be my favorite at this point.