Quality Assurance in Guidewire’s Agile World

The goal of quality software development is to release a product which will meet or exceed customer expectations. Sounds simple, but few software companies (especially those developing enterprise applications) ever achieve this goal and even fewer do so on a consistent basis. From it’s inception Guidewire adopted aspects of Agile Development and Extreme Programming with the objective of high quality and on-time, customer-relevant releases. My ambition with this blog entry is to share the experience of the QA team working in Guidewire’s Agile model with the hope of sharing lessons learned, how we work, and what we strive for.

There’s no QA in XP

QA and Agile/XP are not natural bedfellows. This issue has been discussed ad nauseum in various forums (see http://www.theserverside.com/news/thread.tss?thread_id=38785 for a relatively entertaining discourse on the matter). In synopsis, Kent Beck (the creator of Extreme Programming) did not see a role for a QA team or testers in general. In From Extreme Programming Explained Beck stated: “An XP tester is not a separate person, dedicated to breaking the system and humiliating the programmers.” Rather, in Agile/XP, programmers develop extensive unit tests and the end-customers decide whether the resulting product (or feature) is acceptable or not. In fact, the founders of Guidewire debated whether or not to have a QA team, based both on the ideals of XP in addition to negative experiences with the effectiveness of QA teams at prior companies. However, Guidewire has consistently maintained a 2-to-1 developer to QA Engineer ratio, a very high investment in QA as compared to the software industry as a whole. Why the disconnect? Is Guidewire really an Agile/XP company? Is Agile/XP too idealistic?

My 2 cents

My first assessment is that Guidewire indeed does try to emulate Agile/XP goals. My second assessment is that strict adherence to Agile/XP is simply unrealistic. Following are several reasons why Guidewire has required a dedicated QA effort:

  • Unit testing, test-first development, and pair-programming fit into that category of ideas that most everyone agrees are worth the investment. However, the intense dedication involved to successfully and consistently implement XP practices means the reality of XP projects often fall short of ambitions.
  • Unit tests generally fail to take into account both integration between features as well as the final packaged customer deliverable and the various environments the product will operate within. Without an effort to exercise the application in it’s end-customer state, including against all supported platforms, too many issues will be missed by unit testing alone.
  • From a psychological point-of-view it’s difficult to objectively critique your own creation (i.e. your own code). An interesting case study would be to contrast the tests a developer implements versus tests defined by a QA Engineer. It’s likely the developer unit tests would cover the obvious use cases that the code was designed for, but how likely is it that the tests would exercise the uglier aspects that lurk in the boundaries?
  • Automation (specifically unit tests) can not always be applied to all features or situations. Some amount of manual testing is realistically unavoidable.
  • QA Engineers live in a world of higher abstraction than a typical developer and are exposed to a broader view of the product and it’s public requirements. Considerations that are obvious at the customer-level are often not made aware to a developer focused on a specific block of code.
  • The customer feedback loop is not always ideal. Agile depends on customers willing to invest heavily in the product development effort, communicating to development their priorities and whether or not the product being built fits their needs. Many customers are unwilling or unequipped to make this investment. It’s arguable whether this is the best decision on the customer’s part, but the reality is that in Agile you’re asking customers to engage in what amounts to beta testing on steroids. As a result, some intermediary is necessary (namely QA) to determine if a feature meets it’s stated requirements.
  • Finally, in the world of mission critical insurance applications it is simply unacceptable to rely on the end customer to discover product issues. Of course some bugs will exist in a release which will be caught by customers, but keeping the number of issues to a minimum is key to successful deployments and content customers.

Hmmm. We do need QA. Now, what is QA again?

So, given that QA has shown to be justified at Guidewire, what has proven to be it’s most effective application? First off, I feel a little guilty using “we” when describing the QA team. A separate QA organization truly only belongs in a waterfall development model where testing is a stand-alone stage and code is delivered wholesale from development to QA. In addition, a team expressly associated with ensuring quality inherently defeats the purpose of the Agile/XP process where testing/quality should be addressed by everyone in the organization and at all points of development. My belief is that implementation of QA in an Agile/XP environment should follow general Agile/XP tenets (when in Rome…), namely establishing and maintaining core Agile/XP ideals, specifically a passion for continual improvement. Following are some guiding principles that I feel are inherent to any successful QA effort within an Agile/XP environment:

  • Never outsource QA. Luckily, this has never been an issue at Guidewire. The fact is that no salary differential (nor no communication device) will compensate for the collaboration that occurs when Engineers sit in the same room with no barriers to conversation.
  • There is no substitute for a good Engineer. Hiring is key.
  • Strive for tightly integrated development and QA teams. Ideally, QA Engineers sit next to their developer counterparts and the testing effort is shared and occurs in lockstep with code development. As well, the automated testing infrastructure should be common. At Guidewire we have what may be the ultimate automation solution. Tests developed by QA are run 24/7 by a harness which assigns broken tests to those who are responsible for their regression (usually development). Thus, a valid automated test is maintained ad infinitum. Simply checkin your test and walk away…
  • Make holistic quality-related decisions. Involving Development and Product Management with decisions impacting testing resources allows for more effective use of limited time. QA should focus on areas known to be of high risk, for example new features where unit testing is known to be lacking or the code base is likely to exhibit buggy behavior based on inherent complexity. As well, the likely customer impact of a bug in a given area (knowledge usually unique to PM) is valuable in terms of determining whether or not that feature deserves special attention.
  • Establish a model of continual training and leverage your knowledge base to keep Engineers up-to-date. At Guidewire we send all QA Engineers through training courses developed for Field Engineers. The expense is rather large (3 weeks of full-time training) but the payoff is Engineers exposed to the entire product and it’s customer-facing interface. Without this training it would likely take years for each Engineer to attain the same broad base of product knowledge.
  • Develop good tests, whether manual or automated. A bad test (defined as either being redundant, poorly defined, trivial, or at worst deceiving) is expensive in terms of maintenance and misrepresented confidence levels.
  • Automate, automate, automate. Guidewire strives for 100% automated test coverage. This is a brash goal and oftentimes the reality is far from ideal, but if you don’t shoot for the moon…
  • Treat test code as production code. Follow good coding conventions, comment well, and refactor each test such that it remains relevant. This is another example of a pinnacle of testing that is difficult to reach.

Finally

Hopefully this is a decent primer on the often misunderstood and historically maligned area of software development called Quality Assurance, especially as applied to Agile. There are many topics that apply to quality software development I’d like to eventually delve into or expand upon. Test coverage is a fascinating pseudo-science that’s fun to debate. The evolution of the QA Engineer from a key-banging monkey to a fully-fledged Object-Oriented programmer is interesting, as well (especially from a staffing point-of-view where said QA programmers are as rare as an early Triassic mammal). I would also like to further explore whether it may in fact be a healthy goal of a development organization to reach a state where QA is superfluous. In addition I’d like to cover what perhaps is the greatest challenge of Agile development – scaling what works well on a small team to a much larger organization. Finally, it’s fun for me to reminisce on the history of the Guidewire QA team, from pure manual testing to a nascent Test Harness and limited test automation, to the world today where GScript tests and the ToolsHarness infrastructure allow for near limitless automation potential.


7 Comments on “Quality Assurance in Guidewire’s Agile World”

  1. feelsgood11 says:

    Interesting thoughts.

    There are several things about QA we should keep in mind:
    1) QA is a way of thinking;
    2) Human won’t check what he has done as well as another human will.

    Unit testing is a good practice. But it’s a not a panacea. Developers should not rely completely on it otherwise they will abort.

    Testing itself is a huge science. I don’t quite understand those people who want to eliminate QA. I think it’s impossible if we want to produce software with a good quality.

  2. feelsgood11 says:

    Some customers even now don’t understand that quality is not free.

  3. Stanislav says:

    Thanks for the article. I’m serving as a ClaimCenter test lead in one of Guidewire’s international clients. First, some background. I began as the only QA colleague amongst 10 developers. We were 3 months into the project when I was hired and we had just 2 months until pilot. Our time to production was super-short and thus, product training was scraped, insurance training was minimal and worst of all, QA was treated as nothing more than repetitive tasks that a monkey could perform. Ten developers were committing code that fed to 1 QA resource – me – and that caused many long 12-17 hour workdays up to 7 days a week. Big changes to CC 4.x went into that first production build. (We had to fully support Cyrillic and multi-currency.)

    Fast-forward, now 10 months later, and I lead a team of 6 testers (including myself) and have built-out regression sets that cover about 70% of the system. Most importantly, I have dedicated one person to creating automated test cases with Test Complete. We are now being invited to new feature workshops and, I feel, the end product’s quality is increasing.

    My team and I still face difficulties testing. Maybe you can shed some light on these areas:

    1) Is Test Complete (TC) the best tool for automated CC testing? We tried others and TC seemed to be the best fit.
    2) We have almost no written requirements, so it is hard to determine what is a bug and what is not. Is it right to push-back and not test until written requirements are in hand?
    3) Our test scripts are not reviewed. Who should review test scripts?
    4) We deploy new releases with critical bugs. My team has found the bugs and escalated them but the release still gets deployed. Idea’s on how we can formalize a process that will prevent this?

    I welcome your feedback and comments.

    Thanks for hearing me out,
    Stanislav

  4. rsmithguidewire says:

    Hi Stanislav,

    Вы русский?

    1) Our preferred 3rd-party automation tool is Watir – http://wtr.rubyforge.org/. A couple of years ago we trialed TestComplete as a replacement for WinRunner, which we were using for a small set of browser tests. TC is inexpensive compared to other professional solutions and it passed our browser testing requirements. However, we decided to go with Watir, a Ruby library which automates browser actions. It’s open-source, free, and so far we have not found any limitations to driving Guidewire applications. As well, there is an active user community for Watir and Ruby-skills are highly desirable.

    2) I would say no. Assuming you have hard deadlines then QA has to find whatever means necessary to help themselves understand what a product/feature is trying to accomplish. Frustrating, yes. Unproductive, yes. But it’s better than waiting for requirements which will likely never materialize. I would attack the lack of written requirements as part of a broader effort to improve the way you develop your product. I’ve got to assume QA’s frustration on this matter is shared by development. So, for instance, more tightly integrating QA and development should at least increase QA’s understanding of what the developer is implementing. If the people you work with buy into Agile development then you can begin by developing stories with developers (and hopefully with those defining the product requirements). It’s important to continually try new things, keep those that work, and be willing to alter the status quo.

    3) Code review (whether you’re talking about functional code or tests) is most effective a) at the time the code is written and b) when done selectively. IMO retrospective code review brings little value. Pair testing and constant refactoring are two practices which I believe do bring meaningful results. Pair testing to me is involving a fellow developer or QA Engineer on a particulary complex chunk of test code that could benefit from inspection. Pair testing is also useful in brain storming test cases. Constant refactoring means that QA (and hopefully development) consistently run and repair/modify tests as development progresses. The assumption is that there is never a perfect test. The consistent review and refactoring of tests over time and with different eyes or different levels of knowledge hopefully moves the test in the right direction.

    4) Usually a final release decision is not made by QA. QA’s role is to make the known issues (and their severity) public. Setting the bar for quality must be a collaborative process. You, as QA, can set criteria for when a product enters/exits certain states (e.g. alpha/beta/RC). As well, tracking bug status (e.g. # of bugs in a certain state) over time is very useful to understand trends in development. Sharing these criteria and product state with the rest of the product team at least makes them aware of your position. A healthy release may in fact allow for a handful of critical issues (not blockers) as long as there are work-arounds or the issue is in a non-critical area of functionality. Again, it’s what you decide as a team that matters. Making a feature-rich, quality release on-schedule is an incredible challenge… However, having the bug state at your disposal allows you to (hopefully as early as possible) try to influence decisions which affect the release (cutting scope, moving release dates, adding resources, having development/QA focus on blockers/criticals).

    From many of your comments it appears your QA team works in a relative vacuum (and more waterfall than agile). This can’t be repaired overnight and I wouldn’t pretend to know the right answers for your team. I have found that many of ideas associated with Agile tend to help a great deal. One of Guidewire’s founders was fond of saying how quickly incremental improvements can move an organization. I really think it’s true…

  5. Stanislav says:

    Да, я русский.

    Thanks for your comments and feedback. It is appreciated. Our organization has struggled with the agile vs. waterfall approach from day 1. As you can see, this issue still isn’t resolved. Our QA team, for sure, follows the waterfall approach while dev and analysts follow an agile model. [Conversations with analysts and developers are usually devoid of QA’s input/presence.]

    Thanks again.

    До свидания!

  6. PeterMcC says:

    Thanks for these articles. They are very useful. I am a developer working for one of Guidewire’s clients. I am focussed on setting up a Continuous Integration environment and the support of Developer Testing. We are setting up a framework for developers writing unit tests for the Java and GScript code that they will be developing. I have a couple of questions relating to this.
    I am reading the GScript documentation and it appears that Gscript classes can be unit tested with GUnit but not Business Rules and Enhancements. Is it possible to Unit test these also?
    Also the documentation I read shows how to kick off the GUnit Tests from Guidewire Studio, we are interested on kicking off these tests from our Continuous Integration environment which currently kicks off the build/junit tests/code coverage etc using Ant. Is it also possible to automate running the GUnit tests?
    Thanks
    Peter

  7. Taras says:

    Stanislav,

    We are using Webtest package for our automated testing. Tests are written by developers when they write code, and then handed over to testers. I’ve also tried Watij, which is Java version of Watir that was suggested above. It was kinda useful for performance testing, but we didn’t do much work with it, was more of a proof of concept.

    The tool our testers prefer to use when left to their own is called QTP.

    In ideal world your developers shouldn’t even start developing if there are no story cards. If there are story cards then this should be enough for you to test. Ideally, you should be present at the time story cards were created, and have your input in them.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s