Tuesday, February 8, 2011

BOOK CLUB: How We Test Software at Microsoft (16/16)

This is the second part of section 4 of How We Test Software at Microsoft. This is also the final chapter of the book. After three months of near weekly updates (some more often, some less often… sorry about that, this approach was a learning process for me, too :) ), this project has now come to an end. I will post a follow on to this final post that I will have a more conventional “total” review of the book and some comments on this BOOK CLUB process (will I do this again? What did I learn from doing this? What went well and what would I want to do differently in the future?), but first, let’s close out this endeavor with some thoughts from Alan regarding where testing may be heading and how Microsoft is trying to shape that future, both within their company culture and to help influence the broader culture outside of itself.


Chapter 16: Building the Future

Alan starts out this final chapter with the reminder that, by direct comparison, software testing is a new player in the culture compared to software development. Computer services offered to the public commercially began proper in the 1950s. In those days, software development and software testing were the same discipline; the developer did both. As the systems grew more complex and more lines of code were being written, and also fostered by developments in the manufacturing world, quality of the process became more of a focus and the idea that a separate, non-partisan entioty should be part of the process to review and inspect the systems. Thus, the role of finding bugs and doing “desk checks” of programs specifically as a development practice broke into two disciplines, where the software developer wrote the code and a tester checked it and make sure it was free of defects (or barring that, found what defects they could find).

Today, software testing is still primarily a process of going through software and verifying that it does what it claims to do, and keeping our eyes out for the issues that would be embarrassing or downright dangerous to a company’s future livelihood if a customer were to run across it. The programs written today are bigger, more complex and have more dependencies than ever. Think of the current IDE culture; so many tools are available at developers’ fingertips that they are able to write code without writing much of anything, it seems. Full featured utilities created with just twenty lines of code. Of course, those of us in testing know full well that that’s not the real story; those 20 lines of code contain references to references to objects and classes that we have to be very alert to if we want to ensure that we have done thorough testing.


As far as we can tell, the future is looking to get more wired, more connected, more blurring of the digital lines structuring our lives. The days of a discrete computer are ancient history. Nearly every digital device we come into contact with today now has ways and means to synchronize with other devices, either through cabled connections or through the ether. The need to test has never been greater, and the need for good testing is growing all the time. The question of this final chapter is simple, but by no means easy… where do we go from here?


The Need for Forward Thinking

In the beginning there was debugging, then we moved to verification and analysis. Going forward, the questions are not going to be so much “how do we verify that the system is working but rather, how do we prevent errors in the first place. A common metaphor that I use when I talk about areas where we have a stake in are two concentric circles. The inner one I call the sphere of control, the outer one I call the sphere of influence. When it comes to verification and analysis, that’s very much in the sphere of control for a tester, it’s something we can do directly and providce immediate value. When it comes to prevention, there are some things we can do to control it, but so much falls outside of our direct control, but it definitely falls into the sphere of influence. Alan recognizes this, and makes the point that the biggest gains going forward towards developing better quality will not be taking place in the verification and analysis sphere, but in the preventative sphere. The rub is, what we as testers can do to prevent bugs is a bit more limited. What we can do is provide great information that will help to influence the behaviors and practices of those who develop code, so that the preventative gains can be realized.

Thinking Forward by Moving Backward

I like this story, so it’s going in unedited :):

As the story goes, one day a villager was walking by the river that ran next to his village and saw a man drowning in the river. He swam into the river and brought the man to safety. Before he could get his breath, he saw another man drowning, so he yelled for help and went back into the river for another rescue mission. More and more drowning men appeared in the river, and more and more villagers were called upon to come help in the rescue efforts. In the midst of the chaos, one man began walking away along a trail up the river. One of the villagers called to him and asked, “Where are you going? We need your help.” He said, “I’m going to find out who is throwing all of these people into the river.”

Another phrase I like a lot comes from Stephen R. Covey’s book “The Seven Habits of Highly Effective People”. His habit #7 is called “Sharpening the Saw”. To picture why this would be relevant here, he uses the example of a guy trying to cut through a big log and he’s huffing and puffing, and he’s making progress, but it’s slow going. An observer notes that he’s doing a lot of work, and then helpfully asks “have you considered sharpening your saw?”, To which the man replies “Hey, I’m busy sawing here!” The point is, we get so focused on what we are doing right now, that we neglect to see what we can do, stop the process, repair or remedy the situation, and then go forward with renewed vigor and sharper tools.

How many software projects rely on the end of the road testing to find the problems that, if we believe the constant drum beat from executives and others who champion quality, would be way more easily found earlier in the process? Is it because we are so busy sawing that we never stop to sharpen the saw? Are we so busy saving drowning people we don’t bother to go up river and see why they are falling in?


All of us who are testers recall the oft mentioned figures of the increase of cost for each bug found later on in the process.

A bug introduced in the requirements phase that might cost $100 dollars to fix if found immediately will cost 10 times as much to fix if not discovered until the system test phase, or as much as 100 times as much if detected post-release. Bugs fixed close to when they are introduced are generally easier to fix. As bugs age in the system, the cost can increase as the developers have to reacquaint themselves with the code to fix the bug or as dependencies to the code in the area surrounding the bug introduce additional complexity and risk to the fix.


Striving for a Quality Culture

Alan points to the work of Joseph Juran and the fact that it is the culture of a place that will determine their approach to quality issues, and their resistance or lack thereof will likewise also have a cultural element to it as well. When I refer to culture here (and Alan, too) we are referring to the corporate culture, the management culture, the visions and values of a company. Those are very fluid as you go from company to company, but company mores and viewpoints can hold for a long time and become ingrained in the collective psyches of organizations. The danger is that, if the culture is not one that embraces quality as a first order factor of doing business, quality will take a back seat to other initiatives and priorities until it absolutely must be deal with (in some organizations, their lack of dealing often results in the closure of said company).

For many, the idea of a front-end quality investment sounds like a wonderful dream, but for many of us, that’s what it has proven to be… just a dream. How can we help make the step to earlier in the process? It needs to be a culture everyone in the organization embrace, one where prevention trumps expediency (or we could go with a phrase that Matt Heusser used on the TWiST podcast that I’ve grown to love… “If heroics are required, I can be a hero, but it’s going to cost you!” Seriously, I love this phrase, and I’ve actually used it a few times myself… because it’s 100% true. If an organization waits untiol the end of the process for last minute heroics, it will cost the organization, either in crunch time overtime of epic proportions, or in reactive fixes because something made itself out into the wild that shouldn’t have and, with some preventative steps, very likely could have been caught earlier in the life cycle.


Testing and Quality Assurance

“In the beginning of a malady it is easy to cure but difficult to detect, but
in the course of time, not having been either detected or treated in the beginning, it becomes
easy to detect, but difficult to cure.” –Niccolo Machiavelli


Alan, I just have to say “bless you” for bringing this up over and over in the book, and making sure it is part of the “parting shot” and summation. Early detection of a problem always trumps last minute heroics, especially when it comes to testing. Testing is the process of unearthing/uprooting problems before a customer can find them. No question, Microsoft has a lot of testers, and as I know quite a few of them and have worked with several of them personally over the years (as I said in the previous chapter, I worked at Connectix in 2001 and 2002, and a number of the software engineers and testers from that team are active SDE’s and SDET’s for Microsoft today). It’s not that they are not good at testing, it’s that even Microsoft still focuses on the wrong part of the equation…:

“YOU CAN’T TEST QUALITY INTO A PRODUCT!”


Testing and Quality Assurance are often treated as though they are the same thing. They are not. They are two different disciplines. When we test a product, it’s an after the fact situation. The product is made, we want to see if it will withstand the rigor of being run through its paces. Quality Assurance, by contrast is a process meant to be proactive and early in the life of a process or a product, to make sure the process delivers the intended result. It sounds like semantics, but it’s not, they are two very different processes with two different approaches. Of course, to assure quality, we use testing to make sure that the quality initiatives are being met, but using the terms interchangeably is both inaccurate and misleading (as well as confusing).


Who Owns Quality?

This is not a trick question, but the answers often vary. Does the test team own quality? No. They own the testing process. They own the “news feed” about the product. Others would say that the entire team owns quality, but do they really? If everyone owns something, does anyone really own anything?! Alan makes the point that saying the test team owns quality is putting the emphasis in the wrong place, and saying everyone owns quality is to de-emphasize it entirely. The fact is, the management team are the ones who own quality, because they are the one’s that make the ship decisions. Testing doesn’t have that power. The mental image of the “Guardian of the Gate” for testing is a bad one, as it makes it seem as though we are the ones that make the decision as to who shall pass and who will not, and we don’t. I’m a little more comfortable with the idea of the “last tackle on the field” because often the test team is the last group to see a feature before it goes out into the wild, but even then, there’s no guarantee we will catch it, or if we do stop it, that we can prevent them from going out into the field. Management owns that. The best metaphor, to me, is the idea of being a beat reporter. We get the story, we tell the story, as much as it that we know, and as much of it as we can learn. We tell our story, and then we leave it to the management team to decide if we have a shipping product or not.

In short, a culture of quality and a commitment to it must exist first before major changes and focus on quality will meet with success.


The Cost of Quality

The Cost of Quality is not the price of making a high quality product. It’s the price paid by a company when a poor quality product gets out. Everything from extra engineering cycles to provide a fix to lost opportunity because of bad press, to actual loss of revenue because a service isn’t working, all of these work into the cost of quality. Other examples of the price to pay when quality issues escape into the wild are:



  • Rewriting or redesigning a component or feature
  • Retesting as a result of test failure or code regression
  • Rebuilding a tool used as part of the engineering process
  • Reworking a service or process, such as a check-in system, build system, or review policy



The point that is being made is that, were none of these situations to have happened because testing and quality assurance were actually perfected to the point where no bugs slipped through (to dream… the impossible dream…), these expenses would not have caused the bottom line to take a hit. So perhaps the real cost of quality is what Alan calls the Cost of Poor Quality (COPQ).


Phillip Crosby says each business has three specific cost areas:


  • Appraisal (salaries, equipment, software, etc.)
  • Preventative (expenditures associated with implementing and maintaining preventative techniques)
  • Failure (the cost of rework or “do-over”)


To put it bluntly, preventative work gets a lot of lip service, but rarely do they actually get implemented.
Failure costs? We pay them in spades, usually way more often than the other types (overtime, crunch time, the death march to release, etc.).


The takeaway from many testers (believe me, if we could impart no other message, this would be really high on my list of #1 takeaways…:

We don’t need heroics; we need to prevent the need for them.


A New Role for Test

One of the great ironies is that, when testers talk about the desire to move away from the focus on late in the game testing to earlier in the process prevention of bugs, an oft hear comment is, “come on, if we do that, what will the testers test?” well, let’s see… there’s the potential for looking at the human factors that influence how a product is actively used, there’s performance and tuning of systems, there’s system up time and reliability, there’s researching and examining different testing techniques to get deeper into the application… in short, there’s lots of things that testers can do, even if the end of the cycle heroic suicide missions are done away with entirely (many of us can only dream and wish of such a world). Many of the more interesting and compelling areas of software testing do not get explored in many companies because testers are in perpetual firefighting mode. For most of us, were we given the opportunity to get out of that situation and be able to explore more options, we would welcome it gladly!

Test Leadership

At the time HWTSAM was written, there were over 9,000 testers at Microsoft. Seriously, wrap your head around that if you can. How do you develop a discipline that large at a company the size of Microsoft, so that the tech of the trade keeps moving forward? You encourage leadership and provide a platform for that leadership to develop and flourish.


The Microsoft Test Leadership Team

Microsoft developed the Microsoft Test Leadership Team (MSTLT) to encourage the sharing of good practices and testing knowledge between various testing groups and between other testers.

The MSTLT’s mission is as follows:

The Microsoft Test Leadership Team vision


The mission of the Microsoft Test Leadership Team (MSTLT) is to create a cross–business group forum to support elevating and resolving common challenges and issues in the test discipline.


The MSTLT will drive education and best practice adoption back to the business group test teams that solve common challenges.


Where appropriate the MSTLT will distinguish and bless business group differences that require local best practice optimization or deviation.


The MSTLT has around 25 members including the most senior test managers, directors, general managers, and VPs, and the are spread throughout the company and represent all products Microsoft makes. Membership is based on level of seniority and approval of the TLT chair and product line vice president. Having these members involved helps to make sure that testing advocacy grows and that the state of the craft develops and flourishes with the support of the very people that champion that growth and development.

Test Leadership in Action

The MSTLT group meets every month to discuss and develop plans to help grow the career paths of a number of contributors, as well as addressing new trends and opportunities that can help testers become better and (yet again) improve the state of the craft overall within Microsoft.

Some examples on topics covered by MSTLT:

Updates on yearly initiatives: At least one MSTLT member is responsible for every MSTLT initiative and for presenting to the group on its progress at least four times throughout the year.


Reports from human resources: The MSTLT has a strong relationship with the corporate human resources department. This meeting provides an opportunity for HR to disseminate information to test leadership as well as take representative feedback from the MSTLT membership.


Other topics for leadership review: Changes in engineering mandates or in other corporate policies that affect engineering are presented to the leadership team before circulation to the full test population. With this background information available, MSTLT members can distribute the information to their respective divisions with accurate facts and proper context.


The Test Architect Group

Another group that has developed is the Test Architect Group which, contrary to its name, does not just include a bunch of Test Architects (though it started out that way) but also includes senior testers and those individuals who are working in the role of being a test architect, whether they have the official title or not.

So what was envisioned for being a Test Architect? Well, here’s how it was originally considered and implemented:


The primary goals for creating the Test Architect position are:


  • To apply a critical mass of senior, individual contributors on difficult/global testing problems facing Windows development teams
  • To create a technical career path for individual contributors in the test teams


Some of the key things that Test Architects would focus on include:


  • Continue to evolve our development process by moving quality upstream
  • Increase the throughput of our testing process through automation, smart practices, consolidation, and leadership


The profile of a Test Architect:


  • Motivated to solve the most challenging problems faced by our testing teams
  • Senior-level individual contributor
  • Has a solid understanding of Microsoft testing practices and the product development process
  • Ability to work both independently and cross group developing and deploying testing solutions.


Test Architects will be nominated by VPs and would remain in their current teams. They will be focused on solving key problems and issues facing the test teams across the board. The Test Architects will form a virtual team and meet regularly to collaborate with each other and other Microsoft groups including Research. Each Test Architect will be responsible for representing unique problems faced by their teams and own implementing and driving key initiatives within their organizations in addition to working on cross-group issues.


Test Excellence

Microsoft created the Engineering Excellence (EE) team in 2003. The group was created to help push ahead initiatives for tester training, to discover and share good practices in engineering across the company (some of you may notice that I didn’t say “best practices”. While Alan used the term ‘Best Pracices”, I personally don’t think there is such a thing. There’s some really great practices, but to say best means thjere’s no room for better practices to develop. It’s a pet peeve of mine, so I’m modifying the words a bit, but the sentiment and the idea is the same thing.

The mission of the Test Excellence comes down to Sharing, Helping, and Communicating.


Sharing

Sharing means focusing on the following areas:


  • Practices The Test Excellence team identifies practices or approaches that have potential for use across different teams or divisions at Microsoft. The goal is not to make everyone work the same way, but to identify good work that is adoptable by others.
  • Tools The approach with tools is similar to practices. For the most part, the core training provided by the Test Excellence team is tool-agnostic, that is, the training focuses on techniques and methods but doesn’t promote one tool over another.
  • Experiences Microsoft teams work in numerous different ways—often isolated from those whose experiences they could potentially learn from. Test Excellence attempts to gather those experiences through case studies, presentations (“Test Talks”), and interviews, and then share those experiences with disparate teams.



Helping

One of the primary purposes of the test excellence team is to help champion quality improvements and learning for all testers. They help accomplish these objectives in the following ways:


  • Facilitation Test Excellence team members often assist in facilitating executive briefings, product line strategy meetings, and team postmortem discussions. Their strategic insight and view from a position outside the product groups are sought out and valued.
  • Answers Engineers at Microsoft expect the Test Excellence team to know about testing and aren’t afraid to ask them. In many cases, team members do know the answer, but when they don’t, their connections enable them to find answers quickly. Sometimes, team members refer to themselves as test therapists and meet individually with testers to discuss questions about career growth, management challenges, or work–life balance.
  • Connections Probably the biggest value of Test Excellence is connections—their interaction with the TLT, TAG, Microsoft Research, and product line leadership ensures that they can reduce the degrees of separation between any engineers at Microsoft and help them solve their problems quickly and efficiently.


Communicating

Having these initiatives is great, and supporting them takes a lot of energy and commitment, but without communicating to the rest of the organization, these initiatives would have limited impact. Some of the ways that the Test Excellence team helps foster communication among other groups are:


  • A monthly test newsletter for all testers at Microsoft includes information on upcoming events, status of MSTLT initiatives, and announcements relevant to the test discipline.
  • University relationships are discussed, including reviews on test and engineering curriculum as well as general communications with department chairs and professors who teach quality and testing courses in their programs.
  •  The Microsoft Tester Center (http://www.msdn.com/testercenter)—much like this book—intends to provide an inside view into the testing practices and approaches used by Microsoft testers. This site, launched in late 2007, is growing quickly. Microsoft employees currently create most of the content, but industry testers provide a growing portion of the overall site content and are expected to become larger contributors in the future.



Keeping an Eye on the Future

Trying to anticipate the future of testing is a daunting task, but many trends make themselves visible often years in advance, and by trying to anticipate these needs and opportunities, the Test Excellence team can be positioned to help testers grow into and help develop these emerging skills and future opportunities.

Microsoft Director of Test Excellence

Each of the the authors of HWTSAM has held (or is the current holder in the case of Alan Page) the position of the Director of test Excellence.

It’s primary responsibility is to work towards developing opportunities and the infrastructure and practices needed to help advance the testing profession at Microsoft.

The following people have all held the Director of Test position:


  •  Dave Moore (Director of Development and Test), 1991–1994
  •  Roger Sherman (Director of Test), 1994–1997
  •  James Tierney (Director of Test), 1997–2000
  •  Barry Preppernau (Director of Test), 2000–2002
  •  William Rollison (Director of Test), 2002–2004
  •  Ken Johnston (Director of Test Excellence), 2004–2006
  •  James Rodrigues (Director of Test Excellence), 2006–2007
  •  Alan Page (Director of Test Excellence), 2007–present

The Leadership Triad

The Microsoft Test Leadership Team, Test Architect Group, and Test Excellence are three pillars of emphasis and focus on the development and advancement of the software testing discipline within Microsoft.

Innovating for the Future

The final page of the book deals with a goal for the future. Since so many of Alan, Ken and BJ’s words are already included, I think it’s only fair to let them have the last word :)...

When I think of software in the future, or when I see software depicted in a science fiction movie, two things always jump out at me. The first is that software will be everywhere. As prevalent as software is today, in the future, software will interact with nearly every aspect of our lives. The second thing that I see is that software just works. I can’t think of a single time when I watched a detective or scientist in the future use software to help them solve a case or a problem and the system didn’t work perfectly for them, and I most certainly have never seen the software they were using crash. That is my vision of software—software everywhere that just works.


Getting there, as you’ve realized by reading this far in the book, is a difficult process, and it’s more than we testers can do on our own. If we’re going to achieve this vision, we, as a software engineering industry, need to continue to challenge ourselves and innovate in the processes and tools we use to make software. It’s a challenge that I embrace and look forward to, and I hope all readers of this book will join me. If you have questions or comments for the authors of this book (or would like to report bugs) or would like to keep track of our continuing thoughts on any of the subjects in this book, please visit http://www.hwtsam.com. We would all love to hear what you have to say.


—Alan, Ken, and Bj

No comments: