Six Things That Go Wrong With Discussions About Testing

Talking about software testing is not easy. It’s not natural! Testing is a “meta” activity. It’s not just a task, but a task that generates new tasks (by finding bugs that should be fixed or finding new risks that must be examined). It’s a task that can never be “completed” yet must get “done.”

Confusion about testing leads to ineffective conversations that focus on unimportant issues while ignoring the things that matter. Here are some specific ways that testing conversations fail:

  1. When people care about how many test cases they have instead of what their testing actually does. The number of test cases (e.g. 500, 257, 39345) tells nothing to anyone about “how much testing” you are doing. The reason that developers don’t brag about how many files they created today while developing their product is that everyone knows that it’s silly to count files, or keystrokes, or anything like that. For the same reasons, it is silly to count test cases. The same test activity can be represented as one test case or one million test cases. What if a tester writes software that automatically creates 100,000 variations of a single test case? Is that really “100,000” test cases, or is it one big test case, or is it no test case at all? The next time someone gives you a test case count, practice saying to yourself “that tells me nothing at all.” Then ask a question about what the tests actually do: What do they cover? What bugs can they detect? What risks are they motivated by?
  2. When people speak of a test as an object rather than an event. A test is not a physical object, although physical things such as documentation, data, and code can be a part of tests. A test is a performance; an activity; it’s something that you do. By speaking of a test as an object rather than a performance, you skip right over the most important part of a test: the attention, motivation, integrity, and skill of the tester. No two different testers ever perform the “same test” in the “same way” in all the ways that matter. Technically, you can’t take a test case and give it to someone else without changing the resulting test in some way (just as no quarterback or baseball player will execute the same play in the same way twice) although the changes don’t necessarily matter.
  3. When people can’t describe their test strategy as it evolves. Test strategy is the set of ideas that guide your choices about what tests to design and what tests to perform in any given situation. Test strategy could also be called the reasoning behind the actions that comprise each test. Test strategy is the answer to questions such as “why are these tests worth doing?” “why not do different tests instead?” “what could we change if we wanted to test more deeply?” “what would we change if we wanted to test more quickly?” “why are we doing testing this way?” These questions arise not just after the testing, but right at the start of the process. The ability to design and discuss test strategy is a hallmark of professional testing. Otherwise, testing would just be a matter of habit and intuition.
  4. When people talk as if automation does testing instead of humans. If developers spoke of development the way that so many people speak of testing, they would say that their compiler created their product, and that all they do is operate the compiler. They would say that the product was created “automatically” rather than by particular people who worked hard and smart to write the code. And management would become obsessed with “automating development” by getting ever better tools instead of hiring and training excellent developers. A better way to speak about testing is the same way we speak about development: it’s something that people do, not tools. Tools help, but tools do not do testing.There is no such thing as an automated test. The most a tool can do is operate a product according to a script and check for specific output according to a script. That would not be a test, but rather a fact check about the product. Tools can do fact checking very well. But testing is more than fact checking because testers must use technical judgment and ingenuity to create the checks and evaluate them and maintain and improve them. The name for that entire human process (supported by tools) is testing. When you focus on “automated tests” you usually defocus from the skills, judgment, problem-solving, and motivation that actually controls the quality of the testing. And then you are not dealing with the important factors that control the quality of testing.
  5. When people talk as if there is only one kind of test coverage. There are many ways you can cover the product when you test it. Each method of assessing coverage is different and has its own dynamics. No one way of talking about it (e.g. code coverage) gives you enough of the story. Just as one example, if you test a page that provides search results for a query, you have covered the functionality represented by the kind of query that you just did (function coverage), and you have covered it with the particular data set of items that existed at that time (data coverage). If you change the query to invoke a different kind of search, you will get new functional coverage. If you change the data set, you will get new data coverage. Either way, you may find a new bug with that new coverage. Functions interact with data; therefore good testing involves covering not just one or the other but also with both together in different combinations.
  6. When people talk as if testing is a static task that is easily formalized. Testing is a learning task; it is fundamentally about learning. If you tell me you are testing, but not learning anything, I say you are not testing at all. And the nature of any true learning is that you can’t know what you will discover next– it is an exploratory enterprise.It’s the same way with many things we do in life, from driving a car to managing a company. There are indeed things that we can predict will happen and patterns we might use to organize our actions, but none of that means you can sleepwalk through it by putting your head down and following a script. To test is to continually question what you are doing and seeing.

    The process of professional testing is not design test cases and then follow the test cases. No responsible tester works this way. Responsible testing is a constant process of investigation and experiment design. This may involve designing procedures and automation that systematically collects data about the product, but all of that must be done with the understanding that we respond to the situation in front of us as it unfolds. We deviate frequently from procedures we establish because software is complicated and surprising; and because the organization has shifting needs; and because we learn of better ways to test as we go.

Through these and other failures in testing conversations, people persist in the belief that good testing is just a matter of writing ever more “test cases” (regardless of what they do); automating them (regardless of what automation can’t do); passing them from one untrained tester to another; all the while fetishizing the files and scripts themselves instead of looking at what the testers are doing with them from day to day.

Variable Testers

I once heard a vice president of software engineering tell his people that they needed to formalize their work. That day, I was an unpaid consultant in the building to give a free seminar, so I had even less restraint than normal about arguing with the guy. I raised my hand, “I don’t think you can mean that, sir. Formality is about sameness. Are you really concerned that your people are working in different ways? It seems to me that what you ought to be concerned about is effectiveness. In other words, get the job done. If the work is done a different way every time, but each time done well, would you really have a problem with that? For that matter, do you actually know how your folks work?”

This was years ago. I’m wracking my brain, but I can’t remember specifically how the executive responded. All I remember is that he didn’t reply with anything very specific and did not seem pleased to be corrected by some stranger who came to give a talk.

Oh well, it had to be done.

I have occasionally heard the concern by managers that testers are variable in their work; that some testers are better than others; and that this variability is a problem. But variability is not a problem in and of itself. When you drive a car, there are different cars on the road each day, and you have to make different patterns of turning the wheel and pushing the brake. So what?

The weird thing is how utterly obvious this is. Think about managers, designers, programmers, product owners… think about ANYONE in engineering. We are all variable. Complaining about testers being variable– as if that were a special case– seems bizarre to me… unless…

I suppose there are two things that come to mind which might explain it:

1) Maybe they mean “testers vary between satisfying me and not satisfying me, unlike other people, who always satisfy me.” To examine this we would discover what their expectations are. Maybe they are reasonable or maybe they are not. Maybe a better system for training and leading testers is needed.

2) Maybe they mean “testing is a strictly formal process that by its nature should not vary.” This is a typical belief by people who know nothing about testing. What they need is to have testing explained or demonstrated to them by someone who knows what he’s doing.

 

 

 

 

 

A Tester’s Commitments

This is the latest version of the commitments I make when I work with a programmer.

Dear Programmer,

 

My job is to help you look good. My job is to support you as you create quality; to ease that burden instead of adding to it. In that spirit, I make the following commitments to you.

 

Sincerely,

 

Tester

  1. I provide a service. You are an important client of that service. I am not satisfied unless you are satisfied.
  2. I am not the gatekeeper of quality. I don’t “own” quality. Shipping a good product is a goal shared by all of us.
  3. I will test your code as soon as I can after you deliver it to me. I know that you need my test results quickly (especially for fixes and new features).
  4. I will strive to test in a way that allows you to be fully productive. I will not be a bottleneck.
  5. I’ll make every reasonable effort to test, even if I have only partial information about the product.
  6. I will learn the product quickly, and make use of that knowledge to test more cleverly.
  7. I will test important things first, and try to find important problems. (I will also report things you might consider unimportant, just in case they turn out to be important after all, but I will spend less time on those.)
  8. I will strive to test in the interests of everyone whose opinions matter, including you, so that you can make better decisions about the product.
  9. I will write clear, concise, thoughtful, and respectful problem reports. (I may make suggestions about design, but I will never presume to be the designer.)
  10. I will let you know how I’m testing, and invite your comments. And I will confer with you about little things you can do to make the product much easier to test.
  11. I invite your special requests, such as if you need me to spot check something for you, help you document something, or run a special kind of test.
  12. I will not carelessly waste your time. Or if I do, I will learn from that mistake.

(This is cool! Yong Goo Yeo has created a Prezi of this.)

An Important Comment

This comment came in from Stuart Taylor. I think it’s important enough that I should add this to the post:

Hi James,

I’m not sure if this is meant to be a little tongue in cheek, but that’s how I read it. That said, I fell at point 1. “I provide a service” really?

Yes, really. This is not a joke. And point #1 is the most important point.

This implies there may be alternative service providers, and if my customer relationship management isn’t up to snuff, I could lose this “client”. Already I feel subservient, I’m an option.

Of course there are alternative service providers. (There are alternative programmers as well, but I’m not concerned with that, here. I’m making commitments that have to do with me and my role, here.) And yes, of COURSE you may lose your client. Actually that’s a lot of what “Agile” has done: fired the testers. In their place, we often find quasi-tester/quasi-programmers who are more concerned with fitting in and poking around tools than doing testing.

Testing is not programming. Programmers are not testers. If you mix that, you do a sorry job of both. In other words: I am a programmer and a tester, but not a good programmer at the same time that I’m a good tester.

Please meditate on the difference between service and subservience. I am a servant and I am proud of that. I am support crew. I spent my time as a production programmer and I’m glad I don’t do that any more. I don’t like that sort of pressure. I like to serve people who will take that pressure on my behalf.

This doesn’t make me a doormat. Nobody wipes their feet on me– I clean their feet. There’s a world of difference. Good mothers know this difference better than anyone.

Personally, when I work on a software  project, I want to feel like I’m part of the team. I need the developer, and the developer wants me. We work together to bake quality in from the start, and not try and sprinkle it on at the end as part of a “service” that I offer.

What strange ideas you have about service. You think people on the same team don’t serve each other? You think services are something “sprinkled” on the end? Please re-think this.

I want to be on the same team, too. The difference, maybe, is that I require my colleagues to choose me. I don’t rap on the door and say “let me in I belong here! I like to be on the team!” (You can’t just “join” a team. The easier it is to join a team, the less likely that it is actually a team, I think. Excellent teams must gel together over time. I want to speed that process– hence my commitments) Instead, I am invited. I insist that it be an invitation. And how I get invited quickly is by immediately dissolving any power struggle that the programmers may worry about, then earning their respect through the fantastic quality of my work.

You can, of course, demand respect. Then you will fail. Then five years later, you will realize true respect cannot be demanded. (That’s what I did. Ask anyone about what a terror I was when I worked at Apple Computer. If you are less combative or more intelligent than I am, you may make this transition in less time.)

I may have actually agreed with your points before I was exposed to Continuous Integration, because that’s how my relationship used to be; hence me interpreting this as a light hearted piece. However I know that this kind of relationship still exists today (its here at this place). When I began working with continuous integration, the traditional role of the tester, with a discrete testing phase became blurred, as the test effort is pushed back up stream towards the source. If the tester and developer share a conversation about “how can we be sure we build the right thing, and how can we ensure we built it right” before the code gets cut, and the resulting tests from that are used to help write the code (TDD, BDD, SBE), then both of us can have a nice warm fuzzy feeling about the code being of good quality. The automation removes the repetition (and reduces the feedback) to maintain that assertion.

First, your experience has nothing to do with “continuous integration.” Continuous integration is a concept. Your experience is actually related to people. Specific people are collaborating with you in specific satisfying ways, probably because it is their pleasure to do so. I have had wonderful and satisfying professional relationships on a number of projects– mostly at Borland, where we were not doing continuous integration, and where testers and programmers cooperated freely, despite being organized into separate administrative entities.

(Besides, I have to say… Testing cannot be automated. Automation doesn’t remove repetition, it changes it. Continuous integration, as I see it, is another way of forcing your customers do the testing, while placing naive trust in the magic of bewildering machinery. I have yet to hear about continuous integration from anyone who seemed to me to be a student of software testing. Perhaps that will happen someday.)

In any case, nothing I’m saying here prevents or discourages collaboration. What it addresses are some of the unhealthy things that discourage collaboration between a tester and a programmer. I’m not sure if in your role you are doing testing. If you want to be a programmer or a designer or manager or cheerleader or hanger-on, then do that. However, testing is inherently about critical thinking, and therefore it always carries a risk (in the same way that being a dentist does) that we may touch a nerve with our criticism. This is the major risk I’m trying to mitigate.

[Note: Michael Bolton pointed out to me the “warm fuzzy” line. I can’t believe I let that go by when I wrote this reply, earlier. Dude! Stuart! If you are in the warm fuzzy business, you are NOT in the testing business. My goal is not anything to do with warm fuzzy thinking. My goal is to patrol out in the cold and shine my bright bright torches into the darkness. On some level, I am never satisfied as a tester. I’m never sure of anything. That’s what it MEANS to be a professional tester, rather than an amateur or a marketer. Other people think positively. Other people believe. The warmth and fuzziness of belief is not for us, man. Do not pollute your testing mentality with that.]

My confusion over this being a tongue in cheek post is further compounded by point 2. Too many testers do believe that they own quality. They become the quality gate keeper, and i think they enjoy it. The question “can this go live” is answered by the tester. The tester may be unwilling to relinquish that control, because they perceive that as a loss of power.

Just as excellent testers must understand they provide a service, they must also understand that they are not “quality” people. They simply hold the light so that other people can work. Testers do not create quality. If you create quality, you are not being a tester. You are being an amateur programmer/designer/manager. That’s okay, but my commitment list is meant for people who see themselves as testers.

I guess what i’m fumbling about with here, is that some testers will read this, and sign up to it. Maybe even print it out and put it up on the wall.

I hope they do. This list has served me well.

Either way, you have written another thought provoking piece James, but i cant help wondering about the back story. What prompted you to write it?

Stuart – sat in the coffee shop, but with laptop and wifi, not a notepad 😉

This piece was originally prompted when I took over a testing and build team for SmartPatents, in 1997. I was told that I would decide when the software was good enough. I called a meeting of the programmers and testers and distributed the first version of this document in order to clarify to the programmers that I was not some obstacle or buzzing horsefly, but rather a partner with them in a SERVICE role.

That was also the first time I gave up my office and opted to sit in a low cubicle next to the entrance that no one else wanted– the receptionist’s cubicle. It was perfect for my role. Programmers streamed by all day and couldn’t avoid me.

I think this piece is also a reaction to the Tester’s Bill of Rights (the one by Gilb, not the one by Bernie Berger, although I have my concerns about that one, too), which is one of the most arrogant pieces of crap I’ve ever seen written on behalf of testers. I will not link to it.

And now that I think of it, I wrote this a year after I hired a personal assistant, and two months later had her promoted to be my boss, even though she retained almost the same duties. The experience of working for someone whose actual day-to-day role was to serve me– but who felt powerful and grateful as my manager– helped me understand the nature of service and leadership in a new way.

Finally, a current that runs through my life is my experience as a husband and father. I am very powerful. I can make my family miserable if I so choose. I choose to serve them, instead.

The CAST Testing Competition

I sponsored the testing competition at CAST, last week, awarding $1,426.00 of my own money to the winners.

My game, my rules, of course, but I tried to be fair and give out the prizes to deserving winners.

There was some controversy…

We set it up with simple rules, and put the onus on the contestants to sort themselves out. The way it worked is that teams signed up during the day (a team could be one tester or many), then at 6pm they received a link to the software. They had to download it, test it, report bugs, and write a report in 4 hours. We set up a website for them to submit reports and receive updates. The developer of the product was sitting in the same ballroom as the contestants, available to anyone who wished to speak with him.

I left the scoring algorithm unexplained, because I wanted the teams to use their testing skills to discover it (that’s how real life works, anyway). A few teams investigated the victory conditions. Most seemed to guess at them. No one associated with the conference organizers could compete for a prize.

During the competition, I made several rounds with my notebook, asking each team what they were doing and challenging them to justify their strategy. Most teams were not particularly crisp or informative in their answers (this is expected, since most testers do not practice their stand-up reporting skills). A few impressed me. When I felt good about an answer, I wrote another star in my notebook next to their name. My objective was partly to help me decide the winner, and partly to make myself available in case a team had any questions.

David reviewed the 350 bug reports, while I analyzed the final test reports. We created a multi-dimensional ordinal scale to aid in scoring:

Awards:

  • Worst Bug Report: Happy Purples ($26)
  • Best Bug Report: In 1st Place ($400)
  • Developer’s Choice Award: Springaby ($200)
  • Best Test Report: Springaby ($800)

These rankings don’t follow any algorithm. We used a heuristic approach. We translated the raw experience data into 1-5 scales (where 5 is OMG and 1 is WTF). David and I discussed and agreed to each assessment, then we looked at the aggregates and decided who would get the awards. My final orderings for best test report (where report means the overall test report, not just the written summary report) are on the left.

Note: I don’t have all the names of the testers involved in these teams (I’ll add them if they are sent to me).

Now for the special notes and controversies.

Happy Purples

Happy Purples won $26 for the worst report made of a bug (which was actually for two bug reports: that it was too slow to download the software at the start of the competition AND that a tooltip was inconsistent with a button title because it wasn’t a duplicate of the button title).

The Happies were not a very experienced team, and that showed in their developer relations. I thought their overall bug list was not terrible, although it wasn’t very deep, either. They earned the ire of the developer because they tried to defend the weird bug reports, mentioned above, and that so offended to David that he flipped the bozo bit on them as a team. Hey, that’s realistic. Developers do that. So be careful, testers.

TestMuse

Keith Stobie was the solo tester known as TestMuse. He was a good explainer when I stopped by to challenge him on what he was doing, but I don’t think he took his written test report very seriously. I had a hard time judging from that what he did and why he did it. I know Keith well enough that I think he’s capable of writing a good report, so maybe he didn’t realize it was a major part of the score.

In 1st Place

They didn’t report many bugs (9, I think). But the ones they reported were just the kind the developer was looking for. I don’t remember which report David told me was the bug that won them the Best Bug Report award, but each bug on their list was a solid functionality problem, rather than a nitpicky UI thing. We called these guys the sniper testers, because they picked their shots.

Springaby

A portmanteau of “springbok” and “wallaby”, Springaby consisted of Australian tester Ben Kelly and South African Louise Perold. Like “Hey David!”, the winner of the previous CAST competition (2007), they used the tactic of sitting right next to the developer during the whole four hours. Just like last time, this method worked. It’s so simple: be friendly with the developer, help the developer, ask him questions, and maybe you will win the competition. Springaby won Developer’s Choice, which goes to David’s favorite team based on personal interactions during the competition, and they won for best test report… But mainly that was because Miagi-Do wiped out.

Note: Springaby reported on of their bugs in Japanese. However, the developer took this as a jest and did not mark them down.

Miagi-Do

Miagi-Do was kind of an all-star team, reminiscent of the Canadian team that should have won the competition in 2007 before they were disqualified for having Paul Holland (a conference organizer) on their side. This time we were very clear that no conference organizer could compete for a prize. But Miagi-Do, which consisted mainly of the friends and proteges of Matthew Heusser (a conference organizer) decided they would rather have him on their team and lose prize money than not have him and lose fun. Ah sportsmanship!

The Miagi-Do team was serious from the start. Some prominent names were on it: Markus Gaertner, Ajay Balamurugadas, and Michael Larsen, to name three. Also, gutsy newcomers to our community, Adam Yuret and Elena Houser.

Miagi-Do got the best rating from my walkaround interviews. They were using session-based test management with facilitated debriefings, and Matt grilled me about the scoring. They talked with the developer, and also consulted with my brother Jon about their final report. I expected them to cruise to a clear victory.

In the end, they won the “Spectacular Wipeout” award (an honorary award made up on the spot), for the best example of losing at the last minute. More about that, below.

The Controversy: Bad Report or Bad Call?

Let’s contrast the final reports of Miagi-Do and Springaby.

This is the summary from Miagi-Do. Study it carefully:

Now this is the summary from Springaby:

Bottom line is this: They both criticize the product pretty strongly, but Miagi-Do insulted the developer as well. That’s the spectacular wipeout. David was incensed. He spouted unprintable replies to Miagi-Do.

The reason why Miagi-Do was the goat, while Springaby was the pet, is that Springaby did not impose their own standard of quality onto the product. They did not make a quality judgment. They made descriptive statements about the product. Calling it unstable, for instance, is not to say it’s a bad product. In fact, Springaby was the ONLY team who checked with David about what quality standard is appropriate for his product. The other teams made assumptions about what the standard should be. They were generally reasonable assumptions, but still, the vendor of the product was right there– why assume you know what the intended customer, use, and quality standard is when you can just ask?

Meanwhile, Miagi-Do claimed the product was not “worthy” to be tested. Oh my God. Don’t say things like that, my fellow testers. You can say that the effort to test the product further at this time may not be justified, but that’s not the same thing as questioning the “worthiness” of the product, which is a morally charged word. The reference to black flagging, in this case, also seems gratuitous. I coined the concept of black flagging bugs (my brother came up with the term itself, by borrowing from NASCAR). I like the idea, but it’s not a term you want to pull out and use in a test report unless everyone is already fully familiar with it. The attempt to define it in the test report makes it appear as if the tester is reaching for colorful metaphors to rub in how much the programmer, and his product, sucks.

Springaby did not presume to know whether the product was bad or good, just that it was unstable and contained many potentially interesting bugs. They came to a meeting of minds with the developer, instead of dictating to him. Thus, even those both teams concurred in their technical findings, one team pleased their client, the other infuriated him.

This judgment of mine and David’s is controversial, because Adam Yuret, the up and coming tester who actually wrote the report, consulted with my brother Jon on the wording. Jon felt that the wording was good, and that the developer should develop a thicker skin. However, Jon wasn’t aware that Miagi-Do was working on the basis of their own imagined quality standard, rather than the one their client actually cared about. I think Adam did the right thing consulting with Jon (although if they had been otherwise eligible to win a prize, that consultation would have disqualified them). Adam tried hard and did what he thought was right. But it turns out the rest of the Miagi-Do team had not fully reviewed the test report, and perhaps if they did, they would have noticed the logical and diplomatic issues with it.

Well, there you go. I feel good about the scoring. I also learned something: most testers are poorly practiced at writing test reports. Start practicing, guys.

Should Developers Test the Product First?

When a programmer builds a product, should he release it to the testers right away? Or should he test it himself to make sure that it is free of obvious bugs?

Many testers would advise the programmer to test the product himself, first. I have a different answer. My answer is: send me the product the moment it exists. I want avoid creating barriers between testing and programming. I worry that anything that may cause the programmers to avoid working with me is toxic to rapid, excellent testing.

Of course, it’s possible to test the product without waiting to send it to the testers. For instance, a good set of automated unit tests as part of the build process would make the whole issue moot. Also, I wouldn’t mind if the programmer tested the product in parallel with me, if he wants to. But I don’t demand either of those things. They are a lot of work.

As a tester I understand that I am providing a service to a customer. One of my customers is the programmer. I try to present a customer service interface that makes the programmers happy I’m on the project.

I didn’t always feel this way. I came to this attitude after experiencing a few projects where I drew sharp lines in sand, made lots of demands, then discovered how difficult it is to do great testing without the enthusiastic cooperation of the people who create the product.

It wasn’t just malicious behavior, though. Some programmers, with the best of intentions, were delaying my test process by trying to test it themselves, and fix every bug, before I even got my first look at it (like those people who hire house cleaners, and then clean their own houses before the professionals arrive).

Sometimes a product is so buggy that I can’t make much progress testing it. Even then, I want to have it. Every look I get at it helps me get better ideas for testing it, later on.

Sometimes the programmer already knows about the bugs that I find. Even then, I want to have it. I just make a deal with the programmers that I will report bugs informally until we reach an agreed upon milestone. Any bugs not fixed by that time get formally reported and tracked.

Sometimes the product is completely inoperable. Even then, I want to have it. Just by looking at its files and structures I might begin to get better ideas for testing it.

My basic heuristic is: if it exists, I want to test it. (The only exception is if I have something more important to do.)

My colleague Doug Hoffman has raised a concern about what management expects from testing. The earlier you get a product, the less likely you can make visible progress testing it– then testing may be blamed for the apparently slow progress. Yes, that is a concern, but that’s a question of managing expectations. Hence, I manage them.

So, send me your huddled masses of code, yearning to be tested. I’ll take it from there.