Six Things That Go Wrong With Discussions About Testing

Talking about software testing is not easy. It’s not natural! Testing is a “meta” activity. It’s not just a task, but a task that generates new tasks (by finding bugs that should be fixed or finding new risks that must be examined). It’s a task that can never be “completed” yet must get “done.”

Confusion about testing leads to ineffective conversations that focus on unimportant issues while ignoring the things that matter. Here are some specific ways that testing conversations fail:

  1. When people care about how many test cases they have instead of what their testing actually does. The number of test cases (e.g. 500, 257, 39345) tells nothing to anyone about “how much testing” you are doing. The reason that developers don’t brag about how many files they created today while developing their product is that everyone knows that it’s silly to count files, or keystrokes, or anything like that. For the same reasons, it is silly to count test cases. The same test activity can be represented as one test case or one million test cases. What if a tester writes software that automatically creates 100,000 variations of a single test case? Is that really “100,000” test cases, or is it one big test case, or is it no test case at all? The next time someone gives you a test case count, practice saying to yourself “that tells me nothing at all.” Then ask a question about what the tests actually do: What do they cover? What bugs can they detect? What risks are they motivated by?
  2. When people speak of a test as an object rather than an event. A test is not a physical object, although physical things such as documentation, data, and code can be a part of tests. A test is a performance; an activity; it’s something that you do. By speaking of a test as an object rather than a performance, you skip right over the most important part of a test: the attention, motivation, integrity, and skill of the tester. No two different testers ever perform the “same test” in the “same way” in all the ways that matter. Technically, you can’t take a test case and give it to someone else without changing the resulting test in some way (just as no quarterback or baseball player will execute the same play in the same way twice) although the changes don’t necessarily matter.
  3. When people can’t describe their test strategy. Test strategy is the set of ideas that guide your choices about what tests to design and what tests to perform in any given situation. Test strategy could also be called the reasoning behind the actions that comprise each test. Test strategy is the answer to questions such as “why are these tests worth doing?” “why not do different tests instead?” “what could we change if we wanted to test more deeply?” “what would we change if we wanted to test more quickly?” “why are we doing testing this way?”The ability to design and discuss test strategy is a hallmark of professional testing. Otherwise, testing would just be a matter of habit and intuition.
  4. When people talk as if automation does testing instead of humans. If developers spoke of development the way that so many people speak of testing, they would say that their compiler created their product, and that all they do is operate the compiler. They would say that the product was created “automatically” rather than by particular people who worked hard and smart to write the code. And management would become obsessed with “automating development” by getting ever better tools instead of hiring and training excellent developers. A better way to speak about testing is the same way we speak about development: it’s something that people do, not tools. Tools help, but tools do not do testing.There is no such thing as an automated test. The most a tool can do is operate a product according to a script and check for specific output according to a script. That would not be a test, but rather a fact check about the product. Tools can do fact checking very well. But testing is more than fact checking because testers must use technical judgment and ingenuity to create the checks and evaluate them and maintain and improve them. The name for that entire human process (supported by tools) is testing. When you focus on “automated tests” you usually defocus from the skills, judgment, problem-solving, and motivation that actually controls the quality of the testing. And then you are not dealing with the important factors that control the quality of testing.
  5. When people talk as if there is only one kind of test coverage. There are many ways you can cover the product when you test it. Each method of assessing coverage is different and has its own dynamics. No one way of talking about it (e.g. code coverage) gives you enough of the story. Just as one example, if you test a page that provides search results for a query, you have covered the functionality represented by the kind of query that you just did (function coverage), and you have covered it with the particular data set of items that existed at that time (data coverage). If you change the query to invoke a different kind of search, you will get new functional coverage. If you change the data set, you will get new data coverage. Either way, you may find a new bug with that new coverage. Functions interact with data; therefore good testing involves covering not just one or the other but also with both together in different combinations.
  6. When people talk as if testing is a static task that is easily formalized. Testing is a learning task; it is fundamentally about learning. If you tell me you are testing, but not learning anything, I say you are not testing at all. And the nature of any true learning is that you can’t know what you will discover next– it is an exploratory enterprise.It’s the same way with many things we do in life, from driving a car to managing a company. There are indeed things that we can predict will happen and patterns we might use to organize our actions, but none of that means you can sleepwalk through it by putting your head down and following a script. To test is to continually question what you are doing and seeing.

    The process of professional testing is not design test cases and then follow the test cases. No responsible tester works this way. Responsible testing is a constant process of investigation and experiment design. This may involve designing procedures and automation that systematically collects data about the product, but all of that must be done with the understanding that we respond to the situation in front of us as it unfolds. We deviate frequently from procedures we establish because software is complicated and surprising; and because the organization has shifting needs; and because we learn of better ways to test as we go.

Through these and other failures in testing conversations, people persist in the belief that good testing is just a matter of writing ever more “test cases” (regardless of what they do); automating them (regardless of what automation can’t do); passing them from one untrained tester to another; all the while fetishizing the files and scripts themselves instead of looking at what the testers are doing with them from day to day.

“Are you listening? Say something!”

[Note: J. Michael Hammond suggests that I note right at the top of this post that the dictionary definition of listen does not restrict the word to the mere noticing of sounds. In the Oxford English Dictionary, as well as others, an extended and more active definition of listening is also given. It’s that extended definition of listening I am writing about here.]

I’m tired of hearing the simplistic advice about how to listen one must not talk. That’s not what listening means. I listen by reacting. As an extravert, I react partly by talking. Talking is how I chew on what you’ve told me. If I don’t chew on what you say, I will choke or get tummy aches and nightmares. You don’t want me to have nightmares, do you? Until you interrupt me to say otherwise, I charitably assume you don’t.

Below is an alternative theory of listening; one that does not require passivity. I will show how this theory is consistent the “don’t talk” advice if you consider that being quiet while other people speak is one heuristic of good listening, rather than the definition or foundation of it. I am tempted to say that listening requires talking, but that is not quite true. This is my proposal of a universal truth of listening: Listening requires you to change.

To Listen is to Change

  1. I propose that to listen is to react coherently and charitably to incoming information. That is how I would define listening.
  2. To react is to change. The reactions of listening may involve a change of mood, attention, concept, or even a physical action.

Notice that I said “coherently and charitably” and not “constructively” or “agreeably.” I think I can be listening to a criminal who demands ransom even if I am not constructive in my response to him. Reacting coherently is not the same as accepting someone’s view of the world. If I don’t agree with you or do what you want me to, that is not proof of my poor listening. “Coherently” refers to a way of making sense of something by interpreting it such that it does not contradict anything important that you also believe is true and important about the world. “Charitably” refers to making sense of something in a way most likely to fit the intent of the speaker.

Also, notice that coherence does not require understanding. I would not be a bad listener, necessarily, if I didn’t understand the intent or implications of what was told to me. Understanding is too high a burden to require for listening. Coherence and charitability already imply a reasonable attempt to understand, and that is the important part.

Poor listening would be the inability or refusal to do the following:

  • take in data at a reasonable pace. (“reasonable pace” is subject to disagreement)
  • make sense of data that is reasonably sensible in that context, including empathizing with it. (“reasonably sensible” is subject to disagreement)
  • reason appropriately about the data. (“reason appropriately” is subject to disagreement)
  • take appropriate responsibility for one’s feelings about the data (“appropriate responsibility” is subject to disagreement)
  • make a coherent response. (“coherent response” is subject to disagreement)
  • comprehend the reasonable purposes and nature of the interaction (“reasonable purposes and nature” is subject to disagreement)

Although all these elements are subject to disagreement, you might not choose to actively dispute them in a given situation, because maybe you feel that the disagreement is not very important. (As an example, I originally wrote “dispute” in the text above, which I think is fine, but during review, after hearing me read the above, Michael Bolton suggested changing “dispute” to “disagreement” and that seemed okay, too, so I made the change. In making his suggestion, he did not need to explain or defend his preference, because he’s earned a lot of trust with me and I felt listened to.)

I was recently told, in an argument, that I was not listening. I didn’t bother to reply to the man that I also felt he wasn’t listening to me. For the record, I think I was listening well enough, and what the man wanted from me was not listening– he wanted compliance to his world view, which was the very matter of dispute! Clearly he wasn’t getting the reaction he wanted, and the word he used for that was listening. Meanwhile, I had reacted to his statements with arguments against them. To me, this is close to the essence of listening.

If you really believe someone isn’t listening, it’s unlikely that it will help to say that, unless you have a strong personal relationship. When my wife tells me I’m not listening, that’s a very special case. She’s weaker than me and crucial to my health and happiness, therefore I will use every tool at my disposal to make myself easy for her to talk to. I generally do the same for children, dogs, people who seem mentally unstable, fire, and dangerous things, but not for most colleagues. I do get crossed up sometimes. Absolutely. Especially on Twitter. Sometimes I assume a colleague feels powerful, and respond to him that way, only later to discover he was afraid of me.

(This happened again just the other day on Twitter. Which is why it is unlikely you will see me teach in Finland any time soon! I am bitten by such a mistake a few times a year, at least. For me this is not a reason to be softer with my colleagues. Then again, it may be. I struggle with the pros and cons. There is no simple answer. I regularly receive counsel from my most trusted colleagues on this point.)

A Sign of Being Listened to is the Change that Happens

Introspect for a moment. How do you know that your computer is listening to you? At this moment, as I am typing, the letters I want to see are appearing on the screen as I press the keys. WordPress is talking back to me. WordPress is changing, and its changes seem coherent and reasonable to me. My purposes are apparently being served. The computer is listening. Consider what happens when you don’t see a response from your computer. How many times have you clicked “save” or “print” or “calculate” or “paste” and suffered that sinking feeling as the forest noises go completely silent and your screen goes glassy and gets that faraway grayed out look of the damned? You feel out of control. You want to shout at your screen “Come back! I’ve changed my mind! Undo! Cancel!” How do you feel then? You don’t say to yourself “what a good listener my computer is!”

Why is this so? It’s because you are involved in a cybernetic control loop with your computer. Without frequent feedback from your system you lose your control over it. You don’t know what it needs or what to do about it. It may be listening to something, but when nothing changes in a manner that seems to relate to your input, you suspect it is not listening to you.

Based just on this example I conjecture that we feel listened to when a system responds to our utterances and actions in a harmonious manner that honors our purposes. I further conjecture that the advice to maintain attentive silence in order to listen better is a special case of change in such a way as to foster harmony and supportiveness.

Can we think of a situation where listening to someone means shouting loudly over them? I can. I was recently in a situation where a quiet colleague was trying to get students to return to her tutorial after a break. The hallway was too noisy and few people could hear her. I noticed that, so I repeated her words very loudly that her students might hear. I would argue that I listened and responded harmoniously in support of her needs. I didn’t ask her if she felt that I listened to her. She knows I did. I could tell by her smile.

If my wife cries “brake!” when I’m driving, I hit the brake. The physical action of my foot on the brake is her evidence that I listened, not attentive silence or passivity.

It may be a small change or a large change, but for the person communicating with you to feel listened to, they must see good evidence of an appropriate change (or change process) in you.

Let me tell you about being a father of a strong-minded son. I have been in numerous arguments with my boy. I have learned how to get my point across: plant the idea, argue for a while, and then let go of it. I discovered it doesn’t matter if he seems to reject the idea. In fact, I’ve come to believe he cannot reject any idea of mine unless it is genuinely wrong for him. I know he’s listening because he argues with me. And if he gets upset, that means he must be taking it quite seriously. Then I wait. And I invariably see a response in the days that follow (I mean not a single instance of this not happening comes to mind right now).

One of the tragedies of fatherhood is that many fathers can’t tell when their children are listening because they need to see too specific a response too quickly. Some listening is a long process. I know that my son needs to chew on difficult ideas in order to process them. This is how to think about the listening process. True listening implies digestion and incubation. The mental metabolism is subtle, complicated, and absolutely vital.

Let People Chew on Your Ideas

Listening is not primarily about taking information into yourself, any more than eating is about taking food into yourself. With eating the real point is digestion. And for good listening you need to digest, too. Part of digestion is chewing, and for humans part of listening is reacting to the raw data for the purposes of testing understanding and contrasting the incoming data with other data they have. Listening well about any complicated thing requires testing. Does this apply to your spouse and children, too? Yes! But perhaps it applies differently to them than to a colleague at work, and certainly differently than testing-as-listening to politician or a telemarketer.

Why does this matter so much? Because if we uncritically accept ideas we risk falling prey to shallow agreement, which is the appearance of agreement despite an unrecognized deep disagreement. I don’t want to find out in the middle of a critical moment on a project that your definition of testing, or role, or collaboration, or curiosity doesn’t match mine. I want to have conversations about the meanings of words well before that. Therefore I test my understanding. Too many in the Agile culture seem to confuse a vacant smile with philosophical and practical comprehension. I was told recently that for an Agile tester, “collaboration” may be more important than testing skill. That is probably the stupidest thing I have heard all year. By “stupid” I mean willfully refusing to use one’s mind. I was talking to a smart man who would not use his smarts in that moment, because, by his argument, the better tester is the one who agrees to do anything for anyone, not the one who knows how to find important bugs quickly. In other words, any unskilled day laborer off the street, desperate for work, is apparently a better tester than me. Yeah… Right…

In addition to the idea digestion process, listening also has a critical social element. As I said above, whether or not you are listening is, practically speaking, always a matter of potential dispute. That’s the way of it. Listening practices and instances are all tied up in socially constructed rituals and heuristics. And these rituals are all about making ourselves open to reasonable change in response to each other. Listening is about the maintenance of social order as well as maintaining specific social relationships. This is the source of all that advice about listening by keeping attentively quiet while someone else speaks. What that misses is that the speaker also has a duty to perform in the social system. The speaker cannot blather on in ignorance or indifference to the idea processing practices of his audience. When I teach, I ask my students to interrupt me, and I strive to reward them for doing so. When I get up to speak, I know I must skillfully use visual materials, volume control, rhythm, and other rhetorical flourishes in order to package what I’m communicating into a more digestible form.

Unlike many teachers, I don’t interpret silence as listening. Silence is easy. If an activity can be done better and cheaper by a corpse or an inanimate object, I don’t consider it automatically worth doing as a living human.

I strongly disagree with Paul Klipp when he writes: “Then stop thinking about talking and pretend, if it’s not obvious to you yet, that the person who is talking is as good at thinking as you are. You may suddenly have a good idea, or you may have information that the person speaking doesn’t. That’s not a good enough reason to interrupt them when they are thinking.” Paul implies that interrupting a speaker is an expression of dominance or subversion. Yes, it can be, but it is not necessarily so, and I wish someone trained in Anthropology would avoid such an uncharitable oversimplification. Some interruptions are harmful and some are helpful. In fact, I would say that every social act is both harmful and helpful in some way. We must use our judgment to know what to say, how to say it and when. Stating favorite heuristics as if they were “best practices” is patronizing and unnecessary.

One Heuristic of Listening: Stop Talking

Where I agree with Paul and others like him is that one way of improving the harmony of communication and that feeling of being coherently and charitably responded to is to talk less. I’m more likely to use that in a situation where I’m dealing with someone whom I suspect is feeling weak, and whom I want to encourage to speak to me. However, another heuristic I use in that situation is to speak more. I do this when I want to create a rhetorical framework to help the person get his idea across. This has the side effect of taking pressure of someone who may not want to speak at all. I say this based on the vivid personal experience of my first date with the one who would become my wife. I estimate I spoke many thousands of words that evening. She said about a dozen. I found out later that’s just what she was looking for. How do I know? After two dates we got married. We’ve been married 23 years, so far. I also have many vivid experiences of difficult conversations that required me to sit next to her in silence for as long as 10 minutes until she was ready to speak. Both the “talk more” and “talk less” heuristics are useful for having a conversation.

What does this have to do with testing?

My view of listening can be annoying to people for exactly the same reason that testing is annoying to people. A developer may want me to accept his product without “judgment.” Sorry, man. That is not the tester’s way. A tester who doesn’t subject your product to criticism is, in fact, not taking it seriously. You should not feel honored by that, but rather insulted. Testing is how I honor strong, good products. And arguing with you may be how I honor your ideas.

Listening, I claim, is itself a testing process. It must be, because testing is how we come to comprehend anything deeply. Testing is a practice that enables deep learning and deeply trusting what we know.

Are You Listening to Me?

Then feel free to respond. Even if you disagree, you could well have been listening. I might be able to tell from your response, if that matters to you.

If you want to challenge this post, try reading it carefully… I will understand if you skip parts, or see one thing and want to argue with that. Go ahead. That might be okay. If I feel that there is critical information that you are missing, I will suggest that you read the post again. I don’t require that people read or listen to me thoroughly before responding. I ask only that you make a reasonable and charitable effort to make sense of this.

 

Variable Testers

I once heard a vice president of software engineering tell his people that they needed to formalize their work. That day, I was an unpaid consultant in the building to give a free seminar, so I had even less restraint than normal about arguing with the guy. I raised my hand, “I don’t think you can mean that, sir. Formality is about sameness. Are you really concerned that your people are working in different ways? It seems to me that what you ought to be concerned about is effectiveness. In other words, get the job done. If the work is done a different way every time, but each time done well, would you really have a problem with that? For that matter, do you actually know how your folks work?”

This was years ago. I’m wracking my brain, but I can’t remember specifically how the executive responded. All I remember is that he didn’t reply with anything very specific and did not seem pleased to be corrected by some stranger who came to give a talk.

Oh well, it had to be done.

I have occasionally heard the concern by managers that testers are variable in their work; that some testers are better than others; and that this variability is a problem. But variability is not a problem in and of itself. When you drive a car, there are different cars on the road each day, and you have to make different patterns of turning the wheel and pushing the brake. So what?

The weird thing is how utterly obvious this is. Think about managers, designers, programmers, product owners… think about ANYONE in engineering. We are all variable. Complaining about testers being variable– as if that were a special case– seems bizarre to me… unless…

I suppose there are two things that come to mind which might explain it:

1) Maybe they mean “testers vary between satisfying me and not satisfying me, unlike other people, who always satisfy me.” To examine this we would discover what their expectations are. Maybe they are reasonable or maybe they are not. Maybe a better system for training and leading testers is needed.

2) Maybe they mean “testing is a strictly formal process that by its nature should not vary.” This is a typical belief by people who know nothing about testing. What they need is to have testing explained or demonstrated to them by someone who knows what he’s doing.

 

 

 

 

 

No KPIs: Use Discussion

My son is ready to send the manuscript of his novel to publishers. It’s time to see what the interest is. In other words, we are going to beta on it. He made this decision tonight.

What is the quality level of his manuscript? There is no objective measure for that. Even if we might imagine “requirements” we could not say for sure if they are met. I can tell you that the novel is about 800 pages long, representing well more than 1,200 hours of his work alone. I have worked a lot on editing and review. The first half has been rewritten many times– maybe 20 or 30. It’s a mature draft.

The first third is good, in my opinion. I’m biased. I’ve read the parts I’ve read many many times. But it seems good to me. I cannot yet speak about the latter 2/3 because I haven’t gotten there yet. I know it will be good by the time we’ve completed the editing, because he’s using a methodical, competent editing process.

Here’s my point. My son, who relies on me to test his novel, has not asked me to quantify my process nor my results. I have not been asked for a KPI. He cares deeply about the quality of his work, but he doesn’t think that can be reduced to numbers. I think this is partly because my son is no longer a child. He doesn’t need me or anyone else to make complicated life simple for him.

How do you measure quality?

Gather relevant evidence through testing and other means. Then discuss that evidence.

That’s how it works for us. That’s how it works for publishers. That’s how it works for almost everything.

Who can’t accept this?

Children and liars.

But my company demands that I report quality in the form of an objective metric!

I’m sorry that you work for children and/or liars. You must feel awful.

 

Test Jumpers: One Vision of Agile Testing

Many software companies, these days, are organized around a number of small Agile teams. These teams may be working on different projects or parts of the same project. I have often toured such companies with their large open plan offices; their big tables and whiteboards festooned with colorful Post-Its occasionally fluttering to the floor like leaves in a perpetual autumn display; their too many earbuds and not nearly enough conference rooms. Sound familiar, Spotify? Skype?

(This is a picture of a smoke jumper. I wish test jumpers looked this cool.)

I have a proposal for skilled Agile testing in such places: a role called a “test jumper.” The name comes from the elite “smoke jumper” type of firefighter. A test jumper is a trained and enthusiastic test lead (see my Responsible Tester post for a description of a test lead) who “jumps” into projects and from project to project: evaluating the testing, doing testing or organizing people in other roles to do testing. A test jumper can function as test team of one (what I call an omega tester ) or join a team of other testers.

The value of a role like this arises because in a typical dedicated Agile situation, everyone is expected to help with testing, and yet having staff dedicated solely to testing may be unwarranted. In practice, that means everyone remains chronically an amateur tester, untrained and unmotivated. The test jumper role could be a role held by one person, dedicated to the mastery of testing skills and tools, who is shared among many projects. This is a role that I feel close to, because it’s sort of what I already do. I am a consulting software tester who likes to get his hands dirty doing testing and running in-house testing events. I love short-term assignments and helping other testers come up to speed.

 

 

What Does a Test Jumper Do?

A test jumper basically asks, How are my projects handling the testing? How can I contribute to a project? How can I help someone test today?

Specifically a test jumper:

  • may spend weeks on one project, acting as an ordinary responsible tester.
  • may spend a few days on one project, organizing and leading testing events, coaching people, and helping to evaluate the results.
  • may spend as little as 90 minutes on one project, reviewing a test strategy and giving suggestions to a local tester or developer.
  • may attend a sprint planning meeting to assure that testing issues are discussed.
  • may design, write, or configure a tool to help perform a certain special kind of testing.
  • may coach another tester about how to create a test strategy, use a tool, or otherwise learn to be a better tester.
  • may make sense of test coverage.
  • may work with designers to foster better testability in the product.
  • may help improve relations between testers and developers, or if there are no other testers help the developers think productively about testing.

Test jumping is a time-critical role. You must learn to triage and split your time across many task threads. You must reassess project and product risk pretty much every day. I can see calling someone a test jumper who never “jumps” out of the project, but nevertheless embodies the skills and temperament needs to work in a very flexible, agile, self-managed fashion, on an intense project.

Addendum #1: Commenter Augusto Evangelisti suggests that I emphasize the point about coaching. It is already in my list, above, but I agree it deserves more prominence. In order to safely “jump” away from a project, the test jumper must constantly lean toward nudging, coaching, or even training local helpers (who are often the developers themselves, and who are not testing specialists, even though they are super-smart and experienced in other technical realms) and local responsible testers (if there are any on that project). The ideal goal is for each team to be reasonably self-sufficient, or at least for the periodic visits of the test jumper to be enough to keep them on a good track.

What Does a Test Jumper Need?

  • The ability and the enthusiasm for plunging in and doing testing right now when necessary.
  • The ability to pull himself out of a specific test task and see the big picture.
  • The ability to recruit helpers.
  • The ability to coach and train testers, and people who can help testing.
  • A wide knowledge of tools and ability to write tools as needed.
  • A good respectful relationship with developers.
  • The ability to speak up in sprint planning meetings about testing-related issues such as testability.
  • A keen understanding of testability.
  • The ability to lead ad hoc groups of people with challenging personalities during occasional test events.
  • An ability to speak in front of people and produce useful and concise documentation as necessary.
  • The ability to manage many threads of work at once.
  • The ability to evaluate and explain testing in general, as well as with respect to particular forms of testing.

A good test jumper will listen to advice from anyone, but no one needs to tell a test jumper what to do next. Test jumpers manage their own testing missions, in consultation with such clients as arise. A test jumper must be able to discover and analyze the testing context, then adapt to it or shape it as necessary. It is a role made for the Context-Driven school of testing.

Does a Test Jumper Need to be a Programmer?

Coding skills help tremendously in this role, but being a good programmer is not absolutely required. What is required is that you learn technical things very quickly and have excellent problem-solving and social skills. Oh, and you ought to live and breathe testing, of course.

How Does a Test Jumper Come to Be?

A test jumper is mostly self-created, much as good developers are. A test jumper can start as a programmer, as I did, and then fall in love with the excitement of testing (I love the hunt for bugs). A test jumper may start as a tester, learn consulting and leadership skills, but not want to be a full-time manager. Management has its consolations and triumphs, of course, but some of us like to do technical things. Test jumping may be part of extending the career path for an experienced and valuable tester.

RST Methodology: “Responsible Tester”

In Rapid Software Testing methodology, we recognize three main roles: Leader, Responsible Tester, and Helper. These roles are situational distinctions. The same person might be a helper in one situation, a leader in another, and a responsible tester in yet another.

Responsible Tester

Rapid Software Testing is a human-centered approach to testing, because testing is a performance and can only be done by humans. Therefore, testing must be traceable to people, or else it is literally and figuratively irresponsible. Hence, a responsible tester is that tester who bears personal responsibility for testing a particular thing in a particular way for a particular project. The responsible tester answers for the quality of that testing, which means the tester can explain and defend the testing, and make it better if needed. Responsible testers also solicit and supervise helpers, as needed (see below).

This contrasts with factory-style testing, which relies on tools and texts rather than people. In the Factory school of testing thought, it should not matter who does the work, since people are interchangeable. Responsibility is not a mantle on anyone’s shoulders in that world, but rather a sort of smog that one seeks to avoid breathing too much of.

Example of testing without a responsible tester: Person A writes a text called a “test case” and hands it to person B. Person B reads the text and performs the instructions in the text. This may sound okay, but what if Person B is not qualified to evaluate if he has understood and performed the test, while at the same time Person A, the designer, is not watching and so also isn’t in position to evaluate it? In such a case, it’s like a driverless car. No one is taking responsibility. No one can say if the testing is good or take action if it is not good. If a problem is revealed later, they may both rightly blame the other.

That situation is a “sin” in Rapid Testing. To be practicing RST, there must always a responsible tester for any work that the project relies upon. (Of course students and otherwise non-professional testers can work unsupervised as practice or in the hopes of finding one more bug. That’s not testing the project relies upon.)

A responsible tester is like being the driver of an automobile or the pilot-in-command of an aircraft.

Helper

A helper is someone who contributes to the testing without taking responsibility for the quality of the work AS testing. In other words, if a responsible tester asks someone to do something simple to press a button, the helper may press the button without worrying about whether that has actually helped fulfill the mission of testing. Helpers should not be confused with inexperienced or low-skilled people. Helpers may be very skilled or have little skill. A senior architect who comes in to do testing might be asked to test part of the product and find interesting bugs without being expected to explain or defend his strategy for doing that. It’s the responsible tester whose job it is to supervise people who offer help and evaluate the degree to which their work is acceptable.

Beta testing is testing that is done entirely by helpers. Without responsible testers in the mix, it is not possible to evaluate in any depth what was achieved. One good way to use beta testers is to have them organized and engaged by one or more responsible testers.

Leader

A leader is someone whose responsibility is to foster and maintain the project conditions that make good testing possible; and to train, support, and evaluate responsible testers. There are at least two kinds of leader, a test lead and a test manager. The test manager is a test lead with the additional responsibilities of hiring, firing, performance reviews, and possibly budgeting.

In any situation where a leader is responsible for testing and yet has no responsible testers on his team, the leader IS the acting responsible tester. A leader surrounded by helpers is the responsible tester for that team.

 

Justifying Real Acceptance Testing

This post is not about the sort of testing people talk about when nearing a release and deciding whether it’s done. I have another word for that. I call it “testing,” or sometimes final testing or release testing. Many projects perform that testing in such a perfunctory way that it is better described as checking, according to the distinction between testing and checking I have previously written of on this blog. As Michael Bolton points out, that checking may better be described as rejection checking since a “fail” supposedly establishes a basis for saying the product is not done, whereas no amount of “passes” can show that it is done.

Acceptance testing can be defined in various ways. This post is about what I consider real acceptance testing, which I define as testing by a potential acceptor (a customer), performed for the purpose of informing a decision to accept (to purchase or rely upon) a product.

Do we need acceptance testing?

Whenever a business decides to purchase and rely upon a component or service, there is a danger that the product will fail and the business will suffer. One approach to dealing with that problem is to adopt the herd solution: follow the thickest part of the swarm; choose a popular product that is advertised or reputed to do what you want it to do and you will probably be okay. I have done that with smartphones, ruggedized laptops, file-sharing services, etc. with good results, though sometimes I am disappointed.

My business is small. I am nimble compared to almost every other company in the world. My acceptance testing usually takes the form of getting a trial subscription to service, or downloading the “basic” version of a product. Then I do some work with it and see how I feel. In this way I learned to love Dropbox, despite its troubling security situation (I can’t lock up my Dropbox files), or the fact that there is a significant chance it will corrupt very large files. (I no longer trust it with anything over half of a gig).

But what if I were advising a large company about whether to adopt a service or product that it will rely upon across dozens or hundreds or thousands of employees? What if the product has been customized or custom built specifically for them? That’s when acceptance testing becomes important.

Doesn’t the Service Level Agreement guarantee that the product will work?

There are a couple of problems with relying on vendor promises. First, the vendor probably isn’t promising total satisfaction. The service “levels” in the contract are probably narrowly and specifically drawn. That means if you don’t think of everything that matters and put that in the contract, it’s not covered. Testing is a process that helps reveal the dimensions of the service that matter.

Second, there’s an issue with timing. By the time you discover a problem with the vendor’s product, you may be already relying on it. You may already have deployed it widely. It may be too late to back out or switch to a different solution. Perhaps your company negotiated remedies in that case, but there are practical limitations to any remedy. If your vendor is very small, they may not be able to afford to fix their product quickly. If you vendor is very large, they may be able to afford to drag their feet on the fixes.

Acceptance testing protects you and makes the vendor take quality more seriously.

Acceptance testing should never be handled by the vendor. I was once hired by a vendor to do penetration testing on their product in order to appease a customer. But the vendor had no incentive to help me succeed in my assignment, nor to faithfully report the vulnerabilities I discovered. It would have been far better if the customer had hired me.

Only the accepting party has the incentive to test well. Acceptance testing should not be pre-agreed or pre-planned in any detail– otherwise the vendor will be sure that the product passes those specific tests. It should be unpredictable, so that the vendor has an incentive to make the product truly meet its requirements in a general sense. It should be adaptive (exploratory) so that any weakness you find can be examined and exploited.

The vendor wants your money. If your company is large enough, and the vendor is hungry, they will move mountains to make the product work well if they know you are paying attention. Acceptance testing, done creatively by skilled testers on a mission, keeps the vendor on its toes.

By explicitly testing in advance of your decision to accept the product, you have a fighting chance to avoid the disaster of discovering too late that the product is a lemon.

My management doesn’t think acceptance testing matters. What do I do?

1. Make the argument, above.
2. Communicate with management, formally, about this. (In writing, so that there is a record.)
It is up to management to make decisions about business risk. They may feel the risk is not worth worrying about. In that case, you must wait and watch. People are usually more persuaded by vivid experiences, rather than abstract principles, so:
1. Collect specific examples of the problems you are talking about. What bugs have you experienced in vendor products?
2. Collect news reports about bugs in products of your vendors (or other vendors) that have been disruptive.
3. In the event you get to do even a little acceptance testing, make a record of the problems you find and be ready to remind management of that history.

 

Tyranny of the Innocents, Exhibit A: Gary Cohen

Gary Cohen, Deputy Administrator and Director, Center for Consumer Information and Insurance Oversight, Centers for Medicare and Medicaid
Services, is not a computer guy. He’s a lawyer. He knows the insurance industry. And yet he was put in charge of a very large and important project: Healthcare.gov.

Here’s what BusinessWeek said about him on August 22nd:

“Gary Cohen seems awfully calm for a man whose job is to make sure Obamacare doesn’t flop. As head of CCIIO (awkward pronunciation: Suh-SIE-O), he oversees the complex, politically fraught system of state health insurance exchanges that will begin signing up uninsured Americans starting on Oct. 1. It hasn’t exactly been a smooth rollout. Many Americans still have no idea the exchanges exist, and the administration has struggled to explain who’s eligible for coverage under the Affordable Care Act and how they enroll. Cohen is convinced the confusion will clear up once things are up and running. “We’re going to get to the point where the discussions we’re having today will fade into the background,” he says.

He should have known that the system wasn’t going to work, at that point. But he’s not a technology guy, so perhaps he thought some big-brained hacker from the movies was going to pull it together at the last minute?

Here’s what he was asked and what he answered at a House Committee on Oversight hearing on May 21st:

Ms. DUCKWORTH. …Could you speak a little bit on the Administration’s readiness to
reach out to this huge number of people so that they can enroll in
time? Basically, you say that you are going to be ready to go on
October 1st, and you need to be. If not, what do you need in order
to get ready and have a successful rollout of these provisions?

Mr. COHEN. So we have a plan in place that basically is timed
so that people are getting the information close to the time in
which there is something that they can do with it. So right now we
are in what we call the education phase, which began in January
and proceeds through June, where we are just putting out information.
We are in the process of re-purposing the HealthCare.gov site
to be really a consumer information site. Our call center will be
going live in June, where people will be able to call and get information
that way. And then starting in the summer we will begin
what we call the anticipation, or get ready phase. And I am not an
expert in these things, but what I understand is that if you start
too early and then people say, well, what do I do, and then there
is nothing that they can do because it is too soon, then you may
end up having people who get a little bit kind of frustrated or disappointed.
So we really are gearing towards making sure the people get the
information they need in time for October, when they actually can
take action and begin to get enrollment coverage.

Hmm. He was asked directly if he needed anything to make sure he was ready to go on October 1st. His answer was basically: no thank you.

Did he really think everything was on track? Why didn’t his people prepare him to set expectations better?

Mr. GOSAR. Mr. Cohen, how closely is HHS working with IRS on Obamacare
implementation?
Mr. COHEN. We are working closely with IRS on those aspects of
implementation where we have to work together, so, for example,
as you know, in determining whether a person is eligible for Medicaid
or CHIP on the one hand, or tax credits in the marketplaces
on the other, income is a test, and we are working with IRS on
verifying people’s income when they apply.
Mr. GOSAR. So the IRS is going to be gathering and sending this
enormous amount of taxpayer information to all the 50 exchanges.
All 50 exchanges are to be ready by October 1st, right?
Mr. COHEN. Yes.
Mr. GOSAR. So will there be any problems with this massive
amount of data sharing?
Mr. COHEN. No. And data sharing may not be exactly the right
way to look at it. Basically what will happen is people will put information
about their income in an application; that information
will be verified by data that comes from the IRS, but there is no
exchange of information from the IRS to the exchange; the information
goes out, it is verified, and it comes back.
Mr. GOSAR. But it is still from the exchange going to the IRS,
and that is where I am going.
Mr. COHEN. It is going to the data hub. Information is coming
from the IRS to the data hub and from the exchange to the data
hub, and there is a comparison and then there is an answer back.
But the tax information isn’t actually going to the exchange.

What a refreshingly blunt answer to the question of whether there will be any trouble with data exchange: No. Unfortunately, we now know there are massive problems with that. Why didn’t he give a more nuanced answer? Why didn’t he hedge? This is why I think he’s an innocent– a child put in charge of the chocolate factory. He didn’t need to be, but that’s how he played it. I guess he was distracted by other duties and trusted the technologists? Or maybe he dismissed the concerns of the technologists as mere excuses? I wonder.

 

 

 

 

Healthcare.gov and the Tyranny of the Innocents

The failure of Healthcare.gov is probably not because of sinister people. It’s probably because of the innocents who run the project. These well-intentioned people are truly as naive as little children. And they must be stopped.

They are, of course, normal intelligent adults. I’m sure they got good grades in school– if you believe in that sort of thing– and they can feed and clothe themselves. They certainly look normal, even stately and wise. It’s just that they are profoundly ignorant about technology projects while being completely oblivious to and complacent about that ignorance. That is the biggest proximal cause of this debacle. It’s called the Dunning-Kruger syndrome (which you can either look up or confidently assure yourself that you don’t need to know about): incompetence of a kind that makes you unable to assess your own lack of competence.

Who am I talking about? I’m talking to some extent about everyone above the first level of management on that project, but mostly I’m talking about anyone who was in the management chain for that project but who has never coded or tested before in their working lives. The non-technical people who created the conditions that the technical people had to work under.

I also blame the technical people in a different way. I’ll get to that, below.

How do I come to this conclusion? Well, take a look at the major possibilities:

Maybe it didn’t fail. Maybe this is normal for projects to have a few glitches? Oh my, no. Project failures are not often clear cut. But among failures, this one is cut as clearly as the Hope Diamond. This is not a near miss. This is the equivalent of sending Hans away to sell the family cow and he comes back with magic beans. It’s the equivalent of going to buy a car and coming back with a shopping cart that has a cardboard sign on which someone has written “CAR” in magic marker. It’s a swing and a miss when the batter was not even holding a bat. It’s so bad, I hope criminal charges are being considered. Make no mistake, the people who ran this project scammed the US government.

Did it fail because it’s too hard a project to do? It’s a difficult project, for sure. It may have been too hard to do under the circumstances prescribed. If so, then we should have heard that message a year ago. Loudly and publicly. We didn’t hear that? Why? Could it have been that the technical people kept their thoughts and feelings carefully shrouded? That’s not what’s being reported. It’s come out that technical people were complaining to management. Management must have quashed those complaints.

Did politics prevent the project from succeeding? No doubt that created a terrible environment in which to produce the system. So what? If it’s too hard, just laugh and say “hey this is ridiculous, we can’t commit to creating this system” UNLESS, of course, you are hoping to hide the problem forever, like a child who has wet the bed and dumps the sheets out the back window. I suppose it’s possible that Republican operatives secretly conspired to make the project fail. If so, I hope that comes out. Doesn’t matter, though. Management could still have seen it coming, unless the whole development team was in on the fix.

Were the technical people incompetent? Probably. It’s likely that many of the programmers were little better than novices, from what I can tell by looking at the bug reports coming through. It was a Children’s Crusade, I guess. But again, so what? The purpose of management, at each of the contracting agencies and above them, is to assess and assure the general competence and performance of the people working on the job. That comes first. I’m sure there were good people mixed in there, somewhere. I have harsh feelings for them, however. I would say to them: Why didn’t you go public? Why didn’t you resign? You like money that much? Your integrity matters that little to you?

Management created the conditions whereby this project was “delivered” in a non-working state. Not like the Dreamliner. The 787 had some serious glitches, and Boeing needs to shape that up. What I’m talking about is boarding an aircraft for a long trip only to be told by the captain “Well, folks it looks like we will be stuck here at the gate for a little while. Maintenance needs to install our wings and engines. I don’t know much about aircraft building, but I promise we will be flying by November 30th. Have some pretzels while you wait.”

Management must bear the prime responsibility for this. I’m not sure that Obama himself is to blame. Everyone under him though? Absolutely.

What About Testing?

Little testing happened on the site. The testing that happened seems to have confirmed what everyone knew. Now this article has come out, about what’s happening behind the scenes. I sure hope they have excellent Rapid Testers working on that, because there is no time for TDD or much of any unit testing and certainly no time to write bloated nonsensical “test case specs” that usually infect government efforts like so much botfly larvae.

Notice the bit at the end?

“It’s a lot of work but people are committed to it. I haven’t heard anyone say it’s not a doable job,” the source said of the November 30th deadline to fix the online portal to purchase insurance on the federal exchange.

Exactly. That’s exactly the problem, Mr. Source. This is what I mean by the tyranny of the innocents. If no one is telling you that the November 30th deadline is not doable, and you think that’s a good sign, then you are an innocent. If you are managing to that expectation then you are a tyrant. It’s probably not doable. I already know that this can’t possibly leave enough time for reasonable testing of the system. Even if it is doable, only a completely dysfunctional project has no one on it speaking openly about whether it is doable.

What Can Be Done?

Politics will ruin everything. I have no institutional solution for this kind of problem. “Best practices” won’t help. Oversight committees won’t help. I can only say that each of us can and should foster a culture of personal ethical behavior. I was on a government project, briefly, years ago. I concluded it was an outlandish waste of taxpayer money and I resigned. I wanted the money. But I resigned anyway. It wasn’t easy. I had car payments and house payments to make. Integrity can be hard. Integrity can be lonely. I don’t always live up to my highest ideals for my own behavior, and when that happens I feel shame. The shame I feel spurs me to be better. That’s all I’m hoping for, really. I hope the people who knew better on this project feel shame. I hope they listen to that shame and go on to be better people.

I do have advice for the innocents. I’ll speak directly to you, Kathy Sebelius, since you are the most public example of who I am talking about…

Hi Kathy,

You’re not a technology person. You shouldn’t have to be. But you need people working for you who are, because technology is opaque. It may surprise you to know that unlike building bridges and monuments, the status of software can be effectively hidden from anyone more than one level above (or sideways from) the programmer or tester who is actually working on that particular piece of it. It’s like managing a gold mine without being able to go down into the mine yourself.

This means you are in a weak position, as an executive. You can pound the table and threaten to fire people, sure. It won’t help. The way in which an executive can use direct power will only make a late software project even later. Every use of direct power weakens your influence. Use indirect power, instead. Imagine that you are taming wild birds. I used to do that as a kid in Vermont. It requires quietness and patience. The first part is to stand for an hour holding birdseed in your hand. Stand quietly and eventually they are landing in your hand.

To have managed this project well, you needed to have created an environment where people could speak without fear. You needed to work with your direct reports to make sure they weren’t filtering out too much of the bad news. You needed to visit the project on a regular basis, and talk to the lowest level people. Then you needed to forgive their managers for not telling you all the bad news. It’s a maddeningly slow process. If you notice, the Pope is currently doing something very similar. Hey, I’m an atheist and yet I find myself listening to that guy. He’s a master of indirect leadership.

You did have the direct power to set expectations. I’m sure you realize you could have done a much better job of that, but perhaps you felt fear, yourself. As your employer (a taxpaying citizen), I bear a little of that responsibility. The country is getting the Healthcare.gov site that it deserves, in a sense.

If you are going to continue in public service, please do yourself and all of us a favor and take a class on software project management. Attend a few lectures. Get smart about what kinds of dodges and syndromes contractors use.

Don’t be an innocent, marching to the slaughter, while millions of dollars line the pockets of the people who run CGI and all those other parasite companies.

— Sincerely, James

My Political Agenda

I have $200,000 of unpaid medical bills due to the crazy jacked up prices and terrible insurance situation for individual citizens in the United States. I am definitely a supporter of the concept of health care reform, even the flawed Obamacare system, if that’s the best we can do for now.

I was pleased to see the failure of the Healthcare.gov website, at first. A little failure helps me make my arguments about how hard it is to do technology well; how getting it right means striving to better ourselves, and no formula or textual incantation will do that for us.

This is too much failure! I want it to stop now. Still, I’m an adult, a software project expert and not in any way an innocent. I know it’s not going to be resolved soon. No Virginia, there won’t be a Healthcare.gov website this Christmas.

Addendum:

From cnn.com:

Summers wrote a memo to the President in 2010 suggesting that HealthCare.gov was not something the government could handle and he needed to bring in experts.

While Summers would not provide details about internal discussions, he said Tuesday, “You need experts. You need to trust but you need to verify. You can’t go rushing the schedule when you get behind or you end up making more errors.”

Damn straight. If this is true then I’m sure glad someone around Obama had basic wisdom. I guess nobody listened to him.

Seven Kinds of Testers

Most of my work is teaching, coaching, and evaluating testers. But as a humanist, I want to apply the Diversity Heuristic: our differences can make us a stronger team. That means I can’t pick one comfortable kind of tester and grade people against that template. On the other hand, I do see interesting patterns of skill and temperament among testers, and it seems reasonable to talk about those patterns in a broad sense. Even though snowflakes are unique, it’s also true that snowflakes are all alike.

So, I propose that there are at least seven different types of testers: administrative tester, technical tester, analytical tester, social tester, empathic tester, user, and developer. As I explain each type, I want you to understand this:

These types are patterns, not prisons. They are clusters of heuristics; or in some cases, roles. Your style or situation may fit more than one of these patterns.

  • Administrative Tester. The administrative tester wants to move things along. Do the task, clear the obstacles, get to “done.” High level administrative testers want to be in the meetings, track the agreements, get the resources, update the dashboards. They are coordinators; managers.  Low level administrative testers often enjoy the paperwork aspect of testing: checking off boxes on spreadsheets, etc. (I was a test manager for years and did a lot of administrative work.) Warning: Administrative testers often are tempted to “fake” the test process. This pattern does not focus on the intellectual details of testing, but more the visible apparatus.
  • Technical Tester. The technical tester builds tools, uses tools, and in general thinks in terms of code. They are great as advocates for testability because they speak the language of developers. The people called SDETs are technical testers. Google and Microsoft love technical testers. (As a programmer I have one foot in this pattern at all times.) Warning: Technical testers are often tempted not to test things that can’t easily be tested with the tools they have. And they often don’t study testing, as such, preferring to learn more about tools.
  • Analytical Tester. The analytical tester loves models and typically enjoys mathematics (although not necessarily). Analytical testers create diagrams, matrices, and outlines. They read long specs. They gravitate to combination testing. (If I had to choose one category to be, I would have to say I am more analytical than anything else.) Warning: Analytical testers are prone to planning paralysis. They often dream of optimal test sets instead of good enough. If they can’t easily model it, they may ignore it.
  • Social Tester. The social tester wants you! Social testers discover all the people who can help them and prefer working in teams to being alone. Social testers understand that other people often have already done the work that needs to be done, and that no one person needs to have the whole solution. A social tester knows that you don’t have to be a coder to test– but it sure helps to know one. A good social tester cultivates social capital: credibility and services to offer others. (I follow a lot of the social tester pattern. My brother, Jon, is the classic social tester.) Warning: Social testers can get lazy and seem like they are mooching off of other people’s hard work. Also, they can socialize too much, at the expense of the work.
  • Empathic Tester. Empathic testers immerse themselves in the product. Their primary method is to empathize with the users. This is not quite the same as being a user expert, since there’s an important difference between being a tester who advocates for users and a user who happens to test. This is so different from my style that I have not recognized, nor respected, this pattern until recently. People with a non-technical background often adopt this pattern, and sometimes also the administrative or social tester pattern, too. Warning: Empathic testers typically have a difficult time putting into words what they do and how they do it.
  • User Expert. Notice I did not say “user tester.” User experts may be called domain experts or subject matter experts. They do not see themselves as testers, but as potential users who are helping out in a testing role. An expert tester can make tremendous use of user experts. Warning: User experts, not having a tester identity, tend not to study or develop deep testing skills.
  • Developer. Developers often test. They are ideally situated for unit testing, and they create testability in the products they design. A technical tester can benefit by spending time as a developer, and when a developer comes into testing, he is usually a technical tester. Warning: Developers, not having a tester identity, tend not to study or develop deep testing skills.

When I’m sizing up a tester during coaching. I find it useful to think in terms of these categories, so that I can more efficiently guess his strengths and weaknesses and be of service.

Do you think I have missed a category? Do you think I have de-composed them poorly? Make your case in the comments.

 

A Tester’s Commitments

This is the latest version of the commitments I make when I work with a programmer.

Dear Programmer,

 

My job is to help you look good. My job is to support you as you create quality; to ease that burden instead of adding to it. In that spirit, I make the following commitments to you.

 

Sincerely,

 

Tester

  1. I provide a service. You are an important client of that service. I am not satisfied unless you are satisfied.
  2. I am not the gatekeeper of quality. I don’t “own” quality. Shipping a good product is a goal shared by all of us.
  3. I will test your code as soon as I can after you deliver it to me. I know that you need my test results quickly (especially for fixes and new features).
  4. I will strive to test in a way that allows you to be fully productive. I will not be a bottleneck.
  5. I’ll make every reasonable effort to test, even if I have only partial information about the product.
  6. I will learn the product quickly, and make use of that knowledge to test more cleverly.
  7. I will test important things first, and try to find important problems. (I will also report things you might consider unimportant, just in case they turn out to be important after all, but I will spend less time on those.)
  8. I will strive to test in the interests of everyone whose opinions matter, including you, so that you can make better decisions about the product.
  9. I will write clear, concise, thoughtful, and respectful problem reports. (I may make suggestions about design, but I will never presume to be the designer.)
  10. I will let you know how I’m testing, and invite your comments. And I will confer with you about little things you can do to make the product much easier to test.
  11. I invite your special requests, such as if you need me to spot check something for you, help you document something, or run a special kind of test.
  12. I will not carelessly waste your time. Or if I do, I will learn from that mistake.

(This is cool! Yong Goo Yeo has created a Prezi of this.)

An Important Comment

This comment came in from Stuart Taylor. I think it’s important enough that I should add this to the post:

Hi James,

I’m not sure if this is meant to be a little tongue in cheek, but that’s how I read it. That said, I fell at point 1. “I provide a service” really?

Yes, really. This is not a joke. And point #1 is the most important point.

This implies there may be alternative service providers, and if my customer relationship management isn’t up to snuff, I could lose this “client”. Already I feel subservient, I’m an option.

Of course there are alternative service providers. (There are alternative programmers as well, but I’m not concerned with that, here. I’m making commitments that have to do with me and my role, here.) And yes, of COURSE you may lose your client. Actually that’s a lot of what “Agile” has done: fired the testers. In their place, we often find quasi-tester/quasi-programmers who are more concerned with fitting in and poking around tools than doing testing.

Testing is not programming. Programmers are not testers. If you mix that, you do a sorry job of both. In other words: I am a programmer and a tester, but not a good programmer at the same time that I’m a good tester.

Please meditate on the difference between service and subservience. I am a servant and I am proud of that. I am support crew. I spent my time as a production programmer and I’m glad I don’t do that any more. I don’t like that sort of pressure. I like to serve people who will take that pressure on my behalf.

This doesn’t make me a doormat. Nobody wipes their feet on me– I clean their feet. There’s a world of difference. Good mothers know this difference better than anyone.

Personally, when I work on a software  project, I want to feel like I’m part of the team. I need the developer, and the developer wants me. We work together to bake quality in from the start, and not try and sprinkle it on at the end as part of a “service” that I offer.

What strange ideas you have about service. You think people on the same team don’t serve each other? You think services are something “sprinkled” on the end? Please re-think this.

I want to be on the same team, too. The difference, maybe, is that I require my colleagues to choose me. I don’t rap on the door and say “let me in I belong here! I like to be on the team!” (You can’t just “join” a team. The easier it is to join a team, the less likely that it is actually a team, I think. Excellent teams must gel together over time. I want to speed that process– hence my commitments) Instead, I am invited. I insist that it be an invitation. And how I get invited quickly is by immediately dissolving any power struggle that the programmers may worry about, then earning their respect through the fantastic quality of my work.

You can, of course, demand respect. Then you will fail. Then five years later, you will realize true respect cannot be demanded. (That’s what I did. Ask anyone about what a terror I was when I worked at Apple Computer. If you are less combative or more intelligent than I am, you may make this transition in less time.)

I may have actually agreed with your points before I was exposed to Continuous Integration, because that’s how my relationship used to be; hence me interpreting this as a light hearted piece. However I know that this kind of relationship still exists today (its here at this place). When I began working with continuous integration, the traditional role of the tester, with a discrete testing phase became blurred, as the test effort is pushed back up stream towards the source. If the tester and developer share a conversation about “how can we be sure we build the right thing, and how can we ensure we built it right” before the code gets cut, and the resulting tests from that are used to help write the code (TDD, BDD, SBE), then both of us can have a nice warm fuzzy feeling about the code being of good quality. The automation removes the repetition (and reduces the feedback) to maintain that assertion.

First, your experience has nothing to do with “continuous integration.” Continuous integration is a concept. Your experience is actually related to people. Specific people are collaborating with you in specific satisfying ways, probably because it is their pleasure to do so. I have had wonderful and satisfying professional relationships on a number of projects– mostly at Borland, where we were not doing continuous integration, and where testers and programmers cooperated freely, despite being organized into separate administrative entities.

(Besides, I have to say… Testing cannot be automated. Automation doesn’t remove repetition, it changes it. Continuous integration, as I see it, is another way of forcing your customers do the testing, while placing naive trust in the magic of bewildering machinery. I have yet to hear about continuous integration from anyone who seemed to me to be a student of software testing. Perhaps that will happen someday.)

In any case, nothing I’m saying here prevents or discourages collaboration. What it addresses are some of the unhealthy things that discourage collaboration between a tester and a programmer. I’m not sure if in your role you are doing testing. If you want to be a programmer or a designer or manager or cheerleader or hanger-on, then do that. However, testing is inherently about critical thinking, and therefore it always carries a risk (in the same way that being a dentist does) that we may touch a nerve with our criticism. This is the major risk I’m trying to mitigate.

[Note: Michael Bolton pointed out to me the “warm fuzzy” line. I can’t believe I let that go by when I wrote this reply, earlier. Dude! Stuart! If you are in the warm fuzzy business, you are NOT in the testing business. My goal is not anything to do with warm fuzzy thinking. My goal is to patrol out in the cold and shine my bright bright torches into the darkness. On some level, I am never satisfied as a tester. I’m never sure of anything. That’s what it MEANS to be a professional tester, rather than an amateur or a marketer. Other people think positively. Other people believe. The warmth and fuzziness of belief is not for us, man. Do not pollute your testing mentality with that.]

My confusion over this being a tongue in cheek post is further compounded by point 2. Too many testers do believe that they own quality. They become the quality gate keeper, and i think they enjoy it. The question “can this go live” is answered by the tester. The tester may be unwilling to relinquish that control, because they perceive that as a loss of power.

Just as excellent testers must understand they provide a service, they must also understand that they are not “quality” people. They simply hold the light so that other people can work. Testers do not create quality. If you create quality, you are not being a tester. You are being an amateur programmer/designer/manager. That’s okay, but my commitment list is meant for people who see themselves as testers.

I guess what i’m fumbling about with here, is that some testers will read this, and sign up to it. Maybe even print it out and put it up on the wall.

I hope they do. This list has served me well.

Either way, you have written another thought provoking piece James, but i cant help wondering about the back story. What prompted you to write it?

Stuart – sat in the coffee shop, but with laptop and wifi, not a notepad 😉

This piece was originally prompted when I took over a testing and build team for SmartPatents, in 1997. I was told that I would decide when the software was good enough. I called a meeting of the programmers and testers and distributed the first version of this document in order to clarify to the programmers that I was not some obstacle or buzzing horsefly, but rather a partner with them in a SERVICE role.

That was also the first time I gave up my office and opted to sit in a low cubicle next to the entrance that no one else wanted– the receptionist’s cubicle. It was perfect for my role. Programmers streamed by all day and couldn’t avoid me.

I think this piece is also a reaction to the Tester’s Bill of Rights (the one by Gilb, not the one by Bernie Berger, although I have my concerns about that one, too), which is one of the most arrogant pieces of crap I’ve ever seen written on behalf of testers. I will not link to it.

And now that I think of it, I wrote this a year after I hired a personal assistant, and two months later had her promoted to be my boss, even though she retained almost the same duties. The experience of working for someone whose actual day-to-day role was to serve me– but who felt powerful and grateful as my manager– helped me understand the nature of service and leadership in a new way.

Finally, a current that runs through my life is my experience as a husband and father. I am very powerful. I can make my family miserable if I so choose. I choose to serve them, instead.

Technique: Paired Exploratory Survey

I named a technique the other day. It’s another one of those things I’ve been doing for a while, but only now has come crisply into focus as a distinct heuristic of testing: the Paired Exploratory Survey (PES).

Definition: A paired exploratory survey is a process whereby two testers confront one product at the same time for the purpose of learning a product, preparing for formal testing, and/or characterizing its quality as rapidly as possible, whereby one tester (the “driver”) is responsible for open-ended play and all direct interaction with the product while the other tester (the “navigator” or “leader”) acts as documentarian, mission-minder, and co-test-designer.

Here’s a story about it..

Last week, I was on my way home from the CAST conference with my 17 year-old son Oliver when a client called me with an emergency assignment: “Get down to L.A. and test our product right away!” I didn’t have time to take Oliver home, so we bought some clean clothes, had Oliver’s ID flown in from Orcas Island by bush plane, and headed to SeaTac.

(I love emergencies. They’re exciting. It’s like James Bond, except that my Miss Moneypenny is named Lenore. I got to the airport and two first class tickets were waiting for us. However, a gentle note to potential clients: making me run around like a secret agent can be expensive.)

This is the first time I had Oliver with me while doing professional testing, so I decided to make use of him as an unpaid intern. Basically, this is the situation any tester is in when he employs a non-tester, such as a domain expert, as a partner. In such situations, the professional tester must assure that the non-tester is strongly engaged and having good fun. That’s why I like to make that “honorary tester” drive. I get them twiddling the knobs, punching the buttons, and looking for trouble. Then they’ll say “testing is fun” and help me the next time I ask.

(Oliver is a very experienced video gamer. He has played all the major offline games since he was 3 or 4, and the online ones for the last 5 years. I know from playing with him what this means: he can be relentless once he decides to figure out how a system works. I was hoping his gamer instinct would kick in for this, but I was also prepared for him to get bored and wander off. You shouldn’t set your expectations too high with teenagers.)

The client gave us a briefing about how the device is used. I have already studied up on this, but it’s new for Oliver. The scene reminded me of that part in the movie Inception where Leonardo DiCaprio explains the dynamics of dream invasion.We have a workstation that controls a power unit and connects to a probe which is connected to a pump. It all looks Frankenstein-y.

(I can’t tell you much about the device, in this case. Let’s just say it zaps the patient with “healing energy” and has nothing whatsoever to do with weaponized subconscious projections.)

I set up a camera so that all the testing would be filmed.

(Video is becoming an indispensable tool in my work. My traveling kit consists of a little solid state Sony cam that plugs into the wall so I don’t have to worry about battery life, a micro-tripod so I can pose the camera at any desired angle, and a terabyte hard drive which stores all the work.)

Then, I began the testing just to demonstrate to Oliver the sort of thing I wanted to do. We would begin with a sanity check of the major functions and flows, while letting ourselves deviate as needed to pursue follow-up testing on anything we find that was anomalous. After about 15 minutes, Oliver became the driver, I became the navigator, and that’s how we worked for the next 6 or 7 hours.

Oliver quickly distinguished himself as as a remarkable observer. He noticed flickers on the screen, small changes over time, quirks in the sound the device made. He had a good memory for what he had just been doing, and quickly constructed a mental model of the product.

From the transcript:

“What?!…That could be a problem…check this out…dad…look, right now…settings, unclickable…start…suddenly clickable, during operation…it’s possible to switch its entire mode to something else, when it should be locked!”

and later

“alright… you can’t see the error message every single time because it’s corrupted… but the error message… the error message is exactly what we were seeing before with the sequence bug… the error message comes up for a brief moment and then BOOM, it’s all gone… it’s like… it makes the bug we found with the sequence thing (that just makes it freeze) destructive and takes down the whole system… actually I think that’s really interesting. It’s like this bug is slightly more evolved…”

(You have to read this while imagining the voice of a triumphant teenager who’s just found an easter egg in HALO3. From his point of view, he’s finding ways to “beat the boss of the level.”)

At the start, I frequently took control of the process in order to reproduce the bugs, but as I saw Oliver’s natural enthusiasm and inquisitiveness blossom, I gave him room to run. I explained bug isolation and bug risk and challenged him to find the simplest, yet most compelling form of each problem he uncovered.

Meanwhile, I worked on my notes and noted time stamps of interesting events. As we moved along, I would redirect him occasionally to collect more evidence regarding specific aspects of the evolving testing story.

How is this different from ordinary paired testing?

Paired testing simply means two testers testing one product on the same system at the same time. A PES is a kind of paired testing.

Exploratory testing means an approach to testing whereby learning, test design, and test execution are mutually supportive activities that run in parallel. A PES is exploratory testing, too.

A “survey session,” in the lingo of Session-Based Test Management, is a test session devoted to learning a product and characterizing the general risks and challenges of testing it, while at the same time noticing problems. A survey session contrasts with analysis sessions, deep coverage sessions, and closure sessions, among possible others that aren’t yet identified as a category. A PES is a survey test session.

It’s all of those things, plus one more thing: the senior tester is the one who takes the notes and makes sure that the right areas are touched and right general information comes out. The senior tester is in charge of developing a compelling testing story. The senior tester does that so that his partner can get more engaged in the hunt for vital information. This “hunt” is a kind of play. A delicious dance of curiosity and analysis.

There are lots of ways to do paired testing. A PES is one interesting way.

Hey, I’ve done this before!

While testing with my son, I flashed back to 1997, in one of my first court cases, in which I worked with my brother Jon (who is now a director of testing at eBay, but was then a cub tester). Our job was to apply my Good Enough model of quality analysis to a specific product, and I let Jon drive that time, too. I didn’t think to give a name to that process, at the time, other than ET. The concept of paired testing hadn’t even been named in our community until Cem Kaner suggested that we experiment with it at the first Workshop on Heuristic and Exploratory Techniques in 2001.

I have seen different flavors of a PES, too. I once saw a test lead who stepped to the keyboard specifically because he wanted his intern to design the tests. He felt that that letting the kid lean back in his chair and talk ideas to the ceiling (as he was doing when I walked in) would be the best way to harness certain technical knowledge the intern had which the test lead did not have. In this way, the intern was actually the driver.

I’m feeling good about the name Paired Exploratory Survey. I think it may have legs. Time will tell.

Here’s the report I filed with the client (all specific details changed, but you can see what the report looks like, anyway).

Introducing Thread-Based Test Management

Most of the testing world is managed around artifacts: test cases, test documents, bug reports. If you look at any “test management” tool, you’ll see that the artifact-based approach permeates it. “Test” for many people is a noun.

For me test is a verb. Testing is something that I do, not so much something that I create. Testing is the act of exploration of an unknown territory. It is casting questions, like Molotov cocktails, into the darkness, where they splatter and burst into bright revealing fire.

How to Manage Such a Process?

My brother Jon and I created a way to control highly exploratory testing 10 years ago, called session-based test management (SBTM). I recently returned from an intense testing project in Israel, where I used SBTM. But I also experimented with a new idea: thread-based test management (TTM).

Like many of my new ideas, it’s not really new. It’s the christening (with words) and sharpening (with analysis) of something many of us already do. The idea is this: organize management around threads of activity rather than test sessions or artifacts.

Thread-based testing is a generalized form of session-based testing, in that sessions are a form of thread, but a thread is not necessarily a session. In SBTM, you test in uninterrupted blocks of time that each have a charter. A charter is a mission for that session; a light sort of commitment, or contract. A thread, on the other hand, may be interrupted, it may go on and on indefinitely, and does not imply a commitment. Session-based testing can be seen as an extension of thread-based testing for especially high accountability and more orderly situations.

I define a thread as a set of one or more activities intended to solve a problem or achieve an objective. You could think of a thread as a very small project within a project.

Why Thread-Based Test Management?

Because it can work under even the most chaotic and difficult conditions. The only formalism required for TBTM is a list of threads. I use this form of test management when I am dropped into a project with as little a day or two to get it done.

What Does Thread-Based Test Management Looks Like?

It’s simple. Thread-based test management looks like a todo list, except that we organize the todo items into an outline that matches the structure of the testing process. Here’s a mocked-up example:

Test Facilities

  • Power meter calibration method
  • Backup test jig validation
  • Create standard test images

Test Strategy

  • Accuracy Testing
    • Sampling strategy
    • Preliminary-testing
    • Log file analysis program
  • Transaction Flow Testing
  • Essential Performance Testing
  • Safety Testing
    • warnings and errors FRS review
    • tool for forcing errors
  • Compliance Testing
  • Test Protocol V1.0 doc.

Test Management

  • Change protocol definition
  • Build protocol definition
  • Test cycle protocol definition
  • Bug reporting protocol definition
  • Bug triage
  • Fix verifications

This outline describes the high level threads that comprise the test project. I typically use a mind-mapping program like MindManager to organize and present them.

So, you should be thinking, “Is that it? Todo lists?” right about now. Well, no. That’s not it. But that’s one face of it.

What Else Does Thread-Based Test Management Look Like?

It looks like testers gathered around a todo list, talking about what they are going to work on that afternoon. Then they split up and go to work. Several times day they might come together like that. If the team is not co-located, then this meeting is done over instant messaging, email, or perhaps through a wiki.

Is That All it Looks Like?

Well, there is also the status report. Whether written or spoken, the thread-based test management version of a status report lists the threads, who is working on the threads, and the outlook for each thread. It typically also includes an issues list.

Other documentation may be produced, of course. TBTM doesn’t tell you what documents to create. It simply tells you that threads are the organizing principle we use for managing the project.

Where Do Threads Come From?

Threads are first spawned from our model of the testing problem. The Satisfice Heuristic Test Strategy Model is an example of such a model. By working through those lists, we get an idea of the kinds of testing we might want to do: those are the first of the threads. After that, threads might be created in many ways, including splitting off of existing threads as we gain a deeper understanding of what testing needs to be done. Of course, in an Agile environment, each user story kicks off a new testing thread.

Which Threads Do We Work On?

Think priority and progress. We might frequently drop threads, switch threads, and pick them up again. In general, we work on the highest priority threads, but we also work on lower priority threads many times, when we see the possibility for quick and inexpensive progress. If I’m trying to finish a sanity check on the new build, I might interrupt that to discuss the status of a particular known bug if the developer happens to wander by.

Major ongoing threads often become attached to specific people. For instance “client testing” or “performance testing” often become full-time jobs. Testing itself, after all, can be thought of as a thread so challenging to do well, and so different from programming, that most companies have seen fit to hire dedicated testers.

How Do Threads End?

A thread ends either in a cut or knot. Cutting a thread means to cancel that task. A knot, however, is a milestone; an achievement of some kind. This is exactly the meaning of the phrase “tying up the loose ends” and marks either the end of the thread (or group of threads) or a good place to drop it for a while.

How Do We Estimate Work?

In thread-based test management, there is no special provision or method for estimating work, except that this is done on a thread-by-thread basis. Session-based test management may be overlaid onto TBTM in order to estimate work in terms of sessions.

How Do We Evaluate Progress?

In thread-based test management, there is no special provision or method for evaluating progress, either, except that this is done on a thread-by-thread basis, and status reports may be provided frequently, perhaps at the end of each day. Session-based test management is also helpful for that.

So What?

This form of management is actually quite common. But, to my knowledge, no one has yet named and codified it. Without a convenient way to talk about it, we have a hard time explaining and justifying it. Then when the “process improvement” freaks come along, they act like there’s no management happening at all. This form of management has been “illegible” up to now (meaning that it’s there but no one notices it) and my brother and I are going to push to make it fully legible and respectable in the testing arena.

From now on, when asked about my approach to test management, I can say “I practice Rapid Testing methodology, which I track in either a thread-based or session-based manner, depending on the stage of the project, and in a risk-based manner at all times.”

How is TBTM Any Different From Using a TO-DO List?

Michel Kraaij questions the substance of TBTM by wondering how it’s different from the age-old idea of a “to-do” list? See his post here.

This is a good question. Yes, TBTM is different than just using a to-do list, but even so, I don’t think I’ve ever read an article about to-do list based test management (TDBTM?). Most textbooks focus on artifacts, not the activity of testing. Thread-based test management is trying to capture the essence of managing with to-do lists, plus some other things in addition to that.

The main additional element, beyond just making a to-do list, is that a traditional to-do list contains items that can be “done”, whereas many threads might not ever be “done.” They might be cut (abandoned) or knotted (temporarily parked at some level of completion). Some threads maybe tied up with a bow and “done” like a normal task, but not the main ones that I’m thinking of. As I practice testing, for instance, I’m rarely “done” with test strategy. I tinker with the test strategy all the way through the project. That’s why it makes sense to call it a thread.

Once again: Thread-based management is not focused on getting things “done.” In this way it is different from KanBan, Scrum, ToDo lists, Session-based test management,  etc., all of which are big into workflow management of definite units of work.

Another thing to recognize is that the main concern of TBTM is how to know what to put on your thread list. The answer to that invokes the entire framework of Rapid Software testing. So, yeah, it’s more than having an outline of threads, which does look very much like a to-do list– it’s the activity (and skills) of making the list and managing it. If you want to talk about to-do list based test management, then you would have to invent that lore as well. You couldn’t just say “make a to-do list” and claim to have communicated the methodology.

[You can find Jonathan’s take on TBTM here.]

[I credit Sagi Krupetski, the test lead on my recent project, for helping me get this idea. His clockwork status reporting and regular morning question “Where are we on the project and what do you think you need to work on today?” caused me to see the thread structure of our project clearly for the first time. He’s back on the market now (Chicago area), in case you need a great tester or test manager.]

Michelle Smith: True Test Leadership

I’m delighted to read Michelle Smith’s play-by-play description of how she is coaching new testers. Take a look.

Let me catalog the coolnesses:

1. “The team I work with was previously exposed to Rapid Software Testing. This exposure caused me to wonder what would happen if these new folks were exposed to some of these ideas early on?”

Notice that she uses the word “wonder”? That’s the attitude I hope to foster in people who take my class. It’s an attitude of curiosity and personal responsibility. She doesn’t speak of applying practices as if the people on the team were milling machines waiting to be programmed. She implies that her testers are learners under their own control. Her attitude is one of establishing a productive but not coercive  relationship.

I don’t know if she got this from my class– probably she had it beforehand– but it’s an attitude I share with her.

2. “I went in their shared office and opened up a five minute conversation with them by asking “What is a bug?” and following that with “who are the people that matter?”.”

Michelle mentions “five minute conversations” a few times. And notice how most of her interactions were in the form of puzzles and questions. It speaks of a light touch with coaching. Light touch is good. Especially with novice testers, experience speaks louder than lecture. Introduce an idea then try it, or try something first, then talk about the idea. Either way, I like how she was getting them working.

3. She had them practice explaining themselves, both in writing and by voice.

4. She was concerned more with the cognitive parts of testing than with the artifacts. That’s good because excellent testing artifacts, such as bug reports and test cases, come from the thinking of the testers. Think well and the rest will follow.

5. She has them work on several aspects of testing. Notice how she deals with oracles, tools, mission, the social process and gaining product knowledge.

I bet what Michelle is doing will lead to better, more passionate testers, and more dynamic, flexible testing. Compare this to what we see so often in our industry: testers simply told to sit down and create test cases. Look, even if you think pre-defined test cases make for great testing, I think to be successful with that you have to base it on skilled and knowledgeable testers. Michelle is creating that foundation with her team.

Overall, what I’m most happy with is that Michelle has made Rapid Testing her own thing. This is vital. This is fundamental to the spirit of my teaching. I want to grow colleagues who confidently think for themselves. Hats off to you, Michelle, for doing that and blogging about it.

Finally, Michelle writes, “I have no idea if what I am doing is going to produce any benefits to them, to the team, or to the stakeholders. Time will tell.”

No best practices nonsense, here. No certification mentality. Just healthy skepticism. Thank you, Michelle!

Yaron Sinai Says Stop Thinking, Stupid Tester

The Factory School is that community of process people who believe testing benefits from eliminating the human element as much as possible. They wish to mechanize testing, and to condition the humans within it to see themselves as machines and emulate machines as much as possible. It’s an idea that has a number of advantages, with the important caveat that it makes good software testing impossible by leaving no room in the process for skill and thinking.

Sometimes when I complain about the Factory School of software testing, people think I’m exaggerating or making it up.

But check out this quote from Yaron Sinai, CEO of Elementools:

“With Test Case, the team doesn’t think,” Sinai said. “They just need to follow the steps. And for you, as a testing manager, you know that once they completed a set of tests, you know they followed the steps that needed to be followed.”

At first when I saw this, I thought it was a joke news story. Apparently not. (For the sake of Mr. Sinai, I hope he was misquoted. I will gladly post a retraction if that is the case.)

The man’s tool is called Test Case, which is emblematic right there. It’s focus is on test cases, not good testing. His view of test cases appears to be test procedure steps, and he wants testers to stop thinking, dammit, and just follow steps. Like factory robots. Robots that don’t question or talk back. Nice. Saaafe. Rooooooboootttts.

I haven’t found much information about Mr. Sinai online. He apparently hasn’t written about testing, per se. At least he is consistent: he has contributed no ideas to the testing craft, and now he sells an idea prevention system in the guise of a test management tool.

I want to ask Mr. Sinai whether, as a CEO, he faithfully follows his “CEO cases” each day, written for him by other, smarter CEO’s. Or does he, gasp, think for himself? I bet he would reply that although he is a smart CEO, not all CEO’s are smart enough to make their own decisions, and thus it’s only reasonable that their work should be scripted. When I suggest that a minimum requirement to be a CEO should be the ability to think about business problems and make decisions on their own behalf, I’m sure he will say “that’s just not practical.”  By which he will mean, of course, that HE doesn’t know how to do it.

The Factory School promotes the antithesis of engineering, while often using the word engineering as if it were some corpse impaled on a stick. Their approach to managing an engineering process is to kill it. Indeed, a dead process, like a dead horse, is much easier to manage once you get used to the smell.

And a lot of top managers buy this crap because the demos are simple, they are unaware of alternatives that would actually help, and the perfume on their cravats tends to mask the stench of tool vendors who don’t know anything about the craft.