We. Use. Tools.

Context-Driven testers use tools to help ourselves test better. But, there is no such thing as test automation.

Want details? Here’s the 10,000 word explanation that Michael Bolton and I have been working on for months.

Editor’s Note: I have just posted version 1.03 of this article. This is the third revision we have made due to typos. Isn’t it interesting how hard it is to find typos in your own work before you ship an article? We used automation to help us with spelling, of course, but most of the typos are down to properly spelled words that are in the wrong context. Spelling tools can’t help us with that. Also, Word spell-checker still thinks there are dozens of misspelled words in our article, because of all the proper nouns, terms of art, and neologisms. Of course there are the grammar checking tools, too, right? Yeah… not really. The false positive rate is very high with those tools. I just did a sweep through every grammar problem the tool reported. Out of the five it thinks it found, only one, a missing hyphen, is plausibly a problem. The rest are essentially matters of writing style.

One of the lines it complained about is this: “The more people who use a tool, the more free support will be available…” The grammar checker thinks we should not say “more free” but rather “freer.” This may be correct, in general, but we are using parallelism, a rhetorical style that we feel outweighs the general rule about comparatives. Only humans can make these judgments, because the rules of grammar are sometimes fluid.

Exploratory Testing 3.0

[Authors’ note: Others have already made the point we make here: that exploratory testing ought to be called testing. In fact, Michael said that about tests in 2009, and James wrote a blog post in 2010 that seems to say that about testers. Aaron Hodder said it quite directly in 2011, and so did Paul Gerrard. While we have long understood and taught that all testing is exploratory (here’s an example of what James told one student, last year), we have not been ready to make the rhetorical leap away from pushing the term “exploratory testing.” Even now, we are not claiming you should NOT use the term, only that it’s time to begin assuming that testing means exploratory testing, instead of assuming that it means scripted testing that also has exploration in it to some degree.]

[Second author’s note: Some people start reading this with a narrow view of what we mean by the word “script.” We are not referring to text! By “script” we are speaking of any control system or factor that influences your testing and lies outside of your realm of choice (even temporarily). This includes text instructions, but also any form of instructions, or even biases that are not instructions.]

By James Bach and Michael Bolton

In the beginning, there was testing. No one distinguished between exploratory and scripted testing. Jerry Weinberg’s 1961 chapter about testing in his book, Computer Programming Fundamentals, depicted testing as inherently exploratory and expressed caution about formalizing it. He wrote, “It is, of course, difficult to have the machine check how well the program matches the intent of the programmer without giving a great deal of information about that intent. If we had some simple way of presenting that kind of information to the machine for checking, we might just as well have the machine do the coding. Let us not forget that complex logical operations occur through a combination of simple instructions executed by the computer and not by the computer logically deducing or inferring what is desired.”

Jerry understood the division between human work and machine work. But, then the formalizers came and confused everyone. The formalizers—starting officially in 1972 with the publication of the first testing book, Program Test Methods—focused on the forms of testing, rather than its essences. By forms, we mean words, pictures, strings of bits, data files, tables, flowcharts and other explicit forms of modeling. These are things that we can see, read, point to, move from place to place, count, store, retrieve, etc. It is tempting to look at these artifacts and say “Lo! There be testing!” But testing is not in any artifact. Testing, at the intersection of human thought processes and activities, makes use of artifacts. Artifacts of testing without the humans are like state of the art medical clinics without doctors or nurses: at best nearly useless, at worst, a danger to the innocents who try to make use of them.

We don’t blame the innovators. At that time, they were dealing with shiny new conjectures. The sky was their oyster! But formalization and mechanization soon escaped the lab. Reckless talk about “test factories” and poorly designed IEEE standards followed. Soon all “respectable” talk about testing was script-oriented. Informal testing was equated to unprofessional testing. The role of thinking, feeling, communicating humans became displaced.

James joined the fray in 1987 and tried to make sense of all this. He discovered, just by watching testing in progress, that “ad hoc” testing worked well for finding bugs and highly scripted testing did not. (Note: We don’t mean to make this discovery sound easy. It wasn’t. We do mean to say that the non-obvious truths about testing are in evidence all around us, when we put aside folklore and look carefully at how people work each day.) He began writing and speaking about his experiences. A few years into his work as a test manager, mostly while testing compilers and other developer tools, he discovered that Cem Kaner had coined a term—”exploratory testing”—to represent the opposite of scripted testing. In that original passage, just a few pages long, Cem didn’t define the term and barely described it, but he was the first to talk directly about designing tests while performing them.

Thus emerged what we, here, call ET 1.0.

(See The History of Definitions of ET for a chronological guide to our terminology.)

ET 1.0: Rebellion

Testing with and without a script are different experiences. At first, we were mostly drawn to the quality of ideas that emerged from unscripted testing. When we did ET, we found more bugs and better bugs. It just felt like better testing. We hadn’t yet discovered why this was so. Thus, the first iteration of exploratory testing (ET) as rhetoric and theory focused on escaping the straitjacket of the script and making space for that “better testing”. We were facing the attitude that “Ad hoc testing is uncontrolled and unmanageable; something you shouldn’t do.” We were pushing against that idea, and in that context ET was a special activity. So, the crusaders for ET treated it as a technique and advocated using that technique. “Put aside your scripts and look at the product! Interact with it! Find bugs!”

Most of the world still thinks of ET in this way: as a technique and a distinct activity. But we were wrong about characterizing it that way. Doing so, we now realize, marginalizes and misrepresents it. It was okay as a start, but thinking that way leads to a dead end. Many people today, even people who have written books about ET, seem to be happy with that view.

This era of ET 1.0 began to fade in 1995. At that time, there were just a handful of people in the industry actively trying to develop exploratory testing into a discipline, despite the fact that all testers unconsciously or informally pursued it, and always have. For these few people, it was not enough to leave ET in the darkness.

ET 1.5: Explication

Through the late ‘90s, a small community of testers beginning in North America (who eventually grew into the worldwide Context-Driven community, with some jumping over into the Agile testing community) was also struggling with understanding the skills and thought processes that constitute testing work in general. To do that, they pursued two major threads of investigation. One was Jerry Weinberg’s humanist approach to software engineering, combining systems thinking with family psychology. The other was Cem Kaner’s advocacy of cognitive science and Popperian critical rationalism. This work would soon cause us to refactor our notions of scripted and exploratory testing. Why? Because our understanding of the deep structures of testing itself was evolving fast.

When James joined ST Labs in 1995, he was for the first time fully engaged in developing a vision and methodology for software testing. This was when he and Cem began their fifteen-year collaboration. This was when Rapid Software Testing methodology first formed. One of the first big innovations on that path was the introduction of guideword heuristics as one practical way of joining real-time tester thinking with a comprehensive underlying model of the testing process. Lists of test techniques or documentation templates had been around for a long time, but as we developed vocabulary and cognitive models for skilled software testing in general, we started to see exploratory testing in a new light. We began to compare and contrast the important structures of scripted and exploratory testing and the relationships between them, instead of seeing them as activities that merely felt different.

In 1996, James created the first testing class called “Exploratory Testing.”  He had been exposed to design patterns thinking and had tried to incorporate that into the class. He identified testing competencies.

Note: During this period, James distinguished between exploratory and ad hoc testing—a distinction we no longer make. ET is an ad hoc process, in the dictionary sense: ad hoc means “to this; to the purpose”. He was really trying to distinguish between skilled and unskilled testing, and today we know better ways to do that. We now recognize unskilled ad hoc testing as ET, just as unskilled cooking is cooking, and unskilled dancing is dancing. The value of the label “exploratory testing” is simply that it is more descriptive of an activity that is, among other things, ad hoc.

In 1999, James was commissioned to define a formalized process of ET for Microsoft. The idea of a “formal ad hoc process” seemed paradoxical, however, and this set up a conflict which would be resolved via a series of constructive debates between James and Cem. Those debates would lead to we here will call ET 2.0.

There was also progress on making ET more friendly to project management. In 2000, inspired by the work for Microsoft, James and Jon Bach developed “Session-Based Test Management” for a group at Hewlett-Packard. In a sense this was a generalized form of the Microsoft process, with the goal of creating a higher level of accountability around informal exploratory work. SBTM was intended to help defend exploratory work from compulsive formalizers who were used to modeling testing in terms of test cases. In one sense, SBTM was quite successful in helping people to recognize that exploratory work was entirely manageable. SBTM helped to transform attitudes from “don’t do that” to “okay, blocks of ET time are things just like test cases are things.”

By 2000, most of the testing world seemed to have heard something about exploratory testing. We were beginning to make the world safe for better testing.

ET 2.0: Integration

The era of ET 2.0 has been a long one, based on a key insight: the exploratory-scripted continuum. This is a sliding bar on which testing ranges from completely exploratory to completely scripted. All testing work falls somewhere on this scale. Having recognized this, we stopped speaking of exploratory testing as a technique, but rather as an approach that applies to techniques (or as Cem likes to say, a “style” of testing).

We could think of testing that way because, unlike ten years earlier, we now had a rich idea of the skills and elements of testing. It was no longer some “creative and mystical” act that some people are born knowing how to do “intuitively”. We saw testing as involving specific structures, models, and cognitive processes other than exploring, so we felt we could separate exploring from testing in a useful way. Much of what we had called exploratory testing in the early 90’s we now began to call “freestyle exploratory testing.”

By 2006, we settled into a simple definition of ET, simultaneous learning, test design, and test execution. To help push the field forward, James and Cem convened a meeting called the Exploratory Testing Research Summit in January 2006. (The participants were James Bach, Jonathan Bach, Scott Barber, Michael Bolton, Elisabeth Hendrickson, Cem Kaner, Mike Kelly, Jonathan Kohl, James Lyndsay, and Rob Sabourin.) As we prepared for that, we made a disturbing discovery: every single participant in the summit agreed with the definition of ET, but few of us agreed on what the definition actually meant. This is a phenomenon we had no name for at the time, but is now called shallow agreement in the CDT community. To combat shallow agreement and promote better understanding of ET, some of us decided to adopt a more evocative and descriptive definition of it, proposed originally by Cem and later edited by several others: “a style of testing that emphasizes the freedom and responsibility of the individual tester to continually optimize the quality of his work by treating test design, test execution, test result interpretation, and learning as mutually supporting activities that continue in parallel throughout the course of the project.” Independently of each other, Jon Bach and Michael had suggested the “freedom and responsibility” part to that definition.

And so we had come to a specific and nuanced idea of exploration and its role in testing. Exploration can mean many things: searching a space, being creative, working without a map, doing things no one has done before, confronting complexity, acting spontaneously, etc. With the advent of the continuum concept (which James’ brother Jon actually called the “tester freedom scale”) and the discussions at the ExTRS peer conference, we realized most of those different notions of exploration are already central to testing, in general. What the adjective “exploratory” added, and how it contrasted with “scripted,” was the dimension of agency. In other words: self-directedness.

The full implications of the new definition became clear in the years that followed, and James and Michael taught and consulted in Rapid Software Testing methodology. We now recognize that by “exploratory testing”, we had been trying to refer to rich, competent testing that is self-directed. In other words, in all respects other than agency, skilled exploratory testing is not distinguishable from skilled scripted testing. Only agency matters, not documentation, nor deliberation, nor elapsed time, nor tools, nor conscious intent. You can be doing scripted testing without any scrap of paper nearby (scripted testing does not require that you follow a literal script). You can be doing scripted testing that has not been in any way pre-planned (someone else may be telling you what to do in real-time as they think of ideas). You can be doing scripted testing at a moment’s notice (someone might have just handed you a script, or you might have just developed one yourself). You can be doing scripted testing with or without tools (tools make testing different, but not necessarily more scripted). You can be doing scripted testing even unconsciously (perhaps you feel you are making free choices, but your models and habits have made an invisible prison for you). The essence of scripted testing is that the tester is not in control, but rather is being controlled by some other agent or process. This one simple, vital idea took us years to apprehend!

In those years we worked further on our notions of the special skills of exploratory testing. James and Jon Bach created the Exploratory Skills and Tactics reference sheet to bring specificity and detail to answer the question “what specifically is exploratory about exploratory testing?”

In 2007, another big slow leap was about to happen. It started small: inspired in part by a book called The Shape of Actions, James began distinguishing between processes that required human judgment and wisdom and those which did not. He called them “sapient” vs. “non-sapient.” This represented a new frontier for us: systematic study and development of tacit knowledge.

In 2009, Michael followed that up by distinguishing between testing and checking. Testing cannot be automated, but checking can be completely automated. Checking is embedded within testing. At first, James objected that, since there was already a concept of sapient testing, the distinction was unnecessary. To him, checking was simply non-sapient testing. But after a few years of applying these ideas in our consulting and training, we came to realize (as neither of us did at first) that checking and testing was a better way to think and speak than sapience and non-sapience. This is because “non-sapience” sounds like “stupid” and therefore it sounded like we were condemning checking by calling it non-sapient.

Do you notice how fine distinctions of language and thought can take years to work out? These ideas are the tools we need to sort out our practical decisions. Yet much like new drugs on the market, it can sometimes take a lot of experience to understand not only benefits, but also potentially harmful side effects of our ideas and terms. That may explain why those of us who’ve been working in the craft a long time are not always patient with colleagues or clients who shrug and tell us that “it’s just semantics.” It is our experience that semantics like these mean the difference between clear communication that motivates action and discipline, and fragile folklore that gets displaced by the next swarm of buzzwords to capture the fancy of management.

ET 3.0: Normalization

In 2011, sociologist Harry Collins began to change everything for us. It started when Michael read Tacit and Explicit Knowledge. We were quickly hooked on Harry’s clear writing and brilliant insight. He had spent many years studying scientists in action, and his ideas about the way science works fit perfectly with what we see in the testing field.

By studying the work of Harry and his colleagues, we learned how to talk about the difference between tacit and explicit knowledge, which allows us to recognize what can and cannot be encoded in a script or other artifacts. He distinguished between behaviour (the observable, describable aspects of an activity) and actions (behaviours with intention) (which had inspired James’ distinction between sapient and non-sapient testing). He untangled the differences between mimeomorphic actions (actions that we want to copy and to perform in the same way every time) and polimorphic actions (actions that we must vary in order to deal with social conditions); in doing that, he helped to identify the extents and limits of automation’s power. He wrote a book (with Trevor Pinch) about how scientific knowledge is constructed; another (with Rob Evans) about expertise; yet another about how scientists decide to evaluate a specific experimental result.

Harry’s work helped lend structure to other ideas that we had gathered along the way.

  • McLuhan’s ideas about media and tools
  • Karl Weick’s work on sensemaking
  • Venkatesh Rao’s notions of tempo which in turn pointed us towards James C. Scott’s notion of legibility
  • The realization (brought to our attention by an innocent question from a tester at Barclays Bank) that the “exploratory-scripted continuum” is actually the “formality continuum.” In other words, to formalize an activity means to make it more scripted.
  • The realization of the important difference between spontaneous and deliberative testing, which is the degree of reflection that the tester is exercising. (This is not the same as exploratory vs. scripted, which is about the degree of agency.)
  • The concept of “responsible tester” (defined as a tester who takes full, personal, responsibility for the quality of his work).
  • The advent of the vital distinction between checking and testing, which replaced need to talk about “sapience” in our rhetoric of testing.
  • The subsequent redefinition of the term “testing” within the Rapid Software Testing namespace to make these things more explicit (see below).

About That Last Bullet Point

ET 3.0 as a term is a bit paradoxical because what we are working toward, within the Rapid Software Testing methodology, is nothing less than the deprecation of the term “exploratory testing.”

Yes, we are retiring that term, after 22 years. Why?

Because we now define all testing as exploratory.  Our definition of testing is now this:

“Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes: questioning, study, modeling, observation and inference, output checking, etc.”

Where does scripted testing fit, then?  By “script” we are speaking of any control system or factor that influences your testing and lies outside of your realm of choice (even temporarily). This does not refer only to specific instructions you are given and that you must follow. Your biases script you. Your ignorance scripts you. Your organization’s culture scripts you. The choices you make and never revisit script you.

By defining testing to be exploratory, scripting becomes a guest in the house of our craft; a potentially useful but foreign element to testing, one that is interesting to talk about and apply as a tactic in specific situations. An excellent tester should not be complacent or dismissive about scripting, any more than a lumberjack can be complacent or dismissive about heavy equipment. This stuff can help you or ruin you, but no serious professional can ignore it.

Are you doing testing? Then you are already doing exploratory testing. Are you doing scripted testing? If you’re doing it responsibly, you are doing exploratory testing with scripting (and perhaps with checking).  If you’re only doing “scripted testing,” then you are just doing unmotivated checking, and we would say that you are not really testing. You are trying to behave like a machine, not a responsible tester.

ET 3.0, in a sentence, is the demotion of scripting to a technique, and the promotion of exploratory testing to, simply, testing.

How Not to Standardize Testing (ISO 29119)

Many years ago I took a management class. One of the exercises we did was on achieving consensus. My group did not reach an agreement because I wouldn’t lower my standards. I wanted to discuss the matter further, but the other guys grew tired of arguing with me and declared “consensus” over my objections. This befuddled me, at first. The whole point of the exercise was to reach a common decision, and we had failed, by definition, to do that– so why declare consensus at all? It’s like getting checkmated in chess and then declaring that, well, you still won the part of the game that you cared about… the part before the checkmate.

Later I realized this is not so bizarre. What they had effectively done is ostracize me from the team. They had changed the players in the game. The remaining team did come to consensus. In the years since, I have found that changing the boundaries or membership of a community is indeed an important pillar of consensus building. I have used this tactic many times to avoid unhelpful debate. It is one reason why I say that I’m a member of the Context-Driven School of Testing. My school does not represent all schools, and the other schools do not represent mine. Therefore, we don’t need consensus with them.

Then what about ISO 29119?

The ISO organization claims to have a new standard for software testing. But ISO 29119 is not a standard for testing. It cannot be a standard for testing.

A standard for testing would have to reflect the values and practices of the world community of testers. Yet, the concerns of the Context-Driven School of thought, which has been in development for at least 15 years have been ignored and our values shredded by this so-called standard and the process used to create it. They have done this by excluding us. There are two organizations explicitly devoted to Context-Driven values (AST and ISST) and our community holds several major conferences a year. Members of our community speak at all the major practitioners conferences, and our ideas are widely cited. Some of the most famous testers in the the world, including me, are Context-Driven testers. We exist, and together with the Agilists, we are the source of nearly every new idea in testing in the last decade.

The reason they have excluded us is that they know we won’t agree to any simplistic standard based on templates or simple formulae. We know those things look pretty but they don’t help. If ISO doesn’t exclude us, they worry they will never finish. They know we will challenge their evidence, and even their ethics and basic competence. This is why I say the craft is not ready for standards. It will be years before all the recognized experts in testing can come together and agree on anything substantial.

The people running the ISO effort know exactly who we are. I personally have had multiple public debates with Stuart Reid, on stage. He cannot pretend we don’t exist. He cannot pretend we are some sort of lunatic fringe. Tens of thousands of testers have watched my video lectures or bought my books. This is not a case where ISO can simply declare us to be outsiders.

The Burden of Proof

The Context-Driven community stands for excellence in testing. This is why we must reject this depraved attempt by ISO to grab power and assert control over our craft. Our craft is still an open marketplace of ideas, and it is full of strong debates. We must protect that marketplace and allow it to evolve. I want the fair chance to put my competitors out of business (or get them to change their business) with the high quality of my work. Context-Driven testing has been growing in strength and numbers over the years. Whereas this ISO effort appears to be a job protection program for people who can’t stomach debate. They can’t win the debate so they want to remake the rules.

The burden of proof is not on me or any of us to show that the standard is wrong, nor is it our job to make it right. The burden is on those who claim that the craft can be standardized to study the craft and recognize and resolve the deep differences among us. Failing that, there can be no ethical or rational basis for standardization.

This blog post puts me on record as opposing the ISO 29119 standard. Together with my colleagues, we constitute a determined and sustained and principled opposition.

A Test is a Performance

Testing is a performance, not an artifact.

Artifacts may be produced before, during, or after the act of testing. Whatever they are, they are not tests. They may be test instructions, test results, or test tools. They cannot be tests.

Note: I am speaking a) authoritatively about how we use terms in Rapid Testing Methodology, b) non-authoritatively of my best knowledge of how testing is thought of more broadly within the Context-Driven school, and c) of my belief about how anyone, anywhere should think of testing if they want a clean and powerful way to talk about it.

I may informally say “I created a test.” What I mean by that is that I designed an experience, or I made a plan for a testing event. That plan itself is not the test, anymore than a picture of a car is a car. Therefore, strictly speaking, the only way to create a test is to perform a test. As Michael Bolton likes to say, there’s a world of difference between sheet music and a musical performance, even though we might commonly refer to either one as “music.” Consider these sentences: “The music at the symphony last night was amazing.” vs. “Oh no, I left the music on my desk at home.”

We don’t always have to speak strictly, but we should know how and know why we might want to.

Why can’t a test be an artifact?

Because artifacts don’t think or learn in the full human sense of that word, that’s why, and thinking is central to the test process. So to claim that an artifact is a test is like wearing a sock puppet on your hand and claiming that it’s a little creature talking to you. That would be no more than you talking to yourself, obviously, and if you removed yourself from that equation the puppet wouldn’t be a little creature, would it? It would be a decorated sock lying on the floor. The testing value of an artifact can be delivered only in concert with an appropriately skilled and motivated tester.

With procedures or code you can create a check. See here for a detailed look at the difference between checking and testing. Checking is part of testing, of course. Anyone who runs checks that fail knows that the next step is figuring out what the failures mean. A tester must also evaluate whether the checks are working properly and whether there are enough of them, or too many, or the wrong kind. All of that is part of the performance of testing.

When a “check engine” light goes on in your car, or any strange alert, you can’t know until you go to a mechanic whether that represents a big problem or a little problem. The check is not testing. The testing is more than the check itself.

But I’ve seen people follow test scripts and only do what the test document tells them to do!

Have you really witnessed that? I think the most you could possibly have witnessed is…

EITHER:

a tester who appeared to do “only” what the test document tells him, while constantly and perhaps unconsciously adjusting and reacting to what’s happening with the system under test. (Such a tester may find bugs, but does so by contributing interpretation, judgment, and analysis; by performing.)

OR:

a tester who necessarily missed a lot of bugs that he could have found, either because the test instructions were far too complex, or far too vague, or there was far too little of it (because that documentation is darn expensive) and the tester failed to perform as a tester to compensate.

In either case, the explicitly written or coded “test” artifact can only be an inanimate sock, or a sock puppet animated by the tester. You can choose to suffer without a tester, or to cover up the presence of the tester. Reality will assert itself either way.

What danger could there be in speaking informally about writing “tests?”

It’s not necessarily dangerous to speak informally. However, a possible danger is that non-testing managers and clients of our work will think of testers as “test case writers” instead of as people who perform the skilled process of testing. This may cause them to treat testers as fungible commodities producing “tests” that are comprised solely of explicit rules. Such a theory of testing– which is what we call the Factory school of testing thought– leads to expensive artifacts that uncover few bugs. Their value is mainly in that they look impressive to ignorant people.

If you are talking to people who fully understand that testing is a performance, it is fine to speak informally. Just be on your guard when you hear people say “Where are your tests?” “Have you written any tests?” or “Should you automate those tests?” (I would rather hear “How do you test this?” “Where are you focusing you testing?” or “Are you using tools to help your testing?”)

Thanks to Michael Bolton and Aleksander Simic for reviewing and improving this post.

 

Finding Your Own Integrity

I have a belief that I’m not going to justify– I’m simply going to say it and challenge you to look into your own experience and your own heart and see the truth of it for yourself: Your sense of identity, as a human among humans, is the most powerful force that animates and directs your choices. It is more important than sex or food or religion. It lurks behind every neurosis (including those involving sex or food or religion). As I read history and experience life, answers to the questions “Who am I? Am I a good example of what I should be?” are the prime movers of human choice throughout all of history, and the proximal cause of every war.

There are certainly exceptions to this rule: drug addiction, mental illness, or panic over a sudden, surprising, physical threat. Maybe those things have little to do with identity. Granted. I’m talking about normal daily life (and every Shakespeare play).

“I am an American. I am a human. I am a father. I am a husband. I am lovable. I am helpful. I am a tester. I am a skeptic. I am an outsider. I am dangerous. I am safe. I am honorable. I am fallible. I am truthful. I am intellectual…”  Each of these statements, for me, are reflective shards that tumble in a kaleidoscope of my identity. The models of personhood they represent comprise my moral compass. Although the pattern formed in that kaleidoscope may seem to shift with the situation, the underlying logic of my adult identity changes little with time.

That is the context for integrity.

Integrity means wholeness; the harmony and completeness of one’s identity. Practically speaking, a person with integrity is a person to lives consistently according to their avowed moral code, as opposed to someone who has no moral code, or who changes it as a matter of convenience. A person of integrity therefore creates continuity across the events of his life, and other people feel they know who they are dealing with.

The Challenge of Finding Your Integrity

Recently, in a discussion about what is reasonable for an employer to ask of a tester, a colleague felt I was trying to impose my own values onto potential employers of my students and wrote that as teachers of new testers “employment [for the testers] should be our first priority.” I disagreed sharply, writing that “our first priority is integrity.” My correspondent seemed to take offense to that.

Now, the employment-first position might be construed to imply that we should advocate robbing banks, because it is the quickest way to get money, or perhaps we should train prostitutes, because prostitution is an old and reliable industry with lots of job security for the right people. That would be absurd, but it’s also a straw man argument. I am certain no one intends to argue that any job is better than no job. Safety, legality and morality do enter into the equation.

Conversely, the integrity-first position might be cast as requiring a tester to immediately protest or resign in the face of any ethical dilemma or systemic ethical lapse, no matter how seemingly minor. This would turn most testers into insufferable, dour lawyers on their projects. We would get very little done. Who would hire such people?

These extreme positions are not very interesting, except as tools for meditating on what might be reasonable behavior. Therefore, I’d like to describe a less extreme position that I think is more defensible and workable. It goes like this:

1. Integrity is a vital and important matter. We suffer as people and society suffers when we treat it too lightly.

2. As testers and technical people, our integrity is routinely threatened by well-meaning clients and colleagues who want us to portray ourselves and the world to be a certain way, even if that isn’t strictly the truth.

3. If we never think directly about integrity, and simply trust in the innate goodness of ourselves and others, we are definitely taking this matter too lightly.

4. Integrity is not like a vase that shatters easily, and that once shattered is irretrievable. Integrity is more like an ongoing public artwork, exposed to and buffeted by the elements, sometimes damaged but always ultimately repairable (although our reputation may be another matter). Integrity is a work in progress for all of us.

5. Integrity, like education, is both personal and social. Your society judges you. It is reasonable that it does. But it is also reasonable to negotiate limits on that judgment. We spend our lives negotiating those lines, one way or another.

6. Forgiveness, although perhaps difficult and expensive to obtain, should always be available to us. (I test this by occasionally imagining my most “depraved” enemies in testing, and then imagining what they could do that would allow me to forgive them and even collaborate with them.)

7. Although integrity is our highest priority, in general, it is not the only thing that matters. We must apply wisdom and judgment so that the maintenance of integrity does not unreasonably affect our ability to survive. There is no set formula for how to do that.

8. Therefore, our practical priority must be: to learn how to think through and solve problems of survival while maintaining reasonable integrity. This itself is an ongoing project, requiring temperance and self-forgiveness.

9. New testers need to realize that they are not necessarily responsible for the quality of their work. Sometimes you will be asked to do things you don’t understand the value of, even though there may be value. In those situations, it’s okay to be compliant, as long as you are under supervision and someone competent is taking responsibility for what you do. It’s okay to watch and learn and not necessarily to make trouble. (Although, I usually did, even as a newbie.)

10. Experienced testers? Well, much is expected of you. Your clients (your non-tester colleagues and bosses) don’t know how to test, but you are supposed to. You can’t just do what you are told. That would be like letting a drunk friend drive home. Remember, someday your clients may sober up and wonder why you agreed to their stupid plan when you were supposed to be the expert.

Having laid this hopefully reasonable and workable strategy before you… I actually think the dispute between me and my correspondent, above, was not about the importance of integrity or employment at all, but rather about the specifics of the case we were debating. I should tell you what that was: whether it is reasonable for an employer to expect an entry-level tester to “write test cases.”

From a context-driven testing perspective, no practice can be declared unreasonable outside all contexts. But I do know a lot about the typical contexts of testing. I have seen profound waste, all around the industry, due to reckless and thoughtless documenting and counting of things called “test cases.” So, I don’t think that it is reasonable, generally speaking, to require young testers to write test cases. First, because “writing test cases” is what people who don’t know how to test think testers do– so, it’s usually an indicator of incompetent management. Second, because entry-level testers do not have the skills to write test cases in such a way that won’t result in a near complete waste of their client’s time and money. And third, because there are such obviously better things to do, in most cases, including learning about the product and actually testing the product.

Many people disagree with me. But I believe their attitude on this is the single most direct and vital cause of the perpetual infancy and impotency that we see in the testing industry. In other words, it’s not just a disagreement about style, it’s something that I think threatens our integrity as sincere and thoughtful testers. Casual shrugging about test case writing must be stamped out the way transfats are being outlawed in fast food. Yes, that also took years to accomplish.

Speaking of fast food…

Here’s a metaphor that might help: eating at McDonalds.

Eating at McDonalds will not kill you (well, not outright). But what if you were forced to eat at McDonalds for your work? Every day, breakfast, lunch and dinner. Nothing but McDonalds. What if it were obvious to you that eating at McDonalds was not helping you actually succeed in your work? What if instead it was clear to you that such a diet was harming your ability to get your work done? For instance, perhaps you are a restaurant reviewer, except you are almost always full of McDonalds food so you can’t ever enjoy a proper meal at a restaurant you are supposed to review? And yet your manager, who knows nothing about restaurant reviewing, insists that you maintain a McDonalds-dominated dietary regimen.

Couldn’t someone say, hey, it’s a job and you should do what you are told? Yes, they could say that. And it might be true enough at first. But over time, that diet would hurt you; over time, you would have to cope with how poorly you were doing what you believed to be your real job. You might even be criticized for missing bugs– I mean– failing to review restaurants fully, even though it’s largely due to your employer’s own unreasonable process prescriptions.

At some point you might say “enough!!” You might refuse to eat another Big Mac. From the point of view of your management and colleagues, it might look like you were risking your job just because you didn’t want to eat a hamburger. It might look crazy to them. But from your point of view, the issue isn’t the one burger, but rather the unhealthy system, slowly killing you. This breakdown comes more quickly if you happen to have a gluten allergy.

Ethics and integrity in testing is not just about following prissy little rules that many other people flout– it’s about not making yourself sick even if other people are willing to live a sickly life. This requires that you be able to judge what health and sickness means to you. Integrity is about identity health.

A Story of Quitting Even Though I Needed the Work

In 1998, I was hired by a consulting company outside of Washington D.C. I negotiated for a $30,000 sign-on bonus, and bought a house in Virginia. I was the sole breadwinner in my family, with a wife and son to support. I bought a new car, too. In short, I was “all in.”

Six months later, I quit. I had no other job to go to. I had bills due. It took me seven years to pay back my sign-on bonus, with interest (I forfeited it because I did not stay for two years). But with the help of colleagues and family over the following weeks, I made the transition to running my own business. I am most thankful for my wife’s response when I came home that night and told her I walked out on our only source of income. She shrugged and said it was surely for the best, and that something good would come of it. (I can only recommend, gentlemen, that you marry an optimist if you can.) I am also thankful to Cem Kaner, who bought me a laptop (my only computer was then owned by my employer) and said “times like these are when you discover who your true friends are.” This was fitting because it was partly because of Cem that I had previously decided never to sacrifice my professional integrity.

This illustrates one lesson about ethics: community support helps us find and maintain our integrity.

I quit because my company was insisting that I bill hours on a project that, in my opinion, was absolutely certain not to benefit from my work. The client wanted me to create fake test cases. They didn’t call them fake test cases, of course. They claimed to want real test cases; and good ones. But no product had been designed at that time! All I had access to was a general description of requirements, which in this case were literally statements of problems the product was intended to solve, with no information on how they would be solved. It was a safety-critical factory process control system, and no one could show me what it would look like or provide any examples of what specifically it might do. The only test cases I could possibly design would therefore be vague and imaginary, based on layers of soft, fluffy assumptions. The customer told me they would be happy if I delivered a document that consisted of the text of each requirement preceded by the phrase “verify that…” I told them they didn’t need a tester for that. They needed a macro.

The integrity picture was clouded, in that case, because the client believed they had to follow the “V-Model” process, which they had interpreted as demanding that I submit a test case specification document. It was a clash between the integrity of a heuristic (the V-Model) vs. the integrity of solving the problem for which the heuristic was designed. My client might have said that I was the one violating the integrity of the process. Whereas I would have said that my client was not competent to make that judgment.

I’m not saying I won’t do bad work… I’m just saying I won’t do bad work for money. If I do bad work, I want it to be for fun or for learning, but not to anyone’s expense or detriment. Hence a line I use once in a while “I could do that for you, except that you pay me too much.” This is one reason I like being independent. I control what I bill for, and if I think a portion of my work is not acceptable, I don’t charge for it– like a chef who refuses to serve an overcooked steak.

It wasn’t as sudden as it looked…

I didn’t just lose my temper at the first sign of trouble. Things had been coming to a boil for a while. On my very first day I reviewed the RFP for that project and concluded it was doomed, but management bid on it anyway, telling me I needed to “be practical” and that surely “we could be helpful to them if they hired us.” I needed the job, so I relented against my better judgment.

During my first staff meeting, my first week on the job, I challenged the consulting staff about what they did to study testing on their own time. My challenge was met with an awkward silence, after which one of the consultants, sounding soul-wounded, told me he was offended that I would suggest that they weren’t already good enough as testers, “These are the best people I’ve ever worked with” said the twenty-something tester with little experience and no public reputation. “But how do you know they are good?” I asked, knowing that our company had just issued a press release about having hired me (a “distinguished industry pioneer” to quote it exactly). There were other murmurs of annoyance around the table, and the manager stepped in to change the subject. I could have pushed the issue, but I didn’t. I needed the job, so I relented against my better judgment.

I was later told that despite my company’s public position, the other consultants felt that I was a mere armchair expert, whereas they were practical men. I don’t know what evidence they had for that. They never showed me what they could do that I supposedly could not. Management tolerated this attitude. That means they were lying directly to their customers about me– claiming I was an expert when clearly they did not believe I was one. I could have insisted they behave in accordance with their public statements about me. But… I needed the job, so I relented against my better judgment.

I knew the day had come when I must quit because I found myself fantasizing about throwing chairs through windows. That triggered a sort of circuit-breaker of judgment: change your life now, now, now.

So what happened after that?

I suffered for this decision. First came the panic attack. I felt dizzy and it was hard to breathe for a few hours. This was followed by a few years of patching together a project here and a project there, never more than 8 weeks from completely running out of money and credit. We were twice within a week of filing for bankruptcy in the early days. During that time I walked away from a few more projects. I resigned from a dysfunctional government project, hopefully saving valuable taxpayer dollars by not writing a completely unnecessary Software Configuration Management plan that no one on the team wanted. I got myself fired from a project at Texas Instruments after about 90 minutes, because I told them a few things they didn’t want to hear (but that I felt were both true and important).

It’s not all suffering, of course. I once was fired from a project (along with the rest of the test team) and then was the only one hired back– partly because the client realized that my high standards meant that I billed far fewer hours than other consultants. In other words, saying no and being a troublemaker earned me another 500 hours of work, while the yes-sayers lost their situations. I also got some great gigs, including my very first one as an independent, specifically because I am a rabble-rousing kind of thinker.

These days, I cultivate as many clients as I can, so that I don’t rely too much on any one of them. And I have established a reputation for being honest and blunt that generally prevents the wrong sort of people from trying to hire me. It’s not easy, but it can be done: I have made integrity my top priority.

What about before I was well known?

Well, I’ve always had this attitude. It’s not some luxury to me. It’s fundamental. That’s why I had to leave high school. I’ve never been able to “play the game” at the expense of feeling like a good honest man. Like I said, I suffered for it. I wanted to go try myself at MIT, where my much more pliable best friend from high school eventually graduated. I am born to be an academic, but since I can’t stand the compliance-to-ceremony culture of the academic world, I must be an independent scholar, without access to expensive journals and fantastic libraries.

Before anybody heard of me, I treated getting work like dating: be a slightly exaggerated version of myself so that I will be rejected quickly if the relationship is not a fit (a stress testing strategy, you might say). My big break came at Apple, where I worked for a man of vision and good humor who seemed to relish being the mentor I needed. The environment was open and supportive. There was an element of luck in that my first ten years in testing I worked for people who didn’t ask me to tell lies or do bad work on purpose.

So I know it’s possible to find such people. They are out there. You don’t have to work for bozos, and if you currently do, there is yet hope.

A person who does not live true to himself feels sick and weak inside. My identity as “excellent software tester” demands that I take my craft seriously. I hope you will take this craft seriously, too.

P.S. What if my sense of identity doesn’t require me to be good at my job?

Then, technically, none of this applies to you. Your ethical code can include doing bad work. But… why are you reading my blog? How did you get in? Guards! Seize him!

 

A Public Service Announcement About Exploratory Testing

[Updated: I revamped and added some more examples to the list.]

I got this message from Oliver Vilson, today:

Oliver V.: hi James. Just had a chat with Helena_JM. She reminded me something… don’t know if you’ve written blog about it or pushed it into  RST.. One Test lead from another company mentioned me he has problems with his testers. Some of them are saying that they don’t have to do test plans, since your teaching seems to align that…
James Bach: Any more details?
Oliver V.: rough translation from test team lead : “ET seems to have reputation as “excuse for shitty testing”. People can’t explain what they did and why. If you ask them for test plan or explanation, all you get is “but Bach said…”.

I have, from time to time, heard rumors that some people cite my writings and teachings as an excuse to do bad testing. I think it would help to do a public service announcement…

Attention Testers and Managers of Testers

If a tester claims he is justified in doing bad work because of something I’ve published or said, please email me at james@satisfice.com, or Skype me, and I will help you stop that silliness.

I teach skilled software testing for people who intend to do an excellent job. That process is necessarily exploratory in nature. It also necessarily will have some scripted elements– partly due to the nature of thinking and partly due to the requirements of excellent intellectual work.

I do not teach evasiveness or obscurantism. I do not ever tell a tester that he can get away with refusing to explain his test process. Explaining testing is an important part of being a professional.

Why People Get Confused

I reinvented software testing for myself, from first principles. So, I teach from a very different set of premises. This is necessary, because common ideas about testing are so idiotic. But it does result in confusion when my ideas are taken out of context and “mixed in” to the idiocy. Consider: “I won’t create a detailed test plan document” is a perfectly ordinary and potentially reasonable thing to say in RST. It is a statement about things made explicit in a document, not a statement about lack of planning. Yet Factory School methodology confuses documents for content. If you say that to one of them, it may be mistaken for a refusal to apply appropriate rigor to your work.

Here are some examples of how someone might misapply my teachings:

  1. Rapid Software Testing methodology (RST) is not the same thing as exploratory testing. ET is very simple. Anyone can do ET, just as anyone can look at a painting. But there’s a huge difference between a skilled appraisal of a painting by an expert and a bored glance by a schoolkid. RST is a methodology for doing testing (including scripted and exploratory testing) well. Therefore, anyone doing ET badly is not doing my methodology.
  2. In RST, a plan is not a document, it’s a set of ideas. Therefore, I say you don’t need to have a test plan template, or any sort of written test plan document in order to have a good test plan. I often document my test ideas, though, in different ways, when that helps. Therefore, the lack of a test plan (a guiding set of ideas) probably represents an immature and possibly inadequate test process, but the lack of a test plan document is not necessarily a problem.
  3. In RST, a test is not a document, it’s a performance. Therefore the lack of documented tests is not necessarily a problem, but poor testing (which can be determined by direct observation by a skilled tester or test manager, just as poor carpentry or poor doctoring can be detected) is a problem.
  4. In RST, we have no templates for reporting. But reporting is crucial. Reporting skills are crucial. Accountability is crucial. Credibility is crucial. We teach the art of telling a testing story. Therefore, anyone who declines to explain himself when asked about his testing is not practicing RST. I disavow such testers. (However, just because explaining oneself is an important part of testing doesn’t mean a manager can insist on arbitrarily voluminous documentation or arbitrary metrics. I suspect that, in some cases, managers who complain about testers refusing to document or explain themselves are really just obsessed with a specific method of documentation and refusing to accept other viable solutions to the same problem.)
  5. In RST we say that testing cannot be automated, and that tools can become an obsession. This leads some to think I am against tools. No, I am against bad work. Unfortunately, some tools, such as expensive HP/Mercury tools, are often used to wastefully automate weak fact checking at he expense of good testing. Yes., tools and the technical skills to create and apply them play an important role in great testing. It’s not automating testing when I use tools, because testing is whatever testers do, not what tools do. Therefore a tester who refuses to learn and use tools in general is not practicing RST.
  6. In RST we distinguish between checking and testing. This allows us to distinguish between a test process that is appropriately thoughtful and deep, and one (based solely on checking) that would be reckless and shallow. But when we criticize a checking-only test strategy, some people get confused and think we are criticizing the presence of checking rather than the lack of testing. Therefore, a tester who refuses to design or perform checks that are actually economical and helpful is not doing RST.
  7. In RST, we ban unscientific, abusive attempts at using metrics to control the test process. But when some people hear us attack, say, the counting of test cases, they assume that means we don’t believe in even the concept or principle of measurement. Instead, we support using inquiry-focused metrics (which inspire questions rather than dictating decisions), we promote active skepticism about numbers applied to social systems, and we promote the development of observation, reasoning, and social skills that limit the need for quantification. Therefore any tester who simply refuses to consider using metrics of any kind is not doing RST.
  8. Some people hear about the freedom of exploratory testing, and they confuse that with irresponsibility. But that’s silly. If you drive a car, you are free to run over pedestrians or smash into buildings– except you don’t, because you are responsible! Also, it’s against the law. Freedom is not the same thing as having a right. Therefore, anyone who accepts the freedom of exploratory testing and cannot or will not manage that testing appropriately is an incompetent or irresponsible tester.

Testing and Checking Refined

This post is co-authored with Michael Bolton. We have spent hours arguing about nearly every sentence. We also thank Iain McCowatt for his rapid review and comments.

Testing and tool use are two things that have characterized humanity from its beginnings. (Not the only two things, of course, but certainly two of the several characterizing things.) But while testing is cerebral and largely intangible, tool use is out in the open. Tools encroach into every process they touch and tools change those processes. Hence, for at least a hundred or a thousand centuries the more philosophical among our kind have wondered “Did I do that or did the tool do that? Am I a warrior or just spear throwing platform? Am I a farmer or a plow pusher?” As Marshall McLuhan said “We shape our tools, and thereafter our tools shape us.”

This evolution can be an insidious process that challenges how we label ourselves and things around us. We may witness how industrialization changes cabinet craftsmen into cabinet factories, and that may tempt us to speak of the changing role of the cabinet maker, but the cabinet factory worker is certainly not a mutated cabinet craftsman. The cabinet craftsmen are still out there– fewer of them, true– nowhere near a factory, turning out expensive and well-made cabinets. The skilled cabineteer (I’m almost motivated enough to Google whether there is a special word for cabinet expert) is still in demand, to solve problems IKEA can’t solve. This situation exists in the fields of science and medicine, too. It exists everywhere: what are the implications of the evolution of tools on skilled human work? Anyone who seeks excellence in his craft must struggle with the appropriate role of tools.

Therefore, let’s not be surprised that testing, today, is a process that involves tools in many ways, and that this challenges the idea of a tester.

This has always been a problem– I’ve been working with and arguing over this since 1987, and the literature of it goes back at least to 1961– but something new has happened: large-scale mobile and distributed computing. Yes, this is new. I see this is the greatest challenge to testing as we know it since the advent of micro-computers. Why exactly is it a challenge? Because in addition to the complexity of products and platforms which has been growing steadily for decades, there now exists a vast marketplace for software products that are expected to be distributed and updated instantly.

We want to test a product very quickly. How do we do that? It’s tempting to say “Let’s make tools do it!” This puts enormous pressure on skilled software testers and those who craft tools for testers to use. Meanwhile, people who aren’t skilled software testers have visions of the industrialization of testing similar to those early cabinet factories. Yes, there have always been these pressures, to some degree. Now the drumbeat for “continuous deployment” has opened another front in that war.

We believe that skilled cognitive work is not factory work. That’s why it’s more important than ever to understand what testing is and how tools can support it.

Checking vs. Testing

For this reason, in the Rapid Software Testing methodology, we distinguish between aspects of the testing process that machines can do versus those that only skilled humans can do. We have done this linguistically by adapting the ordinary English word “checking” to refer to what tools can do. This is exactly parallel with the long established convention of distinguishing between “programming” and “compiling.” Programming is what human programmers do. Compiling is what a particular tool does for the programmer, even though what a compiler does might appear to be, technically, exactly what programmers do. Come to think of it, no one speaks of automated programming or manual programming. There is programming, and there is lots of other stuff done by tools. Once a tool is created to do that stuff, it is never called programming again.

Now that Michael and I have had over three years experience working with this distinction, we have sharpened our language even further, with updated definitions and a new distinction between human checking and machine checking.

First let’s look at testing and checking. Here are our proposed new definitions, which soon will replace the ones we’ve used for years (subject to review and comment by colleagues):

Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc.

(A test is an instance of testing.)

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

(A check is an instance of checking.)

Explanatory notes:

  • “evaluating” means making a value judgment; is it good? is it bad? pass? fail? how good? how bad? Anything like that.
  • “evaluations” as a noun refers to the product of the evaluation, which in the context of checking is going to be an artifact of some kind; a string of bits.
  • “learning” is the process of developing one’s mind. Only humans can learn in the fullest sense of the term as we are using it here, because we are referring to tacit as well as explicit knowledge.
  • “exploration” implies that testing is inherently exploratory. All testing is exploratory to some degree, but may also be structured by scripted elements.
  • “experimentation” implies interaction with a subject and observation of it as it is operating, but we are also referring to “thought experiments” that involve purely hypothetical interaction. By referring to experimentation, we are not denying or rejecting other kinds of learning; we are merely trying to express that experimentation is a practice that characterizes testing. It also implies that testing is congruent with science.
  • the list of words in the testing definition are not exhaustive of everything that might be involved in testing, but represent the mental processes we think are most vital and characteristic.
  • “algorithmic” means that it can be expressed explicitly in a way that a tool could perform.
  • “observations” is intended to encompass the entire process of observing, and not just the outcome.
  • “specific observations” means that the observation process results in a string of bits (otherwise, the algorithmic decision rules could not operate on them).

There are certain implications of these definitions:

  • Testing encompasses checking (if checking exists at all), whereas checking cannot encompass testing.
  • Testing can exist without checking. A test can exist without a check. But checking is a very popular and important part of ordinary testing, even very informal testing.
  • Checking is a process that can, in principle be performed by a tool instead of a human, whereas testing can only be supported by tools. Nevertheless, tools can be used for much more than checking.
  • We are not saying that a check MUST be automated. But the defining feature of a check is that it can be COMPLETELY automated, whereas testing is intrinsically a human activity.
  • Testing is an open-ended investigation– think “Sherlock Holmes”– whereas checking is short for “fact checking” and focuses on specific facts and rules related to those facts.
  • Checking is not the same as confirming. Checks are often used in a confirmatory way (most typically during regression testing), but we can also imagine them used for disconfirmation or for speculative exploration (i.e. a set of automatically generated checks that randomly stomp through a vast space, looking for anything different).
  • One common problem in our industry is that checking is confused with testing. Our purpose here is to reduce that confusion.
  • A check is describable; a test might not be (that’s because, unlike a check, a test involves tacit knowledge).
  • An assertion, in the Computer Science sense, is a kind of check. But not all checks are assertions, and even in the case of assertions, there may be code before the assertion which is part of the check, but not part of the assertion.
  • These definitions are not moral judgments. We’re not saying that checking is an inherently bad thing to do. On the contrary, checking may be very important to do. We are asserting that for checking to be considered good, it must happen in the context of a competent testing process. Checking is a tactic of testing.

Whither Sapience?

If you follow our work, you know that we have made a big deal about sapience. A sapient process is one that requires an appropriately skilled human to perform. However, in several years of practicing with that label, we have found that it is nearly impossible to avoid giving the impression that a non-sapient process (i.e. one that does not require a human but could involve a very talented and skilled human nonetheless) is a stupid process for stupid people. That’s because the word sapience sounds like intelligence. Some of our colleagues have taken strong exception to our discussion of non-sapient processes based on that misunderstanding. We therefore feel it’s time to offer this particular term of art its gold watch and wish it well in its retirement.

Human Checking vs. Machine Checking

Although sapience is problematic as a label, we still need to distinguish between what humans can do and what tools can do. Hence, in addition to the basic distinction between checking and testing, we also distinguish between human checking and machine checking. This may seem a bit confusing at first, because checking is, by definition, something that can be done by machines. You could be forgiven for thinking that human checking is just the same as machine checking. But it isn’t. It can’t be.

In human checking, humans are attempting to follow an explicit algorithmic process. In the case of tools, however, the tools aren’t just following that process, they embody it. Humans cannot embody such an algorithm. Here’s a thought experiment to prove it: tell any human to follow a set of instructions. Get him to agree. Now watch what happens if you make it impossible for him ever to complete the instructions. He will not just sit there until he dies of thirst or exposure. He will stop himself and change or exit the process. And that’s when you know for sure that this human– all along– was embodying more than just the process he agreed to follow and tried to follow. There’s no getting around this if we are talking about people with ordinary, or even minimal cognitive capability. Whatever procedure humans appear to be following, they are always doing something else, too. Humans are constantly interpreting and adjusting their actions in ways that tools cannot. This is inevitable.

Humans can perform motivated actions; tools can only exhibit programmed behaviour (see Harry Collins and Martin Kusch’s brilliant book The Shape of Actions, for a full explanation of why this is so). The bottom line is: you can define a check easily enough, but a human will perform at least a little more during that check– and also less in some ways– than a tool programmed to execute the same algorithm.

Please understand, a robust role for tools in testing must be embraced. As we work toward a future of skilled, powerful, and efficient testing, this requires a careful attention to both the human side and the mechanical side of the testing equation. Tools can help us in many ways far beyond the automation of checks. But in this, they necessarily play a supporting role to skilled humans; and the unskilled use of tools may have terrible consequences.

You might also wonder why we don’t just call human checking “testing.” Well, we do. Bear in mind that all this is happening within the sphere of testing. Human checking is part of testing. However, we think when a human is explicitly trying to restrict his thinking to the confines of a check– even though he will fail to do that completely– it’s now a specific and restricted tactic of testing and not the whole activity of testing. It deserves a label of its own within testing.

With all of this in mind, and with the goal of clearing confusion, sharpening our perception, and promoting collaboration, recall our definition of checking:

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

From that, we have identified three kinds of checking:

Human checking is an attempted checking process wherein humans collect the observations and apply the rules without the mediation of tools.

Machine checking is a checking process wherein tools collect the observations and apply the rules without the mediation of humans.

Human/machine checking is an attempted checking process wherein both humans and tools interact to collect the observations and apply the rules.

In order to explain this thoroughly, we will need to talk about specific examples. Look for those in an upcoming post.

Meanwhile, we invite you to comment on this.

UPDATE APRIL 10th: As a result of intense discussions at the SWET5 peer conference, I have updated the diagram of checking and testing. Notice that testing is now sitting outside the box, since it is describing the whole thing, a description of testing is inside of it. Human checking is characterized by a cloud, because its boundary with non-checking aspects of testing is not always clearly discernible. Machine checking is characterized by a precise dashed line, because although its boundary is clear, it is an optional activity. Technically, human checking is also optional, but it would be a strange test process indeed that didn’t include at least some human checking. I thank the attendees of SWET5 for helping me with this: Rikard Edgren, Martin Jansson, Henrik Andersson, Michael Albrecht, Simon Morley, and Micke Ulander.

Thoughts Toward The Ethics of Testing

I am thinking and talking a lot about ethics, lately. Maybe that’s because Context-Driven testing is getting better traction, now. More testers are approaching me, asking for guidance. More testers are challenging fake testing in their organizations.

Or maybe I’m just noticing it more, because Cem Kaner used to take the brunt of all this. In that respect, I have historically been more a follower than a leader in our community. Recently, I’ve stepped forward to be more of the role model that I am expected by my colleagues to be.

(Note: Ethics is a guaranteed sore subject. Whenever I talk about ethics, I can’t avoid implying that I believe some people are not as ethical as they should be. But, you know, that’s the burden of leadership. To those who avoid politically difficult subjects, enjoy your shadowy existence. Lurk on, yon lurkers.)

I have an ethical code. Actually, several! The Association for Software Testing adopted the ACM Code of Ethics, verbatim. It’s okay. I would have preferred the simpler IEEE Code, personally. I also almost completely follow Jerry Weinberg’s code. I take comfort from all of them, plus I have my own.

I need my own, because these codes above don’t directly deal with certain common testing ethics issues.

So, I’m the content-owner for the 2nd Kiwi Workshop on Software Testing, tomorrow morning, here in Wellington, New Zealand. Our topic is ethics. To get ready for it, I thought I would write out some of the ethical principles by which I strive to operate. Here they are:

  • Know what a test is. Avoid labeling an activity as a “test” unless it represents a sincere effort to discover a problem in a product.
  • Maintain a reasonable impartiality. The purpose of testing is to cast light on the status of the product and its context, in the service of my clients. I may play multiple roles on a project, but my purpose, insofar as I am a tester,  is not to design or improve the product.
  • Do not claim to assure, ensure, or control quality. I don’t control anything about the product: a tester is a witness. In that capacity, I strive to assist the quality creation process.
  • Report everything that I believe, in good faith, to be a threat to the product or to the user thereof, according to my understanding of the best interests of my client and the public good.
  • Apply test methods that are appropriate to the level of risk in the product and the context of the project.
  • Alert my clients to anything that may impair my ability to test.
  • Recuse myself from any project if I feel unable to give reasonable and workman-like effort.
  • Make my clients aware, with alacrity, of any mistake I have made which may require expensive or disruptive correction.
  • Do not accede to requests by my client to work in a wasteful, dangerous, or deceptive way. (e.g. I will not keep test case metrics, because they are damaging in almost any context)
  • If I do not understand or accept my mission, it shall be my urgent priority to discover it or renegotiate it.
  • Do not deceive my clients about my work, nor help others to perpetrate deception.
  • Do not accept tasks for which I am not reasonably prepared or possess sufficient competence to perform, unless I am under the direction and supervision of someone who can guide me.
  • Study my craft. Be alert to better solutions and better ways of working.

 

 

 

Immaturity of Maturity Models

Maturities models (TMMi, CMM, CMMi, etc.) are a dumb idea. They are evidence of immaturity in our craft. Insecure managers sometimes cling to them as they might a treasured baby blanket. In so doing, they retard their own growth and learning.

A client of mine recently asked for some arguments against the concept of maturity models in testing. My reply makes for a nice post, so here goes…

First, here are two articles attacking the general idea of “maturity” models:

Maturity Models Have it Backwards

The Immaturity of the CMM

Here is another article that attacks one of the central tenets of most maturity models, which is the idea that there are universal “best practices”:

No Best Practices

And of course commercial tester certification, which I often ridicule on this blog, is a related phenomenon.
Here are some talking points:

1. I suggest this definition of maturity: “Maturity is the degree to which a system has realized its potential and adapted to its context.”

In support of this definition, consider these relevant snippets from the Oxford English Dictionary.

* Fullness or perfection of growth or development.
* Deliberateness of action; mature consideration, due deliberation.
* The state of being complete, perfect, or ready; fullness of development.
* The stage at which substantial growth no longer occurs.

2. I suggest this definition of maturity model: “a maturity model is plan for achieving maturity.”

By this definition, I know of nothing that is called a “maturity model” that actually is a maturity model. This is because maturity cannot be achieved by mimicking the “look” of mature organizations. Maturity is achieved through growing and learning as you encounter and deal with natural problems.

3. Maturity is not our goal in engineering. Our goal is to achieve success, satisfaction, security, and respect through the mechanism of doing good work.

No one gains success through maturity. It is not our goal. Some businesses benefit by the appearance of maturity, but that is a matter of marketing, not engineering. And regardless of how we achieve maturity, not all maturity is desirable. A creature approaching death is also mature.

Hey, blacksmithing is a mature craft, and yet look around… where are the blacksmiths? The world has moved on. We are in a period of growth, study, and creativity in testing. No one can say what the state of the art of our craft will be in 50 years. It will evolve, and we– our minds and experiences– are the engine of that evolution.

4. The behaviors of a healthy mature organization cannot be a template for success.

We achieve maturity by learning and growing as a testing organization, not by aiming at or emulating “mature” behaviors.

Maturity is a dependent variable. We don’t manipulate our maturity directly. We simply learn and grow, wherever that takes us. As any parent knows, you cannot speed up the maturation of your children by entreating them to “be mature.” Their very immaturity is partly the means by which they will mature. Immature creatures play and experiment. Research in rats, for instance, documents severe developmental problems in rats that were prevented from playing. Rats who don’t play as juveniles are much less able to deal with unexpected situations as adults. They cannot deal effectively with stress, compared to normal rats.

There can NEVER be one ultimate form or process that we declare to be mature, UNLESS our context never changes. This is because, in engineering, we are in the business of solving problems in context. However, our context changes regularly, because people change, technology changes, and because we are continuing to experiment and innovate.

Darwin’s theory of the origin of species is also a theory of the maturation and continual re-generation of species. As he understood, maturity is always a relative matter.

5. The “maturity model” of any outsider is essentially a propaganda tool. It is a marketing tool, not an engineering tool.

Every attempt to formalize testing constitutes a claim, on some person’s part, that he knows what testing should be, and that other people cannot be trusted to know this (otherwise, why not let them decide for themselves how to test?).

I have formalized testing, myself, and that’s exactly what I am thinking when I do so. But I do not impose my view of testing on any other organization or tester, unless they work for me. My formalizations are specific to my experiences and analysis of specific situations. I offer these ideas to my clients as input to a process of ongoing study and adaptation. To make any best practice claims would be irresponsible.

6. If you want to get better try this: create an environment where learning and innovation is encouraged; institutionalize mechanisms that promote this, such as internal training, peer conferences, pilot projects, and mentoring.

As my colleague Michael Bolton likes to say “no mature person, involved in a serious matter, lets any other mature person do their thinking for them.”

Mature people take responsibility for themselves. Therefore, don’t adopt anyone else’s “maturity model” of testing. Let your own people lead you.

Quality is Dead #1: The Hypothesis

Quality is dead in computing. Been dead a while, but like some tech’d up version of Weekend at Bernie’s, software purveyors are dressing up its corpse to make us believe computers can bring us joy and salvation.

You know it’s dead, too, don’t you? You long ago stopped expecting anything to just work on your desktop, right? Same here. But the rot has really set in. I feel as if my computer is crawling with maggots. And now it feels that way even when I buy a fresh new computer.

My impression is that up to about ten years ago most companies were still trying, in good faith, to put out a good product. But now many of them, especially the biggest ones, have completely given up. One sign of this is the outsourcing trend. Offshore companies, almost universally, are unwilling and unable to provide solid evidence of their expertise. But that doesn’t matter, because the managers offering them the work care for nothing but the hourly rate of the testers. The ability of the testers to test means nothing. In fact, bright inquisitive testers seem to be frowned upon as troublemakers.

This is my Quality is Dead hypothesis: a pleasing level of quality for end users has become too hard to achieve while demand for it has simultaneously evaporated and penalties for not achieving it are weak. The entropy caused by mindboggling change and innovation in computing has reached a point where it is extremely expensive to use traditional development and testing methods to create reasonably good products and get a reasonable return on investment. Meanwhile, user expectations of quality have been beaten out of them. When I say quality is dead, I don’t mean that it’s dying, or that it’s under threat. What I mean is that we have collectively– and rationally– ceased to expect that software normally works well, even under normal conditions. Furthermore, there is very little any one user can do about it.

(This explains how it is possible for Microsoft to release Vista with a straight face.)

I know of a major U.S. company, that recently laid off a group of more than a dozen trained, talented, and committed testers, instead outsourcing that work to a company in India that obviously does not know how to test (judging from documents shown to me). The management of this well-known American company never talked to their testers or test managers about this (according to the test manager involved and the director above him, both of whom spoke with me). Top management can’t know what they are giving up or what they are getting. They simply want to spend less on testing. When testing becomes just a symbolic ritual, any method of testing will work, as long as it looks impressive to ignorant people and doesn’t cost too much. (Exception: sometimes charging a lot for a fake service is a way to make it seem impressive.)

Please don’t get me wrong. Saving money is not a bad thing. But there are ways to spend less on testing without eviscerating the quality of our work. There are smart ways to outsource, too. What I’m talking about is that this management team obviously didn’t care. They think they can get away with it. And they can: because quality is dead.

I’m also not saying that quality is dead because people in charge are bad people. Instead what we have are systemic incentives that led us to this sorry state, much as did the incentives that resulted in favorable conditions for cholera and plague to sweep across Europe, in centuries past, or the conditions that resulted in the Great Fire of London. It took great disasters to make them improve things.

Witness today how easily the financial managers of the world are evading their responsibility for bringing down the world economy. It’s a similar deal with computing. Weak laws pertaining to quality, coupled with mass fatalism that computers are always going to be buggy, and mass acceptance of ritualistic development and testing practices make the world an unsafe place for users.

If we use computers, or deal with people who do, we are required to adapt to failure and frustration. Our tools of “productivity” suck away our time and confidence. We huddle in little groups on the technological terrain, subject to the whims and mercies of the technically elite. This is true even for members of the technically elite– because being good in one technology does not mean you have much facility with the 5,000 other technologies out there. Each of us is a helpless user, in some respect.

Want an illustration? Just look at my desktop:

  • Software installation is mysterious and fragile. Can I look at any given product on my system and determine if it is properly installed and configured? No.
  • Old data and old bits of applications choke my system. I no longer know for sure what can be thrown away, or where it is. I seem to have three temp folders on my system. What is in them? Why is it there?
  • My task manager is littered with mysterious processes. Going through, googling each one, and cleaning them up is a whole project in and of itself.
  • I once used the Autoruns tool to police my startup. Under Vista, this has become a nightmare. Looking at the Autoruns output is a little like walking into that famous warehouse in Indiana Jones. Which of the buzillion processes are really needed at startup?
  • Mysterious pauses, flickers, and glitches are numerous and ephemeral. Investigating them saps too much time and energy.
  • I see a dozen or two “Is it okay to run this process?” dialog boxes each day, but I never really know if it’s okay.  How could I know? I click YES and hope for the best.
  • I click “I Agree” to EULAs that I rarely read. What rights am I giving away? I have no idea. I’m not qualified to understand most of what’s in those contracts, except they generally disclaim responsibility for quality.
  • Peripherals with proprietary drivers and formats don’t play well with each other.
  • Upgrading to a new computer is now a task comparable with uprooting and moving  to a new city.
  • I’m sick of becoming a power user of each new software package. I want to use my time in other ways, so I remain in a state of ongoing confusion.
  • I am at the mercy of confused computers and their servant who work for credit agencies, utility companies and the government.
  • I have to accept that my personal data will probably be stolen from one of the many companies I do business with online.
  • Proliferating online activity now results in far flung and sometimes forgotten pockets of data about me, clinging like Spanish Moss on the limbs of the Web.

Continuous, low grade confusion and irritation, occasionally spiking to impotent rage, is the daily experience of the technically savvy knowledge worker. I shudder to think what it must be like for computerphobes.

Let me give you one of many examples of what I’m talking about.

I love my Tivo. I was a Tivo customer for three years. So why am I using the Dish Network and not Tivo? The Dish Network DVR sucks. I hate you Dish Network DVR developers! I HATE YOU! HAVEN’T YOU EVER SEEN A TIVO??? DO YOU NOT CARE ABOUT USABILITY AND RELIABILITY, OR ARE YOU TOTAL INCOMPETENT IDIOTS???

I want to use a Tivo, but I can’t use it with the Dish Network. I have to use their proprietary system. I don’t want to use the Dish Network either, but DirectTV was so difficult to deal with for customer service that I refuse to be their customer any more. The guy who installed my Dish Network DVR told me that its “much better than Tivo.” The next time I see him, I want to take him by the scruff of his neck and rub his nose on the screen of my Dish Network DVR as it fails once again to record what I told it to record. You know nothing of Tivos you satellite installer guy! Do not ever criticize Tivo again!

Of all the technology I have knowingly used in the last ten years, I would say I’m most happy with the iPod, the Tivo, and the Neatworks receipt scanning system. My Blackberry has been pretty good, too. Most other things suck.

Quality is dead. What do we do about that? I have some ideas. More to come…

What the Certification Sales Lady Said…

At the Star conference, this week, the lady at the ASTQB booth was executive director Lois Kostroski. The ASTQB is the American chapter of the ISTQB. Here’s the gist of the conversation we had about certification…

James: “Do you need any experience to get certified?”

Lois: “No, you just have to pass the exam.”

James: “What are the benefits of certification?”

Lois: “JB.”

James: “JB?”

Lois: “Just Because. There are almost 90,000 certified testers. It’s fast becoming the norm. In some countries you can’t get a job unless you have our certification.”

James: “But isn’t there controversy surrounding certification?”

Lois: “No. The controversy is only around other certification programs. When you’re the Big Dog, there is no controversy.” (Yes, she referred to her organization as the Big Dog.)

James: “Are you a tester?”

Lois: “No. I run the organization.”

James: “You run the organization but you are not a tester? How do you know this program is any good.”

Lois: “I work with testers. I trust the experts on our advisory board.”

James: “Are you aware of any recognized industry experts who oppose this certification?”

Lois: “Yes.”

James: “Can you name them?”

Lois: “Yes. Cem Kaner and James Bach.”

James: “I’m James Bach.”

Lois: “Oh, I didn’t recognize you.”

James: “Well, I’m glad you recognize that there is controversy.” (I turned to go.)

Lois: (She called after me…) “From your side!”

Here’s what I take from that conversation. The ASTQB is using a combination of ignorance (Lois is ignorant of testing) and feigned ignorance of the field (she pretends that there is no controversy) in order to scare the ignorant into giving the ASTQB money and buying into the ASTQB world-view.

Plus, her organization is so arrogant that it believes it can ignore principled opposition to its behavior.

Please don’t support these fools and scoundrels.

Methodology Debates: Traps and Transformations

(This article is adapted from work I did with Johanna Rothman, at the 1st Amplifying Your Effectiveness conference. It’s never been widely published, so here you go.)

As a context-driven testing methodologist, I am required to think through the methods I use. Sometimes that means debating methodology with people who have a different view about what should be done. Over time, I’ve gained a lot of experience in debate. One thing I’ve learned is that most people have good ideas, but few people know how to debate them. This is too bad, because a successful debate can make a community stronger, while avoiding debates creates a nurturing environment for weak ideas. Let’s look at how to avoid the traps that make debates fail, and how to transform disagreement into powerful consensus.

Sometimes a debate is really part of a war. The advice below won’t help much if that is the case. This advice is more for situations where you are highly motivated to create or maintain a working relationship with someone you disagree with– such as when you work in the next cubicle from the guy.

Traps

  • Conflicting Terminology: Be alert to how you are using technical terms. A common term like “bug” has different meanings to different people. If someone says “Unit testing is absolutely essential to good software quality” among your first concerns should be “What does he mean by ‘unit testing’, ‘essential’, and ‘quality’?” Beware, sometimes a debate about definitions bears important fruit, but it can also be another trap. You can spend all your energy on them without necessarily touching the marrow of the subject. On the other hand, you can allow yourself to understand and even use someone else’s terminology in your debate without committing yourself to changing your preferred terminology in general.
  • Paradigm Conflict: A paradigm is an all-inclusive way of explaining the world, generally tied into terminology and assumptions about practices and contexts. Two different paradigms may explain the same phenomena in totally different ways. When two people from different paradigms come together, each may seem insane to the other. Whenever you feel that your opponent is insane, maybe that’s time to stop and consider that you are trying to cross a paradigmatic boundary. In which case, you should talk about that, first.
  • Ambiguous Metrics: Don’t be seduced by numbers. They can mean anything. The problem is knowing what they do, in fact, mean. When someone quotes numbers at me, I wonder how the metric was collected, and what influenced the people who collected them. I wonder if the numbers were sanitized in any way. For instance, when someone tells me that he performed 1000 test cases, I wonder if he’s talking about trivial test cases, or vital ones. There’s no way to know unless I personally review the tests, or conduct a detailed interview of the tester.
  • Confusing Feeling and Rationality: Beware of confusing feeling communication with rational communication. Be alert to the intensity of the feelings associated with the ideas being presented. Many debates that seem to be about ideas may indeed be about loyalty, trust, respect, and other fundamental issues. A statement like “C++ is the best language in the world. All other languages are garbage” may actually mean “C++ is the only language I know. I am comfortable with what I know. I don’t want to concern myself with languages I don’t already know, because then I feel like a beginner, again.” There’s an old saying that you can’t use logic to refute a conclusion that wasn’t arrived at through logic. That may not be strictly true, but it’s a helpful guideline. So, if you sense a strange intensity around the debate, your best bet may be to stop talking about ideas and start exploring the feelings.
  • Confusing Outcome and Understanding: Sometimes one person can be debating for the purpose of getting a particular outcome, while the other person is debating to understand the subject better, or help you understand them. Confusing these approaches can lead to a lot of unnecessary pain. So, consider saying what your goal is, and ask the other person what they want to get out of the debate.
  • Hidden Context: You may not know enough about the world the other person lives in. Maybe work life for them is completely different than it is for you. Maybe they live under a different set of requirements and challenges. Try saying “I want to understand better why you feel the way you do. Can you tell me more about your [life, situation, work, company, etc.]?”
  • Hidden History: You may not know enough about other debates and other struggles that shaped the other person’s position. If you notice that the other person seems to be making many incorrect assumptions about what you mean, or putting words in your mouth, consider asking something like “Have you ever had this argument with someone else?”
  • Hidden Goals: Not knowing what the other person wants from you. You might try learning about that by asking “Are we talking about the right things?”or “What would you like me to do?” Keep any hint of sarcasm out of your voice when you say that. Your intent should be to learn about what they want, because maybe you can give it to them without compromising anything that’s important to you.
  • False Urgency: Feeling like you are trapped and have to debate right now. It’s always fair to get prepared to discuss a difficult subject. You don’t have to debate someone at a particular time just because that person feels like doing it right then. One way to get out of this trap is just to say “This subject is important to me, but I’m not prepared to debate it right now.”
  • Flipping the Bozo Bit: If you question the sanity, good faith, experience, or intelligence of the person who disagrees with you, then the debate will probably end right there. You’ll have a war, instead. So, if you do that, in the heat of the moment, your best bet for recovery may be to take a break. When you come back, ask questions and listen carefully to be sure you understand what the other guy is trying to say.
  • Short-Term Focus: Hey, think of the future. Successful spouses know that the ability to lose an argument gracefully can help strengthen the marriage. I lose arguments to my wife so often that she gives me anything I want. The same goes for teams. Consider a longer term view of the debate. For instance, if you sense an impasse, you could say “I’m worried that we’re arguing too much. Let’s do it your way.” or “Tell you what: let’s try it your way as an experiment, and see what happens.” or “Maybe we need to get some more information before we can come to agreement on this.”

Transforming Disagreement

An important part of transforming disagreement is to synchronize your terminology and frames of reference, so that you’re talking about the same thing (avoiding the “pro-life vs. pro-choice” type of impasse). Another big part is changing a view of the situation that allows only one choice into one that allows many reasonable choices (the “reasonable people can bet on different horses” position). Here are some ideas for how to do that:

  • Transform absolute statements into context-specific statements. Consider changing “X is true” to “In situation Y, X is true.” In other words, make your assumptions explicit. That allows the other person to say “I’m talking about a different situation.”
  • Transform certainties into probabilities and alternatives. Consider changing “X is true” to “X is usually true” or”X, Y, or Z can be true, but X is the most likely” That allows the other person to question the basis of your probability estimate, but it also opens the door to the possibility of resolving the disagreement as a simpler matter of differing opinions on probability rather than the more fundamental problem of what is possible.
  • Transform rules into heuristics. Consider changing “You should do X” to something like”If you have problem Y and want to solve it, doing something like X might help.” The first statement is probably a suggestion in the clothing of a moral imperative. But in technical works, we are usually not dealing with morals, but rather with problems. If someone tells me that I should write a test plan according to the IEEE-829 template, then I wonder what problem that will solve, whether I indeed have that problem, how important that problem is, whether 829 would solve it, and what other ways that same problem might be solved.
  • Transform implicit stakeholders and concerns into explicit stakeholders and concerns. Consider changing “X is bad” to “I don’t like X” or”I’m worried about X” or “Stakeholder Y doesn’t like X.” There are no judgments without a judger. Bring the judger out into the open, instead of using language that make an opinion sound like a law of physics. This opens the door to talk about who matters and who gets to decide, which can be a more important issue than the decision itself. Another response you can make to “X is bad” is to” ask compared to what?” which will bring out the unspecified standard.
  • Translate the other person’s story into your terms and check for accuracy. Consider saying something like “I want to make sure I understand what you’re telling me. You’re saying that…” then follow with “Does that sound right?” and listen for agreement. If you sense a developing impasse, try suspending your part of the argument and become an interviewer, asking questions to make sure the other person’s story is fully told. Sometimes that’s a good last resort option. If they challenge you to prove them wrong or demand a reply, you can say “It’s a difficult issue. I need to think about it some more.”
  • Translate the ideas into a diagram. Try drawing a picture that shows both views of the problem. Sometimes that helps put a disagreement into perspective (literally). This can help especially in a “blind men and the elephant” situation, where people are arguing because they are looking at different parts of the same thing, without realizing it. For instance, if I argue that testing should start late, and someone else argues that testing should start early, we can draw a timeline and put things on the timeline that represent the various issues we’re debating. We may discover that we are making different assumptions about the cost of bugs curve, and which point we can draw several curves and discuss the forces that affect them.
  • Translate total disagreement into shades of agreement. Do you completely disagree with the other person, or disagree just a little? Consider looking at it as shades of agreement. Is it total opposition or is it just discomfort. This is important because I know, sometimes, I begin an argument with a vague unease about someone’s point of view. If they then react defensively to that, as if I’ve attacked them, then I might feel driven firmly to the other side of the debate. Sometimes when looking for shades of agreement, you discover that you’ve been in violent agreement all along.
  • Transform your goal from being right to being a team. Is there a way to look at the issue being debated as related to the goal of being a strong team? This is something you can do in your own mind to reframe the debate. Is it possible that the other person is arguing less from the force of logic and more from the fear of being ignored? If so, then being a good listener may do more to resolve the debate than being a good thinker. Every debate is a chance to strengthen a relationship. If you’re on the “right” side, you can strengthen it by being a gracious winner and avoiding I-told-you-so behavior. If you’re on the “wrong” side, you can strengthen the team by publicly acknowledging that you have changed you mind, that you have been persuaded. When you don’t know who is right, you can still respect feelings and consider how the outcome and style of the debate might harm your ability to work together.
  • Transform conclusions to processes. If the other person is holding onto a conclusion you disagree with, consider addressing the process by which they came to adopt that conclusion. Talk about whether that process was appropriate and whether it could be revisited.
  • Express faith in the other person. If the debate gets tense, pause and remind the other person that you respect his good faith and intentions. But only say that if it’s true. If it’s not true, then you should stop debating about the idea immediately, and deal instead with your feelings of mistrust. Any debate that’s not based on trust is doomed from the start, unless of course it’s not really a debate, but a war, a game, or a performance put on for an audience.
  • Wait and listen. Sometimes, a conversation looks like a debate, and feels like a debate, but is actually something else. Sometimes we just need to vent for a bit, and be heard. That’s one reason why being a good listener is not only polite, but eminently practical.
  • Express appreciation when someone tries to transform your position. When you notice someone making an effort to use these transformations in a conversation with you, thank them. This is a good thing. It’s a sign that they are trying to connect with you and help you express you ideas.

Question: How Many Times Should You Run a Test?

Kevin asks: What is the best or industry standard for how many times a test case should be run?

There are questions that should not be answered. For instance, “What size unicorn do you wear?” or “How many cars should I own?” Sure, I could answer them, but the answers are worthless. My answers are A) I don’t wear unicorns and B) 2. In these cases, the more helpful reply is to question the question. For the first question, perhaps you said “uniform” and I misheard you. For the second question, perhaps you own a railroad and you were talking about train cars of different kinds, whereas I assumed you’re a small family and you were asking about automobiles.

I can tell you this for sure: No one I respect in the testing field will give you a direct answer to the general question of how many times a test should be run (except maybe as a joke).

Imagine if the answer was 100,000. Would you believe it? What if the answer was 7? Wouldn’t you wonder what was wrong with 6? I can imagine 7 being the right answer, but only for a very specific hypothetical case, not as any sort of general principle.

The first potentially useful answer I have is to tell you that this question would not even occur to you if you knew how to test, therefore, what you really need to do is start learning how to test. I mean if someone was re-wiring your house, and during that process he asked you what “voltage” is, wouldn’t you get someone else to wire your house? Like electrical work, plumbing, computer programming, or welding, good testing is a skilled activity.

I rarely give that answer, though, because I worry I will just leave people feeling discouraged.

The closest thing to a direct answer I can give you is this:

There exist no testing industry standards that are universally binding or even, in my opinion, more than negligibly helpful. Yes, there are documents that purport to be standards. If you are bound by them then you already know that. You aren’t subject to standards unless one has been imposed upon you by a regulating authority or by contract. Therefore, considering that testing costs money and time, I suggest that you don’t run any tests unless there is a reason to do so. In general, don’t do the same work a second time if you have already done it once. Certainly, if your clients would benefit from you running a test again, go for it. Otherwise, you are just indulging in compulsive/obsessive behavior, and you need help of a different kind than I offer.

A problem with this answer is that it begs the question of how you know when to run a test again. Fortunately, I wrote an essay on possible reasons to repeat tests. I can think of ten good reasons that you may want to repeat any given test (along with one big reason not to).

That’s a pretty good answer, but I think I can offer a little more:

Your job is probably to discover if there are terrible as-yet-unknown problems in your very complex product that you have little time to test. To do that job really well requires that you design and perform many tests, more tests than you probably have time to run. Therefore, when you run a test a second time, you are spending precious time and resources (even if it’s automated, though possibly less so) on something other than running a test you have not yet run that may find one of those big bugs you haven’t yet found. Get it?

So, how about having a small set of very basic tests that touch upon a lot features of the product. You may even want to automate these. It should take ten minutes to run these tests, ideally. Perhaps as long as an hour. Repeat those for every build. Their purpose is to quickly detect huge obvious things that may be wrong. Call that the smoke test suite. For everything else, make a test coverage outline that lists every significant element of the product and every significant element of data. Visit the items on that list and test each one according to its importance and potential for failure. Whenever any part of the product changes, try to figure out what could have been affected, and retest that area– but using different tests; perhaps variations on what you’ve already done.

By the way, the more you learn about testing, the less you will find advice like the preceding paragraph useful, because you will carry within you the ability to design your own test strategy that fits your specific purposes and contexts.

Question: Tester’s Freedom of Thought

Subha asks:
A tester is usually bound by the constraints of specifications when he does functional testing. But what about usability? How much should the tester’s imagination be allowed to flow?

Hello Subha,

Read carefully– this is important:

The specification does not bind you, as a tester. The specification provokes you. In fact, the spec, the product, the things people say on the project– all of it is provocative to the tester. It might be where we start, but not where we end. If the product behaves in some important way (important either positively or negatively), then it is generally the role of the tester to test it, even if there is nothing in the spec about that behavior.

Doing testing well requires a great deal of imagination.

I think a tester is actually bound by six things that come to mind as I write this: the mission, project culture, the paticular constraints of the project, and the skill and knowledge of the tester. All of these except that last one are to some extent negotiable:

Mission: This is the problem that your clients want you to solve for them; the outcome they want you to achieve. If you don’t honor your mission, you will not gain credibility or retain respect. Be sure that you negotiate a mission you are capable of fulfilling; and be sure that the story of your testing features the mission as its primary plot point. Is usability testing a part of your mission?

Project Culture: The other people on your project have expectations about what you will or will not do. These bind you. You can challenge those expectations and suggest alternatives, but you have to be careful about that. If usability is part of your mission, what methods of usability testing are acceptable or expected within your organization?

Project Constraints: On your project, you don’t have all the time and money to do everything you might find useful or interesting. You may need to find inexpensive ways to do the testing that your strategy calls for. You may need to acquire special tools, or use tools that don’t do everything you wish they did. What kind of usability testing is it even possible to do on your project?

Tester Skill & Knowledge: Even if you were granted permission and resources to do everything you want to do, you would still be limited to the things that you know how to do. If you want to do testing well, you need enough command of testing practices and tools to make that possible. One common problem with testers is that they don’t do enough to educate themselves. Do you know how to do usability testing?

I realize that this is not a detailed answer to your question. What I’m trying to do is frame a way for you to think the issue through for yourself.

Ethical Standards: A tester is bound by ethical standards not to, for instance, lie about the results of the tests, or to misrepresent his ability to do the work. Are you suggesting usability testing for selfish reasons, or do you really believe it will help your client?

Legal Standards: A tester is bound by legal standards. In some cases, there are laws, such as Sarbanes-Oxley or HIPAA, which guide how you must test. Is there any legal reason why you must or must not perform usability testing?

 

Defining Agile Methodology

Brian Marick has offered a definition of agile methodology. I think his definition is strangely bulky and narrow. That’s because it’s not really a definition, but an example.

Those of us who’ve worked with Brian know that he doesn’t like to talk about definitions. He’d rather deal with rich examples and descriptions than labels. He worries that labels such as “Agile” and “Version Control” can easily become empty talismans that can turn a conversation about practice into a ritualized exchange of passwords. “Oh, he said Agile, that must mean he’s one of us.” I admire how Brian tries to focus on practice rather than labelling.

Where Brian and I part ways is that I don’t think we have a choice about labels and their definitions. When we decline to discuss definitions we are not avoiding politics, we are simply driving the politics underground, where it remains insidious and unregulated. To discuss definitions is to discuss the terms by which your community governs itself, so that we do not inadvertantly undercut each other.

Here’s an example of how postponing a conversation about definitions can bite you. A few years ago, at the Agile Fusion peer conference I hosted at my lab in Virginia, Brian and I got into a heated debate about the meaning of the word “agile”. He said he was completely uninterested in the dictionary definition. He was interested only in how he felt the word was used by a certain group of people– which group, it turned out, did not include me, Cem Kaner, or very many of my colleagues who can legitimately claim to have been working with agile methodologies since the mid-eighties (or in one case, mid-sixties). Perhaps because of Brian’s reluctance to discuss definitions, our disagreement came up out of the blue. I don’t know if it surprised Brian, but it shocked me to discover that he and I were operating by profoundly different assumptions about agile methodology.

Actually, I have had many clashes with people who claim to own the word agile. It’s not just Brian. But some agilists in the capital “A” camp don’t limit themselves to it. Ward Cunningham is a great example. Find Ward. Meet him. Talk to him. He gives agile methodology a good name. I have had similar positive experiences with Alastair Cockburn and Martin Fowler.

There are at least two agile software development communities, then. My community practices agile development in an open-ended way. We support the Agile Manifesto (in fact, I was invited to the meeting where the manifesto was created, but could not attend). However:

  1. We do not divide the world into programmers and customers.
  2. We do not demand that everyone on the project be a generalist, and then define generalist to be just another word for someone who remains ignorant of all skills other than programming skills.
  3. We believe there can be different roles on the team, including, for instance, the role of tester; and that people performing a role ought to develop skill in that art.
  4. We don’t limit our practices to fit guru-approved slogans such as “YAGNI” and “100% automated testing”, but instead use our skills to match our practices to our context.
  5. We don’t accuse people who question practices of “going meta” as if that is a sin instead of ordinary responsible behavior.
  6. We aren’t a personality cult. (if you ever hear someone justify a practice by saying “because James Bach said so” please email me so I can put a stop to it. I like being respected; I hate being a blunt object for ending a debate.)
  7. We don’t talk as if software engineering was invented in 1998.
  8. We question. We criticize. We learn. We change. We are agile.
  9. When we make definitions, we strive to be inclusive and try not to redefine ordinary English words such as “pattern” or “agile”. Specifically, we probably won’t say you just don’t “get it” if you cite the dictionary instead of using approved gurucabulary. GURUCABULARY: (noun) idiosyncratic vocabulary, often a redefinition of preexisting words, asserted by one thinker or thinkers as a way of establishing a proprietary claim on a field of interest.

I want to offer an alternative definition for use outside of the insular world of capital “A” agilists.

First, here is the Websters definition of agile:

agile
Function:adjective
Etymology:Middle French, from Latin agilis, from agere to drive, act, more at AGENT
Date:1581

1 : marked by ready ability to move with quick easy grace *an agile dancer*
2 : having a quick resourceful and adaptable character *an agile mind*

Here, then is my definition of agile methodology:

agile methodology: a system of methods designed to minimize the cost of change, especially in a context where important facts emerge late in a project, or where we are obliged to adapt to important uncontrolled factors.

A non-agile methodology, by comparison, is one that seeks to achieve efficiency by anticipating, controlling, or eliminating variables so as to eliminate the need for changes and associated costs of changing.

Brian Marick’s definition of agile methodology is an example of how one community approaches what I would call agile methodology. My definition is intended to be less imperialistic and more pluralistic. I want to encourage more of us to explore the implications of agility, without having to accept the capital “A” belief system whole.

Fight the power!