Testing and Checking Refined

This post is co-authored with Michael Bolton. We have spent hours arguing about nearly every sentence. We also thank Iain McCowatt for his rapid review and comments.

Testing and tool use are two things that have characterized humanity from its beginnings. (Not the only two things, of course, but certainly two of the several characterizing things.) But while testing is cerebral and largely intangible, tool use is out in the open. Tools encroach into every process they touch and tools change those processes. Hence, for at least a hundred or a thousand centuries the more philosophical among our kind have wondered “Did I do that or did the tool do that? Am I a warrior or just spear throwing platform? Am I a farmer or a plow pusher?” As Marshall McLuhan said “We shape our tools, and thereafter our tools shape us.”

This evolution can be an insidious process that challenges how we label ourselves and things around us. We may witness how industrialization changes cabinet craftsmen into cabinet factories, and that may tempt us to speak of the changing role of the cabinet maker, but the cabinet factory worker is certainly not a mutated cabinet craftsman. The cabinet craftsmen are still out there– fewer of them, true– nowhere near a factory, turning out expensive and well-made cabinets. The skilled cabineteer (I’m almost motivated enough to Google whether there is a special word for cabinet expert) is still in demand, to solve problems IKEA can’t solve. This situation exists in the fields of science and medicine, too. It exists everywhere: what are the implications of the evolution of tools on skilled human work? Anyone who seeks excellence in his craft must struggle with the appropriate role of tools.

Therefore, let’s not be surprised that testing, today, is a process that involves tools in many ways, and that this challenges the idea of a tester.

This has always been a problem– I’ve been working with and arguing over this since 1987, and the literature of it goes back at least to 1961– but something new has happened: large-scale mobile and distributed computing. Yes, this is new. I see this is the greatest challenge to testing as we know it since the advent of micro-computers. Why exactly is it a challenge? Because in addition to the complexity of products and platforms which has been growing steadily for decades, there now exists a vast marketplace for software products that are expected to be distributed and updated instantly.

We want to test a product very quickly. How do we do that? It’s tempting to say “Let’s make tools do it!” This puts enormous pressure on skilled software testers and those who craft tools for testers to use. Meanwhile, people who aren’t skilled software testers have visions of the industrialization of testing similar to those early cabinet factories. Yes, there have always been these pressures, to some degree. Now the drumbeat for “continuous deployment” has opened another front in that war.

We believe that skilled cognitive work is not factory work. That’s why it’s more important than ever to understand what testing is and how tools can support it.

Checking vs. Testing

For this reason, in the Rapid Software Testing methodology, we distinguish between aspects of the testing process that machines can do versus those that only skilled humans can do. We have done this linguistically by adapting the ordinary English word “checking” to refer to what tools can do. This is exactly parallel with the long established convention of distinguishing between “programming” and “compiling.” Programming is what human programmers do. Compiling is what a particular tool does for the programmer, even though what a compiler does might appear to be, technically, exactly what programmers do. Come to think of it, no one speaks of automated programming or manual programming. There is programming, and there is lots of other stuff done by tools. Once a tool is created to do that stuff, it is never called programming again.

Now that Michael and I have had over three years experience working with this distinction, we have sharpened our language even further, with updated definitions and a new distinction between human checking and machine checking.

First let’s look at testing and checking. Here are our proposed new definitions, which soon will replace the ones we’ve used for years (subject to review and comment by colleagues):

Testing is the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc.

(A test is an instance of testing.)

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

(A check is an instance of checking.)

Explanatory notes:

  • “evaluating” means making a value judgment; is it good? is it bad? pass? fail? how good? how bad? Anything like that.
  • “evaluations” as a noun refers to the product of the evaluation, which in the context of checking is going to be an artifact of some kind; a string of bits.
  • “learning” is the process of developing one’s mind. Only humans can learn in the fullest sense of the term as we are using it here, because we are referring to tacit as well as explicit knowledge.
  • “exploration” implies that testing is inherently exploratory. All testing is exploratory to some degree, but may also be structured by scripted elements.
  • “experimentation” implies interaction with a subject and observation of it as it is operating, but we are also referring to “thought experiments” that involve purely hypothetical interaction. By referring to experimentation, we are not denying or rejecting other kinds of learning; we are merely trying to express that experimentation is a practice that characterizes testing. It also implies that testing is congruent with science.
  • the list of words in the testing definition are not exhaustive of everything that might be involved in testing, but represent the mental processes we think are most vital and characteristic.
  • “algorithmic” means that it can be expressed explicitly in a way that a tool could perform.
  • “observations” is intended to encompass the entire process of observing, and not just the outcome.
  • “specific observations” means that the observation process results in a string of bits (otherwise, the algorithmic decision rules could not operate on them).

There are certain implications of these definitions:

  • Testing encompasses checking (if checking exists at all), whereas checking cannot encompass testing.
  • Testing can exist without checking. A test can exist without a check. But checking is a very popular and important part of ordinary testing, even very informal testing.
  • Checking is a process that can, in principle be performed by a tool instead of a human, whereas testing can only be supported by tools. Nevertheless, tools can be used for much more than checking.
  • We are not saying that a check MUST be automated. But the defining feature of a check is that it can be COMPLETELY automated, whereas testing is intrinsically a human activity.
  • Testing is an open-ended investigation– think “Sherlock Holmes”– whereas checking is short for “fact checking” and focuses on specific facts and rules related to those facts.
  • Checking is not the same as confirming. Checks are often used in a confirmatory way (most typically during regression testing), but we can also imagine them used for disconfirmation or for speculative exploration (i.e. a set of automatically generated checks that randomly stomp through a vast space, looking for anything different).
  • One common problem in our industry is that checking is confused with testing. Our purpose here is to reduce that confusion.
  • A check is describable; a test might not be (that’s because, unlike a check, a test involves tacit knowledge).
  • An assertion, in the Computer Science sense, is a kind of check. But not all checks are assertions, and even in the case of assertions, there may be code before the assertion which is part of the check, but not part of the assertion.
  • These definitions are not moral judgments. We’re not saying that checking is an inherently bad thing to do. On the contrary, checking may be very important to do. We are asserting that for checking to be considered good, it must happen in the context of a competent testing process. Checking is a tactic of testing.

Whither Sapience?

If you follow our work, you know that we have made a big deal about sapience. A sapient process is one that requires an appropriately skilled human to perform. However, in several years of practicing with that label, we have found that it is nearly impossible to avoid giving the impression that a non-sapient process (i.e. one that does not require a human but could involve a very talented and skilled human nonetheless) is a stupid process for stupid people. That’s because the word sapience sounds like intelligence. Some of our colleagues have taken strong exception to our discussion of non-sapient processes based on that misunderstanding. We therefore feel it’s time to offer this particular term of art its gold watch and wish it well in its retirement.

Human Checking vs. Machine Checking

Although sapience is problematic as a label, we still need to distinguish between what humans can do and what tools can do. Hence, in addition to the basic distinction between checking and testing, we also distinguish between human checking and machine checking. This may seem a bit confusing at first, because checking is, by definition, something that can be done by machines. You could be forgiven for thinking that human checking is just the same as machine checking. But it isn’t. It can’t be.

In human checking, humans are attempting to follow an explicit algorithmic process. In the case of tools, however, the tools aren’t just following that process, they embody it. Humans cannot embody such an algorithm. Here’s a thought experiment to prove it: tell any human to follow a set of instructions. Get him to agree. Now watch what happens if you make it impossible for him ever to complete the instructions. He will not just sit there until he dies of thirst or exposure. He will stop himself and change or exit the process. And that’s when you know for sure that this human– all along– was embodying more than just the process he agreed to follow and tried to follow. There’s no getting around this if we are talking about people with ordinary, or even minimal cognitive capability. Whatever procedure humans appear to be following, they are always doing something else, too. Humans are constantly interpreting and adjusting their actions in ways that tools cannot. This is inevitable.

Humans can perform motivated actions; tools can only exhibit programmed behaviour (see Harry Collins and Martin Kusch’s brilliant book The Shape of Actions, for a full explanation of why this is so). The bottom line is: you can define a check easily enough, but a human will perform at least a little more during that check– and also less in some ways– than a tool programmed to execute the same algorithm.

Please understand, a robust role for tools in testing must be embraced. As we work toward a future of skilled, powerful, and efficient testing, this requires a careful attention to both the human side and the mechanical side of the testing equation. Tools can help us in many ways far beyond the automation of checks. But in this, they necessarily play a supporting role to skilled humans; and the unskilled use of tools may have terrible consequences.

You might also wonder why we don’t just call human checking “testing.” Well, we do. Bear in mind that all this is happening within the sphere of testing. Human checking is part of testing. However, we think when a human is explicitly trying to restrict his thinking to the confines of a check– even though he will fail to do that completely– it’s now a specific and restricted tactic of testing and not the whole activity of testing. It deserves a label of its own within testing.

With all of this in mind, and with the goal of clearing confusion, sharpening our perception, and promoting collaboration, recall our definition of checking:

Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.

From that, we have identified three kinds of checking:

Human checking is an attempted checking process wherein humans collect the observations and apply the rules without the mediation of tools.

Machine checking is a checking process wherein tools collect the observations and apply the rules without the mediation of humans.

Human/machine checking is an attempted checking process wherein both humans and tools interact to collect the observations and apply the rules.

In order to explain this thoroughly, we will need to talk about specific examples. Look for those in an upcoming post.

Meanwhile, we invite you to comment on this.

UPDATE APRIL 10th: As a result of intense discussions at the SWET5 peer conference, I have updated the diagram of checking and testing. Notice that testing is now sitting outside the box, since it is describing the whole thing, a description of testing is inside of it. Human checking is characterized by a cloud, because its boundary with non-checking aspects of testing is not always clearly discernible. Machine checking is characterized by a precise dashed line, because although its boundary is clear, it is an optional activity. Technically, human checking is also optional, but it would be a strange test process indeed that didn’t include at least some human checking. I thank the attendees of SWET5 for helping me with this: Rikard Edgren, Martin Jansson, Henrik Andersson, Michael Albrecht, Simon Morley, and Micke Ulander.

65 thoughts on “Testing and Checking Refined

  1. “Testing is the process of evaluating a product by learning about it through experimentation, which includes to some degree: questioning, study, modeling, observation and inference”

    James, I’ve read the above description a few times, but I fail to comprehend a particular part of this description. Maybe its ignorance on my part, but can you explain why or how testing includes only “to some degree:” questioning, study, modelling, observation and inference?

    [James’ Reply: It means two things. A) That not ALL questioning is testing, not ALL study is testing, etc., but just that which directly relates to the goal of testing. B) That the list of activities is not complete.]

    Are you implying that there maybe (or are) more ways to evaluating a product through experimentation other than questioning, study, modeling etc.?

    [James’ Reply: Yes.]

    Or are you saying that testing comprises of experimentation through part questioning, part study, part modelling, part observation and part inference? (An analogy of my question is how a dish (test) is prepared… 2 cups of water (questioning), 2 teaspoons of salt (study) etc.).

    [James’ Reply: Yes, that, too.]

    Hope my questions make sense. By the way, nice Captcha words! 🙂

  2. Thanks Michael, I’ve been doing exactly that after I read James’ response. Your blogs opened my mind to many things I didn’t see before (but wanted to).

  3. I believe in human’s high cognitive capacity. This is something a person can build inside the brain. There’s a very interesting book on that called “The Talent Code” by Dan Coyle.

  4. [James’ Reply: Thank you for these interesting comments. My replies are below.]

    Great Article, very thought provoking. The comparison between machine checking and human checking i thought was really good. Thank you. See my comments below. NB: Apologies in advance if some of my comments have already been made as i haven’t read through all the comments made by others.

    “This situation exists in the fields of science and medicine, too. It exists everywhere: what are the implications to skilled human work of the evolution of tools? ”

    RF: I think this should be ‘what are the implications of the evolution of tools on skilled human work? ‘

    [James’ Reply: That is indeed another way to say it. I guess your way is less confusing, so I will change mine to yours.]

    “We want to test a product very quickly. …”

    RF: Do you mean testing quickly in terms of running more tests in a shorter period? Testing a product quickly is what those think about cost first would say.

    [James’ Reply: “Running more tests in a shorter period” doesn’t really mean anything, because there is no general meaning to the numbering of tests. I mean we need to find those important bugs as soon as reasonable. We need to work with urgency.

    I don’t know what you mean about people who “think about cost first.” I think we all want to find bugs quickly, instead of slowly.]

    “…We believe that skilled cognitive work is not factory work. That’s why it’s more important than ever to understand what testing is and how tools can support it.”

    RF: I think this is a brilliant distinction to make. A warrior learns his craft and can apply it using any tool. A spear thrower learns his tool and attempts to apply it to any situation, limiting his craft to one tool. As testers we shouldn’t be letting our tools shape us but rather we should learn our craft. People seem to be learning the tools and calling themselves craftsmen which is the problem, this might be driven by the assumption or fact that products are becoming too complex for the craft.

    [James’ Reply: Are you familiar with the Book of Five Rings?]

    “Testing is the process of evaluating a product by learning about it through experimentation, which includes to some degree: questioning, study, modeling, observation and inference.
    (A test is an instance of testing.)
    Checking is the process of making evaluations by applying algorithmic decision rules to specific observations of a product.
    (A check is an instance of checking.)”

    RF: I am assuming the distinction is purely down to Testing being continuous experimentation where as checking is referring to stopping after evaluating an observations ie expected result?

    [James’ Reply: It has nothing to do with continuity. The difference is mainly that checking is algorithmic and testing is not. Testing involves creative interpretation and analysis, and checking does not.]

    I think a check is an instance of testing but we should identify in testing when it’s a check or a test. I think this will encounter the push back it seems exploratory testing seems to constantly have.

    [James’ Reply: Sorry, but a check is not an instance of testing, any more than tires are an instance of a car. A check is part of testing.]

    In organisations they seem more keen on checks than tests, and it seems a lot of testers instead of pushing to add more value are satisfied with just doing what customers want and ignoring tests. Checks come across as confidence builders, which do not focus on finding the important bugs. Testing ends up being just a confidence activity which is just a fraction of what testing is.

    [James’ Reply: I don’t think it’s fair to say that organizations prefer checking to testing, because such organizations don’t know what they are doing. They don’t know enough to have a preference.

    In other words, any organization that seems to have that preference can be moved to a different preference by a skilled tester.]

    I agree with the concept but my only disagreement is that Checks are instances of testing as they are being done to evaluate a product by learning about it through observations. Even if you only learn about an observation/expected outcome.

    [James’ Reply: You don’t actually disagree. What you are doing here is not understanding. If you don’t understand something, whether or not you agree is irrelevant.

    By saying checks are instances of testing you are committing exactly the error that we are trying to stamp out. You are confusing testing with checking. A check is part of testing, not an instance of it. Testing means experimenting. Checking is not experimenting (although it’s an important part of experimenting).]

    A question I raised to Michael regarding checking vs testing comment on twitter: Is executing theories in the form of test ideas/cases, checking or testing?

    [James’ Reply: There is no such thing as “executing theories” as far as I know. A theory cannot be executed because a theory is not a procedure.]

    My initial answer: This is based on my thinking that we experiment with theories, ask questions, observe etc during the test design process hence test ideas that get generated are still tests.

    [James’ Reply: That sounds like testing.]

    My new answer: Based on the above definition, if no further test ideas are generated after evaluating the theory then it’s a check, but if the theory is evaluated and based on the learning/outcome further test ideas are generated then it’s a test as long as the cycle of learning continues. Hence test ideas are still checks when we don’t go past the initial observation identified during evaluation of the test idea.

    [James’ Reply: I don’t get this at all. You’ll have to explain it to me over a beer.]

    “•“learning” is the process of developing one’s mind. Only humans can learn in the fullest sense of the term as we are using it here, because we are referring to tacit as well as explicit knowledge.”

    RF: Should this be Human learning as by the definition of learning, Machines are also capable.

    [James’ Reply: I don’t feel the need to emphasize that “machine learning” is not learning in the human sense.]

    “•“experimentation” implies interaction with a subject and observation of it as it is operating, but we are also referring to “thought experiments” that involve purely hypothetical interaction. By referring to experimentation, we are not denying or rejecting other kinds of learning; we are merely trying to express that experimentation is a practice that characterizes testing. It also implies that testing is congruent with science.”

    RF: During experimentation we observe the products behavior to interactions, its environment etc…it’s not just the product we are observing.

    [James’ Reply: Yes.]

    “Human Checking vs. Machine Checking…”

    RF: I agree, my initial thoughts being humans find it difficult to follow/stick to rules whereas machines thrive on applying rules, processes etc. Machines can perform a streamlined version of human activities, with boundaries/constraints intentionally/unintentionally set. What Machines can do human activities stripped into logic and programmed by a human to be executed by a machine.
    My only disagreement is that Machine checking is an outcome/output of Human checking, hence should be within Human checking?

    [James’ Reply: That would miss the entire point of making this distinction. I don’t say “human checking” to denote that it is done by humans. I say it to distinguish it from how machines work.]

  5. Following up a discussion on Twitter: how does the ideas of test vs checking relate to the concept of scripted testing vs exploratory testing? I feel that there are similarities between the concepts. If we for instance imagine “pure” scripted testing, will that only contain checks and no testing?

    [James’ Reply:

    “Scripted testing” is about what controls testing. “Checking” is about the subset of testing that can be scripted (whether or not it is scripted.)

    Scripted testing means testing that in some *aspect* is scripted. Scripted means controlled from the outside of the conscious test execution process, rather than being completely controlled consciously by the tester. There are many ways in which bits and pieces of testing might be scripted. If *everything* is scripted, however, I would not call that testing. Testing is a thoughtful process of investigation and consideration. We can’t script that.

    What I once called “pure scripted testing” I would now call checking. Although it’s part of testing, I don’t think we should call it testing (the same way we don’t call a leaf a “tree”). My language has evolved.

    So, checking CAN be scripted but it isn’t necessarily so. All machine checking is scripted. All human checking is an attempt by a human to emulate machine checking, so it is often scripted, too. Checking is not necessarily scripted, however, because checking may be pursued in the moment, under the control of the tester. What makes checking “checking” is not that it is scripted in some way or another, but rather that it is *completely* scriptable.

  6. Hi James,
    I’m wondering if the intention of the users actions (or process) is important when determining whether we are testing or just checking?

    [James’ Reply: Yes, it is important. In The Shape of Actions, a book that is the primary basis of our distinction between checking and testing, intention is discussed at length. You can’t do testing in the full sense of the word without the intent to test.]

    Clearly tools do not have any intent, they just follow a script. People may also follow a script without thought or intent. But, a good tester with a desire to learn will be testing even when following a script. A tester may also utilise a tool to carry out checks on their behalf, the tool maybe just checking but the tester will be experimenting or exploring through the tool.

    [James’ Reply: If so, then by definition the tester is not ONLY following the script, because that allows only the “intent to follow” but not the intent to test. Meanwhile, a human is never ONLY following a script. If a fire alarm goes off, he will stop the check. If the system hangs, he will stop the check (and there is rarely a sure way to tell that a system has actually hung, so that is going to be outside the script, too.)

    Scripted testing and exploratory testing exist on a continuum. Even very scripted testing has some exploratory elements to it. This is one reason we distinguish between machine checking, which can be absolutely and purely algorithmic, and human checking, which will never quite be so.]

    Without intention in the definition:
    We could attempt to make discoveries through experimentation but not actually learn anything that helps us to evaluate a product.
    Is this not to be considered testing because we failed to learn anything to help in an evaluation? – I’d say it is still testing.

    [James’ Reply: It may be poor testing– it would have to be if you learned nothing or if all that you learned was false– but it’s testing.]

    In contrast we may observe an anomaly in the results of an automated suite that gives us salient information even if we had no intention of learning.

    [James’ Reply: How could that be? If you were running an automated set of checks, then you had intent to do so, right? That intent is probably to test the product. Or are you saying that someone delivering a pizza who happened to glance at a screen might “notice an anomaly that gives us salient information?” If so, then such a person would probably ignore it, but in any case I would not call that testing.

    There may be testing-like actions that people engage in for some purpose other than testing. But why worry about that? My concern is for testers and testing and people who do intend to learn the status of their products.]

    During the run there was just a machine performing checks, when it finishes and we examine the output does this retrospectively becoming testing because can make a value based judgement based on our observation? Seems to me the checks are still just checks, the testing doesn’t begin until after human interaction.

    [James’ Reply: It’s not that it retroactively becomes testing (although there might be such a thing in the case of a delayed oracle, whereby you see a specification today, and suddenly realize that a behavior you saw a week ago was actually a bug, thus rendering what had been a tour of the product into what retroactively becomes a test), it’s the the checking process is embedded ALL ALONG in a testing process. You can’t design the checks without engaging in testing (possible poor testing), you can’t decide to use the checks with engaging in testing, and you can’t interpret the results of the checks without engaging in testing.

    What we are calling checking is ONLY found embedded in testing. It’s possible that what someone else might fairly call checking could be outside of testing, but in Rapid Testing methodology our sense of checking is parasitic on our sense of testing.]

    Or, is there perhaps a grey area between where checking stops and testing begins?

    [James’ Reply: Look at our diagram in the post. Checking is inside of testing. Also, notice the cloud shape and the dashed lines? Those are intended to suggest fuzziness of different kinds.]

    I would like to further illustrate with the following example:
    I was recently working on a large and complicated but poorly documented application undergoing transformation to a new platform. Knowing that testing resources were stretched and being charged with creating an automated regression suite we had to be creative in our activities. We used SQL queries to introduce some variability into the data for our automated scripts. This would allow us to perform the necessary regression checks whilst leaving open the opportunity to discover things about the product.

    [James’ Reply: Excellent idea.]

    For example a script to test might fail because:
    a) of a regression error,
    b) our (testing) understanding of the business requirements was not correct [so we needed to update the SQL query and/or test – but we learnt something],
    c) the script executed a scenario that had not been tested before (or possibly even documented) and the code had not been developed according to the business need, though not technically a regression error (again we learnt something).

    [James’ Reply: Also, D) a tool/platform failure, E) a bug in your checking code, or F) the state of the checking system or system-under-test was disturbed by some external agent.]

    Although the primary objective of this activity was regression checking, we were also learning about the system when the checks failed. I’m not sure I’d call this experimentation although there was the intention of learning through observation, so I’m not sure if it meets your definition “testing” or not. Maybe this an example of what you’d call “speculative exploration”, though I wouldn’t call the inputs exactly random.

    [James’ Reply: By any scientific notion of “experiment” I know, you are running an experiment. You are testing. Your primary objective is testing. You are wishing to know the status of the product. It is motivated by this research question “what is the status of my product after these changes?” You are using a machine-checking process to fulfill most of the fact gathering and initial evaluation of the results. Now, it may be poor testing, I don’t know. That depends on the quality of your design and management of the test-that-includes-the-checks.

    The way to make that not testing (and also not checking) would be to perform that process as a ceremony: press the button, wait until it stops, perhaps email the results to a third party who ignores it. Do this with the intent of “not getting yelled at.” Now it’s not even checking, because no fact gathering that might theoretically have happened has made any difference to anyone.]

    When the tests failed we learnt something, but if the tests all passed we didn’t (at least for that run).

    [James’ Reply: Not so! You learn that the checks did not detect failure. This you did not know before you ran them. If you DID know, then that wouldn’t be checking because you would already have gathered all the facts and the so-called “checks” would not be gathering anything new. You learned that no smoke came out of the machines. You learned that they didn’t hang. You learned how long it took them to complete, and if that were to be much different in time than you were used to, you would have investigated that.]

    We could’ve built the tests with hard coded data just to check for regression issues – I’d call this checking.

    [James’ Reply: Both your randomized thing and the hard-coded thing are checks under the definition of checking in the post you are commenting on.]

    However, we built the tests with the intention to learn, expecting to find issues and knowing that we would have to investigate and evaluate. We may have only been utilising automated checks (most of the time) but I’d still call this testing.

    [James’ Reply: The entire process is testing. The scriptable fact-gathering PART of it is checking.]

    I’d welcome your feedback.

    [James’ Reply: My feedback is that you are obviously a man of philosophical chops. I hope you continue to comment on my work. It will help me be sharper.]

  7. Hi James,

    I read through some older posts from Michael Bolton and you regarding the topic ‘testing and checking’. But there is a point in this post that makes think.

    You say:
    “Testing is the process of evaluating a product by learning about it through experimentation, which includes to some degree: questioning, study, modeling, observation and inference.” and an implication
    “Testing is an open-ended investigation ….” and
    “evaluating means making a value judgment”.

    Is it really a testing task making a value judgement of a product? I have a rule mind that says “Never be the gate keeper.” How should it be done to say a product is good or bad for testers? In addition making a value judgement and open-ended investigating a product seem to be counterparts. Didn’t you say testing provides information for someone who has the authority to make a informed decision or a value judgement about a product?
    Are testers mutated to stakeholders?

    [James’ Reply: This is a great question, Florian.

    The value judgments we make as testers are not personal judgments, but rather judging with respect to the values of our stakeholders. When I see a bug, I am always asking myself whether my clients would consider that a threat to the value of the product, whether or not I personally might think so.

    The judging that stakeholders do can be yoked directly to decisions about the product, whereas the judgments testers make relate only to the direction of their work and what (and how) they report.

    If someone asks me my opinion of whether the product should ship, I am happy to give it, as long as they accept that as information pursuant to their decisions, and not treating it AS the decision.]

  8. Hi James,

    there is another thought that comes to my mind while I was thinking about your post. What is the essence of testing? Let’s make a short thought experiment:

    I experiment a little bit with a tablet exposed in a store just for fun after work. I won’t evaluate it for buying. I act only just for fun, but I am learning something about this product. Is this testing?

    [James’ Reply: That is not yet testing, but it may retroactively become testing. If you did this for a while and then decided you did want to test the tablet, then what you had just done would certainly be considered part of that testing process. So, it’s testing-like. It’s “proto-testing.”

    The reason why I would not call it testing is that you claim to be making no attempt to evaluate the product.]

    You wrote a reply on one comment here that says “You can’t do testing … without the intent to test”. I repeat your definition of testing here, too: “Testing is the process of evaluating a product by learning about it….”.

    I act consciously and in an act of volition in my thought experiment above, but I only want to play and learn(!) a little bit with the tablet. Is it testing?

    [James’ Reply: I see no good reason to call that testing by my definition that you cited.]

    Sure, there is no obvious intent for testing, but how can I differentiate between an intent for testing and other intents?

    [James’ Reply: You are the only one who can! Tell us what your intent is. If you don’t know your intent, then who else does?]

    That leads to my starting question “What is the essence of testing?” in your understanding?

    [James’ Reply: The essence of testing is to shine light so that others do not have to work in darkness. This is not merely the fun of waving a torch at night, but shining that light with purpose; as a service.]

  9. James and MB,

    Thank you for this clear discussion of the terms. I’ve been doing it wrong, and this is the perfect time to fix my work…

    In developing the MetaAutomation pattern language, I’ve been keeping in mind the vast difference in design and artifact between a manual test (executed by a human) and an “automated test” (executed by a test harness etc.) and how much I’ve struggled with widespread ignorance that there’s any difference there at all. I’ve even written a bunch about it for my book, but my treatment of that topic didn’t quite feel right.

    You’ve solved the problem for me: there is no such thing as an “automated test,” or at least not a fully “automated test.” These are called “checks.” If somebody talks about running automated tests, it’s just because they haven’t gotten the news yet.

    [James’ Reply: It’s like calling a whale a “fish.” It’s fine to do that if you aren’t a practicing biologist, but if your work involves research about sea life, you do need to keep the details straight. And it’s fine to say that a television is a sort of “babysitter” unless you are serious about daycare or parenting, in which case you do need to know that a TV cannot perform the role of a care-giver. Our terminology becomes refined to the degree that we need deeper or more reliable solutions.]

    I need to (ahem) make some checks with you on how you guys use some words
    • An assertion is an operation that doesn’t interact with the system under test (SUT) but does have a Boolean result

    [James’ Reply: I would simply say that an assertion is condition logging. When a certain condition is true, log that it is true. How the process of such logging interacts or does not interact with the SUT doesn’t seem to me to affect its status as an assertion. In fact, the very act of logging does interact with the SUT (affecting timing, disk space, network bandwidth, or something like that), although usually not in any material way. Assertions are, I suppose, the simplest kind of check.]

    • A check (as defined in your post) is zero or more operations on the SUT plus zero or one assertion, all fully automated

    [James’ Reply: A check does not have to be automated at all. The key idea is that the check CAN be automated. That’s what makes it special and interesting. I think you would have to have at least one assertion, though, or else I don’t see how it could fit the definition of a check, since no evaluation would be happening.]

    Does this make sense?

    Thanks to Michael Bolton for referring me to this blog post, and thanks in advance to James Bach for his feedback especially because I know he won’t pull his punches.

  10. A good explanation of your concept. The distinction is useful. Unfortunately, it doesn’t align with how the words are used out there in the real world.

    [James’ Reply: Oh, I think it does. I don’t encounter much resistance.]

    You might encounter a lot less resistance if you came up a new word for what you refer to as “Testing but not Machine Checking”. Maybe stick with “Exploring”.

    [James’ Reply: Exploring would obviously not be the right word for that. Also, there’s no need for a word for that, just as I have no need for a word for a car-with-everything-except-tires.]

    BTW, if my memory of how Venn Diagrams serves me well, your diagram contradicts your words.

    [James’ Reply: This is obviously not a Venn diagram. But it is related to one.]

    Well, actually, I’m not even sure how to interpret the diagram since Testing is outside the oval.

    [James’ Reply: I agree that you don’t know how to interpret the diagram. The word “Testing” is the title of the diagram, not part of it. I can understand how you might be confused, but it is not unusual for people to put titles on their diagrams that do not participate in the semantics of the diagram itself.]

    Is it just an oval that is so big that we cannot see any part of it?

    [James’ Reply: No.]

    If so, everything inside the large oval is part-of or a kind-of Testing.

    [James’ Reply: Okay, since you want to be precise let’s use words properly: it’s an ellipse, not an oval. I’m sure you already know the difference, but it’s fun and useful for me to point out that you, yourself, in criticizing my perceived inexactitude, have also found that in human communication, words are heuristics. How precisely we use them varies with the situation. I like to be quite precise, but I am rarely as precise in natural language/communication as I am when I write code. You sensed that you could say “oval” without sounding like a buffoon or confusing me– and you were right.

    In this case, my intent is to be very precise. So, I’m taking your criticism seriously.

    My intent in this diagram is to show an ellipse that represents testing, and that everything inside the ellipse is a part of testing. This comports with what you have supposed. So, yay!]

    And Machine Checking is a kind of “Learning by experimenting. …” Which doesn’t sound right.

    [James’ Reply: Not “a kind of.” Instead, “a part of” which is what you just suggested as a possibility (and you were right). Machine checking is an optional part of testing. I tried to imply “optional” partly by making the line for that ellipse dashed.]

    So maybe “Learning by experimenting” needs its own oval which doesn’t intersect with Machine Checking and maybe overlaps Human Checking a little bit.

    [James’ Reply: If I had been trying to suggest that machine checking exists outside of testing, then I would have done that. But my claim is that it does not exist outside of testing.]

    And Testing should be inside the large oval to indicate that all these kinds of checking and experimenting are different kinds of testing.

    [James’ Reply: Unless “Testing” is the title of the diagram, which is the case, here. Titles are conventionally placed outside of the diagram, although exceptions to that rule are not uncommon.]

    But wait, aren’t you saying that Machine Checking is not Testing?

    [James’ Reply: Yes. Just as tires are not the car. Tires are part of a car. Understand how that works?]

    Ack! That implies the oval for Testing should intersect Human Checking but not Machine Checking.

    [James’ Reply: Not if you interpret this diagram correctly.]

    The bottom line is: Since you are advocating for precision in language, it should be accompanied by precision in diagramming. 😉

    [James’ Reply: Of course, I agree. However, I also need my readers to be reasonably careful (and charitable) in their interpretations of my diagrams. Right under the diagram you are referring to are words (which are not part of the diagram, but you correctly understood that). The words read as follows:

    ‘You might also wonder why we don’t just call human checking “testing.” Well, we do. Bear in mind that all this is happening within the sphere of testing. Human checking is part of testing.’

    So, what I’m saying is, you might find it easier to understand the diagram if you read the accompanying post more carefully.]

  11. I’m having a hard time coming to a conclusion on the following thought I had, could you please offer your insight – thanks in advance.

    How do you decide which checks are worth automating and which checks are best left to be done manually? Automation seems like an obsession now and it feels to me that the pervading feeling is that if it can be automated, then automate it. That doesn’t sit right with me, since there’s an overhead in creating and maintaining the check.

    [James’ Reply: The answer to that relies on many specific factors such as: what tools I have, what skills I have with tools, what product risk is associated with the check, what else I have to do, how much the situation is changing, how complex the oracle is, how testable the product is, what my team wants me to do, what I want to do, etc.

    I must resist doing things merely because I feel like it, or merely because I haven’t bothered to think about alternative things I could do. I am here to find important bugs quickly.]

  12. Though a bit late to comment… it is a wonderful problem you are dealing with in your article!

    Perhaps exploring the following subject might be of your interest: well-structured vs. ill-structured problems. Herbert A. Simon [1] seems to be one of the first persons who examined the problem of ill structured problems [2]. Although his focus was AI, the two subject seems to be very close to each other – i.e. well-structured-vs-ill-structured-problems and testing-vs-checking. Also the topic seems to have a greate literature…


    [1] https://en.wikipedia.org/wiki/Herbert_A._Simon
    [2] Herbet A. Simon: The structure of ill structured problems

  13. s/Am I a warrior or a just spear throwing /Am I a warrior or just a spear throwing

    Greets from New Zealand 🙂

    [James’ Reply: Thanks!]

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.