Round Earth Test Strategy

The “test automation pyramid” (for examples, see here, here, and here) is a popular idea, but I see serious problems with it. I suggest in this article an alternative way of thinking that preserves what’s useful about the pyramid, while minimizing those problems:

  1. Instead of a pyramid, model the situation as concentric spheres, because the “outer surface” of a complex system generally has “more area” to worry about;
  2. ground it by referencing a particular sphere called “Earth” which is familiar to all of us because we live on its friendly, hospitable surface;
  3. illustrate it with an upside-down pyramid shape in order to suggest that our attention and concern is ultimately with the surface of the product, “where the people live” and also to indicate opposition to the pyramid shape of the Test Automation Pyramid (which suggests that user experience deserves little attention);
  4. incorporate dynamic and well as static elements into the analogy (i.e. data, not just code);
  5. acknowledge that we probably can’t or won’t directly test the lowest levels of our technology (i.e. Chrome, or Node.js, or Android OS). In fact, we are often encouraged to trust it, since there is little we can do about it;
  6. use this geophysical analogy to explain more intuitively why a good tooling strategy can access and test the product on a subterranean level, though not necessarily at a level below that of the platforms we rely upon.

Good analogies afford deep reasoning.

The original pyramid (really a triangle) was a context-free geometric analogy. It was essentially saying: “Just as a triangle has more area in its lower part than its upper part, so you should make more automated tests on lower levels than higher levels.” This is not an argument; this is not reasoning. Nothing in the nature of a triangle tells us how it relates to technology problems. It’s simply a shape that matches an assertion that the authors wanted to make. It’s semiotics with weak semantics.

It is not wrong to use semantically arbitrary shapes to communicate, of course (the shapes of a “W” and an “M” are opposites, in a sense, and yet nobody cares that what they represent are not opposites). But at best, it’s a weak form of communication. A stronger form is to use shapes that afford useful reasoning about the subject at hand.

The Round Earth model tries to do that. By thinking of technology as concentric spheres, you understand that the volume of possibilities– the state space of the product– tends to increase dramatically with each layer. Of course, that is not necessarily the case, because a lot of complexity may be locked away from the higher levels by the lower levels. Nevertheless that is a real and present danger with each layer you heap upon your technology stack. An example of this risk in action is the recent discovery that HTML emails defeat the security of PGP email. Whoops. The more bells, whistles, and layers you have, the more likely some abstraction will be fatally leaky. (One example of a leaky abstraction is the concept of “solid ground,” which can both literally and figuratively leak when hot lava pours out of it. Software is built out of things that are more abstract and generally much more leaky than solid ground.)

When I tell people about the Round Earth model they often start speaking of caves, sinkholes, landslides, and making jokes about volcanoes and how their company must live over a “hot spot” on that Round Earth. These aren’t just jokes, they are evidence that the analogy is helpful, and relates to real issues in technology.

Note: If you want to consider what factors make for a good analogy, Michael Bolton wrote a nice essay about that (Note: he calls it metaphor, but I think he’s referring to analogies).

The Round Earth model shows testing problems at multiple levels.

The original pyramid has unit testing at the bottom. At the bottom of the Round Earth model is the application framework, operating environment, and development environment– in other words, the Platform-That-You-Don’t-Test. Maybe someone else tests it, maybe they don’t. But you don’t know and probably don’t even think about it. I once wrote Assembler code to make video games in 16,384 bytes of memory. I needed to manage every byte of memory. Those days are long gone. Now I write Perl code and I hardly think about memory. Magic elves do that work, for all I know.

Practically speaking, all development rests on a “bedrock” of assumptions. These assumptions are usually safe, but sometimes, just as hot lava or radon gas or toxified groundwater breaks through bedrock, we can also find that lower levels of technology undermine our designs. We must be aware of that general risk, but we probably won’t test our platforms outright.

At a higher level, we can test the units of code that we ourselves write. More specifically, developers can do that. While it’s possible for non-developers to do unit-level checks, it’s a much easier task for the devs themselves. But, realize that the developers are working “underground” as they test on a low level. Think of the users as living up at the top, in the light, whereas the developers are comparatively buried in the details of their work. They have trouble seeing the product from the user’s point of view. This is called “the curse of expertise:”

“Although it may be expected that experts’ superior knowledge and experience should lead them to be better predictors of novice task completion times compared with those with less expertise, the findings in this study suggest otherwise. The results reported here suggest that experts’ superior knowledge actually interferes with their ability to predict novice task performance times.”

[Hinds, P. J. (1999). The curse of expertise: The effects of expertise and debiasing methods on prediction of novice performance. Journal of Experimental Psychology: Applied, 5(2), 205–221. doi:10.1037/1076-898x.5.2.205]

While geophysics can be catastrophic, it can also be more tranquil than a stormy surface world. Unit level checking generally allows for complete control over inputs, and there usually aren’t many inputs to worry about. Stepping up to a higher level– interacting sub-systems– still means testing via a controlled API, or command-line, rather than a graphical interface designed for creatures with hands and eyes and hand-eye coordination. This is a level where tools shine. I think of my test tools as submarines gliding underneath the storm and foam, because I avoid using tools that work through a GUI.

The Round Earth model reminds us about data.

Data shows up in this model, metaphorically, as the flow of energy. Energy flows on the surface (sunlight, wind and water) and also under the surface (ground water, magma, earthquakes). Data is important. When we test, we must deal with data that exists in databases and on the other side of micro-services, somewhere out in the cloud. There is data built into the code, itself. So, data is not merely what users type in or how they click. I find that unit-level and sub-system-level testing often neglects the data dimension, so I feature it prominently in the Round Earth concept.

The Round Earth model reminds us about testability.

Complex products can be designed with testing in mind. A testable product is, among other things, one that can be decomposed (taken apart and tested in pieces), and that is observable and controllable in its behaviors. This usually involves giving testers access to the deeper parts of the product via command-line interfaces (or some sort of API) and comprehensive logging.


  • Quality above requires quality below.
  • Quality above reduces dependence on expensive high-level testing.
  • Inexpensive low-level testing reduces dependence on expensive high-level testing.
  • Risk grows toward the user.



My Personal Source Code: Books to Learn Analysis

Occasionally people come to me and say they want to learn certain things. They ask “how do I become a good tester” or “how do I design test cases” or “how do I automate” or something specific like that. These are not really the right questions, though. The better question, which addresses all the other ones, is “how do I become a competent analyst?” Analysis is at the root of all technical work. It’s the master key to nearly everything else. You will almost automatically become a good tester, test case designer, or automater of whatever you choose, IF you master analysis. (Yes, there are other factors of equal precedence, such as humanity, temperance, and detachment. I’m going to focus on analysis, today.)

One simple way to answer the question is to suggest reading books. It’s not enough, but it’s an important step. Now, I own a lot of useful books. I’ve encountered many more. But there are just a few that express the essence of my thought process– the thought process that allows me to analyze difficult problems in complex systems and provide my clients with the help they need. These books have been so important to me that if you know them, too, you will have a good understanding of the “source code” by which I operate; my “secrets.”

These are difficult books in at least two senses: each of them is full of funny words and complicated sentences; but much more importantly, to digest each one is to change the structure of your mind, which is always a painful process. I can’t tell you it will be easy, or even fun. (Some of these books I can only read about 10 pages at a time, before getting too excited to continue.) I am simply saying I make my living as a consultant and expert witness who tackles very complex problems, and I believe it’s substantially down to what I learned from struggling with these books.

Against Method, by Paul Feyerabend
I encountered Feyerabend just after I quit high school. I had already read Ayn Rand and considered myself an Objectivist. Feyerabend cured me of that, more or less. He introduced me to the skeptical study of method; to methodology as a pursuit. I was also drawn to his combative, wild attitude.

Gödel, Escher, Bach: An Eternal Golden Braid, by Douglas R. Hofstadter
I had tried to study logic formally when I was in my teens. I just felt it was a lot of boring symbol manipulation and rule-following. Hofstadter’s book showed me the true essence of logic: exciting symbol manipulation and rule-following! Logic came alive for me through this amazing treatise.

The Hero with a Thousand Faces, by Joseph Campbell
When I joined Apple Computer as a young tester, I joined a philosophy discussion group. There I was introduced to Joseph Campbell’s work on mythology. He applied what I later came to know as “general systems thinking” to theology. What had seemed to me, an atheist, to be boring and silly rituals and statues suddenly became connected with all of humanity and history and with my own life. This was analysis connected directly to the meaning of life (although Campbell hated that phrase). I’m still an atheist, but I appreciate what religion is trying to do.

Introduction to General Systems Thinking, by Gerald M. Weinberg
This was the first book I encountered that actually taught me to do analysis. It taught me to be a tester. It cemented my career choice.

Conjectures and Refutations: The Growth of Scientific Knowledge, by Karl Popper
Read the first 30 pages about what defines science. The rest is optional. Popper was the opposite of Feyerabend. He believed that there was a best method of science. I ignore that. What impressed me about Popper is his convincing attack on Foundationalism. He showed me that science and testing are the same thing in slightly different wrappers. In testing, as in science, you can’t prove that your theory about the facts is correct. You can only try to refute it.

The Sciences of the Artificial, by Herbert Simon
This book is about what a science of design would look like. It provided a sort of road map for me about what my testing methodology had to include and accomplish. It opened my eyes to the central role that heuristic play in analysis.

The Pleasure of Finding Things Out, by Richard Feynman
Feynman’s book is really about attitude and agency. He convinced me never to seek permission to think, and to develop and follow my own code of conduct.

Discussion of the Method, by Billy Vaughan Koen
Billy Koen’s book is the best explanation of heuristics there is. But what he wrote goes beyond that, because he connected heuristics to skeptical philosophy. He showed me that I am not just using heuristics in testing; I am swimming in them; I am made of them. Also, I wrote a fan letter to him and he wrote back! So, there’s that.

Tacit and Explicit Knowledge, by Harry Collins
This is the book I encountered most recently, and it caused Michael Bolton and I to change how we teach. We now realize that much of the skills of the analyst are tacit in nature, and therefore cannot directly be taught. We teach them indirectly, by arranging and examining experiences. Michael Bolton and I made a pilgrimage to Harry’s home in Wales, too. To me, Harry is the sociologist of software testing.

TestInsane’s Mindmaps Are Crazy Cool

Most testing companies offer nothing to the community or the field of testing. They all seem to say they hire only the best experts, but only a very few of them are willing to back that up with evidence. Testing companies, by and large, are all the same, and the sameness is one of mediocrity and mendacity.

But there are a few exceptions. One of them is TestInsane, founded by ex-Moolyan co-founder Santosh Tuppad. This is a company to watch.

The wonderful thing about TestInsane is their mindmaps. More than 100 of them. What lovelies! Check them out. They are a fantastic public contribution! Each mindmap tackles some testing-related subject and lists many useful ideas that will help you test in that area.

I am working on a guide to bug reporting, and I found three maps on their site that are helping me cover all the issues that matter. Thank you TestInsane!

I challenge other testing companies to contribute to the craft, as well.

Note: Santosh offered me money to help promote his company. That is a reasonable request, but I don’t do those kinds of deals. If I did that even once I would lose vital credibility. I tell everyone the same thing: I am happy to work for you if you pay me, but I cannot promote you unless I believe in you, and if I believe in you I will promote you for free. As of this writing, I have not done any work for TestInsane, paid or otherwise, but it could happen in the future.

I have done paid work for Moolya, and Per Scholas, both of which I gush about on a regular basis. I believe in those guys. Neither of them pay me to say good things about them, but remember, anyone who works for a company will never say bad things. There are some other testing companies I have worked for that I don’t feel comfortable endorsing, but neither will I complain about them in public (usually… mostly).

Agile Testing Heuristic: The Power of Looking

Today I broke my fast with a testing exercise from a colleague. (Note: I better not tell you what it is or even who gave it to me, because after you read this it will be spoiled for you, whereas if you read this and at a later time stumble into that challenge, not knowing that’s the one I was talking about, it won’t be spoiled.)

The exercise involved a short spec and an EXE. The challenge was how to test it.

The first thing I checked is if it had a text interface that I could interact with programmatically. It did. So I wrote a program to flood it with “positive” and “negative” input. The results were collected in a log file. I programmatically checked the output and it was correct.

So far this is a perfectly ordinary Agile testing situation. It is consistent with any API testing or systematic domain testing of units you have heard of. The program I wrote performs a check, and the check is produced by my testing thought process and its output analyzed by a similar thought process. That human element qualifies this as testing and not merely naked checking. If I were to hand my automated check to someone else who did not think like a tester, it would not be testing anymore, although the checks would still have some value, probably.

Here’s my public service announcement: Kids! Remember to look at what is happening.

The Power of Looking

One aspect of my strategy I haven’t described yet is that I carefully watched the check as it was running. I do this not as a bored, offhanded, or incidental matter. It’s absolutely vital. I must observe all the output I can observe, rather than just the “pass/fail” status of my checks. I will comb through log files, watch the results in real-time, try things through the GUI, whatever CAN be seen, I want to see it.

As I watched the output flow by in this particular example, I noticed that it was much slower than I expected. Moreover, the speed of the output was variable. It seemed to vary semi-randomly. Since there was nothing in the nature of the program (as I understood it) that would explain slowness or variable timing, this became an instant focus of investigation. Either there’s a bug here or something I need to learn. (Note: that is known as the Explainability Oracle Heuristic.)

It’s possible that I could have anticipated and explicitly checked for performance issues, of course, but my point is that the Power of Looking is a heuristic for discovering lots of things you did NOT anticipate. The models in your mind generate expectations, automatically, that you may not even be aware of until they are violated.

This is important for all testing, but it’s especially important for tool-happy Agile testers, bless their hearts, some of whom consider automation to be next to godliness… Come to think of it, if God has automated his tests for human qualities, that would explain a lot…



Mechanical or Magical? Noah Says “Neither.”

As I was having dinner with Noah Höjeberg tonight, he said an interesting thing. “Some people think testing is mechanical, and that’s bad enough. But a lot of people seem to think the alternative to mechanical is magical.”

(Noah is the new test strategist at Scila AB, in Stockholm. Interesting guy. I’ve played a lot of testing dice with him, in the past. I meant to do the Art Show game with him, too, but we got so much into our conversation that I completely forgot.)

Mechanical and magical are false opposites. In Rapid Testing, we pursue another path: Heuristical. In other words, skilled testing, achieved through systematic study and the deliberate application of heuristics. This is neither a mechanical, algorithmic process, not is it magical, mystical. We can show it, talk about it, etc. And yet it cannot be automated.

Technique: Paired Exploratory Survey

I named a technique the other day. It’s another one of those things I’ve been doing for a while, but only now has come crisply into focus as a distinct heuristic of testing: the Paired Exploratory Survey (PES).

Definition: A paired exploratory survey is a process whereby two testers confront one product at the same time for the purpose of learning a product, preparing for formal testing, and/or characterizing its quality as rapidly as possible, whereby one tester (the “driver”) is responsible for open-ended play and all direct interaction with the product while the other tester (the “navigator” or “leader”) acts as documentarian, mission-minder, and co-test-designer.

Here’s a story about it..

Last week, I was on my way home from the CAST conference with my 17 year-old son Oliver when a client called me with an emergency assignment: “Get down to L.A. and test our product right away!” I didn’t have time to take Oliver home, so we bought some clean clothes, had Oliver’s ID flown in from Orcas Island by bush plane, and headed to SeaTac.

(I love emergencies. They’re exciting. It’s like James Bond, except that my Miss Moneypenny is named Lenore. I got to the airport and two first class tickets were waiting for us. However, a gentle note to potential clients: making me run around like a secret agent can be expensive.)

This is the first time I had Oliver with me while doing professional testing, so I decided to make use of him as an unpaid intern. Basically, this is the situation any tester is in when he employs a non-tester, such as a domain expert, as a partner. In such situations, the professional tester must assure that the non-tester is strongly engaged and having good fun. That’s why I like to make that “honorary tester” drive. I get them twiddling the knobs, punching the buttons, and looking for trouble. Then they’ll say “testing is fun” and help me the next time I ask.

(Oliver is a very experienced video gamer. He has played all the major offline games since he was 3 or 4, and the online ones for the last 5 years. I know from playing with him what this means: he can be relentless once he decides to figure out how a system works. I was hoping his gamer instinct would kick in for this, but I was also prepared for him to get bored and wander off. You shouldn’t set your expectations too high with teenagers.)

The client gave us a briefing about how the device is used. I have already studied up on this, but it’s new for Oliver. The scene reminded me of that part in the movie Inception where Leonardo DiCaprio explains the dynamics of dream invasion.We have a workstation that controls a power unit and connects to a probe which is connected to a pump. It all looks Frankenstein-y.

(I can’t tell you much about the device, in this case. Let’s just say it zaps the patient with “healing energy” and has nothing whatsoever to do with weaponized subconscious projections.)

I set up a camera so that all the testing would be filmed.

(Video is becoming an indispensable tool in my work. My traveling kit consists of a little solid state Sony cam that plugs into the wall so I don’t have to worry about battery life, a micro-tripod so I can pose the camera at any desired angle, and a terabyte hard drive which stores all the work.)

Then, I began the testing just to demonstrate to Oliver the sort of thing I wanted to do. We would begin with a sanity check of the major functions and flows, while letting ourselves deviate as needed to pursue follow-up testing on anything we find that was anomalous. After about 15 minutes, Oliver became the driver, I became the navigator, and that’s how we worked for the next 6 or 7 hours.

Oliver quickly distinguished himself as as a remarkable observer. He noticed flickers on the screen, small changes over time, quirks in the sound the device made. He had a good memory for what he had just been doing, and quickly constructed a mental model of the product.

From the transcript:

“What?!…That could be a problem…check this out…dad…look, right now…settings, unclickable…start…suddenly clickable, during operation…it’s possible to switch its entire mode to something else, when it should be locked!”

and later

“alright… you can’t see the error message every single time because it’s corrupted… but the error message… the error message is exactly what we were seeing before with the sequence bug… the error message comes up for a brief moment and then BOOM, it’s all gone… it’s like… it makes the bug we found with the sequence thing (that just makes it freeze) destructive and takes down the whole system… actually I think that’s really interesting. It’s like this bug is slightly more evolved…”

(You have to read this while imagining the voice of a triumphant teenager who’s just found an easter egg in HALO3. From his point of view, he’s finding ways to “beat the boss of the level.”)

At the start, I frequently took control of the process in order to reproduce the bugs, but as I saw Oliver’s natural enthusiasm and inquisitiveness blossom, I gave him room to run. I explained bug isolation and bug risk and challenged him to find the simplest, yet most compelling form of each problem he uncovered.

Meanwhile, I worked on my notes and noted time stamps of interesting events. As we moved along, I would redirect him occasionally to collect more evidence regarding specific aspects of the evolving testing story.

How is this different from ordinary paired testing?

Paired testing simply means two testers testing one product on the same system at the same time. A PES is a kind of paired testing.

Exploratory testing means an approach to testing whereby learning, test design, and test execution are mutually supportive activities that run in parallel. A PES is exploratory testing, too.

A “survey session,” in the lingo of Session-Based Test Management, is a test session devoted to learning a product and characterizing the general risks and challenges of testing it, while at the same time noticing problems. A survey session contrasts with analysis sessions, deep coverage sessions, and closure sessions, among possible others that aren’t yet identified as a category. A PES is a survey test session.

It’s all of those things, plus one more thing: the senior tester is the one who takes the notes and makes sure that the right areas are touched and right general information comes out. The senior tester is in charge of developing a compelling testing story. The senior tester does that so that his partner can get more engaged in the hunt for vital information. This “hunt” is a kind of play. A delicious dance of curiosity and analysis.

There are lots of ways to do paired testing. A PES is one interesting way.

Hey, I’ve done this before!

While testing with my son, I flashed back to 1997, in one of my first court cases, in which I worked with my brother Jon (who is now a director of testing at eBay, but was then a cub tester). Our job was to apply my Good Enough model of quality analysis to a specific product, and I let Jon drive that time, too. I didn’t think to give a name to that process, at the time, other than ET. The concept of paired testing hadn’t even been named in our community until Cem Kaner suggested that we experiment with it at the first Workshop on Heuristic and Exploratory Techniques in 2001.

I have seen different flavors of a PES, too. I once saw a test lead who stepped to the keyboard specifically because he wanted his intern to design the tests. He felt that that letting the kid lean back in his chair and talk ideas to the ceiling (as he was doing when I walked in) would be the best way to harness certain technical knowledge the intern had which the test lead did not have. In this way, the intern was actually the driver.

I’m feeling good about the name Paired Exploratory Survey. I think it may have legs. Time will tell.

Here’s the report I filed with the client (all specific details changed, but you can see what the report looks like, anyway).

A Six-fold Example from Pradeep Soundararajan

Pradeep blogged this, today.

I need to amplify it because it provides a nice example of at least six useful and important patterns all in one post. This is why I believe Pradeep is one of the leading Indian testers.

Practical advice: “Ask for testability”

His story is all about asking for testability and all the good things that can come from that. It’s rare to see a good example present so vividly. I wanted more details, but the details he gave were enough to carry the point and fire the imagination.

Practical advice: “Try video test scripting”

I have never heard of using videos for scripted testing. Why didn’t I think of that?

Testing as a social process

Notice how many people Pradeep mentions in his post. Notice the conversations, the web of relationships. This aspect of testing is profoundly important, and it’s one that I find Pradeep to excel in. It’s kind of like x-ray vision– the ability to see past the objects of the project to the true bones of it, which is how people think of each other, communicate with, and influence each other. Pradeep’s story is a little bit technical, but it’s mostly social, as I read it.

Experience report

Pradeep’s post is an example of an experience report. Not many of them around. It’s like sighting a rare orchid. He published it with the support of his client, otherwise we’d never have seen it. That’s why there can never be an accurate or profound history written about the craft of testing: almost everything is kept secret. The same dynamic helps preserve bad practice in testing, because that bad practice thrives in the darkness just as roaches do.

Sapient tester blogging

I have referred in the past to a phenomenon I call “sapient tester blogs.” These are introspective, self-critical, exploratory essays written by testers who see testing as a complex cognitive activity and seek to expand and develop their thinking. It’s particularly exciting to see that happening in India, which brings me to the final point…

Leadership in Indian testing

There’s not a lot of good leadership in Indian testing. Someday there will be. It’s beginning to happen. Pradeep’s post is an example of what that looks like.

There must be more than a hundred thousand testers in India. (I wonder if some agency keeps statistics on that?) I would expect to see at least a hundred great tester blogs from India, not six!

Heuristic Value is a Relationship

One of the comments on my post about The Esssence of Heuristics went like this:

“An excellent example of a heuristic is a hammer.”


Ecstasy is your friend: it picks you up at the airport.

Non heuristics that can help an expert solve a problem, without being a guarantee – an abridged list:
* Money
* Time
* Expertise
* Newton-Raphson Algorithm
* Analogies

It was posted anonymously. I generally don’t publish anonymous comments that are argumentative, because opponents who yell things and run away bore me… but this one is helpful. The writer is a little confused, but I bet other people are confused too, in the same way.

He offers a list of things that he claims are non-heuristics (he doesn’t explain why so I guess he thinks they are “obviously” non-heuristics), and suggests that they meet my definition. I think he’s trying to make two points: either my definition of heuristics is wrong because it admits non-heuristics, or it’s trivial because then everything would be a heuristic and therefore the idea conveys no information.

Well, the first point is easily handled. By definition, each thing on his list is heuristic (because he has declared that they help without guaranteeing in some scenario which he seems to have in mind). There is no contradiction, he’s simply mistaken by calling those things non-heuristics.

As to his second point, that’s where the real confusion lies. I think he is mixed-up because he expects heuristics to have some essential nature that identifies them. But what makes something a heuristic is not its essential nature, but rather its relationship to the problem and the problem-solver. We say that anything that may help you solve a problem has “heuristic value” for you with respect to that problem. But if it is infallible (and also if it halts) then we don’t call it a heuristic. We call it an algorithm. For instance, long division is a guaranteed way for dividing one rational number with a finite number of digits by another such number. However, long division is just a heuristic if we were to apply it to splitting the check at a restaurant. It doesn’t guarantee that everyone will feel that they are paying a fair share.

How about instead of heuristics we think about weapons? Would it be absurd of me to suggest that a weapon is any fallible method of attacking your enemy or winning a battle? Someone might reply “But James, are pickles really weapons? Sand? Wind?” The answer is: they can be, depending on your situation. But you don’t need to catalog all possible weapons. Instead you study the art of combat and learn to spot the things that will help you. Myamoto Musashi wrote exactly about this in 1645:

When you attack the enemy, your spirit must go to the extent of pulling the stakes out of a wall and using them as spears and halberds. — Book of Five Rings

We cannot speak intelligibly about heuristics without identifying the problem, the problem-solver, and the dynamics of the situation. You can make a list of anything you want, Mr. Anonymous, but can you answer my next questions, which are: what specific problems are you talking about for which these are heuristics and how specifically are they heuristic? Or if you think they aren’t heuristics, what specific problem do you think they can’t help with and why not?

Three New Testing Heuristics

A lot of what I do is give names to testing behaviors and patterns that have been around a long time but that people are not systematically studying or using. I’m not seeking to create a standard language, but simply by applying some kind of terminology, I want to make these patterns easier to apply and to study.

This is a quick note about three testing heuristics I named this week:

Steeplechase Heuristic (of exploratory boundary testing)

When you are exploring boundaries, think of your data as having to get to the boundary and then having to go other places down the line. Picture it as one big obstacle course with the boundary you are testing right in the middle.

Then consider that very large, long, extreme data that the boundary is designed to stop might founder on some obstacle before it ever gets to the boundary you want to test. In other words, a limit of 1,000 characters on a field might work fine unless you paste 1,000,000 characters in, in which case it may crash the program instantly before the boundary check ever gets a chance to reject the data.

But also look downstream, and consider that extreme data which barely gets by your boundary may get mangled on another boundary down the road. So don’t just stop testing when you see one boundary is handled properly. Take that data all around to the other functions that process it.

Galumphing (style of test execution)

Galumphing means doing something in a deliberately over-elaborate way. I’ve been doing this for a long time in my test execution. I add lots of unnecessary but inert actions that are inexpensive and shouldn’t (in theory) affect the test outcome. The idea is that sometimes– surprise!– they do affect it, and I get a free bug out of it.

An example is how I frequently click on background areas of windows while moving my mouse pointer to the button I intend to push. Clicking on blank space shouldn’t matter, right? Doesn’t hurt, right?

I actually learned the term from the book “Free Play” by Stephen Nachmanovitch, who pointed out that it is justified by the Law of Requisite Variety. But I didn’t connect it with my test execution practice until jogged by a student in my recent Sydney testing class, Ted Morris Dawson.

Creep & Leap (for pattern investigation)

If you think you understand the pattern of how a function works, try performing some tests that just barely violate that pattern (expecting an error or some different behavior), and try some tests that boldly take that behavior to an extreme without violating it. The former I call creeping; the latter is leaping.

The point here is that we are likely to learn a little more from a mildly violating test than from a hugely violating test because the mildly violating test is much more likely to surprise us, and the surprise will be easier to sort out.

Meanwhile, stretching legal input and expectations as far as they can reasonably go also can teach us a lot.

Creep & Leap is useful for investigating boundaries, of course, but works in situations without classic boundaries, too, such as when we creep by trying a different type of data in a function that is supposed to be rejected.

The Essence of Heuristics

Excellent testing requires skill, but heuristics give structure to that skill. Heuristics help us access our skills under pressure.

A heuristic is a fallible method of solving a problem or making a decision. Cem Kaner and I came to this definition based on an extensive search of papers and books across fifty years of psychology and engineering. Amazingly, we were not able to find a single coherent definition of heuristic in our research (The dictionaries are not much help, here) . Coleridge, Kant, and Polya have all written about heuristic reasoning. But we needed a definition that would bring the issues into focus.

There are two main issues: something that helps you solve a problem without being a guarantee. This immediately suggests the next issue: that heuristics must be applied sapiently (meaning with skill and care).

An excellent example of a heuristic is a hammer. Do you see how a hammer can help a carpenter solve a problem, but does not itself guarantee the solution? I like this example because it’s so easy to see that that a hammer may be critical to a skilled carpenter while being of little use to an unskilled lout who doesn’t know what to pound or how hard to pound it or when to stop pounding.

Heuristics do not replace skill. They don’t make skill unnecessary. But they can make skilled people more productive and reliable.

How Do You Tell if Consultants/Trainers Understand Heuristics?

I typically hear two reactions from rival testing consultants of other schools of thought (especially the Factory and Quality Control schools) that love “best practice” talk:

  1. “Oh yes, heuristics! That’s another name for ‘best practices’, right?”
  2. “Oh no, heuristics! That’s just a fancy name for ‘best practices’!”

Obviously both reactions miss the point. Even if these folks rename their ideas of what you should do to call them “heuristics,” they would be leaving out a key idea, which is that skill must rule methods. Talking about methods, focusing on methods, enshrining methods, is only sensible if humans are left in charge. The heuristic nature of engineering is the reason why a “best practice” is an absurdity. Seek not the perfect practice. Seek instead to practice your skills.

My friend Judah Mogilensky once commented that he thought my Heuristic Test Strategy Model was a lot like the Capability Maturity Model that he works with.

“Are people allowed to change or suspend the CMM whenever they see fit?” I asked.

“Oh no,” he replied.

“Then it isn’t being applied responsibly as a heuristic. It’s being treated like a law, or an oath, or something else that places itself above its subjects and binds them.”

So that’s how you tell. Look to see what’s driving things. People, or concepts? Fundamentally, “methodology” can’t control projects. Anytime you seem to see methods in charge, you are actually witnessing a project out of control, or else a project under covert control by shadowy masters.

When someone teaches you a way to solve a problem, check:

  • Do they teach you how to tell if it’s working?
  • Do they teach you how to tell if it’s going wrong?
  • Do they teach you heuristics for stopping?
  • Do they teach you heuristics for knowing when to apply it?
  • Do they compare it to alternative heuristics?
  • Do they show you why it works?
  • Do they help you understand when it probably works best?
  • Do they help you know how to re-design it, if needed?
  • Do they let you own it?
  • Do they ask you to practice it?
  • Do they tell stories about how it has failed?
  • Do they listen to you when you question or challenge it?
  • Do they praise you for questioning and challenging it?

Revenge of the Process Imperialists

Ben Kelly, who works in Japan, is reviewing the Japanese translation of my book, Lessons Learned in Software Testing. He writes:

Each lesson is numbered as per the original, but rather than ‘lesson’, they use the word tessoku, which means ‘Inviolable Rule’ or ‘Ironclad Regulation’

Ben goes on to say that it was probably a marketing decision on the publisher’s part to make that change. Apparently, Japanese testers want ironclad rules.

Interesting, but that’s a little like spicing up a film about Gandhi by having him carry an M60 machine gun and chomp on a cigar. The “lessons” in Lessons Learned are heuristics worth considering, not ironclad anythings.

Why Labels For Test Techniques?

Steve Swanson is very new to testing. I predict he has a great future. He has already noticed that the common idea of boundary testing is almost content-free. Michael Bolton and I do a whole session on how to turn boundary testing into a much more intellectual engaging activity. At the end of his post, he identifies one of the major weaknesses of the classic notion of boundary testing. This confirmed for me that he is a mind to watch.

Steve questions the idea of naming test techniques:

What’s the point of having names for things? To me having a name limits what you see and limits creativity. If you feel that certain things are not to be considered boundary tests, then maybe you won’t do them. Maybe you are pigeon-holing yourself into constraints that do not need to be there. Furthermore it seems that everyone has a different idea of what a “boundary” test is. If that is the case then why even have a name for it?

Dear Steve,

I’ve been studying testing for 19 years, and let me tell you, a lot of things people write about it are fairy tales. This is the first reason why you are confused about what’s in a name: most people (not everyone) are also confused, and thus just copy what they see other people write, without thinking much about it.

To use an example from my own history, I used to talk about the difference between conformance testing and deviance testing. I learned this distinction at Apple Computer. For about five years I talked about them, until I one day I realized that it is an empty and unhelpful distinction. It was not a random realization, but was part of a process of systematically doubting all my terminology and assumptions about testing, in traditional Cartesian fashion. I just couldn’t find a good enough reason to retain the distinction of testing into conformance and deviance.

It looks like you are on a similar path. If you continue on it, the kind of thinking you are doing, will A) lead you to better resources to draw from about testing, and B) allow you to become one the leaders helping us out of this morass, C) ensure that you will be attacked by certain bloggers from Microsoft and elsewhere who just hate people who apply philosophical methods to such an apparently straightforward and automatable task like testing. (Speaking of philosophy of terminology, you may be interested in the Nominalism vs. Realism conversation, or how the pragmatism movement swept aside that whole debate, or how the structuralists and post-modernists study the semiotics of discourse. All these things relate to the issues you raise about terminology.)

I will tell you now what book you need to read that will help you more than any other on this planet: Introduction to General Systems Thinking, by Gerald M. Weinberg. It’s what I consider to be the fundamental textbook of software testing, yet not 1 in 100 testers knows about it.

A quick answer to your issue with names…

Terminology is useful for at least these reasons:

  1. A term can be a generative tool. It can evoke an idea or a thought process that you are interested in. (This is different from using a term to classify or label something, which as you point out limits us without necessarily helping us.) An example of the generative use of terms in this way is the HAZOP process which uses “guidewords” to focus analysis. Even a generative usage is susceptible to bias, which is why I use multiple, diverse, overlapping terms.
  2. A term can serve as a compact way to direct a co-worker. When I manage a test team, I establish terminology so that when I say “do boundary testing” I can expect the tester to know what I’m asking him to do without necessarily explaining every little thing. The term is thus attached to training, dialog, and shared experiences. (This needn’t be very limiting, although we must guard against settling into a rut and having only a narrow interpretation of terms.)
  3. A term can serve as a label indicating the presence of a bigger story under the surface, much like a manilla folder marked “boundary testing” would be expected to hold some papers or other information about boundary testing. The danger, I think you’ve noticed, is that ignorant testers may quite happily pass folders back and forth that are duly labelled but are quite perfectly empty. You have to open the folders on a regular basis by asking “can you describe, specifically, what you did when you did ‘exploratory testing’? Can you show me?”
  4. A term can hypnotize people. (I’m not recommending this; I’m warning you against it). Terminology, especially obscure terminology, is often used in testing to give the appearance of knowledge where there is no knowledge, in hopes that the client will fall asleep in the false assumption that everything is oooookkkkkkaaaayyyyyy. You appear not to be susceptible to such hypnosis. (Adam White has a similar resistance.)

I expect to see more example of skeptical inquiry on you blog, as you wrestle with testing, Steve. I hope you find, as I do, that it’s a rewarding occupation.

Methodology Debates: Traps and Transformations

(This article is adapted from work I did with Johanna Rothman, at the 1st Amplifying Your Effectiveness conference. It’s never been widely published, so here you go.)

As a context-driven testing methodologist, I am required to think through the methods I use. Sometimes that means debating methodology with people who have a different view about what should be done. Over time, I’ve gained a lot of experience in debate. One thing I’ve learned is that most people have good ideas, but few people know how to debate them. This is too bad, because a successful debate can make a community stronger, while avoiding debates creates a nurturing environment for weak ideas. Let’s look at how to avoid the traps that make debates fail, and how to transform disagreement into powerful consensus.

Sometimes a debate is really part of a war. The advice below won’t help much if that is the case. This advice is more for situations where you are highly motivated to create or maintain a working relationship with someone you disagree with– such as when you work in the next cubicle from the guy.


  • Conflicting Terminology: Be alert to how you are using technical terms. A common term like “bug” has different meanings to different people. If someone says “Unit testing is absolutely essential to good software quality” among your first concerns should be “What does he mean by ‘unit testing’, ‘essential’, and ‘quality’?” Beware, sometimes a debate about definitions bears important fruit, but it can also be another trap. You can spend all your energy on them without necessarily touching the marrow of the subject. On the other hand, you can allow yourself to understand and even use someone else’s terminology in your debate without committing yourself to changing your preferred terminology in general.
  • Paradigm Conflict: A paradigm is an all-inclusive way of explaining the world, generally tied into terminology and assumptions about practices and contexts. Two different paradigms may explain the same phenomena in totally different ways. When two people from different paradigms come together, each may seem insane to the other. Whenever you feel that your opponent is insane, maybe that’s time to stop and consider that you are trying to cross a paradigmatic boundary. In which case, you should talk about that, first.
  • Ambiguous Metrics: Don’t be seduced by numbers. They can mean anything. The problem is knowing what they do, in fact, mean. When someone quotes numbers at me, I wonder how the metric was collected, and what influenced the people who collected them. I wonder if the numbers were sanitized in any way. For instance, when someone tells me that he performed 1000 test cases, I wonder if he’s talking about trivial test cases, or vital ones. There’s no way to know unless I personally review the tests, or conduct a detailed interview of the tester.
  • Confusing Feeling and Rationality: Beware of confusing feeling communication with rational communication. Be alert to the intensity of the feelings associated with the ideas being presented. Many debates that seem to be about ideas may indeed be about loyalty, trust, respect, and other fundamental issues. A statement like “C++ is the best language in the world. All other languages are garbage” may actually mean “C++ is the only language I know. I am comfortable with what I know. I don’t want to concern myself with languages I don’t already know, because then I feel like a beginner, again.” There’s an old saying that you can’t use logic to refute a conclusion that wasn’t arrived at through logic. That may not be strictly true, but it’s a helpful guideline. So, if you sense a strange intensity around the debate, your best bet may be to stop talking about ideas and start exploring the feelings.
  • Confusing Outcome and Understanding: Sometimes one person can be debating for the purpose of getting a particular outcome, while the other person is debating to understand the subject better, or help you understand them. Confusing these approaches can lead to a lot of unnecessary pain. So, consider saying what your goal is, and ask the other person what they want to get out of the debate.
  • Hidden Context: You may not know enough about the world the other person lives in. Maybe work life for them is completely different than it is for you. Maybe they live under a different set of requirements and challenges. Try saying “I want to understand better why you feel the way you do. Can you tell me more about your [life, situation, work, company, etc.]?”
  • Hidden History: You may not know enough about other debates and other struggles that shaped the other person’s position. If you notice that the other person seems to be making many incorrect assumptions about what you mean, or putting words in your mouth, consider asking something like “Have you ever had this argument with someone else?”
  • Hidden Goals: Not knowing what the other person wants from you. You might try learning about that by asking “Are we talking about the right things?”or “What would you like me to do?” Keep any hint of sarcasm out of your voice when you say that. Your intent should be to learn about what they want, because maybe you can give it to them without compromising anything that’s important to you.
  • False Urgency: Feeling like you are trapped and have to debate right now. It’s always fair to get prepared to discuss a difficult subject. You don’t have to debate someone at a particular time just because that person feels like doing it right then. One way to get out of this trap is just to say “This subject is important to me, but I’m not prepared to debate it right now.”
  • Flipping the Bozo Bit: If you question the sanity, good faith, experience, or intelligence of the person who disagrees with you, then the debate will probably end right there. You’ll have a war, instead. So, if you do that, in the heat of the moment, your best bet for recovery may be to take a break. When you come back, ask questions and listen carefully to be sure you understand what the other guy is trying to say.
  • Short-Term Focus: Hey, think of the future. Successful spouses know that the ability to lose an argument gracefully can help strengthen the marriage. I lose arguments to my wife so often that she gives me anything I want. The same goes for teams. Consider a longer term view of the debate. For instance, if you sense an impasse, you could say “I’m worried that we’re arguing too much. Let’s do it your way.” or “Tell you what: let’s try it your way as an experiment, and see what happens.” or “Maybe we need to get some more information before we can come to agreement on this.”

Transforming Disagreement

An important part of transforming disagreement is to synchronize your terminology and frames of reference, so that you’re talking about the same thing (avoiding the “pro-life vs. pro-choice” type of impasse). Another big part is changing a view of the situation that allows only one choice into one that allows many reasonable choices (the “reasonable people can bet on different horses” position). Here are some ideas for how to do that:

  • Transform absolute statements into context-specific statements. Consider changing “X is true” to “In situation Y, X is true.” In other words, make your assumptions explicit. That allows the other person to say “I’m talking about a different situation.”
  • Transform certainties into probabilities and alternatives. Consider changing “X is true” to “X is usually true” or”X, Y, or Z can be true, but X is the most likely” That allows the other person to question the basis of your probability estimate, but it also opens the door to the possibility of resolving the disagreement as a simpler matter of differing opinions on probability rather than the more fundamental problem of what is possible.
  • Transform rules into heuristics. Consider changing “You should do X” to something like”If you have problem Y and want to solve it, doing something like X might help.” The first statement is probably a suggestion in the clothing of a moral imperative. But in technical works, we are usually not dealing with morals, but rather with problems. If someone tells me that I should write a test plan according to the IEEE-829 template, then I wonder what problem that will solve, whether I indeed have that problem, how important that problem is, whether 829 would solve it, and what other ways that same problem might be solved.
  • Transform implicit stakeholders and concerns into explicit stakeholders and concerns. Consider changing “X is bad” to “I don’t like X” or”I’m worried about X” or “Stakeholder Y doesn’t like X.” There are no judgments without a judger. Bring the judger out into the open, instead of using language that make an opinion sound like a law of physics. This opens the door to talk about who matters and who gets to decide, which can be a more important issue than the decision itself. Another response you can make to “X is bad” is to” ask compared to what?” which will bring out the unspecified standard.
  • Translate the other person’s story into your terms and check for accuracy. Consider saying something like “I want to make sure I understand what you’re telling me. You’re saying that…” then follow with “Does that sound right?” and listen for agreement. If you sense a developing impasse, try suspending your part of the argument and become an interviewer, asking questions to make sure the other person’s story is fully told. Sometimes that’s a good last resort option. If they challenge you to prove them wrong or demand a reply, you can say “It’s a difficult issue. I need to think about it some more.”
  • Translate the ideas into a diagram. Try drawing a picture that shows both views of the problem. Sometimes that helps put a disagreement into perspective (literally). This can help especially in a “blind men and the elephant” situation, where people are arguing because they are looking at different parts of the same thing, without realizing it. For instance, if I argue that testing should start late, and someone else argues that testing should start early, we can draw a timeline and put things on the timeline that represent the various issues we’re debating. We may discover that we are making different assumptions about the cost of bugs curve, and which point we can draw several curves and discuss the forces that affect them.
  • Translate total disagreement into shades of agreement. Do you completely disagree with the other person, or disagree just a little? Consider looking at it as shades of agreement. Is it total opposition or is it just discomfort. This is important because I know, sometimes, I begin an argument with a vague unease about someone’s point of view. If they then react defensively to that, as if I’ve attacked them, then I might feel driven firmly to the other side of the debate. Sometimes when looking for shades of agreement, you discover that you’ve been in violent agreement all along.
  • Transform your goal from being right to being a team. Is there a way to look at the issue being debated as related to the goal of being a strong team? This is something you can do in your own mind to reframe the debate. Is it possible that the other person is arguing less from the force of logic and more from the fear of being ignored? If so, then being a good listener may do more to resolve the debate than being a good thinker. Every debate is a chance to strengthen a relationship. If you’re on the “right” side, you can strengthen it by being a gracious winner and avoiding I-told-you-so behavior. If you’re on the “wrong” side, you can strengthen the team by publicly acknowledging that you have changed you mind, that you have been persuaded. When you don’t know who is right, you can still respect feelings and consider how the outcome and style of the debate might harm your ability to work together.
  • Transform conclusions to processes. If the other person is holding onto a conclusion you disagree with, consider addressing the process by which they came to adopt that conclusion. Talk about whether that process was appropriate and whether it could be revisited.
  • Express faith in the other person. If the debate gets tense, pause and remind the other person that you respect his good faith and intentions. But only say that if it’s true. If it’s not true, then you should stop debating about the idea immediately, and deal instead with your feelings of mistrust. Any debate that’s not based on trust is doomed from the start, unless of course it’s not really a debate, but a war, a game, or a performance put on for an audience.
  • Wait and listen. Sometimes, a conversation looks like a debate, and feels like a debate, but is actually something else. Sometimes we just need to vent for a bit, and be heard. That’s one reason why being a good listener is not only polite, but eminently practical.
  • Express appreciation when someone tries to transform your position. When you notice someone making an effort to use these transformations in a conversation with you, thank them. This is a good thing. It’s a sign that they are trying to connect with you and help you express you ideas.

“Mipping”: A Strategy for Reporting Iffy Bugs

When I first joined ST Labs, years ago, we faced a dilemma. We had clients telling us what kind of bugs we should not report. “Don’t worry about the installer. Don’t report bugs on that. We have that covered.” No problem, dear customer, we cheerfully replied. Then after the project we would hear complaints about all the installation bugs we “missed”.

So, we developed a protocol called Mention In Passing, or “mipping”. All bugs shall be reported, without exception. Any bug that seems questionable or prohibited we will “mention in passing” in our status reports or emails. In an extreme case we mention it by voice, but I generally want to have a written record. That way we are not accused of wasting time investigating and reporting the bug formally, but we also can’t be accused of missing it entirely.

If a client tells me to stop bothering him about those bugs, even in passing, I might switch to batching them, or I might write a memo to all involved that I will henceforth not report that kind of problem. But if there is reasonable doubt in my mind that my client and I have a strong common understanding of what should and should not be reported, I simply tell them that I “mip” bugs to periodically check to see if I have accidentally misconstrued the standard for reporting, or to see if the standard has changed.

Should Developers Test the Product First?

When a programmer builds a product, should he release it to the testers right away? Or should he test it himself to make sure that it is free of obvious bugs?

Many testers would advise the programmer to test the product himself, first. I have a different answer. My answer is: send me the product the moment it exists. I want avoid creating barriers between testing and programming. I worry that anything that may cause the programmers to avoid working with me is toxic to rapid, excellent testing.

Of course, it’s possible to test the product without waiting to send it to the testers. For instance, a good set of automated unit tests as part of the build process would make the whole issue moot. Also, I wouldn’t mind if the programmer tested the product in parallel with me, if he wants to. But I don’t demand either of those things. They are a lot of work.

As a tester I understand that I am providing a service to a customer. One of my customers is the programmer. I try to present a customer service interface that makes the programmers happy I’m on the project.

I didn’t always feel this way. I came to this attitude after experiencing a few projects where I drew sharp lines in sand, made lots of demands, then discovered how difficult it is to do great testing without the enthusiastic cooperation of the people who create the product.

It wasn’t just malicious behavior, though. Some programmers, with the best of intentions, were delaying my test process by trying to test it themselves, and fix every bug, before I even got my first look at it (like those people who hire house cleaners, and then clean their own houses before the professionals arrive).

Sometimes a product is so buggy that I can’t make much progress testing it. Even then, I want to have it. Every look I get at it helps me get better ideas for testing it, later on.

Sometimes the programmer already knows about the bugs that I find. Even then, I want to have it. I just make a deal with the programmers that I will report bugs informally until we reach an agreed upon milestone. Any bugs not fixed by that time get formally reported and tracked.

Sometimes the product is completely inoperable. Even then, I want to have it. Just by looking at its files and structures I might begin to get better ideas for testing it.

My basic heuristic is: if it exists, I want to test it. (The only exception is if I have something more important to do.)

My colleague Doug Hoffman has raised a concern about what management expects from testing. The earlier you get a product, the less likely you can make visible progress testing it– then testing may be blamed for the apparently slow progress. Yes, that is a concern, but that’s a question of managing expectations. Hence, I manage them.

So, send me your huddled masses of code, yearning to be tested. I’ll take it from there.