We. Use. Tools.

Context-Driven testers use tools to help ourselves test better. But, there is no such thing as test automation.

Want details? Here’s the 10,000 word explanation that Michael Bolton and I have been working on for months.

Editor’s Note: I have just posted version 1.03 of this article. This is the third revision we have made due to typos. Isn’t it interesting how hard it is to find typos in your own work before you ship an article? We used automation to help us with spelling, of course, but most of the typos are down to properly spelled words that are in the wrong context. Spelling tools can’t help us with that. Also, Word spell-checker still thinks there are dozens of misspelled words in our article, because of all the proper nouns, terms of art, and neologisms. Of course there are the grammar checking tools, too, right? Yeah… not really. The false positive rate is very high with those tools. I just did a sweep through every grammar problem the tool reported. Out of the five it thinks it found, only one, a missing hyphen, is plausibly a problem. The rest are essentially matters of writing style.

One of the lines it complained about is this: “The more people who use a tool, the more free support will be available…” The grammar checker thinks we should not say “more free” but rather “freer.” This may be correct, in general, but we are using parallelism, a rhetorical style that we feel outweighs the general rule about comparatives. Only humans can make these judgments, because the rules of grammar are sometimes fluid.

Behavior-Driven Development vs. Testing

The difference between Behavior-Driven Development and testing:

This is a BDD scenario (from Dan North, a man I respect and admire):

+Scenario 1: Account is in credit+
Given the account is in credit
And the card is valid
And the dispenser contains cash
When the customer requests cash
Then ensure the account is debited
And ensure cash is dispensed
And ensure the card is returned

This is that BDD scenario turned into testing:

+Scenario 1: Account is in credit+
Given the account is in credit
And the card is valid
And the dispenser contains cash
When the customer requests cash
Then check that the account is debited
And check that cash is dispensed
And check that the card is returned
And check that nothing happens that shouldn’t happen and everything else happens that should happen for all variations of this scenario and all possible states of the ATM and all possible states of the customer’s account and all possible states of the rest of the database and all possible states of the system as a whole, and anything happening in the cloud that should not matter but might matter.

Do I need to spell it out for you more explicitly? This check is impossible to perform. To get close to it, though, we need human testers. Their sapience turns this impossible check into plausible testing. Testing is a quest within a vast, complex, changing space. We seek bugs. It is not the process of  demonstrating that the product CAN work, but exploring if it WILL.

I think Dan understands this. I sometimes worry about other people who promote tools like Cucumber or jBehave.

I’m not opposed to such tools (although I continue to suspect that Cucumber is an elaborate ploy to spend a lot of time on things that don’t matter at all) but in the face of them we must keep a clear head about what testing is.

What Testers Find

While testing at eBay, recently, it occurred to me that we need a deeper account of what testers find. It’s not just bugs. Here’s my experimental list:

Testers find bugs. In other words, we look for anything that threatens the value of the product. (This ties directly into Jerry Weinberg’s famous dictum that quality means value to some person, at some time, who matters.) Some people like to say that testers find “defects.” That is also true, but I avoid that word. It tends to make programmers and lawyers upset, and I have trouble enough. Example: a list of countries in a form is missing “France.”

Testers also find risks. We notice situations that seem likely to produce bugs. We notice behaviors of the product that look likely to go wrong in important ways, even if we haven’t yet seen that happen. Example: A web form is using a deprecated HTML tag, which works fine in current browsers, but may stop working in future browsers. This suggests that we ought to do a validation scan. Maybe there are more things like that on the site.

Testers find issues. An issue is something that threatens the value of the project, rather than the product itself. Example: There’s a lot of real-time content on eBay. Ads and such. Am I supposed to test that stuff? How should I test it?

Testers find testability problems. It’s a kind of issue, but it’s worth highlighting. Testers should point out aspects of the product that make it hard to observe and hard to control. There may be small things that the developers can do (adding scriptable interfaces and log files, for instance) that can improve testability. And if you don’t ask for testability, it’s your fault that you don’t get it. Example: You’re staring at a readout that changes five times a second, wondering how to tell if it’s presenting accurate figures. For that, you need a log file.

Testers find artifacts, too. Also a kind of issue, but also worth highlighting, we sometimes see things that look like problems, but turn out to be manifestations of how we happen to be testing. Example: I’m getting certificate errors on the site, but it turns out to be an interaction between the site and Burp Proxy, which is my recording tool.

Testers find curios. We notice surprising and interesting things about our products that don’t threaten the value of the product, but may suggest hidden features or interesting ways of using the product. Some of them may represent features that the programmers themselved don’t know about. They may also suggest new ways of testing. Example: Hmm. I notice that a lot of complex content is stored in Iframes on eBay. Maybe I can do a scan for Iframes and systematically discover important scripts that I need to test.

Maybe there are other things you think should be added to this list. The point is that the outcomes of testing can be quite diverse. Keep your eyes and your mind open.

Quick Oracle: Blink Testing

Background:

  1. In testing, an “oracle” is a way to recognize a problem that appears during testing. This contrasts with “coverage”, which has to do with getting a problem to appear. All tests cover a product in some way. All tests must include an oracle of some kind or else you would call it just a tour rather than a test. (You might also call it a test idea, but not a complete test.)
  2. A book called Blink: The Power of Thinking Without Thinking has recently been published on the subject of snap decisions. I took one look at it, flipped quickly through it, and got the point. Since the book is about making decisions based on little information, I can’t believe the author, Malcolm Gladwell, seriously expected me to sit down and read every word.

“Blink testing” represents an oracle heuristic I find quite helpful, quite often. (I used to call it “grokking”, but Michael Bolton convinced me that blink is better. The instant he suggested the name change, I felt he was right.)

What you do in blink testing is plunge yourself into an ocean of data– far too much data to comprehend. And then you comprehend it. Don’t know how to do that? Yes you do. But you may not realize that you know how.

You can do it. I can prove this to you in less than one minute. You will get “blink” in a wink.

Imagine an application that adds two numbers together. Imagine that it has two fields, one for each number, and it has a button that selects random numbers to be added. The numbers chosen are in the range -99 to 99.

Watch this application in action by looking at this movie (which is an interactive EXE packaged in a ZIP file) and ask yourself if you see any bugs. Once you think you have it, click here for my answer.

  • How many test cases do you think that was?
  • Did it seem like a lot of data to process?
  • How did you detect the problem(s)?
  • Isn’t it great to have a brain that notices patterns automatically?

There are many examples of blink testing, including:

  • Page through a long file super rapidly (holding your thumb on the Page Down button, notice the pattern of blurry text on the screen, and look for strange variations in that pattern.
  • Take a 60,000 line log file, paste it into Excel, and set the zoom level to 8%. Scroll down and notice the pattern of line lengths. You can also use conditional formatting in Excel to turn lines red if they meet certain criteria, then notice the pattern of red flecks in the gray lines of text, as you scroll.
  • Flip back and forth rapidly between two similar bitmaps. What catches your eye? Astronomers once did this routinely to detect comets.
  • Take a five hundred page printout (it could be technical documentation, database records, or anything) and flip quickly through it. Ask yourself what draws your attention most about it. Ask yourself to identify three interesting patterns in it.
  • Convert a huge mass of data to sound in some way. Listen for unusual patterns amidst the noise.

All of these involve pattern recognition on a grand scale. Our brains love to do this; our brains are designed to do this. Yes, you will miss some things; no, you shouldn’t care that you are missing some things. This is just one technique, and you use other techniques to find those other problems. We already have test techniques that focus on trees, it also helps to look at the forest.