The Unbearable Lightness of Model-Based Testing
Last night I gave a webinar to the CAMUG people in Calgary. About forty people attended online, as well. You can see this presentation on Google Video. The resolution is poor, so you may also want to see the slides.
The talk itself is 20 minutes long, then I answer questions for an hour.
In this talk, I tried to illustrate why any testing based on a model is inherently limited, and why we must strive against those limitations. I talk about some ways of doing that.
Questions
Numerous quesitions came up during the talk. I answered many. Below are the questions that were typed to me that I did not have a chance to answer, during the event:
How often do we apply blink testing?
I apply blink testing any time I can arrange to be confronted with a blizzard of data: comparing screens (I glance at a million pixels, then at another million pixels, and in an instant I see the tiny difference between them), scrolling through huge log files, or watching an extremely rapid process take place. Anything that seems overwhelming to take in triggers me to consider a blink test. By the way, when I say “blink test” I’m talking mostly about a blink oracle.
When presented with an application to test, what triggers (if any) lead you to choose model-based testing? or do you always think through state transitions?
Everyone, always, is doing model-based testing if that term means “testing according to a model.” The only possible exception to that would be testing by accident. As soon as you test purposefully, that means you already have a model in mind.
Automatically generating tests according to a specified model is the narrower definition of model-based testing that people like Google’s Harry Robinson prefer. What triggers that for me is whenever I want to meticulously cover a test space that I can conveniently describe with a tractable handful of variables (or if I see a regular and simple notation to describe those tests, regardless of the number of variables). I then write a little program in Perl to generate the test ideas. I may also try to automate those tests. For instance, I wrote a program to generate tests for the example I used in my talk. It produced 152 state transition cases, each consisting of a start state, three actions, and an expected end state. Like this:
TRANSITION SEQUENCES: 1 launching -> (finished launching) stop start -> running 2 launching -> (finished launching) stop reset -> resetted 3 launching -> (finished launching) stop stop -> stopped 4 launching -> (finished launching) reset start -> running 5 launching -> (finished launching) reset reset -> resetted ..... 148 stopped -> stop reset start -> running 149 stopped -> stop reset reset -> resetted 150 stopped -> stop stop start -> running 151 stopped -> stop stop reset -> resetted 152 stopped -> stop stop stop -> stopped
How do you suggest that you apply this technique with a complex application, something like MS Word for instance.
I’m not sure what technique you are referring to. If you are talking about using state models to describe a system, then it’s interesting that you ask that, because I asked Harry Robinson that same question (it was even about applying it to Word) after seeing an otherwise fascinating talk about state-based testing that he gave, years ago. As I recall, he wasn’t ready to answer that question. But I am. My answer is: ask Harry. Seriously, he has actually worked with model-based testing tools at Microsoft, in the years since I challenged him on this point.
I have another answer, too. I don’t apply automated model-based testing to entire applications, I apply that method opportunistically. So, if I were testing Word, I would be looking for features of Word that seemed especially like tractable state machines, or were tractable (meaning not too many variables and complications) to test via some other kind of model. Otherwise, what I am always doing is developing models in my mind (we call it learning) and using those models to test any given product, no matter how complex it is.
Have you come across any tools for automated Model based testing?
I use Perl. I’m sure there are other kinds of tools, but I haven’t used them.
What’s too much modeling?
Too much formal modeling is when you give it more time than it is worth, or when many other interesting things don’t get done because you are obsessed with the formalisms and the cool tools. Too much attention to one kind of model will starve attention for other kinds of models that also have testing value.
Technology Notes
The technology of Webinars is still a bit immature. I used GoToWebinar, and I think the price/value arrangement is pretty good. WebEx has better features, but it’s much more expensive.
I had to wear two headsets, one to record my voice and talk to the CAMUG auditorium via Skype, the other to talk on the webinar conference line. The CAMUG people (in an auditorium) and the online people were not able to hear each other. I wasn’t able to hear the online people, but they could type questions to me. I wish the audio was all somehow integrated online. I would have wanted to record the CAMUG audio, too.
Animations happening on my screen were not smoothly displayed to the audience, but at least they were displayed.
Skype drops calls pretty often, but luckily not during this particular presentation.
January 9th, 2007 at 1:44 pm
The video is an unimaginable wealth of knowledge about Model Based Testing!
[James’ Reply: Well, maybe not unimaginable.]
January 10th, 2007 at 3:00 pm
Is there an open forum here for readers to post their questions?
[James’ Reply: It’s a private forum where I encourage readers to attempt to post their questions.]
January 10th, 2007 at 5:53 pm
James, thanks for posting your presentation on Google Video. I enjoyed learning about model-based testing and look forward to seeing more online presentations in the future.
January 14th, 2007 at 10:16 am
James,
Thanks for answering question and for writing more about Harry Robinson, model based testing. There seems to be good number of interesting papers on model based testing.
[Niteen]
January 24th, 2007 at 10:33 pm
James,
Automated model-based testing does not need to be as light as you suggest. You have appropriately identified the limitations that prevent automated model-based testing from scaling beyond simple application components.
I’ve been implementing automated model-based tests since I read Harry’s article in the September 2000 STQE magazine (http://www.geocities.com/model_based_testing/intelligent.pdf).
I believe that most automated UI testing should be model-based. And I am a proponent of exploratory manual testing. As someone that makes a living from automated testing, I often find myself in the awkward position of trying to communicate the value of manual testing when asked to automate “everything”. I want my test automation to provide value to the manual testers – not attempt to replace them. I am a firm believer in your Rule #1 of test automation: “A good manual test cannot be automated.”
I don’t like most purely scripted testing — whether it be manual or automated. Testing needs to be an interactive activity – whether it be manual or automated.
Test automation should be a tool to assist the manual tester – just as most software assists human users instead of replacing them. The industry needs to stop treating test automation as a batch process that runs over and over again without human intervention. I believe that many automation efforts fail because they are attempts to replace the manual tester.
Automated model-based testing allows me to simplify the automated test creation and maintenance. As with any test automation, the automation is only as good as what the designer put into it. If unexpected input is not modeled, it will not be tested. And a big part of any automation maintenance is to enhance the model based on new knowledge and ideas.
I use automated model-based testing to test whole applications. (Not to imply that I wholly test any application.) I am not manually defining every possible state transition. Most modern software applications are hierarchical in nature. I address the state explosion problem by using hierarchical state machines. In your example, I would only need to define the right-click actions once and then let the computer flatten the model into all the possible transitions.
[James’ Reply: The basic point I’m arguing for is that ALL testing is model-based and ANY explicit model is a simplification that may be an oversimplification. So, from my point of view, while I agree that you can just right-click to add the actions, the problem is that you have to think of doing that, and you have to decide whether it’s worth doing that. This is not a trivial matter. The unbearable lightness is the trouble we have getting to a model that we ought to trust; that is justifiable and sufficient.]
I also simplify the model definition by defining some things as data and others as states. In your example, I would likely make the clock a data item (rather than a different state) and add oracles to my test to confirm that the clock is ticking in the expected direction.
[James’ Reply: You would do that if you thought of it, and thought that it was worth doing.]
You say that I am not going to make a 1400 box state model. You are correct. I let the computer make it for me from one or more simple hierarchical state models. I am running automated model-based tests with hundreds of thousands of state transitions — and the automation is finding important bugs missed by manual testing.
[James’ Reply: Your automation is probably also missing important bugs that could be found by interactive, attentive, sentient, cognitively intensive testing. This is because your model is a simplification. The computer cannot, in fact, “make it for you”. What it’s doing is fulfilling an algorithm it has been given, and that algorithm is full of assumptions that cannot be questioned by your software. The computer is expanding a model that you have already provided.]
Hierarchical state machines even let me easily validate my assumptions about the behavior of an application. For example, if I expect the same behavior from pressing a button no matter what the state within an application, I can define the behavior once and let the computer run by itself and report any states from which the behavior is different. This information can then be used to help direct manual testing and future improvements to the model.
[James’ Reply: I think the point you are making is that model-based test tools can be wonderfully helpful. I agree. That’s why I use them. However, I use them (and I suspect you do, too) with a healthy skepticism and ongoing vigilance for things that may be missed.]
January 27th, 2007 at 1:22 pm
Thanks for the reply. And thank you for continually challenging testers to think.
I am a skeptic and agree that a healthy amount of skepticism is necessary to be a good tester. I apply my “defensive pessimism” to testing. (See the book “The Positive Power of Negative Thinking” or http://www.defensivepessimism.com.) Although always considering what I (and others) might be missing and what might go wrong probably isn’t a good way to win friends, it does contribute to good testing.
I agree with your point: ALL testing is model based. Mental models are the basis for all testing. Scripted testing is based on a testers mental model at the time of scripting; which is why I don’t like scripted automation: it is a specific implementation of part of an explicit model. An engaged manual tester updates their mental model as they learn during testing.
I am certain my automation is missing important bugs that could be found by “interactive, attentive, sentient, cognitively intensive testing”. Models are simplifications of the real thing, as automated testing is a simplification of the real thing. I find that automating tests with explicit models can be (if the automation designer thinks of the right things) less of a simplification than automating scripted test procedures.
My models are full of assumptions and (as you point out) my software cannot question those assumptions but it can report anything it finds that deviates from my assumptions. I can then analyze the reported data (another task that takes an engaged human mind) to refine my mental model. In this way, automation can be a great testing tool.
I think that the idea of scripting test cases came from a time that developers wrote code with pencil and paper, punched it into cards, waited for computer time, and finally attempted to compile the code. Coding software is now a more interactive activity. Developers can try things and instantly compile and test their ideas. For some reason, many testing practices are still based on the way software was developed 40 years ago.
[James’ Reply: Thanks for commenting, Ben. This makes sense to me.]
February 7th, 2007 at 4:43 pm
Jim,
Interesting presentation. I would make the a couple of observations.
Your presentation seems to make a better case for exploratory testing than any form of model-based testing. However I would argue that you were not testing the app so much as learning its observed behavior and then itemizing what seemed “unusual.” If the point of the app was a quick and dirty promotion to sell high speed connectivity, then most of your list of behaviors would probably not be classified as bugs.
[James’ Reply: Well, that’s not really the point. I don’t care, in this case, what someone would consider a bug. My point is that any application, even a simple one, is more complex than we can conveniently model in any formal way. And I’m arguing that we must test in a way that admits surprises and helps us expand our models in real-time.]
Exploratory understanding of the app is much better done against the specification, when the developer can benefit rather than after the code is written and rework becomes expensive. Exploratory testing remains a terribly unmanageable in inefficient process for testing that an application conforms to specification.
[James’ Reply: I don’t know where your assertion is coming from. What experience do you have with exploratory testing? I suspect you may be using a different definition of it than we do who foster this approach. You’re calling it unmanageable… Well, do you mean you don’t know how to manage it? Or are you saying that no one can manage it? If the latter, I find that surprising, considering that I can manage it, I do manage it, and I teach other people how to manage it.]
I would challenge the notion of being sceptical of model-based testing. Rather, the issue is to understand thoroughly the types of coverage which each model approach provides and apply the appropriate ones to the testing problem.
[James’ Reply: If you want to challenge, then go ahead and challenge, but where’s your challenge? What I see is a counter-assertion without any reason provide. Please let me hear your argument. Skepticism is a powerful attitude to have in testing. I demonstrated specifically, in my talk, why any model– any model at all– is going to limit you. This is not necessarily a problem, but it is a problem if you fall in love with a particular model, or with the idea of testing exclusively against formal models. It is my skepticism that protects me from such obsessions.]
For example, 70% of defects are the result of incorrect implemnetation of required functionality. Only 15% of bugs are actual coding errors. Cause-effect models provide test cases with very thorough functional coverage. So a C-E model is preferable over a code structure model for catching the most prevalent bugs. State models exercise state transitions but may not exercise all of the conditionis which result in a state change. So if I am testing a cell phone app, a blend of C-E and state models are required to get adquate coverage.
[James’ Reply: Where did you get these numbers? Without context, they have no meaning or value to me. You might as well say “somebody somewhere at some time measured the temperature of something and the temperature was 70 degrees Fahrenheit, therefore the temperature where you are must be…” Come on, man. Don’t use numbers that way. Anyway, I would be happy to evaluate your argument once you put it in some sort of meaningful form. There are so many suppressed premises and assumptions that I can’t make sense of it. You can start by defining some of your terms. What do you think “functional testing” means? What do you think a cause-effect model is? Then show me a “cause-effect model” and tell me why I should believe that it represents the application that you claim it represents? Peter, I deal in fabulous and overwhelming complexities. I am a tester. I’m not a model believer, I’m a model questioner. I AM A TESTER.]
The value of formal model-based testing is that it provides a consistent and reproducable approach to designing test cases and obtaining measurable test coverage. That means manageability of the test process and predictability of time and cost. In our testing events, we pick the modeling approaches appropriate to the problem. This allows us to meet time and budget constraints, manage and track the process and end up with very high reliability applications delivery consistently.
Pete Becker
Critical Logic
[James’ Reply: Peter, I would urge you to be more self-critical. I think you have achieved the illusion of manageability, and you have achieved this illusion by avoiding deep questions about your models and their intellectual and empirical foundations. I don’t offer my clients formulas for success. I offer them a way to gain the necessary skills to avoid fooling themselves.]
February 8th, 2007 at 2:03 pm
This is great, thanks James!