Rethinking Equivalence Class Partitioning, Part 1

Wikipedia’s article on equivalence class partitioning (ECP) is a great example of the poor thinking and teaching and writing that often passes for wisdom in the testing field. It’s narrow and misleading, serving to imply that testing is some little game we play with our software, rather than an open investigation of a complex phenomenon.

(No, I’m not going to edit that article. I don’t find it fun or rewarding to offer my expertise in return for arguments with anonymous amateurs. Wikipedia is important because it serves as a nearly universal reference point when criticizing popular knowledge, but just like popular knowledge itself, it is not fixable. The populus will always prevail, and the populus is not very thoughtful.)

In this article I will comment on the Wikipedia post. In a subsequent post I will describe ECP my way, and you can decide for yourself if that is better than Wikipedia.

“Equivalence partitioning or equivalence class partitioning (ECP)[1] is a software testing technique that divides the input data of a software unit into partitions of equivalent data from which test cases can be derived.”

Not exactly. There’s no reason why ECP should be limited to “input data” as such. The ECP thought process may be applied to output, or even versions of products, test environments, or test cases themselves. ECP applies to anything you might be considering to do that involves any variations that may influence the outcome of a test.

Yes, ECP is a technique, but a better word for it is “heuristic.” A heuristic is a fallible method of solving a problem. ECP is extremely fallible, and yet useful.

“In principle, test cases are designed to cover each partition at least once. This technique tries to define test cases that uncover classes of errors, thereby reducing the total number of test cases that must be developed.”

This text is pretty good. Note the phrase “In principle” and the use of the word “tries.” These are softening words, which are important because ECP is a heuristic, not an algorithm.

Speaking in terms of “test cases that must be developed,” however, is a misleading way to discuss testing. Testing is not about creating test cases. It is for damn sure not about the number of test cases you create. Testing is about performing experiments. And the totality of experimentation goes far beyond such questions as “what test case should I develop next?” The text should instead say “reducing test effort.”

“An advantage of this approach is reduction in the time required for testing a software due to lesser number of test cases.”

Sorry, no. The advantage of ECP is not in reducing the number of test cases. Nor is it even about reducing test effort, as such (even though it is true that ECP is “trying” to reduce test effort). ECP is just a way to systematically guess where the bigger bugs probably are, which helps you focus your efforts. ECP is a prioritization technique. It also helps you explain and defend those choices. Better prioritization does not, by itself, allow you to test with less effort, but we do want to stumble into the big bugs sooner rather than later. And we want to stumble into them with more purpose and less stumbling. And if we do that well, we will feel comfortable spending less effort on the testing. Reducing effort is really a side effect of ECP.

“Equivalence partitioning is typically applied to the inputs of a tested component, but may be applied to the outputs in rare cases. The equivalence partitions are usually derived from the requirements specification for input attributes that influence the processing of the test object.”

Typically? Usually? Has this writer done any sort of research that would substantiate that? No.

ECP is a process that we all do informally, not only in testing but in our daily lives. When you push open a door, do you consciously decide to push on a specific square centimeter of the metal push plate? No, you don’t. You know that for most doors it doesn’t matter where you push. All pushable places are more or less equivalent. That is ECP! We apply ECP to anything that we interact with.

Yes, we apply it to output. And yes, we can think of equivalence classes based on specifications, but we also think of them based on all other learning we do about the software. We perform ECP based on all that we know. If what we know is wrong (for instance if there are unexpected bugs) then our equivalence classes will also be wrong. But that’s okay, if you understand that ECP is a heuristic and not a golden ticket to perfect testing.

“The fundamental concept of ECP comes from equivalence class which in turn comes from equivalence relation. A software system is in effect a computable function implemented as an algorithm in some implementation programming language. Given an input test vector some instructions of that algorithm get covered, ( see code coverage for details ) others do not…”

At this point the article becomes Computer Science propaganda. This is why we can’t have nice things in testing: as soon as the CS people get hold of it, they turn it into a little logic game for gifted kids, rather than a pursuit worthy of adults charged with discovering important problems in technology before it’s too late.

The fundamental concept of ECP has nothing to do with computer science or computability. It has to do with logic. Logic predates computers. An equivalence class is simply a set. It is a set of things that share some property. The property of interest in ECP is utility for exploring a particular product risk. In other words, an equivalence class in testing is an assertion that any member of that particular group of things would be more or less equally able to reveal a particular kind of bug if it were employed in a particular kind of test.

If I define a “test condition” as something about a product or its environment that could be examined in a test, then I can define equivalence classes like this: An equivalence class is a set of tests or test conditions that are equivalent with respect to a particular product risk, in a particular context. 

This implies that two inputs which are not equivalent for the purposes of one kind of bug may be equivalent for finding another kind of bug. It also implies that if we model a product incorrectly, we will also be unable to know the true equivalence classes. Actually, considering that bugs come in all shapes and sizes, to have the perfectly correct set of equivalence classes would be the same as knowing, without having tested, where all the bugs in the product are. This is because ECP is based on guessing what kind of bugs are in the product.

If you read the technical stuff about Computer Science in the Wikipedia article, you will see that the author has decided that two inputs which cover the same code are therefore equivalent for bug finding purposes. But this is not remotely true! This is a fantasy propagated by people who I suspect have never tested anything that mattered. Off the top of my head, code-coverage-as-gold-standard ignores performance bugs, requirements bugs, usability bugs, data type bugs, security bugs, and integration bugs. Imagine two tests that cover the same code, and both involve input that is displayed on the screen, except that one includes an input which is so long that when it prints it goes off the edge of the screen. This is a bug that the short input didn’t find, even though both inputs are “valid” and “do the same thing” functionally.

The Fundamental Problem With Most Testing Advice Is…

The problem with most testing advice is that it is either uncritical folklore that falls apart as soon as you examine it, or else it is misplaced formalism that doesn’t apply to realistic open-ended problems. Testing advice is better when it is grounded in a general systems perspective as well as a social science perspective. Both of these perspectives understand and use heuristics. ECP is a powerful, ubiquitous, and rather simple heuristic, whose utility comes from and is limited by your mental model of the product. In my next post, I will walk through an example of how I use it in real life.

13 thoughts on “Rethinking Equivalence Class Partitioning, Part 1

  1. “…or else it is misplaced formalism that doesn’t apply to realistic open-ended problems.”

    I think that’s one of the dirty little secrets of these techniques. ECP, Alll-pairs, finite state machines, they are typically presented in these perfect magical problems that are specifically designed to fit the method. Then we go out and try to use them and see that we can sort-of shoehorn some of these ideas into the problem … sort of.

    Real testing is messy. And that’s okay. The false certainty presented by the methods makes me a little uncomfortable. To borrow a line, they might be a good place to start, but don’t stop there.

    And notice the word ‘might.’ 🙂

    [James’ Reply: If you just say these things are heuristic, all the rest of what you are saying simply follows from the basic meaning of the word.

    Formalism is itself a heuristic. A heuristic is not necessarily a starting place. It’s a tool– a fallible but potentially useful tool that with skill and judgment may provide just what you need in some context. When we discuss these things AS heuristics, with reference to a socially competent human operator, then we more or less automatically talk about their pros and cons without falling into the traps of idolizing or idealizing them.

    I observe that people who don’t use the word “heuristic” are forced to speak in indirect or almost mystical terms about tacit knowledge, which diminishes the message.]

  2. Thanks James, I enjoyed the post.

    One difference I noticed between your model of ECP and the model that I have, was that I don’t consider ECP a prioritisation technique.

    I would describe ECP as a ‘chunking’ technique (or possibly a ‘filtering’ technique) for creating models with different partitions. I can partition with different ‘equivalence’ views on the same data to create different equivalent partitioned sets.

    [James’ Reply: Other than prioritization, what purpose does chunking serve? I say it is a prioritization technique because that is the only actual value I can see to the chunking. Creating chunks is just a fantasy, in and of itself. You never know whether your sets are valid as equivalence classes. They are guesses. When people say “see I made an equivalence class, and now I only have to test one element from it” I ask them why they even need to do that much: if we can just assume our way out of having to test the members of a supposed equivalence class, why not go all the way and just assume there are no bugs at all? Depending on the actual bugs in the product, your chunks may be all wrong.

    But if you see its purpose as prioritizing, then it makes sense: you say “hey I don’t have the time to try every little thing, so I’m going to make guesses about the things that are most important, then test breadth-first on each category. If I have more time I might unpack my equivalence guesses more deeply and question those assumptions.]

    e.g. ‘vowel’, ‘punctuation’ & ‘consonant’ are different ways of equivalent partitioning data than ‘ascii’ & ‘extended ascii’ but both ways can be applied to the same data set.

    [James’ Reply: You mean they are supposed to be applicable to the same data set. There may be a bug that spoils that.]

    Having created a bunch of sets through ECP, I can then use prioritisation techniques with those partitions, but I don’t view ECP itself as a prioritisation technique.

    I do have to choose to explore one partition before another, so I do have to prioritise one partition above the others.

    And I do have to prioritise different partitioning approaches.

    But I don’t think that ECP itself prioritises the output for me, or helps me prioritise the view to apply to partition the data. I have to use other techniques to do that.

    [James’ Reply: The technique specifically involves elevating or suppressing test conditions. The whole point and only meaning of this is to decide which of the many many possible conditions I should not have to worry about (because they are equivalent). This is most definitely prioritization. But, yes there is also prioritization that goes on outside the bounds of ECP.]

    For me ECP and prioritisation are separate, but I can use the output of ECP as input to my prioritisation process.

    [James’ Reply: Show me, step-by-step, your process of ECP, and I will show you where you are privileging some conditions while suppressing others.]

    Thanks.

    Alan

    [James’ Reply: When I first edited your comment I used the WordPress editing app on my iPad. However, the comment is too long, and the edit field couldn’t handle it. I had to switch to my laptop to finish… I guess somebody got their testing priorities wrong, that time. Maybe they thought that all comments are equivalent and didn’t test beyond the obvious.]

  3. Thanks. Creating some examples of ECP is on my todo list – I do want to go back to basics again.

    “Other than prioritization, what purpose does chunking serve?”

    The chunking creates a model – a collection of sets.

    I can use the model:

    – to prioritise
    – to ask questions, e.g.
    – are these values supposed to be treated as equivalent by the system at this point?
    – have we covered these equivalence classes in the Unit Test Code?
    – did the coding try to honour these equivalence classes?
    – to communicate
    – e.g. I am going to use data from these collections and I expect the system to process them the same way
    – to document model coverage
    – e.g. I could ‘cross off’ the elements in the set as I use them
    – to model the system
    – e.g. I can build up the partitions as I use the system, and I can expect the partitions to expand until I create generalised rules that describe the properties that make the elements in the partition equivalent. Then I can compare my rules to the system.

    [James’ Reply: I think what we are disagreeing about is whether establishing the partitions is by its very nature an act of highlighting some things and suppressing other things. I think it is, and I think highlighting and suppressing is a form of prioritizing. The things that I lump together are not going to get as much coverage as they would if I dind’t lump them.

    Otherwise, I agree that, having made those choices, you can go on to do all the other cool things you have suggested with your model.]

  4. I recently had a little encounter with the concept of ECP and immediately got the feeling that it was something about it that didn’t sit right with me but I couldn’t quite put my finger on what it was, which was kind of frustrating.

    After reading this post I feel that the fog has cleared a bit so thank you for that!

    Do you have any suggestions for how I can improve my ability to put my intuitions into words?

    [James’ Reply: By practicing explaining yourself. For instance, I find that teaching others helps me tremendously. This post itself comes out of things that I developed in the classroom.]

  5. Hi James,

    I have the impression the Wikipedia article describes ECP from a programmer’s point of view, rather than a tester’s. (As evidenced by the further reading links to the Testing Standards Working Party website, which has a draft for a unit testing standard, and to a tool for creating partitions from UML diagrams.) Seen in that context, I think, explains some of what a tester must see as shortcomings in the text.

    [James’ Reply: It’s called a test technique. I’m talking about testing. If they are talking about testing, too, then they are selling it short by obsessing over code coverage. Even from a programmer’s point of view, the article is wrong, if the goal is to discuss this technique itself, rather than one specific application of the technique.]

    However, I do think of ECP as a rather technical thing. See, e.g., Cem Kaner and Sowmya Padmanabhan’s “Domain Testing Workbook”. In that sense, I think the example from the Wikipedia article works, as far as explaining the basics of the concept goes. It lacks severely in explaining how real life makes the application of the technique/heuristic much more difficult and error prone.

    [James’ Reply: Explaining the basics is exactly what the article is NOT doing. Those are not the basics, my friend. ECP is logical, not technological. It can be applied in many interesting technical ways, but those are variations or instantiations of the idea, not the basics.

    In that sense, I think you have it backwards, the article does present a “real world” example of a particular way to apply ECP, and yet present that one example as definitional.]

    I am looking forward to reading in part 2 how you are expanding the concept outside this technical constraint.

    • Hi James,

      I should probably have stated clearly that I agree with you insofar as I think the article is badly written.

      [James’ Reply: My claim is not that the article is badly written, but rather that it is factually incorrect.]

      “Even from a programmer’s point of view, the article is wrong, if …”

      Not being a programmer myself, I cannot debate this with you. I just inferred from my experience with programmers that they seem to work well from examples and are quite capable of transferring what they learned from them to their own problem-solving.

      [James’ Reply: I agree, but how does that relate to the point I was making in my post? The article, no matter for whom it is written, gives a false description of equivalence class partitioning.]

      And it seemed to me that the article might “work” for that kind of person. I am not saying that it will lead them to a good understanding of the underlying concept, but that they might get from the article what they were looking for. (Hmmm, my test expert mentioned my unit test suite might benefit from something called Equivalence class partitioning; don’t quite remember how to do that, lets look it up…)

      [James’ Reply: If they were looking for a description of ECP they won’t find it in that article. They will not go away with an understanding of it. The article perpetuates a shallow and false view of ECP.]

      “ECP is logical, not technological.”

      I am not sure if we are using different words to mean the same thing here? Looking at “The Domain Testing Workbook” (I should add Douglas Hoffmann to the list of authors) it looks very involved.

      [James’ Reply: The authors like to teach with rich examples, but the technique itself is not about technology. It’s about logical relationships of entities within models. It is a very simple concept that when situated in real life becomes complicated for reasons that have nothing to do with the technique itself. What I’m complaining about is how the article missed the point of ECP and described an example of ECP in misleading terms.]

      And I don’t really see me applying all this logical workflow for anything but the most trivial applications, like the one given in the wikipedia article (and there it really doesn’t seem necessary either).

      [James’ Reply: I don’t know what logical flow you are referring to. ECP itself does not involve a complicated logical workflow. Perhaps you are referring to a specific application of ECP in a particular circumstance.]

      And it is this heavy framework I expect you to lift from my understanding of ECP with part 2 of the series 🙂

      “… the article does present a “real world” example of a particular way to apply ECP, and yet present that one example as definitional.”

      Hmm, I’m not reading it this way. The paragraph before the code says “The demonstration can be done using a function written in C:”, which to me indicates an example as guidance for explaining some of the concepts (like “valid” and “invalid” partitions).

      [James’ Reply: I don’t think you could have read the paragraph above the example very carefully, man. It defines ECP specifically in terms of code coverage. It formalizes this in terms of equivalence classes within “computable functions.” That is wrong. That is a specific and narrow and technological view of ECP that makes us weaker if we embrace it– as opposed to the logical, general systems view, which applies to many more circumstances.]

      • Hi, James!

        In the post, you write “An equivalence class is simply a set. It is a set of things that share some property.”

        The Wikipedia article states “The fundamental concept of ECP comes from equivalence class which in turn comes from equivalence relation.”

        The linked article on equivalence class states “Formally, given a set S and an equivalence relation ~ on S, the equivalence class of an element a in S is the set…of elements which are equivalent to a.” (I have removed the formula from the quote.)

        So far, I see no disagreement between your statement and theirs.

        [James’ Reply: Me neither. That’s obviously not the problematic part.]

        The Wikipedia article on ECP continues with some questionable attempt to relate software systems to equivalence relations. I have read that passage of course with my own understanding of what ECP is. But after re-reading it, I agree that to someone reading about ECP for the first time, it must give the impression that code coverage is _the_ defining equivalence relation. That is, of course, wrong. It is one possible equivalence relation, which may be useful for certain kinds of unit tests, and, as you already pointed out in your post, quite useless for (almost?) all other types of tests.

        [James’ Reply: Okay, thanks. That was my point. I don’t think it is questionable (in the sense of dubious) to relate one example from computing to ECP. My problem is that the writer(s) are claiming this is definitional. Definitions are important. Definitions allow us to use a concept with confidence even while relating it to new situations. With overly narrow definitions, we will either get an explosion of highly related terms, or the terms will be used in a systematically sloppy way. That’s why I generally argue on the side of general-systems-based definitions of basic terms and then using adjectives and noun phrases to construct more specific ideas from those.]

        In your reply to my previous comment, you write: “I don’t know what logical flow you are referring to.”

        I am referring to the workflow described in “The Domain Testing Workbook”. This workflow involves up to 18 steps (A to R), not of all of which need to be applied in all situations, of course, but at least you’d have to check whether a given step applies in the current situation.

        [James’ Reply: I am not familiar with that, but if it is claiming to represent the fundamentals of ECP, then it sounds too complicated to be reasonable. I collaborated with Cem and Doug for many years, and I know that they like to use rich examples for teaching. This sometimes comes at the cost of conceptual power and clarity. I suspect that the logic flow they described is but one possible flow, related to one situation, and not the template for all ECP. ECP is, at its heart, very simple.

        Back in the days when I was in the room with Cem (for instance, while working on our Lessons Learned book) we would have spirited arguments about the value of abstraction vs. example. I have come to a greater appreciation of examples. I don’t know if he has come to appreciate the power of abstraction.

        My commitment not to criticize Cem prevents me from going deeper into this.]

        Cheers,
        Marc

  6. Hi James,

    I found your blog (and this specific entry) after attempting to understand ECP better through reading the article on Wikipedia.

    I’m pretty new to testing and am finding it difficult to find quality sources of useful/practicle information. The Wiki article wasn’t much help.

    I’m looking forward to Part 2 on the topic, and the debates that follow soon after in the comments section.

    Thanks!

  7. I look at ECP as starting point for a modeling exercise. Outcome of ECP would a test model of the system and sets of test conditions (as you define here) that I want to use to setup the system and observe.

    If I were to apply ECP as a technique to a system – I would ask “what kind of model of the system” would I need to create so that I can arrive at test conditions. One model might take a black box view (outside in or behavioral) and another might take white box view (inside view, code, libraries, OS, hardware etc).

    [James’ Reply: Isn’t it true that pretty much any model can give you test conditions? So, isn’t a better question simply “what is this system?” The answer to that question is the model that you want.]

    Your suggestion of thinking in terms of test conditions (instead of input or output alone) makes me thinking. Can I say test condition is same as “state of application? (and that of OS, network, hardware and so on) ? I mostly have seen ECP used as technique to sample input space (explicit inputs from user of the system).

    [James’ Reply: a testing condition (as defined in RST) is not the same thing a state. A test condition is some aspect (or, of course, combination of aspects taken together, which is also an aspect) of a product that you might examine (directly or indirectly) during the performance of a test. Thus a particular state of a product IS a test condition. But also a line of code is a test condition. A field on the screen is a test condition. If you need to run the test at midnight, then midnight is a test condition. “Condition” in this usage is drawn from the broader sense of “a convention, stipulation, proviso, etc” (because it a truth upon which we predicate a particular test, e.g. “I want a test that includes this button”) rather than “mode of being, state, position, nature” (although this second meaning of condition is also part of what we mean by condition and is included in the first sense).]

    Doug Hoffman’s diagram on how system can fail can be useful here to think about explicit and implicit inputs and outputs.

    Connected to ECP is infamous Boundary Value Analysis (infamous because it is abused extensively).

    [James’ Reply: BVA as it is normally practiced seems to be just a couple of dopey heuristics tacked onto ECP when you are dealing with ordered sets.]

    The model I use to think about BVA is that of a strip of a paper (rectangular) with shorter edges of the paper as two boundaries. This strip represents continuous range of values that one part portion of a system can accept and process further (eg An input field in a web form). A blackbox model would focus on all values that can be provided from end user and white box model would focus on how the value propagates through various programming constructs and finally hits CPU.

    For ECP, similarly, I would chose a model of the system and then ask where and are the equivalent classes of test conditions. Conversely, ECP should be able to help us in developing various models of the applications.

    Shrini

  8. Here is a model that I can think of for test condition/state
    Let us think an application in running condition is like a running video that we are observing/seeing/listening/thinking.
    If I pause the video at say 13:15:45 hours ( a specific time slot) today – I am examining a test condition for the application. I can ask questions like how is the audio volume, video clarity etc. Here, I think I am examining the state the application. To me, state is like printout of all parameters and values that characterizes the system at a specific time instant.

    [James’ Reply: I understand that is how you see it. However, as I said, that is not the meaning of “condition” that we use in Rapid Software Testing. I prefer my definition because it is in keeping with our philosophy of more broadly inclusive (and therefore easier to use) terms. For the concept you are referring to, we already have a term, and that is “state.”]

    >>>But also a line of code is a test condition. A field on the screen is a test condition.
    I would say line of code or field on the screen are structural attribute of the system. When the code is not in running state – meaning the binary of the application is not loaded in memory, it is on file system, line of code is simply a structure. When binary is loaded in memory, the line of code lets say creates a variable on stack – it gets the life. Hence for an element to be considered as part of test condition – it has to be live.

    [James’ Reply: That’s true for your definition of condition, but not for mine.]

    However, I also have an exception to this. A binary file lying on a file system (not live in a sense – could be impacted by a disk defragmenter run on the file system or OS update or simply computer reboot or power outage. So in a way even non-live structure can be part of test condition

    I know this is diverging idea. I am just thinking loud.

Leave a Reply

Your email address will not be published. Required fields are marked *