I have often said something like “We found a hundred bugs!” Lots of people have heard me say it. Statements like that are very valuable to me. But we should ask some vital questions about them.
Consider Raisin Bran cereal. If you lived in America and weren’t in solitary confinement during the 80’s an 90’s you would have seen this commercial for Raisin Bran at some point (or one like it):
Two scoops of raisins!
Huh? Two scoops of raisins? What does that mean?
Perhaps the conversation went like this:
“I want my Bran to have MANY raisins!” barked Boss Kellogg.
“But, Mr. Kellogg, we already include nearly one full standard scoop.” replied the Chief Cereal Mixer. “No one has more raisins than we do.”
“Increase to maximum scoop!”
“But sir! that would violate every–”
“TWO SCOOPS! And damn the consequences.”
“The skies will be black with raisins!”
“Then we shall eat in the shade.”
I doubt anything like that happened, though. I suspect what happened is that somebody mixed some raisins with some bran flakes until it tasted pretty good. Maybe he adjusted it a little to optimize cost of goods (and perhaps they adjust the bran/raisin ratio as cost of goods change). Later, I bet, and completely unrelated to the engineering and manufacturing process, Kellogg’s advertising agency decided to create the impression that customers are getting a lot of value for the money, so they invented a distinguishing characteristic that actually makes no sense at all: an absolute measurement called a “scoop”. And began to speak of it AS IF it were meaningful.
The reason the measurement makes no sense at all is that the “Two Scoops” slogan was pasted onto boxes of substantially different sizes. But even if the measurement makes no sense, the pretentious claim makes a lot of sense, because we humans don’t think through the rational basis of measurements like this unless we are A) rather well trained, and more importantly B) highly motivated. So our unconscious lizard brain says to itself “two means yummy. two means yummy. means two yummy. yummy two…”
At some point, someone (an intern, perhaps) may have asked “But are there actually two scoops of raisins in those boxes?” and the answer was much laughing. Because it could be argued that if there are at least two raisins in the box, then there are two scoops of raisins in the box. It could be argued that if there is one raisin in the box and you used two scoops to measure it (“measure twice and cut once”) then there are two scoops of raisins in the box. If you make up your own measuring unit, such as, say, “scoop”, you can go on to make any other claim you want. This is exactly the point of Jerry Weinberg’s famous dictum “If quality doesn’t matter, you can achieve any other goal you want.”
I was thinking about doing a scientific analysis of this, but someone beat me to it.
Oh What Silliness… OR IS IT?
We have a real problem in testing, and no good solution for it. We are supposed to report the ground truth. Concrete reality. But this turns out to be a very difficult matter. Apart from all the problems of observation and interpretation, we have to summarize our findings. When we do that we are tempted to use scientific tropes (such as nonsensical measurements) to bolster our reports, even when they are poorly founded. We are often encouraged to do this by managers who were raised on Kelloggs commercials and therefore confuse numbers with food.
Let’s look once again at the Raisin Bran situation and consider what might be the reasonable communication hidden there:
Maybe “two scoops” is intended to mean “ample” or “amply supplied with raisins.” In other words they are saying “You won’t regret buying our Raisin Bran, which always has enough raisins for you. While you’re eating it, we predict you will hum the ‘two scoops of raisins!’ song instead of calling a lawyer or becoming a cereal killer.”
I think there’s a scale built into all of us. It’s a comparative scale. It goes like this:
- Minimum Possible
- Nothing
- Hardly any
- Some
- Enough
- Plenty
- Remarkable
- “OMG! That must be a record!”
- Maximum Possible
This scale is a bit of a mess. The italicized values move around (e.g. maximum possible may be not enough in some situation). The others although fixed relative to each other, aren’t fixed in any way more definite than their ordering. The scale is highly situational. It’s relative to our understanding of the situation. For instance you might be impressed to learn that the Colonia cable ship, which was the largest cable ship in the world in 1925, could carry 300 miles of cable in her hold. If so you would be very easily impressed, because I just lied to you… According to that article it actually could hold 3,000 miles of cable. (However, bonus points if you were thinking “what KIND of cable?”)
What I do with bug numbers, etc.
I want you to notice my first paragraph in this post. Notice that every sentence in that paragraph invokes an unspecified quantity.
- “I have often…” Often compared to what?
- “Lots of…” Lots compared to what?
- “Very…” Very compared to what?
- “Vital…” Vital compared to what?
You could say “He’s not saying anything definite in those sentences.” I agree, I’m not. I’m just giving an impression. My point is this: an impression is a start. An impression might be reasonable. An impression may make conversation possible. An impression may make conversation successful.
Most engineering statements like this don’t stand alone. Like flower buds, they blossom under the sunlight of questioning. And that’s why I can’t take any engineer seriously who gets offended when his facts are questioned. They cry: “Don’t you believe me?” I answer: “I don’t know what you mean, so belief has no meaning, yet.”
So, as a professional tester who prides himself on self-examination, I am ready for the probing perspective question that might follow my attempt to send an impression: “compared to what?” I am ready for the data question, too: “what did you see or hear that leads you to say this?”
I strive (meaning I consciously and consistently work on this) to be reasonable and careful in my use of qualifiers, quantifiers, quantities, and intensifiers. For instance, you will notice that I just used the word “reasonable”, by which I intend to invoke images of normal professional practice in your mind (A LOT like invoking the image of two healthy reasonable scoops of delicious raisins).
One important and definite thing that is accomplished by this relatively loose use of language is that it allows us to talk to each other without bogging down the conversation with ALL the specifics RIGHT NOW.
Kelloggs used the method mostly to trick you into buying their bran smothered raisin products. They didn’t have any reasoning behind “two scoops.” But we can use the same technique wisely and ethically, if we choose. We can be ready to back up our claims.
For Bugs: If I tell you I “found X bugs!!” in your product, the number of exclamation points indicates the true message. An exclamation point means “remarkable” or “lots.” If I tell you I found a lot of bugs in your product, I mean I found substantially more than I expected to find in the product, and more than a reasonable and knowledgeable person in this situation would consider acceptable. And by “more” I don’t mean quantity of bug reports, I mean the totality of diversity of problems, impact of problems, and frequency of occurrence of problems. The headline for that is “lots of bugs” or maybe I should say “two scoops of bugs!”
Gerald M. Weinberg says
Terrific article, James. I have a couple of comments (two scoops of comments, actually).
1. Before the problem of scoops is the problem of *counting* raisins (which the scoops business is designed to obscure). Personally, as concerns raisins in my cereal, if I’m interested in anything at all, that comes even before the counting problem–who cares that much about raisins, but the “two scoops” business presumes and suggests that’s what we should be interested in. Not, for example, the *taste* of the cereal, which in the case of raisin bran is disgusting (IMO). Gee, isn’t that like software, after all, where some people think “quality” is measured by the absence of bugs (bugs found, actually, which is different). What if I’m not interested in the app in the first place?
2. But counting raisins doesn’t make much sense, either, since raisins come in a great variety of sizes (as do scoops). (as do bugs–whaddya know?). To understand the significance of the “two scoops” claim, we’d have to know not so much about scoops as we do about *scooping*. So, I visited the cereal factory (we have one of those) and watched them boxing their raisin bran (not *making* it–God only knows who makes it, or what they make it out of). The cereal flakes come in a huge vat, as do the raisins, in a second vat. Since the customers want their raisins mixed well with the cereal (or the manufacturer wants them mixed well so the customer can’t see how many raisins s/he’s getting), the two ingredients have to be premixed before they fill the boxes. So, there’s no *scooping* at all, but just *dumping* and randomizing by spinning. Therefore, the “two scoops” claim has to be based on measurement after the dumping of the vats and the filling of the boxes. Therefore, it can only be statistical (like the study you cite). What they’re actually saying, then, as you point out, is that there are “lots of raisins” in this box. Of course, there couldn’t be parallels here in s/w, could there?
3. But I save the most important comment for the last. What’s wrong with you James? Don’t you understand how the game is played? Or do you understand, but intentionally not follow the rules? Customers are not supposed to *think* about what advertising slogans really mean. If they did that, the Great Depression would come to look like a tiny economic blip compared with the collapse of the American economy. Are you some kind of un-American? Can you show two forms of proof of your citizenship?
[James’ Reply: Good points. BTW, I almost included a section on “What the heck is a raisin?” similar to Rick Brenner’s famous analysis of the varieties of vanilla which caused me to stop saying “plain vanilla” forever after.
And your comment reminds me of how most of this post is a direct application of ideas I first encountered in at least six scoops of books you have written. So, thank you for that, too.]
And, since you write about software, don’t you see what will happen if managers and (especially) customers stop playing the game by following your example? (Oh, wait a minute, it’s actually happening. Sorry for the rant. It’s too late.)
[James’ Reply: Yeah, that’s why the last section of the post is about the key difference between being an ordinary blowhard and being responsible. We can be ethical and still speak simply if we use basic qualifiers (such as the General Semantics idea of appending “etc.” to every list, or your idea of drawing boxes with wavy lines) and learn to spot our own “marketing words” and be prepared to back them up with further discussion and evidence.]
Gerald M. Weinberg says
Somehow, the url in my comment (above) got messed up (it’s a raisin, now). Hopefully, this comment will have the correct link: http//www.geraldmweinberg.com
[James’ Reply: I think I fixed it.]
Tim Western says
Great post James, (and awesome comment Gerald). As I thought about the two scoops of raisins commercial, which I remember as a kid, the thought occurred to me that their slogan might not have been about the quantity of raisins in their cereal at all. In fact, it may not necessarily even be meant to make raisin bran seem like a “high quality cereal per say”. I would consider the point that it was designed specifically to be catchy and to bring people to talk about their product. The thinking here is when you are advertising a product like Cereal, how many people really are driven by commercials alone? How many will be more interested in trying and perhaps eating regularly a particular brand or variety of cereal if a friend has ever mentioned it?
I can almost see this kind of thing happening: Bobby and Marcus are friends. Bobby sees the Raisin bran two scoops commercial and starts a conversation with Marcus about it. “Did you hear about Kelloggs Raisin Bran?” Bobby says. Now perhaps Marcus’s answer would be that no he hadn’t seen the commercial, at least early in the ads play that’s a possibility. So Marcus may reply, “No what about Kellogg’s Raisin Bran?” “They’ve got two scoops of Raisins in every bowl.” (See Bobby doesn’t think of the advertisement as not being a quantifiable Idea, he just thinks wow listen to how great they say their product is. The result of this conversation in many cases would be that Marcus might just have an interest in trying raisin bran. Maybe he’d never had it before, but sure enough after the add had run for a long time, and nearly everyone had heard the slogan, the ad begins to reinforce itself and at some point you may not even think about any other bran even matching up because of that apparent fantastic claim, which btw Is typically the point of Advertising.
It’s like the following quote “There’s nothing so absurd that if you repeat it often enough, people will believe it.” Now I’m not saying that Kellogg’s was intentionally lying here, but there is reason to suspect they may have stretched the truth a little to try and push their product. Infomercials are the same way typically. Why does it matter if a Vaccuum cleaner can hold a bowling ball anyways? I want it to pick up all the dust and dirt and filth it encounters, not be so powerful my furniture may get stuck to it 😉
Great blog entry James! It really got me thinking this morning.
[James’ Reply: Thanks for that different perspective. I wonder how that might apply to testing. Could it be that some methodologies in our industry are motivated by the wish to get people to talk about them? I guess that’s sort of the definition of a meme.]
Joe Strazzere says
So in this instance, Gerald’s “two scoops of comments” appears to equate to about 3 paragraphs of commentary.
I rather like the idea of using indefinite measures. Then next time someone asks me how long it will take to test a build, I’ll tell them that my estimate is “two scoops of work”.
Simon Morley says
Good post!
Putting my critical thinking hat on – with regard to the first paragraph I could ask: What is the time period for “We found a hundred bugs!” In the product under test, under the whole 6 months development, ever in my career? I won’t bother asking about the severity of those bugs just now…
Ok, hat off now…
[James’ Reply: That hat looks good on you.]
There’s an interesting point here (at least my interpretation.) That is that we testers need to distinguish between our critical reasoning abilities and our abilities in testing communication (or telling the story about the product.)
This is something I’m currently thinking about – how testers communicate to non-testers. It’s very important to find that balance between the rhetorical devices (“lots”, “more than I’d expect”, “very”, “vital”) and use of testing results/figures (evidence).
In many cases the “numbers” only need to be a footnote to the story, for example:
“We found a number of bugs in the installation procedures for the application [ref x], of which 2 have been agreed with the technical product support group as been urgent to fix before the next release.”
Here, the actual number doesn’t matter for the “headline” story – the fact that there’s at least one that “must” be fixed (not according to the tester) is the important information here. (This isn’t the whole story either – I’ve missed the retest, areas not tested yet, areas for further test or investigation and the silent evidence part of the story.)
It’s very much a case-by-case issue but finding a formulation of telling the story without metrics (avoiding the metric traps) is something all testers could benefit from – that may invoke some rhetorical devices, but the important thing (I think) is that the “rhetoric” or the common language between tester and non-tester is handshaked/understood beforehand – this might be done by confidence building (brand-building by the tester even) right from the start of the project.
[James’ Reply: Good points, Simon.]
Abraham Heward says
Great post, but ARGH!
The past few days I have been working on a blog post comparing the medical pain scale to measuring progress in software projects. I read this post this morning and realized we were saying substantially the same thing.
The wind’s out of my sails, now. Guess I’ll keep in my drafts folder for a while.
[James’ Reply: It’s not the topic, Abe, it’s how you write it. Go post that thing!]
James Christie says
Thanks, James. That was a good way of looking at a problem that’s clearly bugging people these days.
That’s just one of the reasons that Abe should go ahead with his piece. I blogged about the same sort of thing recently. I then realised how many other people were saying the same thing. If I’d seen all the top blog posts from other people on this subject before I wrote mine I wouldn’t have bothered.
However, I’m glad I did. One great blog article won’t change the world. We need testers to keep speaking out till people who don’t get it yet start to realise that they’re clinging to a view of the world that just doesn’t work. It’s not about one person, or a few, passing on great ideas. It’s about loads of testers taking in, and passing on these ideas to people in their own network. It’s a responsibility for the community, not a few individuals.
Brent says
Good post James! I found an ass-load of bugs today! (snicker-snicker)
http://qainsight.net/2007/01/12/HowMuchIsAnQuotassloadquot.aspx
Duane says
Now that’s food for thought it I ever saw it!! LOL
Michael M. Butler says
Joe: There’s a kindred expression from some software developers I used to work with…
Bossoid: “How long will it take to finish coding [x]?”
Lead Coder: “Two weeks!…”
[Bossoid gives a satisfied grunt and makes to leave]
Lead Coder: “…But they’ll have to be /the right/ two weeks…”
Markus says
Reading the “Tipping Point” I again stumbled over lots of things that I didn’t know or had another idea about. James, when you write “An impression may make conversation possible”, I have the impression that there are very few situations when this is not the case.
Aren’t all bugs we report “impressions”? I perceive something and make my mind up about it, then tell it to someone who might have another idea and so this difference will provoke discussion of some kind. Even backing up the report pointing to some document someone might say, “hey this is an old version of the doc”, bingo: discussion.
[James’ Reply: I think the point is not that they are impressions, but whether they are ONLY impressions, and have no interesting status beyond that.]
Lanette says
This article cracked me up. You forgot my favorite measurements, the metric buttload of bugs, the shedload of issues. In all seriousness, you didn’t mention the “OMG TOO MUCH! make it stop!” or the “I feel robbed. You are stingy.” measure. These measures are subjective, but important because it is about making judgments on a value of the amount. There comes a point in scoops of raisins where you want it to stop. Where there is nary a bran flake to be found and this is just Raisin Raisin with Bran flake (singular).
Lanette says
I thought about this all day, so much that I used 2 scoops for a measurement of learning in my own blog.
It doesn’t matter that it is imprecise because humans aren’t precise and ultimately quality is value to “some person(s)” at some point in time. Because quality is subjective, it is a squishy topic no matter how much we want it to be a hard science. We can make it as scientific as possible, but that satisfies only our developers and some of the stakeholders.
I do my best to explain the subjective stuff. I mean really share with others HOW to test for the squishy stuff and I get told I have “hardly any content” when I do. It sort of makes me want to give up, and scream, “I didn’t make it FOR YOU!” I try to share testing ideas with the people I like. The people who see value in the subjective. The people who see not only is it a forest full of trees, but the trees are all different in some way from one another. Never a tree is a tree is a tree. Yet the number of people who think specifying and documenting and standardizing exactly what a scoop is and making it transparent will solve “the problem”. Most people want some good cereal compared to other cereal. They don’t give two scoops of bird poops about the exact measure of two scoops. So, I think the marketing people are right on here compared to the people who care exactly what size each scoop is. One is trivia, the other has emotional resonance for humans over time.
[James’ Reply: Do you think anything that has resonance is acceptable? Is it possible for something to resonate and yet be misleading or untrue? I’m told that there are two scoops of raisins in the package. This seems to be a message about how many damn raisins are in the package. And yet maybe it’s not. Maybe it’s just an attempt to coldly manipulate us. What do you say to that?
I don’t have a problem with subjectivity. I am concerned about cynical manipulation.]
Oliver Smith says
Are those raisins, organic or GM raisins. Are they sun dried Californian raisins, or are they the value ones I get at my local branch of . I guess the point is you may have a misleading quantity, but the things being measured is also misleading in its level of clarity around what you’re actually getting.
I think that one of the things I would take out of the “two scoops of raisins” message is it’s simplicity and it’s total clarity. It gets the message across that the advertiser wants really well (obviously unless there are software testers and other such critical thinkers about). I think sometimes we can spend too long digging out the detail, counting the beans wrapping them up, when really all we need to be saying is “heck it’s two scoops darn it!”
[James’ Reply: I don’t think we need to be saying anything that doesn’t mean anything. “Heck it’s two scoops darn it!” doesn’t mean anything to me that corresponds with reality, in this case.
So, what is it that you mean when you say that?]
Oliver Smith says
I think my point is that we can spend forever trying to count those beans when in reality when you’re working on something you intrinsically know that there are far too many bugs. The process of justifying it to the nth degree is crippling to the project.
[James’ Reply: Fair enough.]
Lanette says
I just cracked up thinking about what would happen if they remade the commercial today and were forced to . The song sounds a bit less catchy when it says afterwords in the legal voice “Each scoop is actually 8oz of raisins by weight. A scoop is meant to represent one cup in the Family Sized 16oz raisin bran. Smaller packages contain less than 16oz total raisins, which is less than 2 scoops so that the ratio of bran flake to raisin remains pleasingly consistent. By raisin we mean dried grape. If you require more detail on what a raisin or a scoop is, read the side panel and please don’t sue us so we have to add this to each and every commercial like the creepy medication commercials. In fact, the commercial for Ambian sleep medication which DID include the scariest list of side effects I’ve heard and the commercial for eyelash enhancement products which included long term changing the color of your eyes iris to a darker color each scared me so bad I have no idea why the companies bothered to make the advertisements at all.
I do think that there is a risk of measurements without meaning being used to imply something false. However, more precision in those measurements doesn’t do much to improve the ethics or make the average person interested.
kristin says
We recently had a candidate for Governor in NY who brilliantly (i thought) reduced all political, statistical, metrics-reliant and human elements all into one clear statement..’the rent’s too damn high.’ He was able to deliver the line with an apparently endless assortment of emphasis on one or another of the 5 words.
when empirical measurements fail to communicate meaningful information, we naturally rely on self-serving interpretations, and then delivery, as in all things, is all.
love this site. I only wish I could find a job that gives me even a small tiny particle of the inspiration and motivation I find here.
Anna Andersson says
Thank you all for a very interesting discussion.
Communication with soft, ambiguous values is a very difficult task and I would say that much of if boils down to who you are speaking to and why. As a tester you many times work as a politician or a salesman, trying to sell the idea of something being wrong to different people or in the organization in order to make them take action. A bug you consider “sever”, might be regarded as “not significant” by a developer or stakeholder. You need the exclamations to really make your point and sell the argument that this bug is really important.
The phrase “There are 20 bugs” or “20 scoops of bugs” does not mean a thing as you say, but can be useful to emphasize something. What it boils down to is that the tester should be able to answer the question “Are we good to go?”. It should be sufficient with a simple “yes” or “no” to that question. Yes does not mean “All bugs are fixed” it means the quality of the product is sufficient for the purpose of the product. But in reality the yes or no are not always sufficient. That when you need the 20-bugs-reason, which can be in whatever measurement you like as long as you sell your point “No, we are not ready to go!”.
All depends on how much you trust your tester to be able to do their job. It is not a matter of measurement in quantity “how many” or “how much”. If you are wise you will ask the tester “WHAT do you think remains to be done before we are good to go?” and then trust your tester when she says “I have often…”, “Very…”, “Vital…” because a tester is hired to make use of their common sense. And how do you measure common sense?
Robert Strauch says
Ok, it’s been five years since the last comment but nevertheless I would like to add something to Anna’s posting 🙂
Anna, you say: “What it boils down to is that the tester should be able to answer the question “Are we good to go?”. It should be sufficient with a simple “yes” or “no” to that question.”
I would disagree with that. As a tester I cannot answer that question with a simple “Yes” or “No”. In fact, I cannot even answer it at all. In my opinion there’s only one person (or maybe a group) who can answer that question and that is who “owns” the product. Quality is the value a product creates for someone who matters. As a tester I do not matter. Maybe your boss matters, maybe the business analyst, maybe the legal department. That may sound harsh but it’s the truth.
The maximum answer to that question I will give in my project would be “If I mattered, I probably would release the product” or “If I mattered, I probably would not release the product”. But the tester often does not have all the relevant information when it comes to a release decision.