The Problem with Pasta Sauce

Written by Benjamin Hunter, Technology Product Manager & Sales Director

May 23, 2017

I don’t know about you, but I love podcasts – old-time radio stories for the new generation.

Last week, I found myself winding down by listening to a podcast of NPR’s TED Radio hour. The episode was called Decisions, Decisions, Decisions, and it digs into the decision-making process and the negative effects that spring from having too many choices. I was hooked, not just because it happened to feature a favorite author of mine, Malcolm Gladwell, but because by the end of the 30-minute podcast, I had learned something unexpectedly pertinent to my work in testing.

Test questions are a lot like pasta sauce.

Let me explain. In the podcast, Gladwell describes visionary Howard Moskowitz, a psychophysicist turned market researcher consultant. In the early 80’s, Moskowitz was approached by Prego (yes, the spaghetti sauce company). Prego had been struggling to compete with marinara mammoth Ragu, and wanted Moskowitz to tell them what products they needed to start selling to be more competitive.

Moskowitz was intrigued. He worked with Prego to create 45 different pasta sauces, and took them on the road to ask thousands of Americans to rate each one. However, when Moskovitz returned to Prego headquarters, his recommendations were not what Prego management had expected, or even asked for. Prego had requested a list of pasta sauces to add to their marinara line-up, but Moskovitz came back with a shocking suggestion.

Rather giving Prego the top ten marinara sauces, he instead recommended that Prego add just one single variety to their product line – extra chunky. Moskovitz had seen a simple trend in the data, which suggested that people didn’t need more options staring down at them from supermarket shelves. What 1/3 of people needed was extra chunkiness. A single, simple, straightforward, product. So, Prego released extra-chunky and over the next 10 years, the company made $600 million from this one line.

For Gladwell, there are two lessons to be learned from the Spaghetti story. The first is that more choices are not always better – sometimes one targeted choice designed to meet specific needs is better than ten variations.

The second, and perhaps more important lesson, is that in their quest for pasta-sauce dominance, Prego had lost sight of the real purpose of the marinara experiment. Somewhere along the line they started believing more varieties of pasta sauce = more profit. In doing so, they became so focused on increasing the varieties of pasta sauce, that they lost sight of their original purpose – increasing market share. If it hadn’t been for Moskovitz, Prego would have certainly succeeded in developing ten more delicious varieties of marinara sauce, but they would have failed in their goal of increasing their profits.

In the assessment industry – specifically banking and delivery software arena that I work in – we are facing much the same dilemma. Yet instead of pasta sauce, we are inundated by the number of question types on the market. We assume that more choices = better questions = better client experience = better tests = more valid results.

Over time, this thought process has simplified to the correlation that more item type choices = more valid test results. In pursuit of this belief we have single-mindedly begun increasing the number of question types available. True/False, traditional 4-option multiple choice, multiple-correct multiple choice, essay, short answer, drag and drop, build list, hot-spot, technology enhanced items (graphics, audio, video), and simulation questions are just a few of the options available to us. Increasingly, the variety of question types have become the guiding factor in how we choose our software, how we design our exams, and how we write our test questions.

Like Prego, we are at risk of losing site of our fundamental purpose. We have become so focused on creating and using more item types, that we have forgotten the end goal is achieving trustworthy, valid, and usable scores.

Valid test scores are the result of so much more than question types. Valid scores are the result of factors such as:

  • Fairness: Is the test free of bias, not offensive or controversial, and equitable to all individuals who might take it?
  • Usability:Is the test user-friendly, efficient in the amount of time it requires, and easy for the test taker to understand and follow?
  • Content: Is the subject matter relevant and effective at determining a test taker’s knowledge-level?
  • Security: Can we be confident that the test takers did not achieve their score through unethical or fraudulent means?

Don’t get me wrong, the type of questions we use in our tests impact all the factors listed above. This is precisely why item-types are so important; why we need to continue to refine and develop them. For example, there is one item-type, the Discrete Option Multiple Choice  (DOMC), that improves fairness, usability, content, and security, and as such should be readily used in our assessments. The DOMC item is about ten years old, but only now are we at a point we can readily design, develop, deliver, and analyze in a cost-effective technology platform.

It is my belief that the DOMC item-type is the testing industry’s “chunky pasta sauce”. Just like Prego revolutionized the sauce industry, DOMC can revolutionize how we approach testing with its focus on security, ease of use, limiting exposure, testing actual knowledge rather than test-taking skills, allowing candidates to focus on options not “best scenarios”, etc. (DOMC has been discussed at length on this blog, so I will refrain from describing it on too much detail. If interested, please see the trydomc.com website for more information.)

The danger in choice overload lies where we get so caught up in developing, using, advertising, and promoting new item types that we begin using them simply for their novelty or (on the flip side) because they are “traditional,” without analyzing how they impact the validity of test scores. More choice is not necessarily better. Like Prego, we can’t get so caught up in developing new varieties, that we forget to analyze whether they are necessary to achieving our fundamental goal. We don’t need more item types, what we need are valid test scores. It is only when it accomplishes this, that an item-type can be considered “chunky.”

Benjamin Hunter

Technology Product Manager & Sales Director, Caveon Test Security