Written by: David Foster, CEO, Caveon Test Security
Performance testing means different things to different people. To some, it’s using simulations in a test; to others, it’s requiring a test taker to construct a response, rather than selecting a response from a list. Some believe that using a software application during a test is performance testing. Others argue that performance testing should use video, graphics, audio or animations as part of the questions. Still others feel you need to observe a person performing the skill. Some simply believe that a test without multiple choice questions is a performance test.
Personally, I don’t think performance testing is any of these things, or even all of them put together. Too often they are used in an exam, sometimes for political or marketing reasons, and fail to actually measure the important skills. To me, performance testing is designing items and tests to better measure the important, identified skills. You may not think this definition of performance testing is a very interesting one. It sounds like it is stating that a performance test is just a good or valid test. You may think that this is what every psychometrician should be doing. In my opinion, you’d be right on both counts. A performance test is simply a well-designed test with ample evidence of validity. It might have some innovative item designs, but then again, it might not.
It’s possible that performance testing may involve simulations, software applications, audio/video, the production of a response that involves behaviors other than mouse movements, bubble filling or keyboard inputs, but the use of those depends completely on whether they would be helpful in measuring the skill. If a multiple choice question format is the best tool for measuring a particular skill, then it should be preferred and used, perhaps exclusively. If the skill requires more complexity, then it is important to bring any or all of the design innovations to bear.
More than twenty years ago I was tasked with creating a valid test for a network administrator. Some of the skills were behavioral, asking the certification candidate to complete network administrator tasks competently. Other skills were more cognitive in nature, some describing the important ability to recall features of the system, capabilities of the hardware and software, typical errors, etc. Still others required the analysis of existing technology capability, future upgrade planning needs, etc. Not surprisingly, a test measuring such a diverse set of skills required a multi-faceted design. Of the 70 or so items in the test, about half were brief simulation items requiring the candidate to actually complete important network administrator tasks in a minute or two. The remaining items were mostly multiple choice, hot-spot and drag-and-drop items capable of assessing the candidate’s cognitive skills. All items were intermixed and randomly administered. I collected evidence of validity and candidate satisfaction and presented the results at a conference or two. The test strongly correlated with actual job performance and was highly praised by candidates and employers alike.