Can Good Items Have Bad Item Statistics?
Written by Sarah Toton of Caveon Test Security
Every psychometrician knows that tests can be improved by removing misbehaving items. Items that are too hard, too easy, or don’t correlate well with the rest of the test items or the overall score are often flagged for revision or removal from a test. But what if that isn’t the full story? In this article, I hope to provide you with evidence that good items can have bad item statistics.
The item I will present was administered as both a multiple-choice item and a Discrete Option Multiple Choice (DOMC) item, with similar results. We will discuss the DOMC item version here. For an overview of how Discrete Option Multiple Choice (DOMC) items work, see Figure 1, below. In a DOMC item, the item stem is presented and then options are presented one at a time (in a randomized order). Using standard DOMC scoring, an item terminates when 1) the correct option is endorsed* as correct (item score of 1), 2) the correct option is not endorsed (item score of 0), or 3) an incorrect option is endorsed as correct (item score of 0). After the item terminates, one of the options may be given as an extra, unscored option with some pre-determined probability (e.g., .40 for example indicates that 40% of the time, an extra option is given when the item terminates) for security purposes (to avoid disclosing the correct answer to the item). Then, the test moves on to the next item….