Ask an Expert: James A. Wollack
Written by James A. Wollack of the University of Wisconsin, & Alison Foster Green of Caveon Test Security
James A. Wollack is an Associate Professor of Educational Psychology at the University of Wisconsin-Madison, where he serves as the Director of Testing & Evaluation Services at the University of Wisconsin Center for Placement Testing, and as the Research Director for the General Education Assessment. Wollack has published numerous journal articles and book chapters and is a frequent presenter at national conferences on topics such as detection of answer copying, test security, score scale stability, and test construction.
The testing industry has begun to recognize the damage that is caused when individuals receive unearned test scores and the value of test security. However, do you think that the world-at-large (those individuals who don’t work in the testing industry, but who take tests and/or have loved ones who take tests) should care about test security and stopping test fraud?
“What kind of a psychometrician and test security advocate would I be if I said ‘no?’ I do think that the greater community should care, as cheating on tests really boils down to an ethical issue. I don’t need to have lost money in the Enron scandal to care that it happened and to want desperately for it to never happen again.
Testing is affecting increasingly larger segments of society, so many individuals who haven’t yet found themselves needing to take and perform well on a standardized test may find that they have to in the future as their profession evolves or as the individual changes careers, as so many people today are apt to do. And anyone who is related to or is close friends with someone who wants to go on to higher education, work in civil service, or be a teacher, nurse, hygienist, accountant, physical therapy assistant, police officer, or scores of other career choices which are gated by exams, has a personal interest in the security and validity of test scores.”
In your opinion, what is the most exciting advance in test security that has occurred in the past decade?
“This is a great question, as there have been many. Immediately, I’m thinking about the advancement of biometrics to reduce proxy testing; inexpensive, high–definition video cameras that allow proctors to more clearly observe potential cheating from different angles and document what they observe; remote proctoring for administrations which would otherwise, due to logistical reasons, have little choice but to be unproctored; lock-down browsers which prevent examinees taking computer-based assessments from accessing the web and peripheral devices during their exam; the introduction of standards, particularly around proctoring; the advancement of statistical detection methods, especially with respect to detection of group-based cheating, such as preknowledge and test tampering, the offering of an annual conference dedicated entirely to test security; and the openness with which testing companies now discuss security problems and work collaboratively to address them. With all these great advances, I don’t know that I can pick just one, so I might have to pick a 1a and a 1b.My 1a is the dramatic shift towards computer-based testing for K-12 tests. I want to be clear that I’m really talking about the specific application of CBT for K-12 accountability testing, and not CBT in general, which does offer some security advantages but also offsets those with some rather significant security vulnerabilities. The events in Atlanta and many other cities across the country have illustrated quite clearly the problems associated with delivering a paper-based state testing program, most notably that teachers have easy access to testing materials prior to the administration and, more importantly, to students’ answer sheets following the administration. With CBT, teachers cannot access the test prior to administration. To be sure, because the tests are still delivered over a window of some weeks, the teachers may still be able to learn about some of the content, but their mechanism for doing so makes it much more challenging. More importantly, with CBT, there is no reason that teachers should have access to student data post-exam. I realize that not all programs elect to have students’ exams automatically submitted at the completion of the test, but they should. And even if they don’t, response time logs available through CBT should be able to identify any tampering that took place after the exam was supposed to be done.
What I regard as 1b is the expanded use of metal detection wands during check-in. Of course, this technology has existed for more than 10 years, but its application to testing is fairly new. As devices have become smaller, cheaper, and smarter (both with respect to what they can do and how well they are being designed to be virtually invisible to the naked eye), metal detection wands provide perhaps the only mechanism for discovering these technologies and preventing candidates from bringing them into the exam room.”
How does a forensics analysis of test results help a program improve the security of its exams? What types of programs should include data forensics analyses as part of their test security plan?
“I like to tell the story of programs that insist they don’t have a test security problem, hence have no need to invest in statistical methods to detect cheating on tests. It’s very hard to find something that you aren’t looking for. No doubt ignorance is bliss, but it’s also dangerous and irresponsible. Even those programs that are fortunate enough to have not encountered a significant security breach should be taking measures to both prevent those breaches and detect them. I think I can say with confidence that no matter the exam, there are candidates (and those with competing professional interests) who are actively trying to compromise the test during every administration.
Aside from the obvious reason that a forensics program offers the opportunity to identify test scores and items that may not be valid for purposes of promoting fairness, protecting the public, and improving the scaling and operational psychometrics underlying the testing program, beginning a forensics program before a major breach happens is critical because it provides the program with accurate baseline data that can be used to more quickly and accurately identify when breaches do occur and quantify their impact.”
Are there types of test fraud that cannot be detected by a statistical analysis of test results? What might those be, if any? And why might they be immune from such detection?
“As of now, yes. I’d say that item harvesting is extremely difficult to reliably detect. We have some hypotheses about how harvesting may manifest itself, but there isn’t a strong empirical basis for those hypotheses. Furthermore, there isn’t an especially good way to validate an item harvesting index. Proxy testing is also something that can’t be detected by statistical analysis. Fortunately, biometrics can be quite helpful at detecting and preventing this form of cheating.”