Written by Dennis Maynes, Chief Scientist
October 2, 2015
A while ago, I noticed that many visitors to the Caveon website search for “cheating statistics.” Being a statistician, I also am extremely interested in cheating statistics. From my perspective, there are four viable and valuable ways to interpret the phrase “cheating statistics”:
- Using “cheating” as an adjective
- Statistical values that track cheating prevalence,
- Statistical methods and procedures that detect and measure cheating,
- Using “cheating” as a verb
- A behavior by which statistical results are falsified, and
- A behavior which seeks to deceive the test administrator after being flagged by a statistical procedure (e.g., obtaining preferential treatment or lying).
Tracking cheating prevalence
In general, I tend to avoid using the word cheating without some additional qualification because it does not convey what a test security professional wants to know. Instead of worrying about cheating prevalence, the test security professional usually desires to measure vulnerabilities, threats, attacks, and risks to the testing program. Prevalence estimates, in general, do not provide actionable data. It does not help very much to know that a certain percentage of high school students admit to having cheated on an exam. The value provides no context for making informed test security decisions. Instead, prevalence estimates need to be oriented toward the goal of understanding where, when, and how test security attacks were carried out. The prevalence estimates need to provide information about the kinds of vulnerabilities exploited by threats. The estimates also need to measure the amount of potential losses due to cheating and how likely those are to occur.
There are different kinds of statistical values that can be used to evaluate cheating or misbehavior on tests. The less valuable are those which trivially seek to measure rates of misbehavior. Greater value is provided by information which can be used to improve test security policies and processes. I have never found estimates of cheating prevalence to be interesting, except as background context because I do not believe they are actionable. On the other hand, I have found key performance indicators which measure the specific security strengths and weaknesses of a testing program to be incredibly useful.
Detecting potential cheating
A large number of methods have been devised for detecting potential cheating. As test administrations continue to migrate to computer based testing, it is likely that new methods will be developed. Even though current methods mainly rely upon test response data, some use other forms of data (e.g., correlations between student performance and transferring to other schools, attendance data, and measures of student performance such as grades). The general problem of detecting cheating can be likened to the problem of detecting unauthorized aircraft in controlled airspace. In detecting unauthorized aircraft, the first priority is to detect the intruder. The second priority is to classify the intruder. In the same way, the first priority in detecting cheating is to determine whether test security was potentially violated. The second priority becomes to determine how, if possible, test security was violated.
Falsifying test results
Being a statistician, I view a test score as being the result of statistical sampling. A test taker was presented with and answered a set of questions. However, the set of questions was only representative of all possible questions that could have been administered (which could have been written if sufficient time and resources were provided). Hence, when test takers “cheat” on a test, they are in reality falsifying a statistical result which is the test score. This is rather important. It means that the question of determining whether test security was violated is really best answered statistically. There is precedence for this. Determination that accounting fraud occurred is best made by reviewing and finding accounting discrepancies. In the same way, determination that cheating on a test occurred is best made by finding statistical inconsistencies in the test result data.
Deceiving test administrators
Those who engage in misbehavior on exams frequently do so knowing that they could get caught and punished. Hence, one way to cheat statistics is to modify data (i.e., by modifying behavior) so that the misbehavior is not apparent to test administrators. For example, a few years ago teachers in one school inappropriately changed answers in test booklets so as to not generate statistical flags with answer changes on scan sheets. Another way to try to deal with compelling statistical evidence of cheating is to attempt to put forth a convincing story which explains away statistical anomalies. Some individuals who are detected by statistical procedures seek to avoid punishment by claiming “Everyone else is doing it,” or “I didn’t realize that it was wrong.” Many of the ways that individuals use to avoid punishment or detection involve some sort of social engineering to convince the test administrator to withdraw action. Sometimes this attempt is the cheating behavior.
The next time that you search for or encounter an ambiguous phrase, such as “cheating statistics,” I encourage you to think carefully about what it might mean. Being aware of all meanings of the phrase “cheating statistics” is critical when your goal and objective is to administer tests fairly and securely. Key performance indicators can provide important information for improving test security. Properly implemented statistical detectors are essential for detecting potential test security violations. Knowing which test security concerns should be addressed first prioritizes the inferential process. Establishing and maintaining fairness helps ensure that statistics are used properly.