An introductory text reads, Tests that claim to measure intelligence are everywhere-online, in your favorite magazine, at job interviews, and in many elementary and secondary schools. But, can all of these tests be trusted? The results of an intelligence test aren’t meaningful unless the test is valid, reliable, and fair. But, what do those concepts mean, and how can we be sure whether a test is valid, reliable, or fair—let alone all three? Let’s take a look.
The first panel depicts validity. Text reads, does the test measure what it intended to measure? An illustration shows a toy duckling placed on a bathroom scale; a question below reads, Is a bathroom scale valid for measuring height. Another illustration shows the toy ducking placed between two vertical graduated rulers. A question reads how about a ruler missing its first inch? The text on the right side shows, a shortened ruler would not be a valid measure because it would provide different results than other rulers. A valid intelligence test will provide results that:
Intelligence tests.
The second panel depicts reliability. A question reads, will your score be consistent every time you take the test?
A shortened ruler isn’t valid, but it is reliable because it will give the same result every time it’s used.
A reliable intelligence test will provide results that:
Three illustrations show the height of the toy duckling being measured thrice from different angles of the duck with a vertical graduated ruler missing the first inch. A graph with shows a bell curve plotting the number of scores against Wechsler I Q score. The scale on the horizontal axis reads 55, 70, 85, 100, 115, 130, and 145. The information presented in the graph is as follows:
About 2 percent of people have an I Q score range of 55 to 70 and 130 to 145; 95 percent have an I Q score range of 70 to 130; 68 percent have an I Q score range of 85 to 115. Text corresponding to 68 percent reads, 68 percent of all people score within 15 points above or below the average score. Text reads: because most intelligence tests are standardized, you can determine how well you have performed in comparison to others. Test scores tend to form a bell-shaped curve, called the normal curve, around the average score. Most people (68 percent) score within 15 points above or below the average. If the test is reliable, each person’s score should stay around the same place on the curve across multiple testing.
The third panel depicts fairness. A question reads, is the text valid for the group?
An animal weighing 2 stone is likely to be a:
(a) Sparrow, (b) Small dog, (c) mature lion, or (d) blue whale.
Unless you live in the United Kingdom, where the imperial system of weights is used, you probably wouldn’t know that a stone is approximately 14 pounds, and therefore the correct answer is B. Does this mean that you are less intelligent, or that the test is biased against people without a specific background? A test that is culture-fair is designed to minimize the bias of cultural background.
An illustration alongside shows the height of the toy duckling being measured in three different units with a vertical graduated ruler. The first shows 3 inches, the second 2.286 Chinese imperial icon, and the third shows 0.1667 cubits.