Research Methods Video on LaunchPad
SCIENCE: We must recognize that there are many ways of knowing, but … in the entire course of prehistory and history only one way of knowing has encouraged its own practitioners to doubt their own premises and to systematically expose their own conclusions to the hostile scrutiny of nonbelievers.
—Marvin Harris, American anthropologist (1927–
Over thousands of years, humans have refined everyday thinking to sharpen it and make it less susceptible to the many biases that limit it; the result is the scientific method. Science is a method for answering questions about the nature of reality that reduces the impact of the human biases we have just reviewed.
16
Just like everyone else, scientists make observations, look for patterns in what they observe, and then generate explanations for how or why things happen as they do. These explanations are called theories. Research is the process whereby scientists observe events in the world, look for consistent patterns, and evaluate theories proposed to explain those patterns. Research and theory are simply scientific refinements of the observations and explanations we all make every day to help us get through life. However, whereas ordinary intuitive thinking typically leads us to accept explanations relatively uncritically (especially if they are consistent with our expectations and desires), generating a plausible account of our observations is just the beginning of scientific inquiry.
An explanation for how and why variables are related to each other.
The process whereby scientists observe events, look for patterns, and evaluate theories proposed to explain those patterns.
As the social psychologist Kurt Lewin put it, “There is nothing so practical as a good theory” (1952, p. 169). Theories tell us about causal factors that influence particular kinds of behavior. This knowledge can help us alter behavior in beneficial ways. For example, if theories specify factors that lead to bad things such as child abuse and good things such as charitable giving, we can design ways to alter these factors to reduce the occurrence of the bad behaviors and increase the occurrence of good behaviors. And research tells us whether our theories provide the right explanations. The concept of theory is often misunderstood: In grade school, many of us were taught to distinguish theories from facts. This probably gave a lot of people the idea that the difference between a theory and a fact lies in the level of certainty we have about its truth, as if a theory is a sort of weaker version of a fact that shouldn’t be taken all that seriously. But in scientific thinking, the concepts of fact and theory are entirely different from one another. They serve different functions and play different roles in the process of doing science. A fact is the content of research observations that have been replicated, that is, verified by multiple observers. A theory, on the other hand, is an explanation for the facts. Although a theory may be our current best explanation for how or why things happen as they do, it is not—
Scientific knowledge is continually evolving, moving toward a more and more useful understanding of reality.
To assess the validity of a theory, a scientist starts by deriving testable hypotheses from the theory (see FIGURE 1.3). A hypothesis is an “if-
An “if-
Typically, a theory generates numerous hypotheses. Once they are tested, either the theory is accepted as it is or is revised or replaced in light of the research findings. The reformulated theory (or the new theory) is then used to generate additional hypotheses, which are then tested, and the cycle continues. In this way, through the ongoing interplay between theory and research, the process spirals toward more sophisticated theories that provide increasingly accurate explanations of reality and programs of research that probe increasingly refined questions about these processes. Let’s consider the cycle of theory and research using the example of the development of stereotype threat theory, a topic we will cover more fully in chapter 11.
17
To illustrate the ongoing interplay between theory and research, let’s focus on some influential findings in social psychology that address the question of why people who are members of stereotyped groups sometimes perform poorly on standardized tests of their abilities. This work was inspired by the fairly consistent observation that members of stigmatized groups (groups within a culture that are viewed negatively in some way), such as African Americans and women, tend, on average, to perform less well in certain academic areas—
In 1995, Claude Steele and Josh Aronson proposed a creative new theoretical explanation for poor performance by members of stigmatized groups, which they labeled stereotype threat theory. The basic idea is that if you are a member of a group about which there are negative stereotypic beliefs, engaging in behavior that is relevant to those negative beliefs puts you in a doubly threatening situation. Not only will you be judged as an individual but your performance also will be taken as evidence of the ability of your entire group. So in the context of a test of verbal intelligence, unlike a White male, whose performance is typically taken as indicative of only his own ability, an African American male might worry that a low score will be viewed as evidence of his entire race’s alleged deficiencies in intelligence. Likewise, a woman who misses too many math questions could be seen as confirming the stereotypes of women’s inability to do math. Steele proposed that this resulting experience of stereotype threat is at least part of the reason members of stigmatized groups tend to perform less well in areas relevant to negative stereotypes concerning their group. Steele further posited that, because of the prevailing negative stereotypic beliefs about the group, the situation itself—
This, of course, is a very different explanation from one that assumes that differences in the abilities and potential of particular groups result from either genetic inferiority or a lifetime of experience with poverty or discrimination. If true, stereotype threat theory would also be a nice example of how understanding basic social psychological processes can shed new light on important personal and social issues. But to have any scientific credibility, this theoretical explanation must be tested. How would a social psychologist use the scientific method to assess the validity of the stereotype threat theory? To do so, the social psychologist will have to generate hypotheses from the theory, and then test those hypotheses with research. Consider these two hypotheses that have been generated from the theory of stereotype threat:
The more a person is conscious of the negative stereotype of his or her group, the worse that person will perform in areas related to the stereotype.
Situations that make a negative stereotype of a person’s group prominent in the person’s mind will lead to worse performance than situations that do not.
18
Hypothesis 1 proposes an association between two variables that can be assessed with correlational research. Hypothesis 2 posits that one variable has a causal influence on the other and can be assessed only through experimental research. We will discuss each of these two primary approaches to research in social psychology and how they were used to test these hypotheses derived from stereotype threat theory.
One of the most widely used approaches to doing research is the correlational method, whereby two or more preexisting characteristics (the variables) of a group of individuals are measured and compared to determine whether and/or to what extent they are associated. If the variables are associated, then knowing a person’s standing on one variable predicts, beyond chance levels, his or her standing on the other variable; if this is the case, we can say that the variables are correlated. To test stereotype threat hypothesis 1, we might: (1) measure the extent to which particular members of a given group are conscious of their stereotyped status; and (2) assess each person’s performance on stereotype-
Research in which two or more variables are measured and compared to determine to what extent if any they are associated.
Liz Pinel and colleagues (Pinel et al., 2005) tested this very hypothesis. They first measured the stigma consciousness—the tendency to be highly conscious of one’s stereotyped status and to believe that these stereotypes have a big effect on how one is viewed by others—
A positive or negative numerical value that shows the direction and the strength of a relationship between two variables.
The correlation coefficient (typically indicated by r) gives us two vital pieces of information about a relationship: both the direction and the strength of the relationship (FIGURE 1.4).
The sign, positive (+) or negative (−), tells us the direction of the relationship. A positive correlation occurs when a high level of one variable tends to be accompanied by a corresponding high level of another variable. A negative correlation exists when a high level of one variable is accompanied by a low level of the other variable. If Pinel and colleagues had found that the higher a person scores on stigma consciousness, the better her GPA, they would have found a positive correlation. The negative correlation that they actually found tells us that the higher a person’s level of stigma consciousness the lower that person’s GPA. This negative correlation provides some evidence for stereotype threat hypothesis 1.
The numerical value tells us the strength of the relationship. The strength of a correlation refers to how closely associated the two variables are, how much knowing a person’s standing on one variable tells us about, or enables us to predict, the person’s standing on the other variable. If knowing a person’s level of stigma consciousness enables us to predict his test performance with absolute certainty, the two variables are perfectly correlated, and the correlation coefficient equals −1.0 (or +1.0 if it were a positive relationship). Perfect correlations are virtually nonexistent in the behavioral sciences. When they do occur, it typically means that the two variables are different measures of the same underlying conceptual variable. For example, temperature as measured on Fahrenheit and Celsius thermometers will be perfectly correlated (as long as the thermometers are operating correctly). On the other hand, we would find a correlation of 0 if the two variables are completely unrelated. This means that knowing something about a person’s standing on one variable tells you nothing whatsoever about where she stands on the other. For example, according to stereotype threat theory, knowing a person’s level of stigma consciousness should only relate to his GPA if he is not a member of an academically stigmatized group. Sure enough, Pinel and colleagues observed no correlation between stigma consciousness and GPA for nonacademically stigmatized groups.
19
It’s important to be clear that although the sign of a correlation coefficient tells you whether two variables are positively or negatively correlated, it tells you nothing at all about the strength of that relationship. Thus, a correlation of −0.60 reflects a stronger relationship than a correlation of +0.35.
Pinel and colleagues’ finding of a moderate negative correlation between stigma consciousness and GPA tells us that knowing how sensitive a person is to stereotypes about his or her group gives us some basis for predicting how well he or she is likely to score on measures of academic performance, although we couldn’t predict the person’s performance with absolute certainty or precision. Clearly, many variables other than stigma consciousness influence college GPA. And imperfections in our two measures would also reduce the size of any correlation we observe. Nonetheless, the negative correlation between stigma consciousness and test performance tells us that these two variables are indeed related, which is consistent with the hypothesis deduced from stereotype threat theory.
20
Scientists usually are interested in understanding why variables are correlated. But finding a negative correlation between stigma consciousness and test performance does not allow us to conclude that fear of confirming stereotypes about one’s group causes poorer performance. Correlation does not imply causality. There must be a correlation between the two variables if one variable causes the other, but there are two major reasons that correlation does not enable us to infer causation.
First, although it is certainly possible that stereotype threat causes poorer test performance, it is also possible that the causal relationship runs in the other direction: Doing poorly on tests makes a person especially sensitive to the stereotypes about his or her group, and perhaps fearful that he or she might be contributing to these stereotypes. This is known as the reverse causality problem: Correlations tell us nothing about which of two interrelated variables is the cause and which is the effect.
A correlation between variables x and y may occur because one causes the other, but it is often impossible to determine if x causes y or y causes x.
The second major reason that we cannot draw causal inferences from correlations is referred to as the third variable problem: The two variables are correlated, but it is still possible that neither exerts a causal influence on the other. It may be that some third variable—
The possibility that two variables may be correlated but do not exert a causal influence on one another; rather, both are caused by some additional variable.
In longitudinal studies two variables are measured at multiple points in time. By examining correlations between one variable at time 1 and another variable at time 2, such studies can make us more confident about likely causal order. For example, one classic study of aggression (see Huesmann et al., 1984) found that amount of violent television watched in childhood correlated positively with amount of aggressive behavior in adulthood. In contrast, aggressiveness in childhood did not correlate with amount of violent television watching in adulthood. The result of this longitudinal study suggests that childhood television watching affected later aggression, rather than childhood aggressiveness affecting later television viewing. However, such studies are not definitive about causation because the third variable problem remains. For example, it could be that neglectful parents both allow their children to watch a lot of violence, and for other reasons produce adult offspring with aggressive tendencies.
Studies in which variables are measured in the same individuals over two or more periods of time, typically over months or years.
Fortunately, there is an approach to research that lets us draw conclusions about cause and effect: the experimental method. As a consequence, this method is extremely popular among social psychologists. An experiment is a study in which the researcher takes active control and manipulates one variable, referred to as the independent variable, measures possible effects on another variable, referred to as the dependent variable, and tries to hold all other variables constant. The independent variable is manipulated because it is being investigated as the possible cause. The dependent variable is the one that is then measured to assess the effect. An experiment can tell us if the dependent variable depends on the independent variable. An experiment would be needed to test hypothesis 2: that conditions which increase the individual’s awareness of the negative stereotype of that person’s group (and thereby increase stereotype threat) will reduce the person’s test performance. Such an experiment must involve:
21
Manipulating our research participants’ awareness of the negative stereotype of their group, creating two or more conditions differing in the level of the independent variable: stereotype threat
Assessing participants’ performance on a test that is relevant to that negative stereotype, providing a measurement of the dependent variable
Holding everything else constant within the setting
The experimental method A study in which a researcher manipulates a variable, referred to as the independent variable, measures possible effects on another variable, referred to as the dependent variable, and tries to hold all other variables constant.
When all the requirements of the experimental method are met, the study has internal validity, which means that it is possible to conclude that the manipulated independent variable caused the change in the measured dependent variable. Let’s translate that into a real example.
The judgment that for a particular experiment it is possible to conclude that the manipulated independent variable caused the change in the measured dependent variable.
Steele and Aronson (1995) conducted a series of experiments that provided the first evidence that stereotype threat caused reduced performance among members of stigmatized groups. In one study, African American and White college students were given a challenging test of verbal ability that consisted of sample items from the verbal portion of the Graduate Record Exam. Performance on the test was the dependent measure. To manipulate stereotype threat, the researchers simply asked half of the participants to indicate their race on the answer form prior to beginning the test; this simple act of indicating race was meant to bring to mind the stereotypes about how each participant’s group was supposed to perform on such tests. The other half of the participants, the control group, took the test with no mention being made of race, so they were much less likely to be thinking about stereotype-
As stereotype threat hypothesis 2 predicts, when participants were reminded of their race, there was a significant drop in the performance of African American students but not in the performance of White students (see FIGURE 1.5). This pattern of results is referred to as an interaction, which occurs when the effect of one independent variable on the dependent variable depends on the level of a second variable. In this study, the effect of the reminder of race depended on whether the participant’s racial identity was African American or White. Because African American students are stereotyped in the United States as being less intelligent, for them, the reminder of racial identity led to lower performance; for White students, however, it had no effect. Thus, even though we cannot randomly assign a person to his or her race, the fact that a reminder of race influenced Blacks and Whites differently suggests that racial identity is what mattered here.
A pattern of results in which the effect of one independent variable on the dependent variable depends on the level of a second independent variable.
The experimental method overcomes the limitations of the correlational method so that causal inferences are possible. As we previously noted, the first major obstacle to drawing causal inferences from correlational studies is the reverse causality problem, because you typically can’t tell which variable is the cause and which is the effect. In an experiment, because the researcher determines whether a participant is exposed to the experimental condition (race reminder) or the control condition (no race reminder) and subsequently measures performance, it is impossible for the participant’s poor test performance to have caused him or her to be reminded of his or her race. Causes must come before effects. Consequently, the causal sequence problem is eliminated.
22
What about the third variable problem? Recall that in an experiment, the only thing that differs between conditions is the independent variable. Everything else is held constant. The researcher treats participants in the various conditions in identical ways: the same instructions are given; the physical setting is the same; and any written, audio, and video materials are identical, except for what is to be manipulated between conditions (the independent variable). All this is done so that if there is a difference between conditions, we can be confident that the cause is the independent variable. By holding everything constant across the various conditions in the experiment except the independent variable, the experimenter solves the third variable problem.
But how do we know that the participants in the experimental group and the control group didn’t simply differ on the dependent measure to begin with? And how do we know that differences between the two samples on some other dimension that existed prior to manipulation of the independent variable were not responsible for the differences in test performance that occurred? The potential problem of preexisting differences among participants in the various experimental conditions is solved by random assignment, in which participants are assigned to conditions in such a way that each person has an equal chance of being in either condition (FIGURE 1.6). Deciding which treatment to give each participant can be done by tossing a coin, pulling names from a hat, or using a random number generator to put individuals into treatment conditions.
A procedure in which participants are assigned to conditions in such a way that each person has an equal chance of being in any condition of an experiment.
Random assignment is an essential component of all experiments in which participants are put in different conditions. It ensures that, if a sufficiently large sample is used, no systematic average differences will exist among the participants in the various experimental conditions. This is because random assignment evenly distributes people, and all the ways they may vary, across all the conditions of the experiment. For example, if a sample of 100 people was randomly divided into two groups of 50, the mean height, weight, level of self-
23
Because the experimental method enables us to infer causes for behavior, it is generally the preferred way to conduct research in social psychology, but in some situations experimental methods cannot be applied. Many of the variables that social psychologists are interested in cannot be manipulated. There are many important questions about the effect of variables like gender, age, race, and sexual preference, but people can’t be randomly assigned to be male or female, old or young, Black or White, or straight or gay. Furthermore, many of the questions of interest to social psychologists deal with long-
In fact, the correlational method and the experimental method provide complementary information about how or why people behave the way they do. Let’s go back to the example of research testing hypotheses derived from stereotype threat theory. The experimental research by Steele and Aronson (1995) provides compelling evidence that stereotype threat is at least one of the factors that cause poorer performance by members of stigmatized groups; on the other hand, the correlational research by Pinel and colleagues (2005) suggests that some students will be more vulnerable to these effects. When applied together, these two research strategies enable social psychologists to document the role that both individual differences and situational forces play in leading people to behave the way they do. Such evidence fits the first core assumption of social psychology: that behavior is a function of a combination of the features of the person and the situation.
Because social psychologists ultimately want to understand the forces that operate on us in real life, another important type of research is field research. This type of research occurs outside the laboratory, for example, in schools, office buildings, medical clinics, football games, or even in shopping malls or on street corners. Field research is not wedded to an experimental or correlational approach. It can be either. It also often utilizes quasi-
Research that occurs outside the laboratory, for example, in schools, office buildings, medical clinics, football games, or even in shopping malls or on street corners.
Type of research in which groups of participants are compared on some dependent variable, but for practical or ethical reasons, the groups are not formed on the basis of random assignment.
We can also use research on stereotype threat to highlight an example of field research. One goal of a field study might be to see if we can use stereotype threat to design interventions that reduce racial differences in students’ actual academic achievement. This is exactly what researchers such as Greg Walton and Geoff Cohen have done (Walton & Cohen, 2007, 2011). They reasoned that for many if not most college students, the transition to college can be stressful. These students have to adjust to a more rigorous type of study than you had in high school. They might also be living away from family for the first time and trying to make new friends. When we add feelings of stereotype threat, perhaps from not seeing many other faculty or students who share their racial background, students from minority backgrounds might be at greater risk for feeling that they don’t belong, and this might impair their academic performance. In the context of the transition to university, Walton and Cohen wanted to see if shoring up feelings of belonging at college would reduce stereotype threat and improve academic performance for racial minorities.
24
To do this, they randomly assigned a sample of White and Black first-
Freshman year even though I met large numbers of people, I didn’t have a small group of close friends… . I was pretty homesick, and I had to remind myself that making close friends takes time. Since then … I have met people some of whom are now just as close as my friends in high school were. (Walton & Cohen, 2007, p. 88)
These testimonials from students of different racial, gender, and ethnic backgrounds send the message that stress is a pretty normal and understandable part of all students’ experience. Those students in the control condition read similar testimonials about how students’ political attitudes had changed. Then the researchers proceeded to follow both groups of students for the next three years (FIGURE 1.7).
Among students in the control group, Black students earned GPAs that were significantly lower than those of their White peers. But for those students who received the intervention and learned that stress is a part of everyone’s experience at university, this racial gap in achievement was cut in half over the next three years. Whereas learning about how stressed other students are did not matter too much for White students, it significantly boosted how Black students performed in their courses, and it did so by helping students see that their experience of stress and adversity at college in no way meant that they didn’t belong there.
25
One of the strengths of field research like this study is that it tries to capture social behavior as it occurs out in the world. This is important because, as you well know, the world is a complex place and researchers need to study that complexity. The chief weakness, though, is that researchers often lose a lot of the control they have in the laboratory in terms of what participants are exposed to, and thus don’t always have the clearest manipulation or measurement of the variables they want to study.
Quasi-
The ultimate function of a good theory is to be useful by moving this ongoing cyclical process of science forward. It should advance our understanding of how and why people behave the way they do, facilitating efforts to make the world a better place. Our experiences in applying our newfound knowledge to issues of real human importance ultimately come back to tell us how well our theoretical understanding fits the world in which we live. A useful theory has the following characteristics.
First, a theory should organize the observations, or facts, that come out of the research process. Theories create order out of chaos and simplify the bewildering array of facts that we observe in the world around us. Theories provide a more abstract and general way of describing the nature of reality than the complex and sometimes messy observations that theories seek to explain. For example, Steele’s stereotype threat theory summarizes and simplifies results from other studies that have shown that members of stigmatized groups perform worse when very few other members of their group are present, when the person administering a test is from a different ethnic group, and when the test is presented as one on which their group tends to perform poorly. This rather disparate set of facts coheres within the broader theory that performance is impaired when conditions make it likely that people will think of a relevant negative stereotype about their group. Generally speaking, the broader the range of observations that a theory can make sense of, the better. Theories that are able to account for a wide variety of observations are said to have conceptual power.
Theories do much more than simplify and organize knowledge. A good theory should also give us insight into how or why things happen. To do this effectively, a theory must be conceptually coherent and logically consistent. It should specify clear relationships between variables that help us understand the processes through which particular events in the world occur. To be truly useful, a theory should provide us with understanding that goes beyond what we already know. It should shed new light on what we observe happening within and around us, giving us a sort of “aha, now I get it” experience. Stereotype threat theory provides an entirely new way of thinking about group differences in academic achievement, and it does this in a coherent and logically consistent way. It is also a relatively simple idea that fits well with our understanding of basic psychological processes. In this sense, stereotype threat theory is highly parsimonious—it explains a wide range of observations with a relatively small number of basic principles. Einstein’s theory of relativity and Darwin’s theory of evolution are two of the most parsimonious theories in the history of science in that both explain extremely diverse sets of observations with just a few relatively simple principles.
26
Third, a good theory should inspire research. It should enable us to deduce clear and novel hypotheses that follow logically from its propositions, hypotheses that in turn lead to research that tells us how well the theory fits with reality. Stereotype threat theory has inspired a great deal of research that has both supported its core propositions and led to refinements in our understanding of how stereotype threat undermines performance. Many potentially interesting ideas about why people behave the way they do have been discussed over the millennia; some of these ideas might be quite accurate. But unless a theory produces hypotheses that can be used to assess its fit with reality, it is not scientifically useful. That’s not to say that a useful theory must be easy to test, or that it must be testable immediately on its development. Indeed, some of the most influential and important theories in the history of science could not be tested directly for many years after they were proposed. For example, the theory that physical matter is made up of tiny particles moving about in space could not be tested until suitable techniques were developed to enable physicists to assess the nature and movement of atomic particles. An intriguing new theory that seems at first to defy scientific testing often provides the impetus for the development of new technologies that can be used to test the theory’s core propositions.
Fourth, in addition to inspiring research, a good theory should shed light on phenomena beyond what the theory was originally designed to explain. In other words, a good theory should be generative, providing new theoretical insights in other domains. When we combine a good theory with other ideas, new ideas should spill out. Stereotype threat theory has been generative in the sense that it has led to new ideas about performance deficits in a wide range of areas and among a wide variety of different groups of people. It has also led to finer-
A good theory should have practical applications that help us solve pressing problems and improve the quality of life. In recent years, stereotype threat theory has begun to inform interventions applied in schools and on college campuses (Walton & Spencer, 2009). For example, the theory implies that remedial programs to help negatively stereotyped minority-
27
Theories deal with the world of abstract conceptual variables, such as attitudes, self-
To conduct research on any conceptual variable, we first must develop an operational definition of that concept. Defining a concept operationally involves moving from the abstract world of concepts to the more concrete world of specific instances. An operational definition entails finding a specific, concrete way to measure or manipulate a conceptual variable. Ideally, an operational definition will capture a typical instance of the conceptual variable that illustrates its core meaning or essence. In reality, any conceptual variable can be operationalized in a variety of ways, so that no single operational definition is likely to provide the perfect or only instance of the concept.
A specific, concrete method of measuring or manipulating a conceptual variable.
Let’s first examine this issue with regard to a dependent variable. Operationalizing a dependent variable refers to specifying precisely how it will be measured in a particular study. For example, a researcher might operationalize the conceptual variable anxiety in the following ways:
Scores on a self-
Overt behaviors that are thought (on the basis of a theoretical conception) to be indicators of anxiety (e.g., chewing on the fingernails, rapidly tapping one’s foot, twitching eyelids)
Physiological measures that assess bodily symptoms or signs that are thought (again, on the basis of a theoretical conception) to be indicators of anxiety (e.g., rapid heart rate, sweaty palms, exaggerated startle response)
28
These various operationalizations tap into different aspects of the concept of anxiety. It’s important that multiple operationalizations of a given conceptual variable are highly correlated with each other, so that we can be confident that the various operationalizations are all tapping into the same underlying conceptual variable. Construct validity is the degree to which the dependent variable measures what it intends to measure or the independent variable manipulates what it intends to manipulate. Often researchers assess the construct validity of an independent variable by including a manipulation check, which is a measure that directly assesses whether the manipulation created the change that was intended. For dependent variables, if different operationalizations of a given conceptual variable are not strongly related to each other, we may actually be tapping into two different conceptual variables. Poor construct validity is one of the primary potential problems in the research process. If it is not clear that an operationalization of a dependent variable measures what it was intended to measure, then we can’t draw any clear conclusion from an experiment using that operationalization. An experiment that lacks construct validity for either the independent or the dependent variable does not have internal validity. No clear conclusions can be drawn from the results of such an experiment.
The degree to which the dependent measure assesses what it intends to assess or the manipulation manipulates what it intends to manipulate.
Problems with the construct validity of independent variables are particularly common in social psychological research. Operationalizations of the manipulation of any one specific conceptual independent variable might also inadvertently alter several other conceptual variables. For example, if we manipulate stereotype threat by informing our research participants that it is widely believed that their group performs poorly on a particular task, this may well be increasing their concern that their poor performance might confirm a negative stereotype, just as our conceptual definition of stereotype threat would suggest. But it may also be doing other things. Maybe it’s just creating a general increase in fear of failure that has little to do with concerns about stereotypes. It might even be creating anger at the thought that some people view one’s group as inferior.
How can we know if the effect of our independent variable is due to concerns about stereotypes, performance anxiety, anger, or any number of other possible consequences of our manipulation? This is a crucial question for determining whether a study has internal validity. When more than one conceptual variable differs across conditions in an experiment, the independent variable is confounded. Confounds cloud the interpretation of research results because a variable other than the conceptual variable we intended to manipulate may be responsible for the effect on the dependent variable, making alternative explanations possible. Alternative explanations make it unclear which conceptual variable really is responsible for the changes in the dependent variable that occur. Confounds and alternative explanations are thus a major problem in social psychological research, and in all of science. Much of the controversy and disagreement among scientists results from the confounding of variables.
A variable other than the conceptual variable intended to be manipulated that may be responsible for the effect on the dependent variable, making alternative explanations possible.
Researchers do their best to avoid confounds in their studies. Ideally, the researcher carefully considers potential confounds and alternative explanations when planning the study and includes control groups that expose participants to these possible confounding alternative causal variables without exposing them to the variable that is being investigated as a possible cause. To control for possible confounds in experiments on the effect of stereotype threat on test performance, we might include control conditions in which participants are threatened, distracted, or angered in ways unrelated to stereotypes of the groups to which they belong. If the experimental stereotype threat induction group shows worse performance than any of these other groups, we can confidently rule out performance anxiety, distraction, and anger as alternative explanations for our findings, which would increase our confidence that stereotype threat is, in fact, causing the poorer performance.
29
The problem of confounding can also be minimized by replicating our studies with different operationalizations of the crucial variables, a process known as conceptual replication. If different studies, each flawed in one way or another, with possible confounds operating, yield consistent results, the probability that an alternative explanation is responsible for the results is reduced. Science is thus a cumulative process, and scientific knowledge depends heavily on ongoing conceptual replications of findings to rule out any confounds that might be affecting our results.
The repetition of a study with different operationalizations of the crucial variables but yielding similar results.
As you can see, establishing the construct validity of an experiment’s independent and dependent variables is essential to the internal validity of the experiment. If a study has high internal validity, we may know, for instance, that stereotype threat undermined performance by a group of African American students at a university in California in the early 1990s. This is important because it supports a hypothesis derived from stereotype threat theory and thereby increases confidence in the theory. And even if this finding comes from a unique sample, it demonstrates that the effect can occur. Once internal validity has been established, we can then ask, What does this tell us about other people, in other settings, at other times? This is the basic question regarding external validity, the ability to generalize one’s findings. Can we generalize beyond the group of people studied at a particular time and place?
The judgment that a research finding can be generalized to other people, in other settings, at other times.
In the case of stereotype threat, one external validity question would be whether these effects are limited to African Americans or extend to other stigmatized groups, and even farther, to majority-
Another study (Stone et al., 1999) had White and Black participants engage in a task akin to miniature golf. Half the participants were told that the task measured sports intelligence, and the other half were told that it measured athletic ability. The researchers reasoned that Whites would feel stereotype threat when they were led to believe that the task measured athletic ability, but Blacks would experience stereotype threat when the task was framed as a measure of sports intelligence. These hypotheses were supported: Whites performed poorly when the task was described as a measure of athletic ability, and Blacks performed poorly when it was described as a measure of sports intelligence (FIGURE 1.8). Over the years, the results of many studies have shown that the problem of stereotype threat is indeed a general one that, depending on the performance domain, can affect members of any group that is negatively stereotyped—
30
These examples illustrate that if we are really to have confidence in the external validity of the findings of psychological research, the research needs to be replicated with other types of operationalizations and other participants from varying cultures, geographical regions, and socioeconomic levels. Social psychological research has been criticized for its heavy use of college students as research participants and for participants who might be described as WEIRD (that is, from countries that are Western, educated, industrialized, rich, and democratic [Heinrich et al., 2010]). This is not surprising, because most of the research has been conducted by scientists who are themselves WEIRD. However, some have wondered whether we are simply piling up knowledge about the middle class in WEIRD nations but are learning little about other North Americans, Europeans, and Australians, let alone people from other continents. This narrow choice of participants is a problem, because if culture does exert a powerful role in shaping our view of ourselves and the world around us, then building a science of human behavior largely drawn from only a limited slice of human diversity is likely to skew the conclusions we draw. The ideal solution to this problem would be to sample people randomly from the entire population of the earth. Of course, such random sampling is never possible. Although the rare cross-
One important point to remember is that scientific progress is made in the aggregate. Every study that scientists carry out contains some limitation or weakness; only by conducting multiple studies, using a diverse set of procedures and with a diverse array of samples, can we learn the more general patterns of the human condition. According to this logic, a good, internally valid experiment teaches us what is possible and lends support to a broader theory, even when it doesn’t capture the effect as it actually occurs among people in general. For example, Steele and Aronson’s (1995) demonstration that merely marking one’s race on a cover sheet to a test can lead Black but not White students to underperform doesn’t apply only to the rare occurrences when students fill out demographic information in a testing context. It tells us something conceptual about how reminding people of their group identity can lead to subtle but profound shifts in behavior.
A second answer to the problem of nonrepresentative samples is the increasingly global nature of psychology. Social psychologists can currently be found on every populated continent. Although research from North America and western Europe still dominates the field, the broadening reach of social psychology as a science will continue to fuel efforts to replicate key findings in other cultural and geographic settings. Although these true tests of generalizability will sometimes confirm the universal nature of phenomena, they might also reveal important cultural differences in how we think and feel about ourselves and others. Throughout this text, we’ll highlight some of the research that has already revealed such interesting cultural variations.
31
The scientific method has helped improve our lives in many ways. By providing a way of assessing the merits of competing claims about the nature of reality, science has greatly enhanced our understanding of the world we live in, ourselves, and how we fit into that world. By applying the knowledge gained from scientific inquiry, humankind has solved many of the problems that have plagued us for millennia, greatly reducing our vulnerability to disease, providing improved means of meeting our basic needs, and giving us control over aspects of life that our ancestors never dreamed possible. But the knowledge science has given us has also created problems our ancestors could have never imagined, such as the potential to kill each other by the millions and to use up or poison the natural resources we rely on for survival. These are very real problems that must be faced. Social psychology can help us grapple with them by providing the knowledge needed to get people to look beyond their immediate personal benefits to see the long-
First, there are aspects of reality that we humans cannot know. Our knowledge of the world originates in the information provided to us by our sense organs. Unfortunately, human sense organs are capable of registering only a tiny fraction of the things that are actually happening in the world. For example, our hearing is limited to a relatively narrow range of sound frequencies. Our dogs can hear many sounds we have no hope of perceiving; bats live in an even more highly differentiated world of sound that we can’t even imagine. Although we often use the knowledge that science gives us to develop technologies that enable us to assess things that our raw sense organs cannot perceive, the fact that we are capable of perceiving only part of what is happening in the world makes a complete understanding of all aspects of reality an elusive goal.
Second, although the scientific method may be objective, the human beings who apply it are not. The scientific method was developed to provide a more objective way of answering questions and evaluating the validity of competing claims about how the world works. But science remains a human endeavor. Scientists may try their best to put their biases aside and be objective, but human nature makes a complete elimination of individual bias impossible. This is part of the reason that controversies continue to rage in all active areas of scientific inquiry. Scientists, social psychologists included, often stake their reputations, careers, and ultimately their self-
32
Third, not all questions can be answered scientifically. Many of the most pressing crises facing us today involve questions of values, morality, and ethics. Although social psychology can fruitfully employ the scientific method to understand how values develop, change, and influence human behavior, science cannot tell us which values are the right ones to invest in. Is safety more important than freedom? Are the rights of the individual more important than the welfare of the group? Should scientific knowledge be used to restrict behaviors that are injurious to the people who engage in them? These are important questions we all will be facing in the years to come, and although science can help us understand the consequences of different courses of action, it cannot tell us which consequences are more important than others and which values we should use to guide our decisions.
Fourth, human values exert a powerful influence on the way science is conducted. The questions we choose to ask—
The Scientific Method: Systematizing the Acquisition of Knowledge |
Science is a method for answering questions that reduces the impact of human biases. Theory and research have a cyclical relationship: Research provides systematic observations; theory provides the basis for predicting and explaining these observations; research then tests hypotheses derived from the theory to assess its validity, refine it, or generate alternate theories. |
||
---|---|---|
Correlational Method Two or more variables are measured and compared to determine whether or not they are related. A relationship between variables does not mean that one caused the other. |
Experimental Method This process seeks to control variables so that cause and effect can be determined. The independent variable is manipulated, and its effect on the dependent variable is observed. Participants must be randomly assigned to conditions to reduce possible confounds. |
|
Features of a Good Theory Organizes the facts. Explains observations. Inspires new research. Generates new questions. Has practical applications. |
Internal and External Validity Abstract ideas need to be made specific and quantifiable to be manipulated and measured properly. Studies should be able to be replicated using different operationalizations of variables. |
Limitations of Science Human knowledge is limited. Humans are biased. Some questions are outside the scope of science. Human values influence the questions asked. |
33