2.2 Observation: Discovering What People Do

To observe means to use one’s senses to learn about the properties of an event (e.g., a storm or a parade) or an object (e.g., an apple or a person). For example, when you observe a round, red apple, your brain is using the pattern of light that is coming into your eyes to draw an inference about the apple’s identity, shape, and color. That kind of informal observation is fine for buying fruit but not for doing science. Why? First, casual observations are notoriously unstable. The same apple may appear red in the daylight and crimson at night or spherical to one person and elliptical to another. Second, casual observations can’t tell us about all of the properties that might interest us. No matter how long and hard you look, you will never be able to discern an apple’s crunchiness or pectin content simply by watching it. Luckily, scientists have devised techniques to overcome these problems.

Measurement

The last time you said, “Can you give me a second?” you probably didn’t know you were talking about atomic decay. Every unit of time has an operational definition, which is a description of a property in concrete, measurable terms. The operational definition of a second is the duration of 9,192,631,770 cycles of microwave light absorbed or emitted by the hyperfine transition of cesium-133 atoms in their ground state undisturbed by external fields (which takes roughly 6 seconds just to say). To count the cycles of light emitted as cesium-133 decays requires an instrument, which is anything that can detect the condition to which an operational definition refers. An instrument known as a “cesium clock” can count cycles of light, and when it counts 9,192,631,770 of them, one second has officially passed.

operational definition

A description of a property in concrete, measurable terms.

instrument

Anything that can detect the condition to which an operational definition refers.

31

The steps we take to measure the psychological properties of a person are the same steps we take to measure the physical properties of an apple. For example, if we wanted to measure a person’s intelligence, or shyness, or happiness, we would have to start by generating an operational definition of that property—that is, by specifying some concrete, measurable event that indicates it. A key feature of an operational definition is validity, which refers to the goodness with which a concrete event defines a property. For example, the concrete event called frequency of smiling is a valid way to define the property called happiness because, as we all know, people tend to smile more often when they feel happy. Do they eat more or talk more or spend more money? Well, maybe. But maybe not. And that’s why food consumption or verbal output or financial expenditures would probably be regarded by most people as invalid measures of happiness (though perfectly valid measures of something else).

validity

The goodness with which a concrete event defines a property.

What are the properties of a good operational definition and a good instrument?

Figure 2.1: FIGURE 2.1 Measurement There are two steps in the measurement of a property.

Once we have a valid operational definition of happiness, we just need a smile-detecting instrument, such as a computer loaded with facial-recognition software or maybe just an attentive research assistant with a pencil and a clipboard. Whatever instrument we use, it needs to have two features. First, it needs to have reliability, which is the tendency for an instrument to produce the same measurement whenever it is used to measure the same thing. For example, if a person smiles just as much on Tuesday as on Wednesday, then a smile-detecting instrument should produce identical results on those two days. If it produced different results (i.e., if the instrument detected differences that weren’t actually there), it would lack reliability. Second, a good instrument needs to have power, which is an instrument’s ability to detect differences or changes in the property. If a person smiled more often on Tuesday than on Wednesday, then a good smile-detector should produce different results on those two days. If it produced the same result (i.e., if it failed to detect a difference that was actually there), then it would lack power (see FIGURE 2.1).

reliability

The tendency for an instrument to produce the same measurement whenever it is used to measure the same thing.

power

An instrument’s ability to detect small magnitudes of the property.

Demand Characteristics

Are most people prejudiced against people with disabilities? People rarely admit to being prejudiced when asked, and they generally won’t behave in prejudiced ways if someone is watching. So how could you measure prejudice in a way that minimized demand characteristics?
Rex Features via AP Photo

Once we have a valid definition and a reliable and powerful instrument, we still have some work to do, because while we are trying to discover how people normally behave, normal people will be trying to behave as they think we want or expect them to. Demand characteristics are those aspects of an observational setting that cause people to behave as they think someone else wants or expects. We call these demand characteristics because they seem to “demand” or require that people say and do certain things. When someone you love asks, “Do these jeans make me look fat?” the right answer is always no, and if you’ve ever been asked this question, then you have experienced demand. Demand characteristics make it hard to measure behavior as it typically unfolds.

demand characteristics

Those aspects of an observational setting that cause people to behave as they think someone else wants or expects.

Culture & Community: Best Place to Fall on Your Face

Best Place to Fall on Your Face Are most people prejudiced against people with disabilities? People rarely admit to being prejudiced when asked, and they generally won’t behave in prejudiced ways if someone is watching. So how could you measure prejudice in a way that minimized demand characteristics?

Robert Levine of California State University-Fresno sent his students to 23 large international cities for an observational study in the field. Their task was to observe helping behaviors in a naturalistic context. In two versions of the experiment, students pretended to be either blind or injured while trying to cross a street, while another student stood by to observe whether anyone would come to help. A third version involved a student dropping a pen to see if anyone would pick it up.

The results showed that people helped in all three events fairly evenly within cities, but there was a wide range of response between cities. Rio de Janeiro, Brazil, came out on top as the most helpful city in the study with an overall helping score of 93%. Kuala Lampur, Malaysia, came in last with a score of 40%, and New York City placed next to last with a score of 45%. On average, Latin American cities ranked as most helpful (Levine, Norenzayan, & Philbrick, 2001).

32

One way that psychologists avoid the problem of demand characteristics is by observing people without their knowledge. Naturalistic observation is a technique for gathering scientific information by unobtrusively observing people in their natural environments. For example, naturalistic observation has shown that the biggest groups leave the smallest tips in restaurants (Freeman et al., 1975), that hungry shoppers buy the most impulse items at the grocery store (Gilbert, Gill, & Wilson, 2002), and that men do not usually approach the most beautiful woman at a singles’ bar (Glenwick, Jason, & Elman, 1978). Each of these conclusions is the result of measurements made by psychologists who observed people who didn’t know they were being observed. It seems unlikely that the same observations could have been made if the diners, shoppers, and singles had realized that they were being watched.

naturalistic observation

A technique for gathering scientific information by unobtrusively observing people in their natural environments.

What are some of the limits of naturalistic observation?

One way to avoid demand characteristics is to measure behaviors that people are unable or unlikely to control. For example, our pupils contract when we are bored (left) and dilate when we are interested (right), which makes pupillary dilation a useful measure of a person’s level of engagement in a task.
Thinkstock

Unfortunately, naturalistic observation isn’t always a viable solution to the problem of demand characteristics. First, some of the things psychologists want to observe simply don’t occur naturally. If we wanted to know whether people who have undergone sensory deprivation perform poorly on motor tasks, we would have to hang around the shopping mall for a very long time before a few dozen blindfolded people with earplugs just happened to wander by and started typing. Second, some of the things that psychologists want to observe can only be gathered from direct interaction with a person—for example, by administering a survey, giving a test, conducting an interview, or hooking someone up to a machine. If we wanted to know how often people worry about dying, how accurately they can remember their high school graduations, or how much electrical activity their brains produce when they feel jealous, then simply watching them from the bushes won’t do.

Luckily, there are ways to avoid demand characteristics. For instance, people are less likely to be influenced by demand characteristics when they are allowed to respond privately (e.g., completing questionnaires when they are alone) or anonymously (e.g., when their names or addresses are not recorded). A second technique that psychologists often use to avoid demand characteristics is to measure behaviors that cannot easily be controlled. You may not want a psychologist to know that you are extremely interested in the celebrity gossip magazine that she’s asked you to read, but you can’t prevent your pupils from dilating, which is what they do when you are mentally engaged.

Why is it important for subjects to be “blind”?

A third way to avoid demand characteristics is to keep the people who are being observed from knowing the true purpose of the observation. When people are “blind” to the purpose of an observation, they can’t behave the way they think they should behave because they don’t know how they should behave. For instance, if you didn’t know that a psychologist was studying the effects of music on mood, you wouldn’t feel obligated to smile when music was played. This is why psychologists typically don’t reveal the true purpose of an observation to the people who are being observed until the study is over.

33

Observer Bias

The people being observed aren’t the only ones who can make measurement a bit tricky. Consider what happened when students in a psychology class were asked to measure the speed with which a rat learned to run through a maze (Rosenthal & Fode, 1963). Some students were told that their rats had been specially bred to be slow learners and others were told that their rats had been specially bred to be fast learners. Although all the rats were actually the same breed, the students who thought they were measuring the speed of a slow learner reported that their rats took longer to learn the maze than did the students who thought they were measuring the speed of a fast learner. In other words, the measurements revealed precisely what the students expected them to reveal.

People’s expectations can cause the phenomena they expect. In 1929, investors who expected the stock market to collapse sold their stocks and thereby caused the very crisis they feared. In this photo, panicked citizens stand outside the New York Stock Exchange the day after the crash, which the New York Times attributed to “mob psychology.”
The Granger Collection, NYC—All rights reserved.

Why did this happen? First, expectations can influence observations. It is easy to make errors when measuring the speed of a rat, and our expectations often determine the kinds of errors we make. Does putting one paw over the finish line count as learning the maze? If the rat falls asleep, should the stopwatch be left running, or should the rat be awakened and given a second chance? If a rat runs a maze in 18.5 seconds, should that number be rounded up or rounded down before it is recorded in the logbook? The answers to these questions may depend on whether one thinks the rat is a slow or fast learner. The students who timed the rat probably tried to be honest, vigilant, fair, and objective, but their expectations influenced their observations in subtle ways that they could neither detect nor control. Second, expectations can influence reality. Students who expected their rats to learn quickly may have unknowingly done things to help that learning along, for example, by muttering, “Oh no!” when the fast learner looked the wrong direction or by petting the slow learner less affectionately. (We’ll discuss these phenomena more in the Social Psychology chapter.)

Observers’ expectations, then, can have a powerful influence on both the observations they make and on the behavior of those whom they observe. Psychologists use many techniques to avoid these influences, and one of the most common is the double-blind observation, which is an observation whose true purpose is hidden from both the observer and the person being observed. For example, if the students had not been told which rats were fast learners and which were slow learners, then the students wouldn’t have had any expectations about the rats, thus their expectations couldn’t have influenced their measurements. That’s why it is common practice in psychology to keep the observers as blind as the participants. For example, measurements are often made by research assistants who do not know what is being studied or why, and who therefore don’t have any expectations about what the people being observed will or Bettmann/Corbis should do.

double-blind

An observation whose true purpose is hidden from both the observer and the person being observed.

34

SUMMARY QUIZ [2.2]

Question 2.4

1. When a measure produces the same measurement whenever it is used to measure the same thing, it is said to have
  1. validity.
  2. reliability.
  3. power.
  4. concreteness.

b.

Question 2.5

2. Aspects of an observational setting that cause people to behave as they think they should are called
  1. observer biases.
  2. reactive conditions.
  3. natural habitats.
  4. demand characteristics.

d.

Question 2.6

3. In a double-blind observation,
  1. the participants know what is being measured.
  2. people are observed in their natural environments.
  3. the purpose is hidden from both the observer and the person being observed.
  4. only surveys are used.

c.