3.3 How to Build a Graph

In this section, you will learn how to choose the most appropriate type of graph and then use a checklist to ensure that the graph conforms to APA style. We also discuss innovative graphs that highlight the exciting future of graphing that some social scientists are already using to let their data speak more clearly. Innovations in graphing can help us to deliver a persuasive message, much like that conveyed by Florence Nightingale’s coxcomb graph.

Choosing the Appropriate Type of Graph

When deciding what type of graph to use, first examine the variables. Decide which is the independent variable and which is the dependent variable. Also, identify which type of variable—nominal, ordinal, or scale (interval/ratio)—each is. Most of the time, the independent variable belongs on the horizontal x-axis and the dependent variable goes on the vertical y-axis.

MASTERING THE CONCEPT

3.4: The best way to determine the type of graph to create is to identify the independent variable and the dependent variable, along with the type of variable that each is—nominal, ordinal, or scale.

After assessing the types of variables that are in the study, use the following guidelines to select the appropriate graph:

  1. If there is one scale variable (with frequencies), use a histogram or a frequency polygon (Chapter 2).

    60

  2. If there is one scale independent variable and one scale dependent variable, use a scatterplot or a line graph. (Figure 3-9 provides an example of how to use more than one line on a time plot.)
  3. If there is one nominal or ordinal independent variable and one scale dependent variable, use a bar graph or a Pareto chart.
  4. If there are two or more nominal or ordinal independent variables and one scale dependent variable, use a bar graph.

How to Read a Graph

Figure 3-16

Two Independent Variables When we are graphing a data set that has two independent variables, we show one independent variable on the x-axis (in this case, chronic jealousy—low or high) and one independent variable in a color-coded key (in this case, type of prime—infidelity or neutral). This graph demonstrates that people high in chronic jealousy looked longer at an attractive person of the same sex, a potential threat, when primed by thinking about infidelity than when primed by thinking about a neutral topic. The pattern was reversed for people low in chronic jealousy.

Let’s use the graph in Figure 3-16 to confirm your understanding of independent and dependent variables. This study of jealousy (Maner, Miller, Rouby, & Gailliot, 2009) includes two independent variables: level of jealousy (whether or not a participant is low or high in chronic jealousy) and how people were primed (whether they were primed to think about infidelity or about a neutral topic). People primed to think about infidelity visualized and wrote about a time when they experienced infidelity-related concerns. People primed to think about a neutral topic wrote a detailed account of four or five things they had done the previous day. The dependent variable is how long participants looked at photographs of an attractive same-sex person; for a jealous heterosexual person, an attractive person of the same sex would be a potential threat to his or her relationship.

Here are the critical questions you need to ask to understand the graph of the findings of the jealousy study. A well-designed graph makes it easy to see the answers; a graph intended to mislead or lie will obscure the answers.

  1. What variable are the researchers trying to predict? That is, what is the dependent variable?
  2. Is the dependent variable nominal, ordinal, or scale?
  3. What are the units of measurement on the dependent variable? For example, if the dependent variable is IQ as measured by the Wechsler Adult Intelligence Scale, then the possible scores are the IQ scores themselves, ranging from 0 to 145.
  4. What variables did the researchers use to predict this dependent variable? That is, what are the independent variables?
  5. Are these two independent variables nominal, ordinal, or scale?
  6. What are the levels for each of these independent variables?

Now check your answers:

  1. The dependent variable is time, in milliseconds.
  2. Milliseconds is a scale variable.
  3. Milliseconds can range from 0 on up. In this case, no average exceeds 600.
  4. The first independent variable is level of jealousy; the second independent variable is priming condition.
  5. Level of jealousy is an ordinal variable but it can be treated as a nominal variable in the graph; priming condition is a nominal variable.

    61

  6. The levels for jealousy condition are low chronic jealousy and high chronic jealousy. The levels for priming condition are infidelity and neutral.

Because there are two independent variables—both of which are nominal—and one scale dependent variable, we used a bar graph to depict these data.

Guidelines for Creating a Graph

Here is a helpful checklist of questions to ask when you’ve created a graph or when you encounter a graph. Some we’ve mentioned previously, and all are wise to follow.

Chartjunk is any unnecessary information or feature in a graph that detracts from a viewer’s ability to understand the data.

The last of these guidelines involves a new term, the graph-corrupting fluff called chartjunk, a term coined by Tufte (2001). According to Tufte, chartjunk is any unnecessary information or feature in a graph that detracts from a viewer’s ability to understand the data. Chartjunk can take the form of any of three unnecessary features, all demonstrated in the rather frightening graph in Figure 3-17.

Figure 3-17

Chartjunk Run Amok Moiré vibrations, such as those seen in the patterns on these bars, might be fun to use, but they detract from the viewer’s ability to glean the story of the data. Moreover, the grid pattern behind the bars might appear scientific, but it serves only to distract. Ducks—like the 3-D shadow effect on the bars and the globe clip-art—add nothing to the data, and the colors are absurdly eye straining. Don’t laugh; we’ve had students submit carefully written research papers accompanied by graphs even more garish than this!

MASTERING THE CONCEPT

3.5: Avoid chartjunk—any unnecessary aspect of a graph that detracts from its clarity.

Moiré vibrations are any visual patterns that create a distracting impression of vibration and movement.

Grids are chartjunk that take the form of a background pattern, almost like graph paper, on which the data representations, such as bars, are superimposed.

A duck is a form of chartjunk in which a feature of the data has been dressed up to be something other than merely data.

  1. Moiré vibrations are any visual patterns that create a distracting impression of vibration and movement. They are unfortunately sometimes the default settings for bar graphs in statistical software. Tufte recommends using shades of gray instead of patterns.
  2. A grid is a background pattern, almost like graph paper, on which the data representations, such as bars, are superimposed. Tufte recommends the use of grids only for hand-drawn drafts of graphs. In final versions of graphs, use only very light grids, if necessary.

    62

  3. Ducks are features of the data that have been dressed up to be something other than merely data. Think of ducks as data in costume. Named for the Big Duck, a store in Flanders, New York, that was built in the form of a very large duck, graphic ducks can be three-dimensional effects, cutesy pictures, fancy fonts, or any other flawed design features. Avoid chartjunk!

Computer defaults are the options that the software designer has preselected; these are the built-in decisions that the software will implement if you do not instruct it otherwise.

There are several computer-generated graphing programs that have defaults that correspond to many—but not all—of these guidelines. Computer defaults are the options that the software designer has preselected; these are the built-in decisions that the software will implement if you do not instruct it otherwise. You cannot assume that these defaults represent the APA guidelines for your particular situation. You can usually point the cursor at a part of the graph and click to view the available options.

Edward Tufte’s Big Duck The graphics theorist Edward Tufte was fascinated by the Big Duck, a store in the form of a duck for which he named a type of chartjunk (graphic clutter). In graphs, ducks are any aspects of the graphed data that are “overdressed,” obscuring the message of the data. Think of ducks as data in a ridiculous costume.
© Franck Fotos/Alamy

The Future of Graphs

Thanks to computer technology, we have entered a second golden age of scientific graphing. We mention only three categories here: interactive graphs, clinical applications, and geographic information systems.

Interactive Graphs  One informative and haunting graph was published online in the New York Times on September 9, 2004, to commemorate the day on which the 1000th U.S. soldier died in Iraq (http://tinyurl.com/55nlu). Titled “A Look at 1000 Who Died,” this beautifully designed tribute is formed by photos of each of the dead servicemen and women. One can view these photographs organized by variables such as last name, where they lived, gender, cause of death, and how old they were when they died. Because the photos are the same size, the stacking of the photos also functions as a bar graph. By clicking on two or more months in a row, or on two or more ages in a row, one can visually compare numbers of deaths among levels of a category.

Yet this interactive graph is even more nuanced, because it provides a glimpse into the life stories of these soldiers. By holding the cursor over a photo that catches your eye, you can learn, for example, that Spencer T. Karol, regular duty in the U.S. Army, from Woodruff, Arizona, died on October 6, 2003, at the age of 20, from hostility-inflicted wounds. A thoughtfully designed interactive graph such as this one holds even more power than a traditional flat graph in how it educates, evokes emotion, and provides details that humanize the stories behind the numbers.

Clinical Applications  Clinical psychology researchers have developed graphing techniques, illustrated in Figure 3-18, to help therapists identify when the therapy process appears to be leading to a poorer-than-expected outcome (Howard, Moras, Brill, Martinovich, & Lutz, 1996). The dependent variable, rate of actual improvement, is graphed as a line that compares it to the rate of expected improvement (based on previous research). If therapy progresses more slowly than expected, the observed discrepancy points out the need to examine why a particular client is not making better progress.

Figure 3-18

Graph as Therapy Tool Some graphs allow therapists to compare the actual rate of a client’s improvement with the expected rate given that client’s characteristics. This client (assessed Mental Health Index in gray) is doing worse than expected (expected treatment response in red) but has improved enough to be above the failure boundary (in yellow).

Geographic Information Systems (GIS)  Many companies have published software that enables computer programmers to link Internet-based data to Internet-based maps (Markoff, 2005). These visual tools are all variations on geographic information systems (GIS). The APA sponsors an advanced workshop on how to apply GIS to the social sciences.

63

Sociologists, geographers, political scientists, consumer psychologists, and epidemiologists (who use statistics to track patterns of disease) have already become familiar with GIS in their respective fields. Organizational psychologists, social psychologists, and environmental psychologists can use GIS to organize workflow, assess group dynamics, and study the design of classrooms. Ironically, this advance in computerized mapping is pretty much what John Snow did without a computer in 1854 when he studied the Broad Street cholera outbreak.

Next Steps

Multivariable Graphs

In this chapter, we learned to create graphs with two variables, such as scatterplots and many bar graphs, and with three variables, such as bar graphs that include two independent variables and one dependent variable. As graphing technologies become more advanced, there are increasingly elegant ways to depict multiple variables on a single graph. Using the bubble graph option under “Other Charts” on Microsoft Excel (and even better, downloading Excel templates from sites such as http://juiceanalytics.com/chartchooser), we can create a bubble graph that depicts multiple variables.

In an article titled “Is the Emotion-Health Connection a ‘First-World Problem’”?, Sarah Pressman and her colleagues used a more sophisticated version of a bubble graph to display four variables (Pressman, Gallagher, & Lopez, 2013):

  1. Country. Each bubble is one country. For example, the large yellow bubble toward the upper-right-hand corner represents Ireland; the small red bubble toward the lower far left represents Georgia.
  2. Self-reported health. The x-axis indicates self-reported physical health for a country.
  3. Positive emotions. The y-axis indicates reported positive emotions for a country.
  4. Gross domestic product (GDP). Both the size and the color of the bubbles represent a country’s GDP. Smaller and darker red bubbles indicate lower GDP; larger and yellower bubbles indicate higher GDP.

64

Figure 3-19

A Five-Variable Graph Increasingly sophisticated technology allows us to create increasingly sophisticated graphs. This bubble graph from a study by Sarah Pressman and her colleagues (2013) depicts four variables: country (each bubble), self-reported health (x-axis), positive emotions (y-axis), and gross domestic product (size and color of bubbles). The researchers could have had five variables if they had not used both the size and the color of the bubbles to represent GDP.
Free material from www.gapminder.org

The researchers could have chosen to add a fifth variable by using either size or color to represent GDP, rather than both. They might have used size, for example, to represent GDP, and color to represent the continent for each country.

From this graph, we can see a strong relation between physical health and positive emotions. GDP appears to be related to both measures; for countries with higher GDP (larger, yellower dots) and lower GDP (smaller, darker red dots), there is a link between emotions and health.

Some interactive versions of a bubble graph have, amazingly, added a sixth variable to the five that are possible on a printed page. For instance, www.gapminder.org/ world has fascinating bubble graphs with data on countries around the world; Gapminder allows you to add the variable year by clicking “Play” in the lower left-hand corner; the graph is then animated and can show the movement of countries with respect to a range of variables since 1800!

CHECK YOUR LEARNING

Reviewing the Concept

  • Graphs should be used when they add information to written text or help to clarify difficult material.
  • To decide what kind of graph to use, determine whether the independent variable and the dependent variable are nominal, ordinal, or scale variables.
  • A brief checklist will help you create an understandable graph. Label graphs precisely and avoid chartjunk.
  • In the near future, online interactive graphs; graphs based on sophisticated prediction models such as those that forecast therapy outcomes; and computerized mapping will become increasingly common.

Clarifying the Concepts

  • 3-8 What is chartjunk?

Calculating the Statistics

  • 3-9 Decisions about what kind of graph to use depend largely on how variables are measured. Imagine a researcher is interested in how “quality of sleep” is related to typing performance (measured by the number of errors made). For each of the measures of sleep below, decide which kind of graph to use.
    1. Total minutes slept
    2. Sleep assessed as sufficient or insufficient
    3. Using a scale from 1 (low-quality sleep) to 7 (excellent sleep)

Applying the Concepts

  • 3-10 Imagine that the graph in Figure 3-18 represents data testing the hypothesis that exposure to the sun can impair IQ. Further imagine that the researcher has recruited groups of people and randomly assigned them to different levels of exposure to the sun: 0, 1, 6, and 12 hours per day (enhanced, in all cases, by artificial sunlight when natural light is not available). The mean IQ scores are 142, 125, 88, and 80, respectively. Redesign this chartjunk graph, either by hand or by using software, paying careful attention to the dos and don’ts outlined in this section.

Solutions to these Check Your Learning questions can be found in Appendix D.

65