Chapter specifics
• If we have data on a single quantitative variable, we start with a histogram or stemplot to display the distribution. Then we add numbers to describe the center and variability of the distribution.
• There are two common descriptions of center and variability: the five-number summary and the mean and standard deviation.
• The five-number summary consists of the median M, the midpoint of the observations, to measure center and the difference between the two quartiles Q1 and Q3 and the difference between the smallest and largest observations to describe variability.
• A boxplot is a graph of the five-number summary.
• The mean is the average of the observations.
• The standard deviation s measures variability as a kind of average distance from the mean, so use it only with the mean. The variance is the square of the standard deviation.
• The mean and standard deviation can be changed a lot by a few outliers. The mean and median are the same for symmetric distributions, but the mean moves farther toward the long tail of a skewed distribution.
• In general, use the five-number summary to describe most distributions and the mean and standard deviation only for roughly symmetric distributions.
286
In Chapter 11, we discussed histograms and stemplots as graphical displays of the distribution of a single quantitative variable. We were interested in the shape, center, and variability of the distribution. In this chapter, we introduce numbers to describe the center and variability. For symmetric distributions, the mean and standard deviation are used to describe the center and variability. For distributions that are not roughly symmetric, we use the five-number summary to describe the center and variability.
In most of the examples, we used graphical displays and numbers to describe the distribution of data on a single quantitative variable. These data are typically a sample from some population. Thus, the numbers that describe features of the distribution are statistics as discussed in Chapter 3. In the next chapter, we begin to think about distributions of populations. Thus, the numbers that describe features of these distributions are parameters. In later chapters, we will use statistics to draw conclusions, or make inferences, about parameters. Drawing conclusions about parameters that describe the center of a distribution of a single quantitative variable will be an important type of inference.
CASE STUDY EVALUATED Find the data on income by education at the Census Bureau website listed in the Notes and Data Sources section at the end of the book. Use what you have learned in this chapter to answer the following questions.
1. What are the median incomes for people 25 years old and over who are high school graduates only, have some college but no degree, have a bachelor’s degree, have a master’s degree, and have a doctorate degree? At the bottom of the table, you will find median earnings in dollars.
2. From the distribution given in the tables, can you find the (approximately) first and third quartiles?
3. Do people with more education earn more than people with less education? Discuss.
287
Online Resources
• The StatClips Videos, Summaries of Quantitative Data Example A, Example B, and Example C, describes how to compute the mean, standard deviation, and median of data.
• The StatClips Video, Exploratory Pictures for Quantitative Data Example C, describes how to construct boxplots.
• The Snapshots Video, Summarizing Quantitative Data, discusses the mean, standard deviation, and median of data, as well as boxplots, in the context of a real example.