davis_mcgivney

Chapter 1. Graphical and Numerical Descriptions

Introduction

Once we have collected data, either from a sample or a population, we have a listing of the values of one or more variables measured on those individuals. The following data give information about 8 of the 25 top-grossing movies in the United States in mid-November 2010.

Some of the top-grossing movies of all time
E.T.: The Extra-Terrestrial 1982 435 Family 115 4 PG
Star Wars: Episode I - The Phantom Menace 1999 431 Sci-Fi 133 0 PG
Pirates of the Caribbean: Dead Man's Chest 2006 423 Adventure 130 1 PG13
Toy Story 3 2010 415 Animation 103 2* G
Spider-Man 2002 404 Action 121 0 PG13
Transformers: Revenge of the Fallen 2009 402 Action 150 0 PG13
Star Wars: Episode III - Revenge of the Sith 2005 380 Sci-Fi 140 0 PG13
The Lord of the Rings: The Return of the King 2003 377 Fantasy 201 11 PG13
*as of February 2011
Source: Internet Movie Database

Unless our sample or population is quite small, it is difficult to make much sense of such a list. Even when we know what the columns in this table represent—name, year of release, millions of dollars earned, genre, running time in minutes, number of Oscars won, and MPAA rating—it is difficult to draw any conclusions about these movies. In what year were the most top-grossing movies released? What was the running time of a typical movie? Were fantasy movies generally longer than animated ones? What was the average dollar amount earned? We need to summarize our data in order to get an overall picture of the data. Our data descriptions can be either graphical (creating a “picture” in the usual sense) or numerical (creating a “picture” in a more abstract sense).