Once we have collected data, either from a sample or a population, we have a listing of the values of one or more variables measured on those individuals. The following data give information about 8 of the 25 top-grossing movies in the United States in mid-November 2010.
Some of the top-grossing movies of all time | ||||||
---|---|---|---|---|---|---|
E.T.: The Extra-Terrestrial | 1982 | 435 | Family | 115 | 4 | PG |
Star Wars: Episode I - The Phantom Menace | 1999 | 431 | Sci-Fi | 133 | 0 | PG |
Pirates of the Caribbean: Dead Man's Chest | 2006 | 423 | Adventure | 130 | 1 | PG13 |
Toy Story 3 | 2010 | 415 | Animation | 103 | 2* | G |
Spider-Man | 2002 | 404 | Action | 121 | 0 | PG13 |
Transformers: Revenge of the Fallen | 2009 | 402 | Action | 150 | 0 | PG13 |
Star Wars: Episode III - Revenge of the Sith | 2005 | 380 | Sci-Fi | 140 | 0 | PG13 |
The Lord of the Rings: The Return of the King | 2003 | 377 | Fantasy | 201 | 11 | PG13 |
*as of February 2011 |
Unless our sample or population is quite small, it is difficult to make much sense of such a list. Even when we know what the columns in this table represent—name, year of release, millions of dollars earned, genre, running time in minutes, number of Oscars won, and MPAA rating—it is difficult to draw any conclusions about these movies. In what year were the most top-grossing movies released? What was the running time of a typical movie? Were fantasy movies generally longer than animated ones? What was the average dollar amount earned? We need to summarize our data in order to get an overall picture of the data. Our data descriptions can be either graphical (creating a “picture” in the usual sense) or numerical (creating a “picture” in a more abstract sense).
The video StatClips: Basic Principles of Exploring Data gives good examples of why we examine data carefully before drawing any conclusions about them.