Graphical and Numerical Descriptions
Once we have collected data, either from a sample or a population, we have a listing of the values of one or more variables measured on those individuals. The following data give information about some of the top 25 grossing movies in the United States in May 2013.
E.T.: The Extra-Terrestrial | 1982 | 435 | Family | 115 | 4 | PG |
Star Wars: Episode I - The Phantom Menace | 1999 | 431 | Sci-Fi | 133 | 0 | PG |
Pirates of the Caribbean: Dead Man's Chest | 2006 | 423 | Adventure | 130 | 1 | PG13 |
Toy Story 3 | 2010 | 415 | Animation | 103 | 2 | G |
The Hunger Games | 2012 | 408 | Adventure | 142 | 0 | PG13 |
Transformers: Revenge of the Fallen | 2009 | 402 | Action | 150 | 0 | PG13 |
Star Wars: Episode III - Revenge of the Sith | 2005 | 380 | Sci-Fi | 140 | 0 | PG13 |
The Lord of the Rings: The Return of the King | 2003 | 377 | Fantasy | 201 | 11 | PG13 |
Source: Internet Movie Database
Unless our sample or population is quite small, it is difficult to make much sense of such a list. Even when we know what the columns in this table represent—name, year of release, millions of dollars earned, genre, running time in minutes, number of Oscars won, and MPAA rating—it is difficult to draw any conclusions about these movies. In what year were the most top-grossing movies released? What was the running time of a typical movie? Were fantasy movies generally longer than animated ones? What was the average dollar amount earned in the United States? We need to summarize our data in order to get an overall picture of the data. Our data descriptions can be either graphical (creating a “picture” in the usual sense) or numerical (creating a “picture” in a more abstract sense).
The video StatClips: Basic Principles of Exploring Data gives good examples of why we examine data carefully before drawing any conclusions about them.