51
When you complete this section, you will be able to:
Compare the mean and the median for symmetric and skewed distributions.
Sketch a Normal distribution for any given mean and standard deviation.
Apply the 68-95-99.7 rule to find proportions of observations within one, two, and three standard deviations of the mean for any Normal distribution.
Transform values of a variable from a general Normal distribution to the standard Normal distribution.
Compute areas under a Normal curve using software or Table A.
Perform inverse Normal calculations to find values of a Normal variable corresponding to various areas.
Assess the extent to which the distribution of a set of data can be approximated by a Normal distribution.
We now have a kit of graphical and numerical tools for describing distributions. What is more, we have a clear strategy for exploring data on a single quantitative variable:
Always plot your data: make a graph, usually a stemplot or a histogram.
Look for the overall pattern and for striking deviations such as outliers.
Calculate an appropriate numerical summary to briefly describe center and spread.
density curves
Technology has expanded the set of graphs that we can choose for Step 1. It is possible, though painful, to make histograms by hand. Using software, clever algorithms can describe a distribution in a way that is not feasible by hand, by fitting a smooth curve to the data in addition to or instead of a histogram. The curves used are called density curves. Before we examine density curves in detail, here is an example of what software can do.
54
Density Curve
A is a curve that
Is always on or above the horizontal axis.
Has area exactly 1 underneath it.
A density curve describes the overall pattern of a distribution. The area under the curve and above any range of values is the proportion of all observations that fall in that range.
57
There are other symmetric bell-shaped density curves that are not Normal. The Normal density curves are specified by a particular equation. The height of the density curve at any point x is given by
64
EXAMPLE 1.44
Eligibility for aid and practice. What proportion of all students who take the SAT would be eligible to receive athletic scholarships and to practice with the team but would not be eligible to compete in the eyes of the NCAA? That is, what proportion of students have SAT scores between 620 and 800? First, sketch the areas, exactly as in Example 1.41. We again use X as shorthand for an SAT score.
Standardize.
Use the table.
As in Example 1.41, about 13% of students would be eligible to receive athletic scholarships and to practice with the team.
Use Your Knowledge
1.97 Find the proportion. Consider the NAEP scores, which are approximately Normal, N(288, 38). Find the proportion of students who have scores less than 350. Find the proportion of students who have scores greater than or equal to 350. Sketch the relationship between these two calculations using pictures of Normal curves similar to the ones given in Example 1.40 (page xx).
1.98 Find another proportion. Consider the NAEP scores, which are approximately Normal, N(288, 38). Find the proportion of students who have scores between 300 and 350. Use pictures of Normal curves similar to the ones given in Example 1.41 (page xx) to illustrate your calculations.
68
Beyond the Basics
Density Estimation
A density curve gives a compact summary of the overall shape of a distribution. Many distributions do not have the Normal shape. There are other families of density curves that are used as mathematical models for various distribution shapes. Modern software offers more flexible options. A density estimator does not start with any specific shape, such as the Normal shape. It looks at the data and draws a density curve that describes the overall shape of the data. Density estimators join stemplots and histograms as useful graphical tools for exploratory data analysis.
density estimator
Density estimates can capture other unusual features of a distribution. Here is an example.