EXAMPLE 5 Shakespeare’s words

Figure 11.4 shows the distribution of lengths of words used in Shakespeare’s plays. This distribution has a single peak and is somewhat skewed to the right. There are many short words (three and four letters) and few very long words (10, 11, or 12 letters), so that the right tail of the histogram extends out farther than the left tail. The center of the distribution is about 4. That is, about half of Shakespeare’s words have four or fewer letters. The variability is from 1 letter to 12 letters.

Notice that the vertical scale in Figure 11.4 is not the count of words but the percentage of all of Shakespeare’s words that have each length. A histogram of percentages rather than counts is convenient because this was a large data set. Different kinds of writing have different distributions of word lengths, but all are right-skewed because short words are common and very long words are rare.