v
To Teachers: About This Book xi
To Students: What Is Statistics? xix
About the Authors xxii
Data Table Index xxiii
Beyond the Basics Index xxiv
PART I Looking at Data
CHAPTER 1 Looking at Data—Distributions 1
Introduction 1
1.1 Data 2
Key characteristics of a data set 4
Section 1.1 Summary 7
Section 1.1 Exercises 7
1.2 Displaying Distributions with Graphs 8
Categorical variables: Bar graphs and pie charts 9
Quantitative variables: Stemplots and histograms 11
Histograms 14
Data analysis in action: Don’t hang up on me 16
Examining distributions 18
Dealing with outliers 19
Time plots 21
Section 1.2 Summary 23
Section 1.2 Exercises 23
1.3 Describing Distributions with Numbers 27
Measuring center: The mean 28
Measuring center: The median 30
Mean versus median 31
Measuring spread: The quartiles 32
The five-
The 1.5 × IQR rule for suspected outliers 35
Measuring spread: The standard deviation 38
Properties of the standard deviation 40
Choosing measures of center and spread 40
Changing the unit of measurement 43
Section 1.3 Summary 46
Section 1.3 Exercises 47
1.4 Density Curves and Normal Distributions 51
Density curves 53
Measuring center and spread for density curves 54
Normal distributions 56
The 68–
Standardizing observations 59
Normal distribution calculations 61
Using the standard Normal table 63
Inverse Normal calculations 64
Normal quantile plots 66
Beyond the Basics: Density estimation 68
Section 1.4 Summary 69
Section 1.4 Exercises 70
Chapter 1 Exercises 74
CHAPTER 2 Looking at Data—Relationships 79
Introduction 79
2.1 Relationships 79
Examining relationships 81
Section 2.1 Summary 84
Section 2.1 Exercises 84
2.2 Scatterplots 85
Interpreting scatterplots 88
The log transformation 91
Adding categorical variables to scatterplots 93
Scatterplot smoothers 94
Categorical explanatory variables 96
Section 2.2 Summary 96
Section 2.2 Exercises 96
2.3 Correlation 100
The correlation r 101
Properties of correlation 102
Section 2.3 Summary 104
Section 2.3 Exercises 105
2.4 Least-
Fitting a line to data 108
Prediction 110
Least-
Interpreting the regression line 113
Facts about least-
Correlation and regression 115
Another view of r2 117
vi
Section 2.4 Summary 117
Section 2.4 Exercises 118
2.5 Cautions about Correlation and Regression 123
Residuals 123
Outliers and influential observations 127
Beware of the lurking variable 129
Beware of correlations based on averaged data 131
Beware of restricted ranges 132
Beyond the Basics: Data mining 132
Section 2.5 Summary 133
Section 2.5 Exercises 133
2.6 Data Analysis for Two-
The two-
Joint distribution 138
Marginal distributions 139
Describing relations in two-
Conditional distributions 140
Simpson’s paradox 143
Section 2.6 Summary 145
Section 2.6 Exercises 146
2.7 The Question of Causation 148
Explaining association 149
Establishing causation 150
Section 2.7 Summary 152
Section 2.7 Exercises 153
Chapter 2 Exercises 154
CHAPTER 3 Producing Data 163
Introduction 163
3.1 Sources of Data 164
Anecdotal data 164
Available data 165
Sample surveys and experiments 167
Section 3.1 Summary 170
Section 3.1 Exercises 170
3.2 Design of Experiments 171
Comparative experiments 173
Randomization 175
Randomized comparative experiments 177
How to randomize 177
Randomization using software 178
Randomization using random digits 179
Cautions about experimentation 181
Matched pairs designs 182
Block designs 183
Section 3.2 Summary 185
Section 3.2 Exercises 186
3.3 Sampling Design 188
Simple random samples 191
How to select a simple random sample 191
Stratified random samples 194
Multistage random samples 195
Cautions about sample surveys 196
Beyond the Basics: Capture-
Section 3.3 Summary 200
Section 3.3 Exercises 200
3.4 Ethics 203
Institutional review boards 204
Informed consent 205
Confidentiality 206
Clinical trials 207
Behavioral and social science experiments 209
Section 3.4 Summary 211
Section 3.4 Exercises 211
Chapter 3 Exercises 212
PART II Probability and Inference
CHAPTER 4 Probability: The Study of Randomness 215
Introduction 215
4.1 Randomness 215
The language of probability 217
Thinking about randomness 218
The uses of probability 219
Section 4.1 Summary 219
Section 4.1 Exercises 220
4.2 Probability Models 220
Sample spaces 221
Probability rules 223
Assigning probabilities: Finite number of outcomes 225
Assigning probabilities: Equally likely outcomes 227
Independence and the multiplication rule 228
Applying the probability rules 231
Section 4.2 Summary 232
Section 4.2 Exercises 232
4.3 Random Variables 235
Discrete random variables 236
Continuous random variables 239
Normal distributions as probability distributions 242
Section 4.3 Summary 243
Section 4.3 Exercises 244
vii
4.4 Means and Variances of Random Variables 246
The mean of a random variable 246
Statistical estimation and the law of large numbers 250
Thinking about the law of large numbers 251
Beyond the Basics: More laws of large numbers 253
Rules for means 253
The variance of a random variable 255
Rules for variances and standard deviations 257
Section 4.4 Summary 261
Section 4.4 Exercises 262
4.5 General Probability Rules 264
General addition rules 264
Conditional probability 267
General multiplication rules 270
Tree diagrams 271
Bayes’s rule 273
Independence again 274
Section 4.5 Summary 274
Section 4.5 Exercises 275
Chapter 4 Exercises 278
CHAPTER 5 Sampling Distributions 281
Introduction 281
5.1 Toward Statistical Inference 282
Sampling variability 283
Sampling distributions 284
Bias and variability 287
Sampling from large populations 289
Why randomize? 290
Section 5.1 Summary 290
Section 5.1 Exercises 291
5.2 The Sampling Distribution of a Sample Mean 293
The mean and standard deviation of 296
The central limit theorem 298
A few more facts 304
Beyond the Basics: Weibull distributions 305
Section 5.2 Summary 307
Section 5.2 Exercises 307
5.3 Sampling Distributions for Counts and Proportions 310
The binomial distributions for sample counts 312
Binomial distributions in statistical sampling 314
Finding binomial probabilities 315
Binomial mean and standard deviation 317
Sample proportions 319
Normal approximation for counts and proportions 321
The continuity correction 325
Binomial formula 326
The Poisson distributions 328
Section 5.3 Summary 332
Section 5.3 Exercises 333
Chapter 5 Exercises 338
CHAPTER 6 Introduction to Inference 341
Introduction 341
Overview of inference 342
6.1 Estimating with Confidence 343
Statistical confidence 344
Confidence intervals 346
Confidence interval for a population mean 348
How confidence intervals behave 352
Choosing the sample size 353
Some cautions 355
Section 6.1 Summary 356
Section 6.1 Exercises 357
6.2 Tests of Significance 361
The reasoning of significance tests 361
Stating hypotheses 363
Test statistics 364
P-values 365
Statistical significance 367
Tests for a population mean 371
Two-
The P-value versus a statement of significance 377
Section 6.2 Summary 379
Section 6.2 Exercises 379
6.3 Use and Abuse of Tests 384
Choosing a level of significance 384
What statistical significance does not mean 385
Don’t ignore lack of significance 386
Statistical inference is not valid for all sets of data 387
Beware of searching for significance 388
Section 6.3 Summary 389
Section 6.3 Exercises 389
6.4 Power and Inference as a Decision 391
Power 391
Increasing the power 395
Inference as decision 396
Two types of error 396
Error probabilities 397
The common practice of testing hypotheses 399
Section 6.4 Summary 400
Section 6.4 Exercises 400
Chapter 6 Exercises 402
viii
CHAPTER 7 Inference for Means 407
Introduction 407
7.1 Inference for the Mean of a Population 408
The t distributions 408
The one-
The one-
Matched pairs t procedures 419
Robustness of the t procedures 423
Beyond the Basics: The bootstrap 424
Section 7.1 Summary 425
Section 7.1 Exercises 426
7.2 Comparing Two Means 432
The two-
The two-
The two-
The two-
Robustness of the two-
Inference for small samples 444
Software approximation for the degrees of freedom 447
The pooled two-
Section 7.2 Summary 453
Section 7.2 Exercises 454
7.3 Additional Topics on Inference 460
Choosing the sample size 461
Inference for non-
Section 7.3 Summary 474
Section 7.3 Exercises 474
Chapter 7 Exercises 476
CHAPTER 8 Inference for Proportions 483
Introduction 483
8.1 Inference for a Single Proportion 484
Large-
Beyond the Basics: The plus four confidence interval for a single proportion 489
Significance test for a single proportion 491
Choosing a sample size for a confidence interval 494
Choosing a sample size for a significance test 498
Section 8.1 Summary 500
Section 8.1 Exercises 501
8.2 Comparing Two Proportions 505
Large-
Beyond the Basics: The plus four confidence interval for a difference in proportions 509
Significance test for a difference in proportions 511
Choosing a sample size for two sample proportions 514
Beyond the Basics: Relative risk 518
Section 8.2 Summary 519
Section 8.2 Exercises 520
Chapter 8 Exercises 522
PART III Topics in Inference
CHAPTER 9 Inference for Categorical Data 525
Introduction 525
9.1 Inference for Two-
The hypothesis: No association 532
Expected cell counts 533
The chi-
Computations 536
Computing conditional distributions 537
The chi-
Beyond the Basics: Meta-
Section 9.1 Summary 543
Section 9.1 Exercises 544
9.2 Goodness of Fit 545
Section 9.2 Summary 550
Section 9.2 Exercises 550
Chapter 9 Exercises 551
CHAPTER 10 Inference for Regression 555
Introduction 555
10.1 Simple Linear Regression 556
Statistical model for linear regression 556
Preliminary data analysis and inference considerations 558
Estimating the regression parameters 561
Checking model assumptions 565
Confidence intervals and significance tests 567
Confidence intervals for mean response 570
Prediction intervals 572
Transforming variables 574
Beyond the Basics: Nonlinear regression 576
Section 10.1 Summary 577
Section 10.1 Exercises 578
10.2 More Detail about Simple Linear Regression 582
Analysis of variance for regression 583
The ANOVA F test 585
Calculations for regression inference 588
Inference for correlation 593
ix
Section 10.2 Summary 596
Section 10.2 Exercises 597
Chapter 10 Exercises 598
CHAPTER 11 Multiple Regression 607
Introduction 607
11.1 Inference for Multiple Regression 608
Population multiple regression equation 608
Data for multiple regression 609
Multiple linear regression model 610
Estimation of the multiple regression parameters 611
Confidence intervals and significance tests for regression coefficients 612
ANOVA table for multiple regression 613
Squared multiple correlation R2 615
Section 11.1 Summary 615
Section 11.1 Exercises 616
11.2 A Case Study 618
Preliminary analysis 619
Relationships between pairs of variables 620
Regression on high school grades 622
Interpretation of results 624
Examining the residuals 624
Refining the model 625
Regression on SAT scores 626
Regression using all variables 627
Test for a collection of regression coefficients 631
Beyond the Basics: Multiple logistic regression 632
Section 11.2 Summary 633
Section 11.2 Exercises 634
Chapter 11 Exercises 636
CHAPTER 12 One-Way Analysis of Variance 643
Introduction 643
12.1 Inference for One-
Data for one-
Comparing means 645
The two-
An overview of ANOVA 647
The ANOVA model 651
Estimates of population parameters 653
Testing hypotheses in one-
The ANOVA table 658
The F test 660
Software 663
Beyond the Basics: Testing the equality of spread 665
Section 12.1 Summary 666
Section 12.1 Exercises 667
12.2 Comparing the Means 670
Contrasts 670
Multiple comparisons 677
Power 681
Section 12.2 Summary 685
Section 12.2 Exercises 685
Chapter 12 Exercises 687
CHAPTER 13 Two-Way Analysis of Variance 697
Introduction 697
13.1 The Two-
Advantages of two-
The two-
Main effects and interactions 703
13.2 Inference for Two-
The ANOVA table for two-
Chapter 13 Summary 713
Chapter 13 Exercises 714
Companion Chapters
(on the IPS website www.macmillanhighered.com/
CHAPTER 14 Logistic Regression 14-
Introduction 14-
14.1 The Logistic Regression Model 14-
Binomial distributions and odds 14-
Odds for two groups 14-
Model for logistic regression 14-
Fitting and interpreting the logistic regression model 14-
14.2 Inference for Logistic Regression 14-
Confidence intervals and significance tests 14-
Multiple logistic regression 14-
Chapter 14 Summary 14-
Chapter 14 Exercises 14-
Chapter 14 Notes and Data Sources 14-
CHAPTER 15 Nonparametric Tests 15-
Introduction 15-
15.1 The Wilcoxon Rank Sum Test 15-
The rank transformation 15-
The Wilcoxon rank sum test 15-
The Normal approximation 15-
What hypotheses does Wilcoxon test? 15-
Ties 15-
Rank, t, and permutation tests 15-
x
Section 15.1 Summary 15-
Section 15.1 Exercises 15-
15.2 The Wilcoxon Signed Rank Test 15-
The Normal approximation 15-
Ties 15-
Testing a hypothesis about the median of a distribution 15-
Section 15.2 Summary 15-
Section 15.2 Exercises 15-
15.3 The Kruskal-
Hypotheses and assumptions 15-
The Kruskal-
Section 15.3 Summary 15-
Section 15.3 Exercises 15-
Chapter 15 Exercises 15-
Chapter 15 Notes and Data Sources 15-
CHAPTER 16 Bootstrap Methods and Permutation Tests 16-
Introduction 16-
Software 16-
16.1 The Bootstrap Idea 16-
The big idea: Resampling and the bootstrap distribution 16-
Thinking about the bootstrap idea 16-
Using software 16-
Section 16.1 Summary 16-
Section 16.1 Exercises 16-
16.2 First Steps in Using the Bootstrap 16-
Bootstrap t confidence intervals 16-
Bootstrapping to compare two groups 16-
Beyond the Basics: The bootstrap for a scatterplot smoother 16-
Section 16.2 Summary 16-
Section 16.2 Exercises 16-
16.3 How Accurate Is a Bootstrap Distribution? 16-
Bootstrapping small samples 16-
Bootstrapping a sample median 16-
Section 16.3 Summary 16-
Section 16.3 Exercises 16-
16.4 Bootstrap Confidence Intervals 16-
Bootstrap percentile confidence intervals 16-
A more accurate bootstrap confidence interval: BCa 16-
Confidence intervals for the correlation 16-
Section 16.4 Summary 16-
Section 16.4 Exercises 16-
16.5 Significance Testing Using Permutation Tests 16-
Using software 16-
Permutation tests in practice 16-
Permutation tests in other settings 16-
Section 16.5 Summary 16-
Section 16.5 Exercises 16-
Chapter 16 Exercises 16-
Chapter 16 Notes and Data Sources 16-
CHAPTER 17 Statistics for Quality: Control andCapability 17-
Introduction 17-
Use of data to assess quality 17-
17.1 Processes and Statistical Process Control 17-
Describing processes 17-
Statistical process control 17-
charts for process monitoring 17-
s charts for process monitoring 17-
Section 17.1 Summary 17-
Section 17.1 Exercises 17-
17.2 Using Control Charts 17-
and R charts 17-
Additional out-
Setting up control charts 17-
Comments on statistical control 17-
Don’t confuse control with capability! 17-
Section 17.2 Summary 17-
Section 17.2 Exercises 17-
17.3 Process Capability Indexes 17-
The capability indexes Cp and Cpk 17-
Cautions about capability indexes 17-
Section 17.3 Summary 17-
Section 17.3 Exercises 17-
17.4 Control Charts for Sample Proportions 17-
Control limits for p charts 17-
Section 17.4 Summary 17-
Section 17.4 Exercises 17-
Chapter 17 Exercises 17-
Chapter 17 Notes and Data Sources 17-
Tables T-1
Notes and Data Sources N-1
Index I-1