INDEX
- Alternative hypothesis. See Hypothesis, alternative
- Analysis of variance (ANOVA)
- one-way, 712, Chapter 14
- regression, 520–521, 524, 555–557
- two-way, 712, Chapter 15
- verifying the conditions for, 723
- Analysis of variance table
- one-way, 722, 724–726
- regression, 521–522, 524, 555, 561
- two-way, 15-13–15-14
- Anecdotal data, 124, 127
- Anonymity, 162
- Applet
- central limit theorem, 296–297
- confidence intervals, 305–306, 352
- correlation and regression, 79, 88, 94, 97
- law of large numbers, 178, 223
- mean and median, 26, 37
- Normal curve, 47
- One-Variable Statistical Calculator, 13
- One-Way ANOVA, 727, 729, 757
- probability, 175, 178
- simple random sample, 133, 141, 153, 158
- statistical power, 351
- statistical significance, 335–336
- Statistic, 372
- Assignable cause, 598
- Association, 68, 72, 76
- Autocorrelation, 653, 678
- Autocorrelation function, 651–655, 657
- Autoregressive behavior, 647, 657
- Autoregressive model, 681–690
- Available data, 124–125, 127
- Bar graph, 8, 2, 108
- Bayes, Thomas, 203
- Bayes’s rule, 203–205, 206
- Benford’s law, 184–187, 284
- Bias. See also Unbiased estimator
- in a sample, 131, 140, 276–280, 281
- in an experiment, 145, 281
- of a statistic, 276–280, 281
- Binomial coefficient, 251–252, 263
- Binomial distribution. See Distribution, binomial
- Binomial setting, 245, 263, 418, 17-1
- Block, 155
- Block design. See Experiment, block design
- Bonferroni method, 342, 741–744
- Bootstrap, 372–373, 406, 411
- Boxplot, 29, 34
- modified, 30
- side-by-side, 30
- Buffon, Count, 175, 187
- Capability, 619–622
- Capability indices, 622–625, 626
- Capture-recapture sampling, 139
- Case, 2, 6, 534–535, 545
- Categorical variable. See Variable, categorical
- Causation, 101–102, 103
- Cause-and-effect diagrams, 596, 599
- Cell, 457, 15-3
- Census, 26
- Center of a distribution, 24–27
- Centered moving average, 695
- Central limit theorem, 294–299
- Chi-square distribution. See Distribution, chi-square
- Chi-square statistic, 463
- and the statistic, 465–466
- Classes, 12
- Clinical trial, 163, 167
- Coefficient, 84
- Coin tossing, 174–176, 178, 179–181, 187–188, 209–210, 220, 224, 237, 240
- Collinearity, 570, 583
- Column variable. See Variable, row and column
- Common cause, 598, 599
- Comparative experiment. See Experiment, comparative
- Complement of an event. See Event, complement of
- Conditional distribution. See Distribution, conditional
- Conditional probability. See Probability, conditional
- Confidence interval, 302–314
- and two-sided tests, 329
- behavior, 309–310
- bootstrap, 372–373, 411
- simultaneous, 744
- for contrast, 737
- for difference of means, 381–382, 392
- for matched pairs, 370
- for mean response in regression, 511, 516, 554
- for multiple regression coefficients, 551, 561
- for one mean, 360–362, 374
- for regression slope, 496, 503
- for difference of proportions, 437, 448
- for one mean, 306–309, 313
- for one proportion, 420, 421–422, 431
- Confidence level, 304, 313
- Confidentiality, 162–163, 167
- Confounding, 143, 157
- Consumer Expenditure Survey, 375
- Continuity correction, 260–261, 264
- Contrasts, 717, 733–739
- Control, statistical, 597
- Control chart,
- chart, 636–637, 638
- chart, 616–618, 626
- chart, 616–618, 626
- chart, 631–635, 638
- chart, 603–612, 626
- chart, 613–614, 626
- three-sigma, 602
- chart, 602–614, 626
- Control chart constant, 603
- Control group, 145, 157
- Control limits, 602–604, 613, 616, 631, 636
- Convenience sample, 131
- Correlation, 74–78
- and regression, 87–88, 95
- between random variables, 230–231
- cautions about, 98–103
- inference for, 500–502, 503
- population, 500, 503
- squared, 87–88, 520–522
- squared multiple, 558, 562
- Count, 417
- distribution of, 245–247, 263, 268–271, 273
- Countably infinite, 210
- Critical value,
- of chi-square distribution, 464, Table F
- of distribution, 556, Table E
- of standard Normal distribution, 306, Table A
- of distribution, 360, Table D
- Cumulative probability. See Probability, cumulative
- Cumulative proportion, 11, 46
- Current Population Survey, 135, 137
- Curved relationships, 515, 569–571
- Cyclic component, 647, 657
- Data mining, 102
- Deciles, 59
- Decision, relation to inference, 346–349
- Degrees of freedom, 31
- approximation for, 380, 392
- for one-way ANOVA, 726–727
- for two-way ANOVA, 15-13
- of chi-square distribution, 464
- of chi-square test, 464
- of distribution, 556–557
- of noncentral distribution, 745–746
- of noncentral distribution, 404
- of distribution, 359, 374
- of regression ANOVA, 522, 524
- of regression , 496, 503
- of regression , 491, 496, 503, 544
- of two-sample , 380
- Deming, W. Edwards, 592, 628
- Density curve, 38–39, 213–215, 216
- Density estimation, 54–55
- Design of an experiment, 142–156
- Direction, of a relationship, 67, 72. See also Correlation
- Disjoint events. See Event, disjoint
- Distribution, 18, 21
- binomial, 245–250, 263
- and logistic regression, Chapter 17
- formula, 250–252, 263
- Normal approximation, 256–260, 264, 418, 431
- Poisson approximation, 271
- use in the sign test, 407–408
- categorical variable, 3, 7
- chi-square, 464
- conditional, 107, 113, 458
- , 556–558, 726, 15-14
- joint, 458
- jointly Normal, 502
- marginal, 105, 113
- noncentral , 745
- noncentral , 404, 409
- Normal, 42, 56, 215–216
- standard, 45–46, 57
- Poisson, 268–270, 273
- Normal approximation, 270
- probability. See Probability
- quantitative variable, 3, 12
- sampling. See Sampling distribution
- skewed, 18
- symmetric, 18
- , 358–360, 374
- trimodal, 55
- uniform, 213–214, 218, 221
- Distribution-free procedure. See nonparametric procedure
- Environmental Protection Agency (EPA), 375
- Equivalence testing, 370–371
- Estimator, 255–256, 277, 279, 281, 292–293, 299, 302
- Ethics, 160–167
- Excel, 2, 85, 134, 148, 177–180, 237, 268–271, 274, 282, 367, 387, 492, 499, 507, 521, 535, 539, 552, 665–667, 670, 676, 695–697, 700, 702, 709, 730, 15-19
- Expected cell count, 462, 470
- Expected value, 220. See also Mean, of a random variable
- Experiment, 127, 142–157, 713
- behavioral and social science, 165–166
- block design, 155–156, 157
- cautions about, 153
- comparative, 145, 157
- completely randomized, 147
- design of, 142–157
- matched pairs design, 154–155, 157
- principles, 151, 157
- randomized comparative, 146
- Experimental units, 142, 157
- Explanatory variable. See Variable, explanatory
- Exploratory data analysis, 7, 20
- Exponential smoothing model, 699–705, 706
- Extrapolation, 98–99, 103
- Event, 181, 191
- complement of, 182, 191, 205–206
- disjoint, 182, 191, 195, 206
- empty, 196
- independent, 188–189, 191, 205, 206
- intersection, 200, 205
- union, 195, 205
- distribution. See Distribution,
- tests. See Significance test
- Factor, experimental, 143, 157, 712, 15-2
- Factorial, 251–252, 263
- False-positive, 340
- Fisher, R. A., 337, 350, 556
- First difference, 659
- Five-number summary, 29, 34
- Flowchart, 594–595, 599
- Forecast, 656, 657, 686
- Form, of a relationship, 67, 72
- Frequency. See Count
- Friedman, Milton, 500
- Galton, Sir Francis, 500
- General Social Survey, 137
- Goodness of fit, 470–475
- Gosset, William S., 359
- Histogram, 12, 21
- Hot hand, 224
- Hypothesis, 319–320, 332
- alternative, 319, 332
- one-sided, 320, 332
- two-sided, 320, 332
- null, 319, 332
- Hypothesis testing. See Significance test
- Independence
- in two-way tables, 467–468
- of events, 188–189, 191, 205, 206
- of observations, 176, 646, 657
- of random variables, 230, 236
- Inference, statistical. See Statistical inference
- Influential observations, 92–94, 95
- Informed consent, 161–162, 167
- Institutional Review Boards (IRB), 160–161, 167
- Instrument, 5
- Interaction, 15-5, 15-7–15-12, 15-13
- terms in multiple regression, 577–580
- Intercept of a line, 83, 94
- Interquartile range, 35
- Intervention, 127
- Irregular component, 647, 657
- JMP, 84, 110, 177–178, 237, 262–263, 272–273, 275, 282, 301, 353, 355, 366, 367, 387, 406, 422, 425, 430, 439, 443, 446, 447, 459, 493, 540, 553, 607, 653–654, 673, 700, 702, 704–705, 724, 732, 747, Chapter 16
- Kerrich, John, 175, 187, 284
- Kruskal-Wallis test, Chapter 16
- Label, 2, 6
- Lag variable, 652, 681
- Lagging, 651
- Large numbers, law of, 222–224
- Leaf, in a stemplot, 16
- Least-significant difference (LSD) method, 741
- Least squares, 486–488, 538–539
- Least-squares line, 81–82, 483
- Level of a factor, 143, 157, Chapter 14
- Linear relationship, 67, 72
- Log transformation, 68, 72, 486, 536, 662–663, 674–678, 679, 683–686, 689–690
- Logistic regression. See Regression, logistic
- Lurking variable. See Variable, lurking
- Main effects, 15-5, 15-7–15-12, 15-13
- Margin of error, 279
- for a difference in two means, 381, 392–393
- for a difference in two proportions, 437, 448
- for regression response, 511
- for regression slope, 496
- for a single mean, 304–312, 313, 360, 374, 398–399
- for a single proportion, 420, 431
- Marginal means, Chapter 15
- Matched pairs design. See Experiments, matched pairs
- inference for, 368–371, 374
- Mean, 24, 34
- of binomial distribution, 254, 263
- of density curve, 39, 56
- of difference of sample means, 379
- of difference of sample proportions, 436
- of Normal distribution, 43
- of Poisson distribution, 268, 273
- of a random variable, 219–220, 235
- population, 222
- rules for, 224–226, 236
- sample mean, 292–293, 299
- sample proportion, 256, 418, 431
- Mean absolute deviation, 694
- Mean absolute percentage error, 694
- Mean response, estimated, 510–515
- Mean square, 522
- in forecasting, 694
- in multiple linear regression, 555
- in one-way ANOVA, 725–726
- in simple linear regression, 522, 524
- in two-way ANOVA, Chapter 15
- Median, 24, 34
- inference for, 407–408, Chapter 16
- of density curve, 39, 56
- Meta-analysis, 468–469
- Minimum variance portfolio, 234
- Minitab, 85, 177–178, 237, 248, 274, 282, 301, 346, 353, 355, 366, 387, 400, 403, 422, 425, 430, 439, 443, 446, 459, 473, 493, 511, 514, 517, 524, 527, 536, 537, 541, 552, 607, 653–655, 666, 669, 676, 678, 683–686, 688–689, 700, 702, 704–705, 731, 747, 752, Chapter 15, Chapter 16, Chapter 17
- Model, mathematical, 38, 54
- Model building, 566
- Mosaic plot, 109–110, 461
- Moving average model, 691–693, 706
- Moving range, 615
- Multiple comparisons, 717, 740–743
- Multiple regression. See Regression, multiple
- Naive forecast, 660
- National Longitudinal Survey of Youth (NLSY), 485
- National Science Foundation (NSF), 493–494
- Natural Resources Canada, 510
- Neyman, Jerzy, 350
- Noncentrality parameter, 404–405, 409, 745–746
- Nonlinear regression. See Regression, nonlinear
- Nonparametric procedure, 406, Chapter 16
- Nonresponse, 137
- Nonsense correlation, 101
- Normal distribution. See Distribution, Normal
- Normal probability plot. See Normal quantile plot
- Normal quantile plot, 51
- Normal score, 51
- Null hypothesis. See Hypothesis, null
- Observational study, 127
- Odds, 581, Chapter 17
- Office of the Superintendent of Bankruptcy Canada (OSB), 375
- Out-of-control signal, 602, 607, 610–611, 626, 627
- Outliers, 21, 67, 71, 94, 95, 494
- Overdispersion, 262
- Parameter, 276, 281
- Pareto chart, 10, 21, 596, 599
- Partial autocorrelation function, 683, 690
- Pattern of a distribution, 18, 21
- Pearson, Egon, 350
- Pearson, Karl, 175, 187
- Percent, 8, 417
- Percentile, 28
- Pie chart, 9, 21
- Placebo, 145
- Placebo effect, 145
- Poisson setting, 267, 273
- Pooled estimator
- of population proportion, 441
- of variance in ANOVA, 721, Chapter 15
- of variance in two samples, 386–387, 393, 714
- Population, 126, 127, 129, 418, 276
- Population regression equation, 485, 492, 502, 548–549
- Power, 343–345, 350
- and sample size, 345–346
- and Type II error, 349
- for one-way ANOVA, 745–748
- increasing, 345
- of test
- one-sample, 402–403, 409
- two-sample, 404–406, 409
- of test, 343–345
- Power curve, 403, 747
- Prediction, 81, 83, 94
- Prediction interval, 510–511, 516, 554
- Probability, 175, 176
- conditional, 197–199, 205
- cumulative, 269
- equally likely outcomes, 186–187
- finite sample space, 184
- Probability distribution, 210, 215, 216
- mean of, 219–220, 235
- standard deviation of, 229, 231, 236
- variance of, 229–231, 236
- Probability histogram, 211, 216
- Probability model, 179, 191
- Probability rules, 182, 191, 206
- addition, 182, 191, 195, 196, 206
- complement, 182, 191, 206
- multiplication, 187–188, 191, 198, 206
- Probability sample. See Sample, probability
- Process, 592–593, 599
- Proportion
- population, 256
- sample, 244
- -value, 322, 332
- Quartiles, 27–28, 34
- of a density curve, 39, 41
- Quantitative variable. See Variable, quantitative
- Random digits, 132, 140
- Random digit dialing, 138
- Random phenomenon, 175, 176
- Random process, 646
- Random sample. See Sample
- Random variable, 209–210, 216
- continuous, 213, 216
- discrete, 210, 216
- mean of, 219–220, 235–236
- standard deviation of, 229, 235–236
- variance of, 229, 236
- Random walk, 657–659, 663
- Randomness, 174–176
- Randomize, how to, 132–134, 147–149
- Randomized comparative experiment. See Experiment, randomized comparative
- Range statistic, 602
- Rate, 5, 6
- Rational subgroup, 600–601
- Regression, 80–95
- and correlation, 87–88, 95
- cautions about, 98–103
- conditions for, 494–495
- equation, 83, 95, 485
- interaction terms, 577–580, 582
- least-squares, 81–82, 94
- line, 80–81
- logistic, 580–581, Chapter 17
- multiple, Chapter 11
- multicollinearity, 570
- model building, 566
- nonlinear, 515
- polynomial, 570
- simple linear, Chapter 10
- standardized coefficients, 585
- quadratic, 569, 582
- variable selection methods, 577–580
- variance inflation factor (VIF), 570, 583
- with categorical explanatory variables, 571–574
- Regression fallacy, 500
- Relative risk, 447–448
- Replication, in experiments, 151, 157
- Residual, 88–89, 491, 503, 539, 541–543, 720
- distribution, 92
- plots, 90–91, 494, 503, 542–543, 667, 678, 685
- Resistant measure, 25, 26, 32, 35
- Response bias, 138
- Response rate, 130
- Response variable. See Variable, response
- Risk, investment, 33, 225
- Risk pooling, 233
- Robustness, 371–372, 374, 384, 391, 496
- Rounding, 24
- Row variable. See Variable, row
- Run, in coin tossing, 224
- Run chart, 595
- Runs rules for control charts, 627
- Runs test for randomness, 648–650, 657
- Sample, 126, 127, 129, 140, 418
- cautions about, 136–138
- convenience, 131
- design of, 129–140
- frame, 129
- multistage, 135, 140
- probability, 134, 140
- simple random, 132, 140
- stratified, 135, 140
- survey, 126, 127, 129
- systematic, 141
- Sample size, choosing
- confidence interval for a difference in proportions, 444–445, 448
- confidence interval for a mean, 311, 313, 398–402
- confidence interval for a proportion, 426–427, 431
- for a desired power, 345–346
- for one-way ANOVA, 745–748
- significance test for a difference in proportions, 445–446, 448
- significance test for a proportion, 429–430, 432
- Sample space, 179, 191
- Sampling, 129–140
- Sampling distribution, 277, 288
- of difference of means, 379–380
- of difference of two proportions, 436
- of one sample statistic, 372
- of regression estimators, 496
- of sample mean, 294, 299
- of sample proportion, 277
- Sampling frame, 129
- Sampling variability, 276–277
- SAS, 388, 525, 540, 568, 570, 572, 574, 575, 576, 577, 579, 730, 745, 15-20, 17-11, 17-16
- Satterthwaite approximation, 380, 392, 398
- Scatterplot, 65–71
- Scatterplot smoothing, 69, 486–487, 537–538
- Seasonal ratio, 697, 706
- Seasonality, 647, 671–676, 679, 693–699
- Seasonally adjusted, 698–699
- Secrist, Horace, 500
- Shape of a distribution, 18, 19
- Shewhart, Walter, 592, 600
- Sign test. See Significance test, sign test
- Significance level, 324, 337–338
- Significance, statistical, 151–152, 316–332
- and practical significance, 337–338, 341
- and Type I error, 349
- Significance test, 316–333
- chi-square for goodness of fit, 472
- chi-square for two-way table, 463–464, 470
- common practice, 349–350
- test for a collection of regression coefficients, 559, 562
- test for one-way ANOVA, 726–727
- test in multiple regression, 556–558, 561–562
- test in regression, 522, 524
- test for two-way ANOVA, 15-14, 15-21
- runs test for randomness, 648–650
- sign test, 407–408, 409
- test for contrast, 737
- test for correlation, 501, 503
- test for matched pairs, 368–370
- test for multiple regression coefficients, 551, 561
- test for one mean, 362–363, 373–374
- test for regression slope, 496, 503
- test for two means, 383, 392
- using, 336–341
- test for one mean, 327, 333
- test for one proportion, 423–424, 431
- test for two means, 380, 392
- test for two proportions, 441–442, 448
- Simple linear regression. See Regression, simple linear
- Simple random sample. See Sample, simple random
- Simpson’s paradox, 111–112, 113, 121
- Simulation, 277
- Six-sigma quality, 630
- Skewed distribution. See Distribution, skewed
- Slope, 83, 94
- Small numbers, law of, 224
- Smoothing. See Scatterplot smoothing
- Special cause, 598, 599
- Spread of a distribution, 18, 27–28, 31
- Spreadsheet, 3. See also Excel
- SPSS, 366, 388, 460, 473, 734, 735, 739, 742, 15-17, 15-20
- Standard deviation, 31, 34
- of binomial distribution, 254, 263
- of density curve, 41
- of difference of sample means, 380
- of difference of sample proportions, 441, 448
- of Normal distribution, 42
- of Poisson distribution, 268, 273
- of random variable, 229, 236
- of sample mean, 293, 299
- of sample proportion, 256, 263
- pooling, 386–387, 393, 441, 448, 721, 15-6, 15-13
- rules for pooling, 720
- Standard error, 358
- for regression prediction, 519
- of a contrast, 737
- of a difference of sample proportions, 436, 448
- of a sample mean, 358, 373
- of a sample proportion, 419, 431
- of mean regression response, 518
- of regression intercept and slope, 518
- pooled, 389, 393, 721, 15-6, 15-13
- regression, 492, 503, 544, 546, 551, 561
- Standard Normal distribution. See Distribution, Normal
- Standardized observation, 45, 56, 75
- Statistic, 276, 281
- Statistical inference, 275–281, 288
- Statistical process control, 597–599
- Statistical significance. See Significance, statistical
- Stem-and-leaf plot. See Stemplot
- Stemplot, 15–16, 21
- back-to-back, 17
- rounding, 17
- splitting stems, 17
- Strata, 135, 140
- Strength, of a relationship, 67, 72. See also correlation
- Subgroup, 600
- Subjects, experimental, 142, 157
- Subpopulation, 484, 549
- Sums of squares, 521
- in multiple linear regression, 555
- in one-way ANOVA, 725
- in simple linear regression, 521
- in two-way ANOVA, 15-7, 15-13
- Survey, 126, 129
- Symmetric distribution. See Distribution, symmetric
- distribution. See Distribution,
- confidence intervals. See Confidence interval,
- significance tests. See Significance test,
- Table
- Table A Standard Normal Probabilities, 48–50
- Table B Random digits, 132, 149
- Table C Binomial probabilities, 248
- Table D distribution critical values, 360
- Table E critical values, 556–557
- Table F distribution critical values, 464
- Table G Critical values of the correlation, 501
- Test of significance. See Significance test
- Test statistic, 321. See also Significance test
- Testing hypotheses. See Significance test
- Time plot, 19–20, 21, 645
- Time series, Chapter 13
- Transformation, 72
- Treatment, experimental, 127, 142–143, 157
- Tree diagram, 201–202
- Trend, 647, 664–670
- exponential, 669–670
- linear, 665–666
- quadratic, 668–669
- Tuskegee Study, 164
- Two-sample problems, 379
- Two-way table
- data analysis for, 104–113
- hypothesis, 462
- inference for, 455– 470
- models for, 466–467
- Type I and II error, 346–347
- Unbiased estimator, 279–280, 281
- Uncountably infinite, 213
- Undercoverage, 137
- Unit, of measurement, 3, 75
- United States Department of Agriculture (USDA), 377, 384
- Value, 2
- Variability of a statistic, 279
- Variable, 2, 6
- categorical, 3, 6, 7, 70, 104–113
- dependent, 63
- explanatory, 63, 71
- independent, 63
- indicator, 572–574
- lurking, 100, 103
- quantitative, 3, 6, 12, 15–16
- response, 63, 71
- row and column, 105, 112
- Variable selection methods, 577–580
- Variance, 31, 34. See also Standard deviation
- of random variable, 229, 236
- rules for, 231, 236
- Variation
- between-group, 724, 733
- within-group, 724, 726, 733
- Venn diagram, 182
- Voluntary response, 130–131, 140
- Wilcoxon rank sum test, Chapter 16
- Wilson estimate, 422, 440
- Wording questions, 138
- confidence interval. See Confidence interval,
- -score, 45
- significance test. See Significance test,
- Zero inflation, 272