INDEX

655

Note: Page numbers in boldface type indicate pages where key terms are defined.


Aaron, Hank, 267, 268–269, 270–271, 272–273, 279, 367

Abecedarian Project, 123

acceptance sampling, 558–560

accuracy

of data, 12

of measurement, 171–174

achievement tests, 169

ACT test. See American College Testing (ACT) test

Adams, Evelyn Marie, 411

Advanced Placement (AP) exams, 169

advertising, gender and, 128

age, height and, 345

aging population statistics, 193, 244, 247, 253–254

alcohol consumption, facial attractiveness and, 141

alternative hypothesis, 525, 525–527, 557–558

American College Testing (ACT) test, 304, 305

American Community Survey (ACS), 12, 68

American Football League (AFL), 339

American Medical Association, 39

American Psychological Association, Ethical Principles of, 152

American Time Use Survey (ATUS), 533

anger, heart disease and, 581–583

anonymity, 145

antidepressants, placebo vs., 551–552

approximate level C confidence interval, 504, 508

approximately Normal, 496, 506

Archaeopteryx fossils, classifying, 321–322, 324, 340–341, 343–344, 347

arithmetic average, 281

Arizona State University, 50

Armed Forces Qualifying Test, 224

asbestos in schools, 417–418

aspirin and reduction of heart attacks, 149

astragali, 408–409

astrological sign, health and, 556–557

atomic clock, 175, 176

authoritarian personality, 177–178

auto manufacturer loans, 188–189

average, 175, 175–176


back-to-back stemplot, 265

Bailar, John, 142

banks, big data and, 250

bar fights, 69

bar graphs, 218–222, 219, 246, 293

baseball

ballpark beer and hot dog prices, 348, 352

divorce and game attendance with spouse, 557

home run statistics, 267–269, 270–271, 272–273, 274, 277–278, 279

player salaries, 367, 372–373

probability of appearing in World Series, 415

.300 hitters, 452

base period, 368, 368–369

Basic and Applied Social Psychology (BASP), 555

basketball

field-goal shots, 473

player salaries, 281–282

run of baskets in, 410

Bayes, Thomas, 417

Bayes’s procedure, 417

Bayes’s theorem, 417

Beardstown Ladies’ Common-Sense Investment Guide, 193

beer prices at the ballpark, 348, 352

Behavioral Risk Factor Surveillance System (BRFSS), 493–494

behavioral science experiments, 151–153

bell curve, 300

Benford’s law, 488

Berra, Yogi, 7

bias

big data and, 354

nonadherers and, 121

reducing, 41, 43, 43–45, 172, 173–176

social desirability, 67

biased sampling, 21–24

Big Bang, 317

big data, 24, 250

correlation, prediction and, 353–355

block design, 127, 127–129

blood pressure of executives, 509

body mass index (BMI), of mothers and daughters, 349–351

body temperature, 293–296

656

Bonds, Barry, 267–269, 271, 272–273, 274, 277–278, 367

bone marrow treatment (BMT), 151

boxplots, 272–275, 273

Bradley, Tom, 67

Bradley effect, 67

brain size, intelligence and, 163, 174, 319, 322, 327

Broca, Paul, 174

Buffon, Count, 407, 416, 526–528, 552–554

bullying, depression and, 104–105

Bureau of Economic Analysis, 377

Bureau of Justice Statistics, 377

Bureau of Labor Statistics (BLS), 166, 167, 176, 377, 378

Consumer Price Index and, 368, 374–377, 379–380

seasonal adjustments, 225

unemployment rate, 166, 176

use of Internet surveys, 78

burger joints, most popular, 174

burglaries during summer, 192–193

Burt, Cyril, 201

buying power, adjusting for changes in, 367, 371–373, 380


Cadillac brand, 189–190

caffeine dependence, 117

calculator, finding mean and standard deviation on, 277, 279

call-in opinion polls, 21–22, 24

cancer clusters, 412–413

car accidents, risk of, 417–418

car sales, 189–190

Carter, Jimmy, 341–342

categorical variables, 4, 218, 220–221, 223, 231, 243

causation, 348–355

evidence for, 352–353

cause, chance and, 412

cause and effect

direct, 350

experiments and, 13–14

cause-and-effect questions, 104

cell counts, 581

cell phones, telephone surveys and, 76

censuses, 11, 11–12. See also U.S. Census Bureau

center

of density curve, 295–299

of distribution, 248

Centers for Disease Control and Prevention (CDC), 39, 353, 354, 493

central limit theorem, 507, 507–508

cereals, fiber content of, 243

chance, 403, 405–420

ancient history of, 408–409

myths about, 409–415

probability and (See probability(ies))

chart junk, 230

cheating on exams, 533

check fraud, 525

children

number of related in American households, 469

probability of sex of, 414, 452–453, 472

chi-square distributions, 578, 578–579

chi-square statistic, 577, 580

chi-square test, 577–581, 579

using, 581–583

chosen in stages, 73

cigarette smoking, lung cancer and, 348, 352–353

classes, 245–246

Cleveland, William, 246

Cleveland Cavaliers, 281–282

climate change, 93

clinical trials, 97, 102, 120

data ethics and, 147–151

measurement and, 164

minorities in, 120

of Orlistat, 121–122

patient treatment in, 120–122

Clinton, Bill, 148

Clinton, Hillary, 67

clusters, 73

cocaine addiction treatment, 574–576, 577–580

coffee

brewing methods, 126

preference for fresh-brewed, 522–524, 525, 530

cohabitation, 226–227

coincidence, myth of surprising, 411

coin tosses, 405–407, 410, 413, 414–415, 416, 450–451, 526–528, 552–553

College Board, 169, 170, 315

colleges and universities

academic rank and gender, 571

average SAT scores of entering, 188

decline in students’ face-to-face interactions, 521

measuring readiness for, 164–165, 166, 171

race and graduation rates, 572

rankings of, 187

rise in women with degrees, 230

sample surveys and, 379–380

SAT scores and college grades, 351

tuition and fees in Illinois, 248–249, 254–255

657

column variable, 572

common response, 349, 350–351, 352

comparative studies, 104, 101–106

completely randomized experiment, 124

computation errors, 191–194

computer-assisted interviewing, 66

computers, privacy and confidentiality of data and, 146

confidence

in polls, 80–81

in sample, 30–31

confidence intervals, 493–509

advantages of, 554

estimation and, 494–499

level C, 500, 504, 508

for population mean, 508–509

for population proportion, 502–505, 504

sampling distribution of sample mean and, 505–508

statistical inference and, 548–549, 554

confidence level C, 500, 508

confidence statements, 47–49, 48

confidentiality, 142, 145–147

confounded variables, 96

confounding, 350–351, 352

matching and, 104–105

consistency of data, 188–190

Consumer Expenditure Survey, 374

Consumer Price Index (CPI), 368–377

understanding, 374–377

using, 370–374

Consumer Reports, 127

control

block design and, 127

placebo, 149

control group, 99, 99–100

convenience sampling, 22, 23–24

Coordinated Universal Time, 175

correlation, 323, 323–328

big data and, 353–355

causation and, 352–355

ecological, 338

independence and, 451

nonsense, 349

regression and, 345–348

square of the, 346, 347

cost of living, CPI and, 375–377

counts, 4, 168, 217, 295

cell, 581

data tables and, 245

expected, 575, 576

Crested Butte (Colorado), 188

crime, gun control and, 351–352

critical values, 503, 503–504

Crohn’s disease, 96–97

cth percentile, 304

Current Population Survey (CPS), 9–10, 11, 379

on cohabitation, 226

reduction of bias and, 176

sample design for, 73

on top causes of death, 215–216

unemployment rate and, 166

Curry, Stephen, 473


data, 1–15

accuracy of, 12

big, 24, 250, 353–355

in censuses, 11–12

computation errors and, 191–194

consistency of, 188–190

excessive precision or regularity of, 191

in experiments, 12–14

falsification of, 190, 191

hidden agendas influencing, 194–196

incomplete information about, 187–188

individuals and variables and, 4–6

in observational studies, 7–8

organizing, 213

ownership of published, 148

plausibility of, 190–191

privacy and confidentiality of, 146

quality of, 1

in sample surveys, 8–11

statistical inference and, 548

uses of, 1

data ethics, 141–154

behavioral and social science experiments and, 151–153

clinical trials and, 147–151

confidentiality and, 142, 145–147

informed consent and, 142, 144–145

institutional review boards and, 142, 143

data production design, 548

data source, in tables, 216

data tables, 215–218

day care effects, 123

death, causes of, 215, 216

decision, inference as, 557–562

decision rule, 559

decision theory, 560, 561

Declaration of Helsinki, 143, 148

degrees of freedom, 579

density curves, 295, 295–296, 433, 579

histograms compared with, 295–296

median and mean of, 296–298, 298

normal (See normal distributions)

dependent variables, 95

depression, history of bullying and, 104–105

deviations, 224, 247

in scatterplot, 320

658

dice rolls, 408, 413, 430–431

digits

really random, 447

simulation and, 447, 449–450

direct causation, 350, 352

direction of scatterplot, 320

Dirksen, Everett, 345

discrimination in mortgage lending, 585–586

distributions, 217, 267–285

boxplots and, 272–276, 273

centers of, 248

chi-square, 578, 578–579

five-number summary of, 272, 272–276

mean of, 277, 277–281

median of, 268, 268–272, 269

normal (See Normal distributions)

numerical descriptions of, 281–282

overall pattern of, 224, 247

quartiles of, 268, 268–272

sampling (See sampling distributions)

shape of, 248

skewed to the left, 249, 250–251

skewed to the right, 249, 251–252

standard deviation of, 277, 277–281

symmetric, 248, 249, 251, 252–253, 282

variability of, 248

variance of, 277

domestic violence experiments, 153

double-blind experiments, 97, 118–120, 120

driver fatigue, 168

dropouts from research studies, 121, 121–122

dying, probability of, 408

Dyson vacuum cleaners, 222–223


earnings. See income

ecological correlation, 338

Edmonton Oilers, 170

education. See also colleges and universities

earnings and, 267

grades and video-gaming, 576, 578, 580–581

level attained by adults, 216–217, 218–219

unemployment by level of, 223–225

Einstein, Albert, 408

elderly people in population, 193, 244, 247, 253–254

election polls, 49, 50–51

elections

predicting states votes in, 341–342

vote counting and, 347

Electronic Encyclopedia of Statistical Examples and Exercises (EESEE), 117

energy conservation, 100

equations, regression, 242–244

errors. See also bias

computation, 191–194

margin of (See confidence statements; margin of error)

measurement, 172

nonsampling, 64, 66–70, 72

processing, 66, 67

random, 64, 172

response, 66, 66–67

roundoff, 217–218

sampling, 64, 64–65

standard, 496, 496–497, 506

Type I, 560

Type II, 560

estimation

confidence levels and, 494–499

using samples, 40–41, 43

Euclid, 562

evaluation of poll results, 80–81

event, 428

exclusive classes, 244

exercise, weight loss vs., 549

exhaustive classes, 244

exit polls, 50, 50–51

expected counts, 575, 576

expected values, 465–474, 467

finding by simulation, 472–473

law of large numbers and, 469–470

winning systems for gambling and, 471–472

experimental design, 117–118

block, 127, 127–129

completely randomized, 124

logic of, 101–103

matched pairs, 126, 126–127

one-track, 97

randomization in, 101–103

in the real world, 124–126

experiments, 12–14, 13, 93–108

double-blind, 97, 118–120, 120

ethics and (See data ethics)

generalization and, 122–124

nonresponse and, 120–122

observational studies vs., 93–94, 104–106

poorly conducted, 95–98

randomized comparative, 98–100, 99

statistical significance and, 103, 103–104

vocabulary of, 93–95

explanatory variables, 94, 94–95, 317, 318, 325

extrapolation, 345


Facebook, 353

facial attractiveness, alcohol consumption and, 141

falsification of data, 190, 191

659

Fatality Analysis Reporting System, 165, 167–168

FDA (Food and Drug Administration), 130

Fermat, Pierre de, 409

fiber content of cereals, 243

first quartile Q1, 270, 270–271

Fisher, Ronald A., 529, 558

five-number summary, 272, 272–276, 282, 283

5% significance level, 555–556

fixed market basket price indexes, 369–370, 370, 375

flu trends, 353, 354

food stamp participation, 225–226

football

Pick 4 lottery and, 411

probability of winning Super Bowl, 339, 427, 428, 431

Forbes magazine, 195–196

Ford Motor Company, 189–190

form of scatterplot, 320, 321

fossils, classifying, 321–322, 324, 340–341, 343–344, 347

Fox & Friends, 192

fruit and vegetable intake, 493–494

frustration study, 122

F-scale, 177–178


Gallup Polls, 9

on amount of federal income tax paid, 71

on vaccinations and autism, 30–31, 39–40, 40–41 45–46

on voting and abortion issue, 47, 49

weighting responses, 72

Well-Being Index, 76

World Poll, 76

Galton, Francis, 300, 344

gambling

ancient history of, 408–409

expected values and, 465–474, 467

legalized, 465, 471

slot machines, 470

teen approval of, 434

winning systems for, 471–472

games of chance, 408–409

gasoline price index number, 368

Gauss, Carl Friedrich, 300

GDP, life expectancy and, 318–319, 320

gender

academic rank and, 571

advertising and, 128

probability of sex of children, 414, 452–453, 472

SAT exam and, 169–171

generalization, from experiments, 122–124

General Motors, 189–190

General Social Survey (GSS), 10, 68, 379–380

Global Positioning System, 174

global warming statistics, 192

Gnedenko, B. V.., 428

Goodall, Jane, 7, 12

Google, 250, 353–354

government. See also U.S. Census Bureau

Consumer Price Index and, 368–377

databases maintained by, 146

statistics used by, 377–379

tax revenue breakdown, 221–222

grades, video-gaming and, 576, 578, 580–581

graduation rates, race and, 572

Graphic, Visualization, and Usability Center (GVU), 77

graphs, 215–233

bar, 218–222, 219, 293

constructing effective, 229–231

data tables and, 215–218

histograms, 243, 243–253, 293

line, 223, 223–226

pictograms, 222, 222–223

pie charts, 218, 218–222

scales in, 226–229

stemplots, 253, 253–255

variables and, 217–218

Greenspan, Alan, 376

Gretzky, Wayne, 170

gun control, crime and, 351–352

Gut (journal), 96–97


haphazard, 406–407

Harris Poll Online, 77, 78–79

Hawthorne effect, 128

health, astrological sign and, 556–557

heart attacks, aspirin and, 149

heart disease

anger and incidence of, 581–583

incidence in women, 195

sex bias in treating, 104, 105–106

height

age and, 345

of children vs. parents, 344

heart attack risk and, 315

height distribution, 252–253, 301–302

Helsinki Declaration, 143, 148

Hennekens, Charles, 149

hidden agendas, influencing data, 194–196

Higher Education Research Institute, 521

highway safety, 165, 167–168

Hill, Theodore P., 488

histograms, 243, 243–246, 293

interpreting, 247–253

home run statistics, 267–269, 270–271, 272–273, 274, 277–278, 279, 367

660

honesty, 145

horse racing

payoff odds, 470

starting position in, 445

hot dog prices, 348, 352

Hubble, Edwin, 317

Hubble’s law, 317

human subjects research. See data ethics

Humphries, Robert, 412

Hurricane Katrina, 190–191

hydroxyurea for sickle-cell patients, 98–99, 103–104

hypotheses, 524–528

alternative, 525, 525–527, 557–558

null, 525, 526, 549, 555, 557, 558–559

testing, 561


incoherent, 431

income

education level and, 267

mean, 282

median annual, 373–374

income distribution, 282, 300

income inequality, 195–196, 269–270, 271–272, 274–276

incomplete information about data, 187–188

independence, 448, 448–452

independent trials, 448

independent variables, 95

index numbers, 368, 368–369

individuals, 4, 4–6

inference. See statistical inference

inflation, 374, 376

informed consent, 142, 144–145, 152

insect repellant effectiveness, 127

Inside Higher Education, 521

institutional review boards (IRBs), 142, 143

instrument, 164

intelligence

brain size and, 163, 174, 319, 322, 327

measurement of, 169, 300, 535

intercept, 343

International Bureau of Weights and Measures (BIPM), 175

International Committee for Weights and Measures, 174–175

Internet surveys, 76–79

InterSurvey, 78

investment returns, 279–280

IQ tests, 169, 300, 535


JAMA (Journal of the American Medical Association), 142–143

James, LeBron, 281–282

Johansson, Mattias Petter, 25


Kerrich, John, 407, 416

kidney transplant, 453–455

knowledge retention, improving, 101


labels, on tables, 216, 229

Lake Murray (South Carolina), elevation levels in, 250–251

Landers, Ann, 23, 30

Landrieu, Mary, 190

Landsberger, Henry A., 128

large populations, sampling from, 49–51

law of averages, 414–415

myth of, 413–414

law of large numbers, 414, 469, 469–470

leaf, 253

Leap Day births, 405

least-squares regression, 346–347

least-squares regression lines, 342, 346

legalization of marijuana, 3, 11, 21

legalized gambling, 465, 471

legends, 229

leukemia, power lines and, 7–8

level C confidence interval, 500, 504, 508

level of confidence, 48

Lewis, C. S., 414

life expectancy

GDP and, 318–319, 320

television set ownership and, 348–349

Lincoln brand, 189–190

line graphs, 223, 223–226

lists, 315

logic, of experimental design, 101–103

Lott, John, 351–352

lotteries

expected values and, 465–468

rigging of, 468

winning, 411–412

Love, Kevin, 281

low-fat food labels, obesity and, 125

lung cancer, cigarette smoking and, 348, 352–353

lurking variables, 96, 101, 102, 585, 586


Major League Baseball. See baseball

mall interviews, 23, 30

margin of error, 45–47, 48, 49, 499–500

sample survey and, 69

marijuana, legalization of, 3, 11, 21

marital status of young women, 427–428, 429

market basket, 369–370, 374–375

market research, 10

Marks, Bruce, 347

Mars Climate Orbiter, 164

matched pairs design, 126, 126–127

matching, 104, 104–105

661

McNamara, John, 189

mean, 277, 277–281, 281–282

of density curve, 296–298, 298

population, 508–509, 531–535

regression and, 344

sample, 300, 505–508, 506

of sampling distribution, 496

measles outbreaks, 39

measurement, 163–180, 164

accuracy of, 171–174

defining variables and, 164–166

errors in, 172 (See also bias)

in psychology, 176–178

reliability of, 173–176

validity of, 166–171, 167

median, 268, 268–272, 269, 282

of density curve, 296–298, 298

medical helicopters, 584–585

melon field infestation, 191

meta-analysis, 124

Meyer, Eric, 188

midpoint, 248, 281

miles per gallon, 320

minorities, underrepresentation in clinical trials, 120

Misterpoll.com, 77

MLive poll on legalization of marijuana, 3, 21

Mondale, Walter, 341

mortgage lending, discrimination in, 585–586

mountain man price index, 369–370

multiple-choice exams, 129, 533

mutual funds, 547

myths about chance behavior, 409–415


NASDAQ composite stock index, 193–194

National Assessment of Educational Progress (NAEP) scores, 509

National Cancer Institute, 413

National Center for Health Statistics (NCHS), 215, 377, 408

national deficit, predicting, 345

National Football League (NFL), 339

National Health Survey, 66

National Hockey League (NHL), 170

National Household Survey, 12

National Institute of Standards and Technology (NIST), 175, 176

National Opinion Research Center (NORC), 10, 379

natural supplements, placebo effect and, 130

negative association, 320, 321, 325

New England Journal of Medicine, 142, 150

New England Patriots, 427, 431

New York

mean income per person, 282

telephone surveys and, 75

Neyman, Jerzey, 561

Neyman-Pearson theory, 561

Nielsen Media Research, 10

Nielsen TV ratings, 10, 78

95% confidence interval, 495, 501–503

95% confident, 45, 47, 499, 499–500

99% confidence interval, 504–505

nonadherers in research studies, 121

nonrandom samples, inference based on, 535

nonresponse, 67, 67–68

in experiments, 120–122

Internet surveys and, 76–77, 78

nonsampling errors, 64, 66–70, 72

nonsense correlations, 349

Normal curve, 293, 299, 433–435

Normal distributions, 293–308

critical values of, 503–504

density curves and, 295, 295–300

percentiles of, 304, 304–306

68–95–99.7 rule and, 300–302

standard scores and, 302–304, 303

Normal percentiles, 434–435

Nova Southeastern University, 94, 95, 96

(n + 1)/2 rule, 271

null hypothesis, 525, 526, 549, 555–556, 557, 558–559

null hypothesis significance testing procedures (NHSTP), 555

numerical descriptions, choosing, 281–283

numerical variables, 4, 164


Obama, Barack, 49, 67

obesity

low-fat food labels and, 125

in mothers and daughters, 349–351

Orlistat study, 121–122

observational studies, 7–8, 8, 13–14, 104–106

experiments vs., 93–94, 104–106

odds, 418, 431

One Million Random Digits, 447

one-sided alternative, 527, 527–528

one-track experimental design, 97

online learning, 94

online social media, 521

opinion polls, 9, 63

accurate information about samples and, 63, 64–65, 72

call-in, 21–22

write-in, 22–24

Orlistat study, 121–122

outliers, 245, 247

correlation and regression and, 346

population mean and, 508

scatterplots and, 320, 326

standard deviation and, 282–283

overall pattern, 224, 247, 320

662


parameters, 40, 494

pari-mutuel system, 470

Pascal, Blaise, 409

Pearson, Egon S., 561

Pearson, Karl, 407, 416

percentage change, 194

percentages, 4

error and, 191–194

two-way tables and, 573

percentiles of normal distributions, 304, 304–306

personal probabilities, 415–417, 416, 431

personal space experiment, 152

Pew Research Center for the People and the Press, 63

Pew Research Center polls

on legalization of marijuana, 11

nonresponse and, 68–69

on right to subpoena phone records, 70–71

use of Internet surveys, 78

Pew Research Internet Project, 354

Pick 4 lottery, 411

pictograms, 222, 222–223, 229

pie charts, 218, 218–222

pig whipworms, 96–97

placebo effect, 77, 97, 97–99, 101–102, 118–120, 130, 149–150, 551–552

plausibility of data, 190–191

Playfair, William, 293

playlist “shuffle” feature, 25

plus four estimate, 519

Point of Purchase Survey, 375

political influence, government statistics and, 378

polls. See also Gallup Polls; opinion polls

election, 47, 49, 50–51

public opinion, 9

telephone, 63, 65, 68–69

population(s), 9, 40, 40–41

elderly people in, 193, 244, 247, 253–254

sampling from large, 49–51

population mean

confidence intervals for, 508–509

significance tests for, 531–535

population proportion

confidence intervals for, 494–499, 495

estimation from sample proportion, 45

positive association, 320, 321, 325

power lines, leukemia and, 7–8

precipitation rates, 346

precision, excessive, of data, 191

prediction

big data and, 353–355

regression and, 344–345

of states’ votes in elections, 341–342

predictive validity, 170, 170–171

preelection polls, 50

pregnancies, length of human, 531–533

price indexes

fixed market basket, 369–370, 370, 375

index number and, 368, 368–369

primary sampling units (PSUs), 73

privacy, 146, 147

probability(ies), 403, 405–408, 407

of dying, 408

odds and, 431

personal, 415–417, 416, 431

of rain, 413

randomness and, 406–407

risk and, 417–418

simulation and, 446–447

probability models, 427–436, 428

rules and, 429–431

for sampling, 432–435

simulation and, 446–447

probability samples, 79, 79–80

probability theory, 409

processing errors, 66, 67

professional athletes’ salaries, 281–282, 367, 372–373

ProFunds Internet Inv Fund, 547

proportion, 295

sample, 494, 494–496, 495

pseudo-random numbers, 28

psychology, measurement and, 176–178

public opinion polls. See opinion polls

P-values, 524–528, 526, 530, 532, 534, 549

calculating, 529–531

naked, 553


quantitative variables, 4, 218, 243

quartiles, 268, 268–272

calculating, 270–272

questions, wording of, 69, 70–72


race and ethnicity

census form categories, 5–6, 12

discrimination in mortgage lending, 585–586

elections and, 67

graduation rates and, 572

nonresponse and, 70

radio format, most popular, 1

rain, probability of, 413

RAND Corporation, 447

random, meaning of, in statistics, 406–407, 407

random digits, 26–30

table of, 27, 27–28

random drawings, 30

random error, 172

randomization in experimental design, 101–103

663

randomized comparative experiments, 98–100, 99

random samples

simple (See simple random samples (SRSs))

stratified, 73, 73–76, 79–80

systematic, 90

random sampling error, 64

Rasmussen Report Poll, 192

rates, 168, 217

error and, 191

Reagan, Ronald, 341–342

really random digits, 447

real-world sample design, 63, 72–76

reasoning of tests of significance, 522–524

recession velocity, 317–318

recycling, 5

Rees, Martin, 551

refusals, 120–122

regression

correlation and, 345–348

prediction using, 344–345

toward the mean, 344

regression equations, 342–344

regression lines, 340, 340–342, 343

least-squares, 342

regularity

chance and, 409–411

excessive, of data, 191

reliability, 172, 173–176

averages and, 175–176

reporter phone records, right to subpoena, 70–71

Research Randomizer, 27, 30

response error, 66, 66–67

responses, 8

weighting, 72, 79

response variables, 94, 94–95, 317, 325

returns on investments, 279–280

risk

height and heart attack, 315

probability and, 417–418

return on investments and, 279–280

Romney, Mitt, 49

roundoff errors, 217–218

row variables, 572

rules, probability, 429–431

Ruth, Babe, 271, 273, 367


sales tax, 221

sample(s), 9, 9–11, 21–32, 39–52, 40

accuracy of data produced by, 12

confidence statements and, 47–49, 48

estimation using, 40–41, 43

margin of error and, 45–47

population size and, 49–51

size of, 41, 44–45, 47

statistics describing, 40–41

stratified, 129

variability of, 41–45

voluntary response, 22, 22–24

sample errors, random, 64

sample means, 300

sampling distribution of, 505–508, 506

sample proportion, 300, 494, 494–496, 495

sample surveys, 8, 8–11, 48, 63–82

evaluating poll results and, 80–81

Internet surveys and, 76–79

nonsampling errors and, 64, 66–70, 72

probability samples and, 79–80

real-world sample design and, 63, 72–76

sampling errors and, 64, 64–65

university, 379–380

wording of questions and, 69, 70–72

sampling

acceptance, 558–560

biased, 21–24

confidence in, 30–31

convenience, 22, 23–24

from large populations, 49–51

probability models for, 432–435

random (See random samples; simple random samples (SRSs))

sampling distributions, 432–435, 433, 495, 495–497, 526–527, 578

of sample mean, 505–508

standard deviation of, 506

sampling errors, 64, 64–65

sampling frame, 64, 64–65, 79

SAT exams

average scores of entering students, 188

college grades and, 351

as college readiness measure, 164–165, 166, 171

gender gap and, 169–171

percentiles for, 305–306

ranking states and, 315

standard scores and, 302–304

Vietnam effect and, 224

scales, 226–229

scatterplots, 317–328, 318

independence and, 451

schools, asbestos in, 417–418

Science, 191, 193

seasonally adjusted, 225

seasonal variation, 225, 247

second, defined, 175

several-variable data, 315

sex. See gender; women

sexual assault resistance program effects, 94–95

Shakespeare, length of words in, 251, 252

shape of distribution, 248

664

sickle-cell anemia treatment, 98–99, 103–104

sigma (σ), 505–506

significance, tests of. See tests of significance

significance level, 528, 528–529

Simple Random Sample applet, 75

simple random samples (SRSs), 24–30, 25

choosing in two steps, 29–30

random digits and, 26–30

stratified random samples vs., 73–76

of telephone numbers, 75–76

variability and, 42

Simpson’s paradox, 584–586, 585

simulation, 446, 445–456

finding expected values by, 472–473

independence and, 448–452

probability models and, 446–447

68-95-99.7 rule, 300, 300–302

skewed to the left distributions, 249, 250–251, 283

skewed to the right distributions, 249, 251–252, 283

slope of line, 343

slot machines, 470

Slutsky, Robert, 191

smoking, 499

snowfall amounts, 188

social desirability bias, 67

socializing, decline in face-to-face, 521

social media, online, 521

social science experiments, 151–153

Social Security Administration, privacy policy, 147

social statistics, 379–380

soda consumption, 494–495, 496–497, 499, 500–501, 504–505

software

choosing simple random sample using, 26

normal curve and, 293

statistical test, 526

Spielberger Trait Anger Scale test, 581

Spotify, 25

square of the correlation, 346, 347

standard deviation, 277, 277–281

normal curves and, 298

outliers and, 282–283

properties of, 279

of sampling distribution, 496, 506

standard error, 496, 496–497, 506

Standard & Poor’s 500 index, 227–228

standard scores, 302–304, 303

starting value, 194

statistic(s), 4, 40, 494

causation and, 348–352

government, 377–379

social, 379–380

test, 530

Statistical Abstract of the United States, 190, 193, 215, 371

statistical inference, 491, 494, 494–495, 547–563

confidence intervals and, 548–549, 554

data and, 548

as decision, 557–562

limitations of tests and, 550–554

meaning of, 549–550

requirements for, 550

statistical significance and, 549–550, 555–557

for two-way table, 574–576

wise use of, 547–550

statistically significant, 103, 103–104, 122

statistical inference and, 549–550, 555–557

statistical significance at level α, 528–529, 529

Statistics Canada, 377

stem, 253

stemplots, 253, 253–255

back-to-back, 265

Stinson, William, 347

stock prices, 227–229

strata, 73

stratified random samples, 73, 73–76, 79–80, 129

strength of relationship in scatterplot, 320, 321

subjects, 94, 94–95

treatment of medical, 123–124

substitute other households, 72

Sullivan, Robert, 199

Super Bowl, probability of winning, 339, 427, 428, 431

Super Bowl Indicator, 339

Supplemental Nutrition Assistance Program (SNAP), 225–226

surgery, sham, 150

surveys

Internet, 76–79

sample, 8, 8–11

symmetric distributions, 248, 249, 251, 252–253, 282

symmetry

of density curve, 297–298

of normal curve, 298

systematic random sample, 90


tables

data, 215–218

of random digits, 27, 27–28

three-way, 584–585

two-way, 572, 572–576

taxation

international comparison of, 220, 221–222, 230–231

sales tax, 221

665

teaching assistants, evaluating, 30

telemarketer’s pause, 48

telephone samples, 75–76

television ratings, 10, 78

television set ownership, life expectancy and, 348–349

testing hypotheses, 561

tests of significance, 521–537, 522

hypotheses and, 524–528

limitations of, 550–554

for population means, 531–535

P-values and, 524–528, 526, 529–531

reasoning of, 522–524

searching for significance and, 555–556

statistical significance and, 528–529

test statistic, 530

text messages sent, 293–294

The Theory of Probability (Gnedenko), 428

third quartile Q3, 270, 270–271

three-way tables, 584–585

time spent eating, 533–535

Town Talk call-in opinion poll, 21–22, 30

treatment, 94, 94–95

tree diagram, 453, 453–454

trend, 224, 247

Tri-State Daily Numbers, 466–468

Tuskegee syphilis study, 148

Tversky, Amos, 476, 489

Twitter, 354

two-sided alternative, 527, 527–528

two-way tables, 572, 572–576

Type I error, 560

Type II error, 560


undercoverage, 64, 65

Internet surveys and, 76–77, 78

unemployment

education level and, 223–225

measuring, 166, 167, 175

units, 164

university sample surveys, 379–380

unmarried couples living together, 226–227

Urban Institute, 13

U.S. Census Bureau, 377, 378

American Community Survey (ACS), 68

income inequality data, 269–270, 271–272, 274–276

income statistics, 196

racial categories, 5–6, 12

voluntary response issue and, 78

U.S. Geological Survey (USGS), 251

U.S. News & World Report, 187

utilities, 560


vaccinations, autism and, 30–31, 39–40, 40–41, 45–46

validity

measurement and, 166–171, 167

predictive, 170, 170–171

Van Buren, Abigail, 23

variability, 43, 43–44

of density curve, 296–298

of distribution, 248

reducing, 44–45

sampling, 41–45

standard deviation and, 279

variables, 4, 4–6

categorical, 4, 218, 243

column, 572

confounded, 96

dependent, 95

explanatory, 94, 94–95, 317, 318, 325

independent, 95

lurking, 96, 101, 102, 585, 586

numerical, 4, 164

quantitative, 4, 218, 243

response, 94, 94–95, 317, 325

row, 572

types of, 218

variance, 173, 277

vehicles per household, 468

video-gaming, grades and, 576, 578, 580–581

Vietnam effect, 224

visual perception, 548

Vitter, David, 190

voluntary response samples, 22, 22–24

Internet surveys and, 76–78


Wainer, Howard, 224

Wald, Abraham, 320

Washington Post/ABC News poll, 70–71

weighting of responses, 72, 79

weightlifting records, 246

weight loss, 549, 554

welfare mothers and employment, 13–14

welfare systems, comparing, 129

winning systems, in gambling, 471–472

women

academic rank and gender, 571

heart disease in, 195

height and risk of heart attack, 315

height distribution for, 301–302

marital status of young, 427–428, 429

obesity in mothers and daughters, 349–351

rise in college-educated, 230

write-in opinion polls, 22–24


Zogby International, 75