corty3e_app2

Appendix B

APPENDIX B

Solutions to Odd-Numbered End-of-Chapter Exercises

Chapter 1

B-1

1.01summarize

1.03cases

1.05(a) X causes Y; (b) Y causes X; (c) some other variable (Z or a confounding variable) causes both X and Y.

1.07random assignment

1.09dependent variable (or outcome variable)

1.11groups

1.13independent = cause; effect = dependent

1.15grouping

1.17NOIR

1.19ordinal

1.21arbitrary; absolute

1.23population

1.25statistic; parameter

1.27descriptive; inferential

1.29two

1.31

Do metal and plastic handcuffs differ in how much abrasion they cause?
Type of handcuff is the explanatory variable; amount of abrasion is the outcome variable.
Experimental
N/A

1.33

Are “conscientious” students smarter than “nonconscientious” students?
Type of student (conscientious vs. nonconscientious) is the explanatory variable; IQ is the outcome variable.
Quasi-experimental
Socioeconomic status may be a confounding variable. Richer students may be able to afford to be conscientious (to buy all the books, not to have to work so they have more time for studying) and richer students may have gone to better schools, received a better education, and ended up with a higher IQ.

1.35

Are a country’s wealth and its greenhouse gas production related?
GDP is the explanatory variable; tons of CO₂ produced is the outcome variable.
Correlational
Population could be a confounding variable. Over the last 50 years, as there were more people in the country, there would be more people to produce and consume goods, increasing both GDP and greenhouse gases.

1.37

Does type of information received affect voting behavior?
Type of information (none vs. positive vs. negative) is the explanatory variable; voting for or against is the outcome variable.
Experimental
N/A

1.39Nominal

1.41Ordinal

1.43Interval

1.45Ordinal

1.47Nominal

1.49

Sample
Parameters
Inferential

1.51Parameter

1.53N = 6

1.55ΣX² = 8² + 9² + 5² + 4² + 7² + 8² = 299.00

1.57ΣX = 13 + 18 + 11 = 42.00

1.59

1.6112.68

1.63121.01

1.6522.47

1.672.53

1.69c

1.71

Do higher levels of physical distress in cities lead to higher levels of social distress?
Amount of graffiti (high, moderate, or low) is the explanatory variable, and teenage pregnancy rate is the outcome variable.
quasi-experimental
The proportion of the city population that has a high school degree could be a confounding variable. Less-educated teenage girls may be more likely to get pregnant. City residents who have not completed high school may be less likely to be employed and more likely to deface property.

B-2

Chapter 2

2.01count

2.03values; frequencies

2.05f_c

2.07nominal

2.09grouped

2.11details

2.13midpoint

2.15how many

2.17nominal; ordinal

2.19in-between (fractional)

2.21interval; ratio

2.23range

2.25the interval width

2.27bar graph

2.29wider; tall

2.31Y

2.33touch

2.35nominal

2.37modality; skewness; kurtosis

2.39bimodal

2.41negative

2.43ungrouped

2.45Don’t forget a title!

Frequency Distribution for Number of Psychiatric Diagnoses in 45 Residents of a Psychiatric Hospital

Number of Diagnoses	Frequency	Cumulative Frequency	Percentage	Cumulative Percentage
5	1	45	2.22	100.00
4	2	44	4.44	97.78
3	9	42	20.00	93.33
2	19	33	42.22	73.33
1	12	14	26.67	31.11
0	2	2	4.44	4.44

2.47

Grouped Frequency Distribution for Number of Chairs in 41 College Classrooms (Interval Width = 20)

Number of Chairs	Interval Midpoint	Frequency	Cumulative Frequency	Percentage	Cumulative Percentage
90–109	99.50	1	41	2.44	100.00
70–89	79.50	1	40	2.44	97.56
50–69	59.50	11	39	26.83	95.12
30–49	39.50	17	28	41.46	68.29
10–29	19.50	11	11	26.83	26.83

2.49

Discrete
Continuous
Discrete
Discrete
Continuous

2.51

	Real Lower Limit	Real Upper Limit	Interval Width	Midpoint
a.	30.50	34.50	4.00	32.50
b.	25.00	45.00	20.00	35.00
c.	2.495	2.995	.500	2.745
d.	10.50	11.50	1.00	11.00

2.53

Since the data are nominal, the order is arbitrary and discussing the shape of the distribution does not make sense.

B-3

2.55 Either a histogram or frequency polygon can be made for these data.

The distribution is unimodal, doesn’t seem too peaked or too flat, and is not skewed.

2.57

The distribution is unimodal, doesn’t seem too peaked or too flat, and is positively skewed.

2.59b

2.61e

2.63Shortest time: 4 minutes: Leave at 10:16:59, return at 10:21:00. 241 seconds, or 4 minutes and one second.

Longest time: 6 minutes: Leave at 10:16:00, return at 10:21:59. 359 seconds, or 5 minutes and 59 seconds.

Chapter 3

3.01central tendency; variability

3.03variability

3.050

3.07less

3.09mean

3.11mode

3.13interval; ratio

3.15is

3.17more

3.19population; sample

3.21variance

3.23cases

3.25tightly

3.27square

3.29σ

3.31corrected; population

3.33deviation score

3.35The data are ratio-level, but there is an outlier, so the median is the most appropriate measure of central tendency. Mdn = 4.00.

3.37They are ratio-level data, and there isn’t anything unusual in terms of skewness or modality, so the mean is the most appropriate measure of central tendency.

3.39

3.41N = 47, so median is the score associated with case number . Case number has a value of 102, so Mdn = 102.00. The value of 102 is also the most commonly occurring score, so mode = 102.00.

3.43(a) Range = 201 – 37 = $164 million. (b) The interquartile range is a better measure of variability to report because it trims off the extreme scores. And the value of $201 million seems to be an outlier, more than $50 million higher than the next value.

3.45

N = 5
Σ (X – M) = (47 – 50.00) + (53 – 50.00) + (67 – 50.00) + (45 – 50.00) + (38 –50.00) = 0.00
Σ (X – M)² = (47 – 50.00)² + (53 – 50.00)² + (67 – 50.00)² + (45 – 50.00)² + (38 – 50.00)² = 476.00

B-4

3.47

3.49

3.51

3.53

3.55d

3.57b

3.59(a) The larger sample, N = 100, will have a smaller correction for s ² to approximate the population variance. (b) As sample size gets larger, the sample probably captures more of the variability that exists in the population, so the formula “corrects” the standard deviation less.

Chapter 4

4.01raw

4.03mean; standard deviation

4.05zero

4.07mean; standard deviation

4.09mean; median

4.11decreases

4.13rare

4.15random

4.170.01

4.19the same

4.21less than

4.23at or below

4.25probability

4.27the total number of possible outcomes

4.29fraction; proportion; percentage

4.311.00; .00 (or, 100%; 0%)

4.33.05; likely

4.353

4.37

4.39X = M + (z × s) = 12.75 + (1.45 × 3.33) = 17.58

4.41

4.43X = M + (z × s) = 100 + (0 × 15) = 100.00

(One could also complete this problem without doing any math as a z score of zero is right at the mean, and the mean for IQ scores is 100.)

4.45

Hillary did better on the math subtest of the SAT than she did on the spelling test.

4.479.01%

4.4980.23%

4.5147.50%

4.53

11.90% of cases in a normal distribution have a z score ≥1.18. Thus, 11.90% of Americans should have high diastolic blood pressure (90 or higher).

4.55PR = 1.70

4.57

PR = 80.78

PR = 42.07

4.59PR of 88.5 = z score of 1.20

SAT = X = M + (z × s) = 500 + (1.20 × 100) = 620.00

4.61

4.63This is 5% on each side of the midpoint. Use column B: ±0.13

4.65First find z scores for middle 84% (42% on each side of the midpoint). Use column B: ±1.41. Then calculate IQ scores: 100 + (–1.41 × 15) = 78.85 and 100 + (1.41 × 15) = 121.15. The middle 84% of IQ scores fall from an IQ score of 78.85 to 121.15. (That could also be written as 100 ± 21.15.)

B-5

4.67That’s 2% in each tail. Use column C: ±2.05

4.69That’s 7.5% in each tail. Using column C, z scores = ±1.44. Then calculate SAT subtest scores: SAT = 500 + (–1.44 × 100) = 356.00 and SAT = 500 + (1.44 × 100) = 644.00. The SAT scores are 356.00 and 644.00.

4.710.135%, using column C

4.73Using column B, 14.80% of scores fall from the mean to a z score of 0.38, so 2 × 14.80 = 29.60% of scores fall within 0.38 standard deviations of the mean.

4.75Using column B, 19.15% of scores fall from the mean to a z score of 0.5, so 2 × 48.21 = 38.30% of scores fall within a half standard deviations of the mean. Thus, p(Not with a half standard deviation) = 1 – .3830 = .6170, or 61.7%.

4.77That’s 5% in each tail. Using column C, the z score should be 1.645.

4.79That’s 5% in the upper tail. Using column C, the z score should be 1.645.

4.811.00

4.83c

4.85z = –5.00. (z scores are deviation scores and the sum of a set of deviation scores is always zero. The first four cases sum to 5, so the one missing case must have a value of –5 in order to bring the sum of all five z scores to zero.)

4.87There are more genius dogs than genius cats. Cats, with a smaller standard deviation, have scores that cluster closer to the mean. Thus, they have fewer scores that fall far away from the mean, which means fewer cats will score in the genius range. (This also means that there are more stupid dogs than stupid cats.)

4.89Since the normal distribution is symmetric, the mean is . Assuming a very, very rare event has a z score of 3, the standard deviation is .

Chapter 5

5.01population; subset

5.03attributes; proportion

5.05convenience

5.07random number table (“computer” is also an acceptable answer)

5.09population

5.11differ

5.1370

5.15random

5.17small; large

5.19standard error of the mean

5.2130

5.23mean

5.25s

5.27range; sample; population

5.291.96

5.31M = 10.5, 95%CI [7.32,13.68]

5.3307, 18, 32, 20, 39, 02, 31, 45, 14, and 24

5.35Consent rate is less than 70%, so the director should not pay attention to the results (or should at least be skeptical).

5.37

5.39

5.4195%CI_μ = M ± (1.96 × σ_M)

= 17.00 ± (1.96 × 4)

= 17.00 ± 7.84

M = 1.700, 95%CI [9.16,24.84]

5.43

5.45c

5.47As the population standard deviation increases, so does the standard error of the mean as long as the sample size is held constant.

B-6

5.49

5.51Increase the sample size, which would reduce the standard error of the mean.

Chapter 6

6.01explanation; facts

6.03hypothesis

6.05H₀; H₁

6.07specific; negative

6.09alternative

6.11disprove

6.13true

6.15sampling error

6.17(1) Pick a test, (2) check the assumptions, (3) state the hypotheses, (4) set the decision rule, (5) calculate the value of the test statistic, (6) interpret the results.

6.19assumptions

6.21nondirectional

6.23one; two

6.25critical value

6.27interpretation

6.29population

6.31influence

6.33population mean

6.35zero

6.37rare zone (or extreme sections)

6.39rare

6.41–1.96; + 1.96

6.43z; + 1.96

6.45α

6.47sample mean; population mean

6.49has rejected

6.51statistically different

6.53p < .05

6.55does not

6.57be rejected

6.59can’t

6.615%

6.63larger; easier

6.65low; reject

6.675% (or .05)

6.69Type II; Type I

6.71it should be rejected

6.73Type I; Type II

6.75a single-sample z test

6.77The random sample assumption is not violated. The independence assumption is not violated. School readiness is a psychological variable and probably is normally distributed. Since no assumptions are violated, it is OK to proceed with the single-sample z test.

6.79H₀: μ = 60

H₁: μ ≠ 60

6.81If z ≤ –1.96 or if z ≥ 1.96, reject H₀.

If –1.96 < z < 1.96, fail to reject H₀.

6.83

6.85

6.87

6.89

2.37 ≥ 1.96, so H₀ rejected;
z (N =23) = 2.37, p < .05

6.91(a) There was a statistically significant difference; (b) the sample mean is higher than the population mean.

6.93A researcher compared the mean weight of American women who want to lose weight (178 pounds) to the average weight of American women (164 pounds). The difference was statistically significant, z (N = 123) = 3.68, p < .05. Women in America who do want to lose weight are heavier than American women in general. In future studies, it would be important to make sure that the sample is representative of women who want to lose weight.

6.95(a) He made a Type I error. (b) He should have concluded that there is not enough evidence to say a difference exists.

6.97power = 1 – β = 1 – .75 = .25

6.99Step 1 Pick a Test

Comparing a sample mean to a population mean where the population standard deviation is known calls for a single-sample z test.

B-7

Step 2 Check the Assumptions

The random samples assumption is not violated. There is no evidence of violation of the independence of observations assumption. The normality assumption is not violated, as it seems reasonable to assume sodium consumption is normally distributed. No assumptions are violated, so it is OK to proceed with the single-sample z test.

Step 3 List the Hypotheses

The hypotheses are nondirectional.

H₀: μ _Dieters = 3,400

H₁: μ _Dieters ≠ 3,400

Step 4 Set the Decision Rule

It is a two-tailed test and the dietician is willing to make a Type I error 5% of the time, so α = .05:

Reject H₀ if z ≤ –1.96 or if z ≥ 1.96.

Fail to reject H₀ if –1.96 < z < 1.96.

Step 5 Calculate the Test Statistic

Step 6 Interpret the Results

A dietitian compared the sodium intake of a random sample of dieting Americans (M = 2,900 mg sodium) to the mean daily sodium intake of Americans (μ = 3,400 mg sodium). The difference was a statistically significant one (z (N = 172) = –24.28, p < .05). Dieters’ sodium intake is lower than for Americans in general. From this study, it is not clear whether this is because they were dieting or if their sodium intake were lower to begin with. Future research should investigate this.

6.101b (If β = .60, then power = .40.)

6.103b (The size of the effect, the difference between the two means, is larger.)

6.105correct

6.107incorrect; Type II error

6.109There is a way to calculate beta, but we haven’t covered it. I hope you’ll take a more advanced statistics class some time and learn how to calculate it. So, why ask this question? Because I don’t want you to assume that β = .05, like α. And, I don’t want you to assume that β = .20.

6.111

This is a one-tailed test because he has made a prediction about the direction of the results before collecting any data.
H₀: μ _Athletes≥ 78

H₁: μ _Athletes < 78

6.113The statistician is comparing the mean of a sample to the mean of a population, and the population standard deviation is known. This sounds appropriate for a single-sample z test. However, before using the test, the assumptions must be checked. The random samples assumption is not violated, and neither is the independence of observations assumption. However, the normality assumption is violated. With random numbers, all numbers are equally likely. So, a 50 is as likely to occur as is a 0 or a 99. This assumption is robust to violations, but this distribution is not even close to normally distributed (it’s flat, not normal), so the single-sample z test should not be done.

Chapter 7

7.01does not

7.03sampling error

7.05is

7.07independent

7.09robust

7.11two

7.13does

7.15≠

7.17increases

7.19tails; I; sample size (or degrees of freedom)

7.21one

7.232.5

7.25N

7.27s_M

7.29facts

7.31effect size; confidence interval

7.33alternative hypothesis

7.35sample mean; population mean

7.37p < .05

7.39not enough evidence

7.41independent variable; dependent variable

B-8

7.43no

7.45decreases

7.47II

7.49outcome variable; explanatory variable

7.5195; population

7.53effect size

7.55small

7.57Single-sample t test

7.59

The random sample assumption is violated. The independence assumption is not violated. The normality assumption is not violated. It is OK to proceed with the hypothesis test since the only assumption violated is a robust one.
With a convenience sample, it is doubtful that the sample is representative of the population at the college. Results can’t be generalized to all students at the college.

7.61H₀: μ _Nuns = 25

H₁: μ _Nuns ≠ 25

7.63H₀: μ _NewTreatment ≥ 6.30

H₁: μ _NewTreatment < 6.30

7.65t_cv = 2.120

7.67If t ≤ –2.000 or if t ≥ 2.000, reject H_0.

If –2.000 < t < 2.000, fail to reject H_0.

7.69

7.71

7.73

7.75Rejected

7.77Not rejected

7.79t(14) = 2.15, p < .05

7.81t(68) = 1.99, p > .05

7.83

7.85

7.8795% CI μ_Diff = (M – μ) ± (t_cv × s_M)

= (45 – 50) ± (2.093 × 2.24)

= [–9.69, –0.31]

7.89A nurse practitioner found that people who are heavy salt users have a statistically significantly higher blood pressure (M = 138) than do people in the general population [μ = 120; t(23) = 5.50, p < .05]. The 18-point increase in blood pressure is a large increase. This study suggests that heavy salt consumption puts one at risk for hypertension. In the larger population from which this sample came, the mean difference might be as small as 11 points or as large as 25 points. Even an 11-point increase in blood pressure has meaningful negative consequences for health. To increase confidence that salt consumption is associated with an increase in blood pressure, it would be a good idea to replicate this study. When doing so, a larger sample would allow the researcher to get a better estimate of the size of the effect in the population.

7.91Step 1 Pick a Test

Comparing a sample mean to a population mean when the population standard deviation is not known calls for a single-sample t test.

Step 2 Check the Assumptions

(1) The random sample assumption is violated, but the single-sample t test is robust to violations. (2) There is no evidence of violation of the independence of observations. (3) For the normality assumption, it seems plausible that hours spent in online social media is normally distributed.

Step 3 List the Hypotheses

The psychologist believes that valedictorians spend less time online in social media, so she has directional hypotheses.

H₀: μ _{Valedictorians} ≥ 18.68

H₁: μ _{Valedictorians} < 18.68

Step 4 Set the Decision Rule

Doing a one-tailed test; using the standard alpha level, .05; df = 31 – 1 = 30. t_cv = –1.697

Reject the null hypothesis if t ≤ –1.697.
Fail to reject the null hypothesis if t > –1.697.

B-9

Step 5 Calculate the Test Statistic

Step 6 Interpret the Results

(1) Was the null hypothesis rejected? As –2.00 ≤ –1.697, the null hypothesis is rejected, and the difference is statistically significant. Valedictorians spend significantly less time in online social media than the average high school student. t (30) = –2.00, p < .05 (one-tailed).

(2) How big is the effect? The d value of –0.36, calculated below, is a weak-to-medium effect:

(3) How wide is the confidence interval?

95% CI μ _Diff = (M – μ) ± (t_cv × s_M)

= (16.24 – 18.68) ± (2.042 × 1.22)

= [–4.93, 0.05]

The confidence interval, calculated above, is roughly from –5 to 0. Thus, it is not clear how strong the effect is. If it is five hours a week less, that seems like a lot of time. Zero, obviously, is no difference.

Note that this confidence interval captures zero, even though the null hypothesis was rejected. How can that be? It is possible because the test was a one-tailed test, but confidence intervals are always two-tailed. When a researcher does a one-tailed test, he or she can change the degree of confidence to fit the test. With this one one-tailed test, all 5% of the rare zone was in one end. If the researcher put 5% in both ends, for a total of 10%, 90% would be left in the middle for a 90% confidence interval. To calculate the 90% confidence interval instead of the 95%, substitute the critical value of t found in Step 4, 1.697, into the confidence interval equation. This gives a confidence interval that doesn’t capture 0, as it ranges from –4.51 to –0.37. Here are the calculations:

90% CI μ _Diff = (M – μ) ± (t_cv × s_M)

= (16.24 – 18.68) ± (1.697 × 1.22)

= [–4.51, –0.37]

Putting it all together: “A psychologist investigating time management in high school students compared the average amount of time spent by valedictorians per week in online social media (M = 16.24) to the known average for the population of high school students in general (μ = 18.68). The difference was statistically significant [t(30) = –2.00, p < .05 (one-tailed)], indicating that students who are doing very well in school, like valedictorians, spend less time with social media and, presumably, more time on school work.

“Unfortunately, the sample size, 31, in this study was relatively small, making the confidence interval relatively wide. To resolve this, I recommend replicating the study with a larger sample of valedictorians, chosen in such a way that they represent the United States.”

7.93The less variability there is, (a) the smaller the standard error of the mean, (b) the larger the t value, (c) the bigger the effect size, (d) and the narrower the confidence interval. Thus, when there is less variability it is easier to reject the null hypothesis, find a large effect, and have a more precise estimate of the population value.

7.95The distance from the sample mean to the population mean has no impact on the standard error of the mean or the width of the confidence interval, but it does have an impact on the t value and the effect size. Having a greater distance between sample and population mean means there is a larger effect and that it is easier to reject the null hypothesis.

7.971.959 is the largest value of t that guarantees failure to reject the null hypothesis. (1.960 is the critical value of t for α = .05, two-tailed, df = ∞.)

7.99For the 90% confidence interval, use Equation 7.4, but instead of using t_cv, two-tailed, α = .05, substitute the critical value of t, two-tailed, α = .10. For the 99% confidence interval, substitute the critical value of t, two-tailed, α = .05. Here are the two equations for N = 21 (i.e., df = 20) and s_M = 1:

90% CI μ _Diff = (M – μ) ± (t_cv × s_M)

= (M – μ) ± (1.725 × 1) = (M – μ) ± 1.725

99% CI μ _Diff = (M – μ) ± (t_cv × s_M)

= (M – μ) ± (2.845 × 1) = (M – μ) ± 2.845

B-10

Chapter 8

8.01mean

8.03sample; population

8.05independent; paired

8.07paired

8.09independent; mean

8.11random samples

8.13normally distributed

8.15non

8.17μ ₁ ≠ μ₂

8.19rare; common; t

8.212

8.23≥

8.25The pooled variance

8.27means

8.29rejected

8.315

8.33p < .05; p > .05

8.350

8.37outcome; explanatory

8.39population means

8.41small

8.43t-statistic

8.45a paired-samples t test (the student and parent are from the same family)

8.47an independent-samples t test (no subject belongs to both groups)

8.49The random samples assumption is violated. The independence assumption is not violated. The normality of distribution assumption is not violated. The homogeneity of variance assumption is violated, so much so that it is not OK to continue with the test.

8.51

H₀: μ ₁ = μ ₂

H₁: μ ₁ ≠ μ ₂
The null hypothesis says that the two population means are the same, that is, there is no difference in effectiveness between the two treatments. Any difference between sample means is due to sampling error. The alternative hypothesis says that the two population means are different because the two populations are differentially effective. The difference between sample means is too large to be due to sampling error.

8.53

H₀: μ _Fluoride ≥ μ _Nonfluoride

H₁: μ _Fluoride < μ _Nonfluoride
The null hypothesis says that the population mean for the number of cavities for the fluoride group will be greater than or the same as that for the nonfluoride group because fluoride has no effect, or a negative effect, on cavities. The alternative hypothesis states that the population mean for the number of cavities for the fluoride group will be less than that for the nonfluoride group because fluoride reduces cavities.

8.55df = n₁ + n₂ – 2 = 223 + 252 – 2 = 473. Use df = 450. t_cv = 1.965.

8.57df = n₁ + n₂ – 2 = 46 + 46 –2 = 90. t_cv = 2.632.

8.59

If t ≤ –2.086 or if t ≥ 2.086, then reject H₀.
If –2.086 < t < 2.086, then fail to reject H₀.

8.61df = n₁ + n₂ – = 12 + 13 – 2 = 23

8.63

8.65

8.67

8.69

B-11

8.71

H₀ rejected
Statistically significant
μ ₂ is probably bigger than μ ₁.

8.73t(21) = 2.07, p > .05

8.75t(8) = 2.31, p < .05

8.77

Very large effect

8.79df = N – 2 = 73 – 2 = 71

8.81

Reject H₀.

8.83(Note: One does not need to use everything that is calculated in an interpretation. Interpretation has a subjective element. What one person chooses to emphasize may differ from what another does.)

An education researcher investigated whether the color used to grade a test has an impact on first graders’ self-esteem. The first graders took a third-grade math test and had 25% of their answers marked wrong, half of the students with a red pen and half of the students with pencil. They then took a self-esteem test. The “pencil” students had higher self-esteem (M = 29.00) than the “red ink” students (M = 23.00). This difference was statistically significant [t(25) = 3.02, p < .05]. The effect is a strong one and suggests that the color used to grade a test has an impact on self-esteem, with an effect of between 1.90 and 10.1 points with 95% confidence. Unfortunately, because there was no control group, it is unclear if grading in pencil raises self-esteem or using red ink lowers it. Future research should use larger sample sizes, include such a control group, and utilize random assignment of the students to the treatment conditions. Until that research is done, it seems advisable for teachers to use pencils, not red ink, to grade tests and papers.

8.85Step 1 Pick a Test.

Comparing the means of two independent samples calls for an independent-samples t test.

Step 2 Check the Assumptions.

Random samples: violated, but robust to violations. Plus, sample is representative of Americans, which is the objective.
Independence of observations: not violated, each participant is in the study only once.
Normality: Cholesterol is a physical characteristic; reasonable to assume it is normally distributed. Plus, robust assumption, especially when N > 50.
Homogeneity of variance: the 2 standard deviations are about equal—not violated.

Step 3 List the Hypotheses.

H₀: μ _American = μ _{Mediterranean}

H₁: μ _American ≠ μ _{Mediterranean}

Step 4 Set the Decision Rule.

Set α = .05, two-tailed, df = 70: t_cv = 1.994.

If t ≤ –1.994 or if t ≥ 1.994, reject H₀.
If –1.994 < t < 1.994, fail to reject H₀.

Step 5 Calculate the Test Statistic.

B-12

Step 6 Interpret the Results.

Was the null hypothesis rejected?

6.78 ≥ 1.994, so H₀ rejected. Results are statistically significant. Cholesterol levels of those who follow a Mediterranean diet are lower than for those who follow an American diet.

How big is the effect?

How wide is the confidence interval?

Write a four-point interpretation:

This study used a representative sample of Americans to compare the cholesterol levels of those who ate a Mediterranean diet to those who ate an American diet. The mean cholesterol level of those who followed the Mediterranean diet was 40 points lower with a mean of 190 compared to a mean cholesterol level of 230 for those following an American diet. This difference was statistically significant [t(70) = 6.78, p < .05] and a 40-point difference is a very meaningful difference. The Mediterranean diet is associated with lower cholesterol, a level sufficiently lower that it means the difference between high cholesterol and normal cholesterol.

This study, because it did not use random assignment, cannot give conclusions about cause and effect. The report suggests that the Mediterranean diet is associated with reduced cholesterol, but doesn’t prove such. Based on this study, it would be reasonable to proceed to a study in which participants are randomly assigned to different diets.

8.87d. There is probably a difference between the two population means.

8.89(a) The first test is an independent-samples t test and the second test is a paired-samples t test. (b) The first test answers the question, “Does one store carry more expensive merchandise than the other?” The second test answers the question, “For the same items, is there a difference in price between the two stores?”

8.91If the confidence interval is a 99% confidence interval but the alpha level for the test is set at .05, which is equivalent to a 95% confidence interval, the confidence interval might capture zero though the null hypothesis was rejected. Another way this could occur would be with a one-tailed test as confidence intervals are two-tailed. A one-tailed test with α = .05 has a rare zone that encompasses the extreme 5%, while a 95% confidence interval reaches out into the tail an additional 2.5%.

8.93

Yes
There was evidence of an effect.
Small.
Because the confidence interval is so narrow, we are fairly certain that effect was small.
Given how large the sample size was, it seems unlikely that a nonrepresentative sample was obtained. Not very worried about an erroneous conclusion being drawn.

Chapter 9

9.01independent; paired-samples t test

9.03confounding

9.05individual differences

9.07higher

9.09sampling error

9.11within; between

9.13populations; samples

9.15negative; small

9.17one

9.19

9.21does not

9.23>

9.25sample

9.27individual differences; dependent variable

9.29confidence interval for the difference between population means

9.31small

9.33paired-samples t test (each subject rated both types of coffee)

9.35single-sample t test (the mean is being compared to the known USDA standard)

9.37It is not OK to proceed with the paired-samples t test.

B-13

The random samples assumption is violated, but this assumption is robust to violation.
The nonrobust independence of observations assumption is violated, so it is not OK to proceed with the test. Why was it violated? All students are in the same classes, so presumably they were assigned the same homework, which would influence the online activity of all students.
For the normality assumption, we can assume that the difference scores are normally distributed.

9.39H₀: μ_School = μ_Social

H₁: μ_School ≠ μ_Social

9.41–5.00; 4.00; –2.00; 1.00; –1.00 or: 5.00; –4.00; 2.00; –1.00; 1.00

9.439

9.452.101

9.471.984

9.49

9.51

9.53

9.55

The null hypothesis is rejected.

9.57t(4) = 3.21, p < .05

9.59t(68) = 1.99, p > .05

9.61μ₂ is probably greater than μ₁.

9.63There is not enough evidence to say a difference exists in the means of the two populations.

9.65

9.67Rejected

9.69A sleep therapist studied whether an herbal tea that was advertised as a sleep aid had any impact on sleep onset for people with insomnia. Using 23 pairs of insomniacs matched on a number of potential confounding variables, he found that those who used the tea went to sleep in 19.70 minutes and those who didn’t took 21.20 minutes. This 1.5-minute difference was not statistically significant [t(22) = 1.32, p > .05], meaning that the study did not provide sufficient evidence that the tea had any impact on sleep onset.

Nonetheless, the study was suggestive that the tea might have some small effect, so it would be reasonable to replicate the study with a larger sample size. It would also be advisable to make the study double blind so that neither experimenter nor participant would know who was receiving the supposed sleep aid.

9.71Step 1 Pick a Test

There are two samples, feet receiving standard treatment and feet receiving experimental treatment, the samples are dependent, and the dependent variable is measured at the ratio level. This calls for a paired-samples t test.

Step 2 Check the Assumptions

Random samples: Violated, but robust to violations.
Independence of observations: Not violated.
Normality: Willing to assume not violated.

Step 3 List the Hypotheses

H₀: μ_E = μ_C
H₁: μ_E ≠ μ_C

Step 4 Set the Decision Rule

df = N – 1 = 30 – 1 = 29

α = .05, two-tailed

t_cv = ±2.045

Decision rule:

If t ≤ –2.045 or if t ≥ 2.045, reject H₀.
If –2.045 < t < 2.045, fail to reject H₀.

Step 5 Calculate the Test Statistic

Step 6 Interpret the Results

Was the null hypothesis rejected?

As 10.13 ≥ 2.045, the null hypothesis is rejected. The difference is statistically significant and is in favor of the new treatment working better. In APA format, the results are t(29) = 10.13, p < .05.

B-14

How wide is the confidence interval? How big is the effect?

Write a four-point interpretation:

A dermatologist evaluated a new treatment for athlete’s foot by taking 30 people with athlete’s foot on both feet and randomly assigning which foot received standard treatment and which received the experimental treatment. He measured outcome as the percentage of reduction of symptoms and found a statistically significant improvement in outcome with the new treatment [t(29) = 10.13, p < .05]. The new treatment led to an 88% reduction in symptoms vs. a 72% reduction in symptoms with the old, standard treatment. The new treatment seems to work substantially better than the standard treatment. But, to increase confidence in robustness of this finding, the study should be replicated.

9.73e (the width does not depend on the differences in the mean)

9.75d (all else being equal, a larger standard error of the mean gives a wider confidence interval)

9.77M₁ – M₂ = –2.00 (that’s the midpoint of the confidence interval)

Chapter 10

10.01two

10.03ANOVA

10.05way; factor

10.07Type I

10.09post-hoc

10.11variability

10.13individual differences

10.15Between-groups

10.17individual differences

10.19large

10.21F ratio

10.23one

10.25t

10.27an independent-samples t test

10.29cannot

10.31homogeneity of variance

10.33non

10.35rejected

10.37between

10.39df_Between; df_Within

10.41sum of squares

10.43SS_Total

10.45treatment effect

10.47ANOVA summary table

10.49mean square

10.51effect

10.53the alternative hypothesis

10.55>

10.57between-groups

10.59100; 0

10.61statistically significant

10.63Type II; Type I (order does matter)

10.65statistically significant

10.67smallest

10.69(e) This is a between-subjects, one-way ANOVA because the means of three or more independent samples are being compared.

10.71(f) This is none of the tests learned about so far. There are three levels of one independent variable being compared on an interval-level dependent variable, but the samples are dependent. This type of test, a repeated-measures ANOVA, is covered in the next chapter.

10.73The random samples assumption is violated. The independence of observations assumption is not violated. There is no reason to think the homogeneity of variance assumption is violated, but it would be nice to know the standard deviation for each group. Because the only violated assumption is robust, we can proceed with the between-subjects, one-way ANOVA.

10.75H₀: μ₁ = μ₂ = μ₃.

H₁: At least one population mean differs from at least one other population mean.

10.77

df_Total = N – 1 = 48 – 1 = 47
df_Between = k – 1 = 4 – 1 = 3
df_Within = N – k = 48 – 4 = 44

10.79

df_Total = N – 1 = 60 – 1 = 59
df_Between = k – 1 = 4 – 1 = 3
df_Within = N – k = 60 – 4 = 56

10.81F_cv = 5.123

10.83df_Between = 4 – 1 = 3; df_Within = 50 – 4 = 46 F_cv = 2.807

B-15

10.85F_cv = 3.232

If F ≥ 3.232, reject H₀. If F < 3.232, fail to reject H₀.

10.87

	Group 1		Group 2		Group 3
	X	X²	X	X²	X	X²
	108	11,664	100	10,000	99	9,801
	102	10,404	105	11,025	95	9,025	Grand
					91	8,281	X	X²
Sum	210	22,068	205	21,025	285	27,1067	700	70,200
n	2		2		3		7

10.89

10.91

10.93

10.95

10.97

10.99

Source of Variability	Sum of Squares	Degrees of Freedom	Mean Square	F ratio
Between groups	172.80	2	86.40	0.16
Within groups	6,410.80	12	534.23
Total	6,583.60	14

10.101 F = 1.96 falls in the common zone, so fail to reject the null hypothesis.

10.1037.64 ≥ 3.467, so reject H₀.

10.105F(3, 17) = 5.34, p < .05

10.107F(3, 44) = 3.28, p < .05

10.109The results are statistically significant, so conclude that it is probable that at least one population has a mean different from at least one other population mean.

10.111

10.113

10.115Yes, complete a post-hoc test because the results were statistically significant.

10.117q = 5.19

10.119

10.12113.09 – 8.89 = 4.20. 4.20 < 4.37, so the difference is not statistically significant. There’s not enough evidence to conclude that these two populations have a mean difference.

10.123

M₁ vs. M₂: 67.86 – 53.56 = 14.30, which is greater than 8.30, so it is a statistically significant difference. The mean of population 1 is probably greater than the mean of population 2.

B-16
M₁ vs. M₃: 67.86 – 61.55 = 6.31, which is less than 8.30, so it is not a statistically significant difference. There is not enough evidence to conclude that these two population means are different.
M₂ vs. M₃: 61.55 – 53.56 = 7.99, which is less than 8.30, so it is not a statistically significant difference. There is not enough evidence to conclude that the two population means are different.

10.125“An addictions researcher obtained samples of alcoholics, smokers, and heroin addicts who were in treatment for the second time. From each participant she found out how long, in months, it took before relapse occurred after the first treatment. The alcoholics relapsed, on average, in 4.63 months, the smokers in 4.91 months, and the heroin addicts in 5.17 months. Using a between-subjects, one-way ANOVA, she found no statistically significant effect for the type of addiction on the mean time to relapse [F(2, 22) = 0.07, p > .05]. These data suggest, for those who fail in treatment, that there is no difference in the addictiveness of these three substances. Future research should use a larger number of participants and follow them prospectively after treatment ends.”

10.127Step 1 Pick a Test

The test compares means of three independent samples, so use a between-subjects, one-way ANOVA.

Step 2 Check the Assumptions

The random samples assumption is violated, but this assumption is robust. Need to be careful about population to which the results are generalized.
Independence of observations is not violated.
Normality is not violated. It’s reasonable to assume that satisfaction scores are normally distributed.
Homogeneity of variance is not violated. All standard deviations are equal.

Only the random samples assumption is violated, but it is robust, so it is OK to proceed with a between-subjects, one-way ANOVA.

Step 3 List the Hypotheses

H₀: μ₁ = μ₂ = μ₃.

H₁: At least one population mean is different from at least one other.

Step 4 Set the Decision Rule

df_Between, numerator df = k – 1 = 3 – 1 = 2

df_Within, denominator df = N – k = 12 – 3 = 9

df_Total = N – 1 = 12 – 1 = 11

Set alpha at .05.

F_cv = 4.256

Decision rule: If F ≥ 4.256, reject H₀; if F < 4.256, fail to reject H₀.

Step 5 Calculate the Test Statistic

Source of Variability	Sum of Squares	Degrees of Freedom	Mean Square	F ratio
Between groups	1,400.00	2	700.00	41.99
Within groups	150.00	9	16.67
Total	1,550.00	11

Step 6 Interpret the Results

Was the null hypothesis rejected?
- Yes, 41.99 ≥ 4.256
- APA format: F(2, 9) = 41.99, p > .05
- Conclude that at least one population mean differs from at least one other mean.
  
  B-17
How big is the effect?
- Big!
Where is the effect?
- q = 3.95
- The premium brand in the premium bottle is statistically significantly more satisfying than the store brand in the store bottle.
- The store brand in the premium bottle is statistically significantly more satisfying than the store brand in the store bottle.
- There is not enough evidence to conclude that there is a difference in satisfaction between the premium brand and the store brand when both are in the premium bottle.
Putting it all together: A consumer researcher investigated the effect of labeling on satisfaction with a product. Comparing a store brand shampoo in a store brand bottle, a premium brand shampoo in a premium brand bottle, and a store brand shampoo in a premium bottle, the mean satisfaction scores were, respectively, 65.00, 90.00, and 85.00. Using a between-subjects, one-way ANOVA, she found a strong and statistically significant effect of labeling [F(2, 9) = 41.99, p < .05]. There was no statistically significant difference between the store brand and the premium brand when both were in the premium bottle, but both were found to be statistically significantly more satisfying than the store shampoo in a store brand bottle. In rating shampoos, consumers seem to be more influenced by their perception of the shampoo from the bottle’s label than they are by the shampoo in the bottle. It would be interesting, in future research, to use the same participants in all three conditions to see if the effect exists when they are directly comparing all three samples.

10.129c. SS_Totalmust be largest; means squares are always no larger than corresponding sum of squares.

10.131Could be true

10.133False

10.135False

10.137Could be true

Chapter 11

11.01independent-samples t test

11.03Repeated-measures

11.05rows; columns

11.07powerful; greater

11.09denominator

11.11a repeated-measures ANOVA; a paired-samples t test

11.13robust

11.15normally distributed; population

11.17is not

11.19greater than or equal to

11.21was

11.23was

11.25treatment; residual

11.27n; k; N

11.29df_Total

11.31subjects

11.33rare

11.35variability

11.37sum of squares; degrees of freedom

11.39MS_Treatment; MS_Residual

11.41decision rule

11.43was

11.45should

11.47was not

11.49eta squared

11.51r squared

11.53medium

11.55rejected

11.57not statistically

11.59Between-subjects, one-way ANOVA (the means of three or more independent samples are being compared)

11.61Independent-samples t test (the means of two independent samples are being compared)

11.63

The random samples assumption is not violated. The independence of observations assumption is violated as participants met together in groups. The normality assumption is not violated.

B-18
Should not proceed with the one-way, repeated-measures ANOVA because the independence of observations assumption is violated.

11.65H₀: All population means are equal.

H₁: At least one population mean is different from at least one other.

11.67df_Subjects = (n – 1) = 8 – 1 = 7

df_Treatment = (k – 1) = 3 – 1 = 2

df_Residual = (n – 1)(k – 1) = (8 – 1)(3 – 1) = 14

df_Total = (N – 1) = (24 – 1) = 23

11.69Remember, N = n × k. So, N = 15 × 4 = 60

df_Subjects = (n – 1) = 15 – 1 = 14

df_Treatment = (k – 1) = 4 – 1 = 3

df_Residual = (n – 1)(k – 1) = (15 – 1)(4 – 1) = 42

df_Total = (N – 1) = (60 – 1) = 59

11.71

F_cv = 4.103

11.73If F ≥ 3.287, reject H₀.

If F < 3.287, fail to reject H₀.

11.75SS_Total = 301.48

Source of Variability	Sum of Squares	Degrees of Freedom	Mean Square	F ratio
Subjects	137.50	11
Treatment	48.48	2	24.24	4.62
Residual	115.50	22	5.25
Total	301.48	35

11.77H₀ was rejected.

11.79Based on df in summary table, F_cv = 3.232. H₀ is rejected.

11.81

F(5, 45) = 7.84, p < .05
Statistically significant

11.83Based on df in summary table, F_cv = 3.555.

F(2, 18) = 11.24, p < .05
Statistically significant

11.85The results of the ANOVA are not statistically significant. There is not enough evidence to conclude that the population means are different.

11.87

Medium

11.89

Medium

11.91

The effect size is small, so there is just the normal level of worry about a Type II error.
Do not recommend replication with a larger sample size.

11.93q = 4.79

11.95

11.97

- M₁ vs. M₂ = 14.98 – 12.83 = 2.15. 2.15 < 3.78: Not statistically significant.
- M₁ vs. M₃ = 12.83 – 8.22 = 4.61. 4.61 ≥ 3.78: Statistically significant.
- M₂ vs. M₃ = 14.98 – 8.22 = 6.76. 6.76 ≥ 3.78: Statistically significant.
- Not enough evidence to conclude there is a difference in means between population 1 and population 2.
- Population 1 probably has a bigger mean than population 3.
- Population 2 probably has a bigger mean than population 3.

11.99“In this study, rats were reared in three different environments—a standard laboratory setting, an enriched laboratory setting, and a mimicked wild setting—to see if environment had an impact on the rats’ intelligence. There was a statistically significant and strong effect of environment on mean intelligence [F(2, 14) = 9.61, p < .05]. Rats reared in the enriched environment earned statistically higher scores (M = 108.00) on a variety of behavioral tasks than either the rats reared in a standard lab setting (M = 95.00) or the rats reared in an environment meant to mimic the wild (M = 94.00). There was no difference in the mean intelligence of these latter two types of rats. An enriched environment has a positive impact on rat intelligence and a standard laboratory environment does not have a negative impact on rat intelligence, as compared to a mimicked wild environment. This research should be replicated to make sure the effect is stable. Future research should attempt to determine what aspects of the enriched environment lead to improved performance.”

B-19

11.101Step 1 Pick a Test

Comparing the means of three dependent samples calls for a one-way, repeated-measures ANOVA.

Step 2 Check the Assumptions

The random samples assumption is violated, but it is robust, so it is OK to proceed.
The independence of observations assumption is not violated.
The normality assumption is not violated. It is reasonable to assume a normal distribution for urges to drink alcohol.

Step 3 List the Hypotheses

H₀: μ₁ = μ₂ = μ₃.

H₁: At least one population mean is different from at least one other.

Step 4 Set the Decision Rule

df_Subjects = n –1 = 10 – 1 = 9

df_Treatment = k – 1 = 3 – 1 = 2

df_Residual=(n – 1)(k – 1) = (10 – 1)(3 – 1) = 18

N = n × k = 10 × 3 = 30

df_Total = N – 1 = 30 – 1 = 29

Set α = .05.

F_cv = 3.555

If F ≥ 3.555, reject H₀.

If F < 3.555, fail to reject H₀.

Step 5 Calculate the Test Statistic

SS_Total = 646.67

Source of Variability	Sum of Squares	Degrees of Freedom	Mean Square	F ratio
Subjects	532.00	9
Treatment	26.67	2	13.34	2.73
Residual	88.00	18	4.89
Total	646.67	29

Step 6 Interpret the Results

2.73 < 3.555, so fail to reject the null hypothesis.

In APA format: F(2, 18) = 2.73, p > .05.

This is more than a small effect, so some concern about Type II error does exist.

A study was conducted to see if different types of treatments for alcoholism were more or less effective in terms of an impact on urges to drink. Thirty alcoholics were matched on the severity of their addiction and then assigned to Alcoholics Anonymous, individual psychotherapy, or anti-urge medication. At the end of treatment the mean number of urges per day for the three treatments, respectively, was 8.00, 6.00, and 6.00. The results were not statistically significant, meaning that there was not enough evidence to show that one treatment was more effective than another in controlling urges [F(2, 18) = 2.73, p > .05.] However, the sample size in this study was small and might not have been sufficient to pick up the effect if one treatment was just slightly better or worse than the others. Thus, it is recommended that the study be replicated with a larger sample size.

11.103 If df_Treatment = 3, then k = 4. Given df_Residual = 45 and df_Treatment = 3, then . If 15 = n – 1, then n = 16. N = k × n = 4 × 16 = 64.

B-20

11.105 There is no need to be concerned about Type II error. The null hypothesis was rejected, so there should be concern about the possibility of Type I error.

11.107 The difference between a HSD value calculated at the α = .01 level and the α = .05 level is the q value. q is bigger at .01 than .05, so HSD is bigger at .01 than .05. As a result of this, two means have to be farther apart to be considered a statistically significant difference at the .01 level. And, this makes sense—farther apart means are more likely to represent a real difference between populations, so there is less chance of a Type I error. And the probability of a Type I error is what alpha represents.

Chapter 12

12.01Two

12.03independent; within

12.05two; one

12.07main; interaction

12.09not parallel

12.11main

12.13independent

12.15independence of observations

12.17three

12.19at least; at least

12.21interaction effect

12.23the number of rows

12.25degrees of freedom; degrees of freedom

12.27between groups; within groups

12.29squares

12.31F_Interaction

12.33statistically significant

12.35.05 (or 5%)

12.37eta squared (η²)

12.399

12.41statistically significant

12.43statistically significant

12.45between-subjects, one-way ANOVA

12.47between-subjects, two-way ANOVA

12.49

12.51

There is no evidence of an interaction.
There appears to be a statistically significant columns main effect.
The main effect of columns should be interpreted.

12.53

The random samples assumption was violated. The independence of observations assumption was not violated. It is assumed that the normality assumption was not violated. The homogeneity of variance assumption was violated.
It is OK to proceed with the test. The random samples assumption for between-subjects, two-way ANOVA is robust to violations. And, the homogeneity of variance assumption is robust to violation as long as the total sample size is large.

B-21

12.55

H_{0 Rows}: All row population means are the same.

H_{1 Rows}: At least one row population mean is different from at least one other row population mean.
H_{0 Columns}: All column population means are the same.

H_{1 Columns}: At least one column population mean is different from at least one other column population mean.
H_{0 Interaction}: There is no interactive effect of the two explanatory variables on the dependent variable in the population.

H_{1 Interaction}: The two explanatory variables, in the population, interact to affect the dependent variable in at least one cell.

12.57

df_Rows = R – 1 = 3 – 1 = 2
df_Columns = C – 1 = 4 – 1 = 3
df_Interaction = df_Rows × df_Columns = 2 × 3 = 6
N = n × R × C = 8 × 3 × 4 = 96

df_Within = N – (R × C) = 96 – (3 × 4) = 84
df_Between = df_Rows + df_Columns + df_Interaction = 2 + 3 + 6 = 11
df_Total = N – 1 = 96 – 1 = 95

12.59

F_{cv Rows} = 3.259
F_{cv Columns} = 3.259
F_{cv Interaction} = 2.634

12.61 If F_Interaction ≥ 2.668, reject H₀.

If F_Interaction < 2.668, fail to reject H₀.

12.63

12.65

12.67 df_Rows = R – 1 = 2 – 1 = 1

df_Columns = C – 1 = 2 – 1 = 1

df_Interaction = df_Rows × df_Columns = 1 × 1 = 1

N = n × R × C = 5 × 2 × 2 = 20

df_Within = N – (R × C) = 20 – (2 × 2) = 16

df_Between = df_Rows + df_Columns + df_Interaction = 1 + 1 + 1 = 3

df_Total = N – 1 = 20 – 1 = 19

Source of Variability	Sum of Squares	Degrees of Freedom	Mean Square	F ratio
Between groups	61.00	3
Rows	15.00	1	15.00	2.00
Columns	12.00	1	12.00	1.60
Interaction	34.00	1	34.00	4.53
Within groups	120.00	16	7.50
Total	181.00	19

12.69The null hypothesis was rejected.

12.71F(2, 66) = 3.70, p < .05

12.73F(3, 168) = 2.45, p > .05

12.75

F(2, 30) = 3.21, p > .05
F(1, 30) = 1.08, p > .05
F(2, 30) = 3.93, p < .05

12.77Interpret the interaction effect.

12.79Interpret the interaction effect.

12.81

This is a medium effect.

12.83

B-22

12.85

q_Rows = 3.44
q_Columns = 2.85
q_Interaction = 4.22

12.87

12.89HSD value should only be calculated for the interaction effect.

12.91

- Row 1 vs. Row 2 (117.66 – 113.63 = 4.03) is not a statistically significant difference.
- Row 1 vs. Row 3 (128.91 – 117.66 = 11.25) is a statistically significant difference.
- Row 2 vs. Row 3 (128.91 – 113.63 = 15.28) is a statistically significant difference.
The mean of Row 3 is higher than that of both Row 1 and Row 2.

12.93To aid in interpretation, make a graph.

Men and women who went shopping for groceries with and without children were surveyed regarding how much they enjoyed the shopping experience. How much shoppers enjoyed the experience depended on the interaction of one’s sex with whether there were children on the shopping trip (F(1, 16) = 41.15, p < .05). Men shopping with children and women shopping by themselves rated their shopping trips, on average as somewhat enjoyable. These ratings were statistically significantly higher than men shopping without children or women shopping with children, who rated, on average, their shopping trips as somewhat less than enjoyable. It is possible that men and women feel differently about shopping with children, but another explanation is that novel situations are more enjoyable. For men, shopping with children tends to be novel and for women shopping without children is more likely to be novel. Future research should determine the novelty of the shopping experience as well as the presence of children.

12.95Step 1 Pick a Test

Comparing the means of four independent samples formed by two crossed independent variables calls for a between-subjects, two-way ANOVA.

Step 2 Check the Assumptions

OK to proceed with the test.

The random samples assumption is violated unless the population is meant to be the local high school.
The independence of observations assumption is not violated.
It is reasonable to assume that this psychological variable is normally distributed, so the normality assumption is not violated.
The homogeneity of variance assumption is not violated.

Step 3 List the Hypotheses

H_{0 Rows}: All row population means are the same.

H_{1 Rows}: At least one row population mean is different from at least one other row population mean.

H_{0 Columns}: All column population means are the same.

H_{1 Columns}: At least one column population mean is different from at least one other column population mean.

H_{0 Interaction}: There is no interactive effect of the two explanatory variables on the dependent variable in the population.

H_{1 Interaction}: The two explanatory variables, in the population, interact to affect the dependent variable in at least one cell.

B-23

Step 4 Set the Decision Rule

df_Rows = 2 – 1 = 1

df_Columns = 2 – 1 = 1

df_Interaction = 1 × 1 = 1

df_Within = 24 – (2 × 2) = 20

All three F ratios have df_numerator = 1 and df_denominator = 20. F_cv = 4.351.

The decision rule for each of the three effects is:

If F ≥ 4.351, reject H₀.
If F < 4.351, fail to reject H₀.

Step 5 Calculate the Value of the Test Statistic

Source of Variability	Sum of Squares	Degrees of Freedom	Mean Square	F ratio
Between groups	447.12	3
Rows	210.04	1	210.04	20.51
Columns	222.04	1	222.04	21.68
Interaction	15.04	1	15.04	1.47
Within groups	204.83	20	10.24
Total	651.95	23

Step 6 Interpret the Results

Calculate the row means and the column means:
- Row 1, positive reinforcement =
- Row 2, punishment =
- Column 1, restricted TV =
- Column 2, unrestricted TV =
Determine statistical significance:
- The rows effect (use of reinforcement vs. punishment) is statistically significant:
  - F(1, 20) = 20.51, p < .05
  - The mean Approval of Violence Scale (AVS) score for teens who were raised with positive reinforcement is 8.25 and is statistically lower than the mean of 14.17 for teens who were raised with punishment.
  - Being raised with punishment leads to a statistically significant higher approval of the use of violence to solve disagreements.
- The columns effect (Restrict TV or not) is statistically significant:
  - F(1, 20) = 21.68, p < .05
  - The mean AVS score for teens who were restricted in what they could watch on TV is 8.17 and is statistically lower than the mean of 14.25 for teens who were unrestricted in their TV watching.
  - Being unrestricted in what one watches on TV leads to a statistically significant higher approval of the use of violence to solve disagreements.
The interaction effect is not statistically significant:
- F(1, 20) = 1.47, p > .05
- There is not enough evidence to conclude that type of discipline and restricting TV watching interact to affect the acceptance of violence as a way to solve disagreements.
- Inspection of the graph to aid in interpretation
  - The lines are mostly parallel, so there is no interaction effect.
  - The “Punishment” line is higher than the “Positive reinforcement” line, suggesting a main effect for type of discipline.
  - The “Unrestricted” dots are higher than the “Restricted” dots, suggesting a main effect for restricted TV.
Calculating effect sizes:
- - This a large effect.
- - This is a large effect.
- - This is a small effect.
  - There is little reason to worry about Type II error for the interaction effect.
Determine whether post-hoc tests are needed:
- There are only two rows, so there is no need to calculate a HSD value for the statistically significant row effect: the positive reinforcement row has a statistically lower mean than the punishment row.
- There are only two columns, so there is no need to calculate a HSD value for the statistically significant column effect: the restricted column has a statistically lower mean than the unrestricted column.
- The interaction effect was not statistically significant, so there is no need to calculate the HSD value of the cells for the interaction effect.

The interpretation:

A developmental psychologist investigated the impact of (a) type of parental discipline, positive reinforcement vs. punishment, and (b) restrictions on television viewing on teenagers’ acceptance of violence as a way to solve disagreements. It was found that each variable by itself had a large and independent effect on acceptance of violence. Teens whose parents used punishment to discipline were statistically significantly more accepting of violence than were teens whose parents used positive reinforcement F(1, 20) = 20.51, p < .05. Teens whose television viewing was unrestricted were more accepting of the use of violence than were teens whose television viewing had been restricted F(1, 20) = 21.68, p < .05. This study suggests using positive reinforcement and restricting television viewing could lead to teenagers who are less likely to approve of the use of violence. But, nothing was manipulated in this study, so no conclusions regarding cause and effect can be drawn. Future research should attempt to manipulate either television viewing or type of discipline to see the impact on violence acceptance.

12.97

- A is given as 4.00.
- If and A = 4.00, then B = 12.00.
- If A = 4.00 and then C = 8.00.
- If B = 12.00 and then D = 6.00.

12.99

- Given A = 1.00 and , then D = 4.00.
- The other cells can’t be calculated.
The grand mean, which is 4.00, can be calculated by finding the average of the row means or the column means.

12.101 e.

12.103

SS_Between = 17.50 + 13.75 + 22.25 = 53.50
SS_Total = 53.50 + 44.75 = 98.25

Chapter 13

13.01two; a

13.03association

13.05systematically

13.07cause and effect

13.09Scatterplots

B-25

13.11X; Y

13.13circular shape

13.15undetermined

13.17.75

13.19quantifies; two variables

13.21weaker

13.23positive (direct)

13.25outlier

13.27unrestricted

13.29can

13.31ρ (rho)

13.33no

13.35alternate

13.37–1; 1; 0

13.39I

13.41reject

13.43scatterplot

13.45decision rule

13.4710

13.49negative; inverse

13.51insufficient; relationship

13.53r ² (coefficient of determination)

13.55100

13.57ρ

13.59was not

13.61should

13.63.54

13.65partial correlation

13.67

13.69independent-samples t test

13.71Pearson r

13.73

Random samples is not violated; independence of observations is not violated; normality is probably violated for age in college students; linearity is not violated.
The normality assumption is robust if the sample size is large, so it is OK to proceed with Pearson r.

13.75H₀: ρ = 0; H₁: ρ ≠ 0

13.77df = N – 2 = 63 – 2 = 61

13.79r_cv = ±.666

13.81r_cv = .291

13.83If r ≤ –.438 or r ≥ .438, reject H₀;

If –.438 < r < .438, fail to reject H₀.

13.85

13.87

X	Y	X Deviation Scores	Y Deviation Scores
12	6	3.00	1.00
8	3	–1.00	–2.00
7	6	–2.00	1.00

13.89

X	Y	X Deviation Scores	Y Deviation Scores	Product of Deviation Scores
10	65	–2.33	11.67	–27.19
18	55	5.67	1.67	9.47
9	40	–3.33	–13.33	44.39
M = 12.33	M = 53.33			Σ =26.67

13.91

X	Y	X Deviation Scores	Y Deviation Scores	Product of Deviation Scores	Squared X Deviation Scores	Squared Y Deviation Scores
21	110	–4.33	10.67	–46.20	18.75	113.85
38	90	12.67	–9.33	–118.21	160.53	87.05
17	98	–8.33	–1.33	11.08	69.39	1.77
M = 25.33	M = 99.33			Σ =–153.33	Σ =248.67	Σ =202.67

B-26

13.93

X	Y	X Deviation Scores	Y Deviation Scores	Product of Deviation Scores	Squared X Deviation Scores	Squared Y Deviation Scores
88	23	1.67	–5.00	–8.35	2.79	25.00
93	28	6.67	0.00	0.00	44.49	0.00
78	33	–8.33	5.00	–41.65	69.39	25.00
M =86.33	M =28.00			Σ =–50.00	Σ =116.67	Σ = 50.00
SS_XSS_Y = 5,833.50

13.95

13.97

13.99

Failed to reject null hypothesis;
Results not statistically significant.

13.101r(20) = –.67, p < .05

13.103

Null hypothesis rejected;
Results statistically significant.

13.105

There is not enough evidence to conclude a relationship exists between X and Y in the population; (b) There is no need to comment on the direction of the relationship because not enough evidence exists to conclude there is a relationship.

13.107

r ² = r ² × 100 = .32² × 100 = 10.24%;
Medium effect.

13.109z_r = –0.50

13.111

13.11395%CIz_r = z_r ± 1.96s_r = -0.34 ± 1.96(.24)

= [-0.81, 0.13]

13.115z_r = 0.60

13.117[–.69, –.39]

13.119

Null hypothesis was not rejected;
Results were not statistically significant;
Relationship in the population could be weak;
Unsure of population value of correlation.

13.121Power > .99; β < .01

13.123120 cases

13.125“In this study, the mental health level of a large, random sample of American men was measured, and the men were surveyed regarding how much aerobic exercise they engaged in each week. A moderately strong, direct, and statistically significant relationship was found between the two variables [r(1,000) = .36, p < .05], indicating that men who had higher levels of mental health also had higher levels of exercise. Because this is a correlational study, it is not clear if being mentally healthy leads to exercise, exercising leads to mental health, or a third variable, like physical health, leads to both. Future research should try to tease apart the direction of this relationship.”

13.127

13.129After adjusting for socioeconomic status, there is a statistically significant, but small, relationship between the IQ of the student and the number of absences.

13.131Step 1 Pick a Test

Examining the relationship between two interval and/or ratio variables: Pearson r.

Step 2 Check the Assumptions

Random samples: Violated.
Independence of observations: Not violated.

B-27
Normality: Willing to assume normality for both variables.
Linearity: Not violated—see the scatterplot below. It is OK to proceed with Pearson r.

Step 3 List the Hypotheses

H₀: ρ = 0.
H₁: ρ ≠ 0.

Step 4 Set the Decision Rule

A nondirectional (two-tailed) test, willing to have a 5% chance of a Type I error, so α = .05, df = N –2 = 6 – 2 = 4, so r_cv = ±.811.

If r ≤ –.811 or r ≥ .811, reject H₀.
If –.811 < r < .811, fail to reject H₀.

Step 5 Calculate the Test Statistic

First, calculate means for X and Y so that deviation scores can be calculated.

Step 6 Interpret the Results

Was the null hypothesis rejected?

–.92 ≤ –.811, so reject H₀.
APA format: r(4) = –.92, p < .05.

How big is the effect?
r ² = r ² × 100 = –.92² × 100 = .8464 × 100 = 84.64%.
This is a very large effect.

How wide is the confidence interval?
z_r = –1.59.
= .5773 = .58
95%CI_zr = z_r ± 1.96s_r = –1.59 ± 1.96(.58) = –1.59 ± 1.1368 = from –2.7268 to –0.4532 = from –2.73 to –0.45.
In r units: from –.99 to –.42.

A four-point interpretation:

This study examined, in a small sample of adults from one dental practice, the relationship between the number of years of fluoride use in childhood and the number of cavities in permanent teeth. There was a very strong, statistically significant, inverse relationship [r(4) = –.92, p < .05]. One plausible interpretation is that the more years of fluoride one has as a child, the fewer cavities one develops. However, as this is a correlational study, other explanations are possible. Perhaps parents who make sure their children receive fluoride also ensure that their children brush their teeth. Though the effect found in this sample was large, the sample size was small and the data derived from only one dental practice. Future research should replicate this study with a much larger and more representative sample.

X	Y	X Deviation Scores	Y Deviation Scores	Product of Deviation Scores	Squared X Deviation Scores	Squared Y Deviation Scores
0	10	–7.50	5.00	–37.50	56.25	25.00
18	1	10.50	–4.00	–42.00	110.25	16.00
2	7	–5.50	2.00	–11.00	30.25	4.00
12	3	4.50	–2.00	–9.00	20.25	4.00
3	5	–4.50	0.00	0.00	20.25	0.00
10	4	2.50	–1.00	–2.50	6.25	1.00
M = 7.50	M = 5.00			Σ = –102.00	Σ = 243.50	Σ = 50.00
				SS_XSS_Y = 243.50 × 50.00
				= 12,175.00

B-28

13.133There is insufficient evidence to conclude that X and Y are related in the population; thus, she should not draw any conclusion regarding the direction of the relationship.

13.135g. r ranges from – 1 to 1. – 1.26 is an impossible value.

13.137A small effect requires at least 800 cases.

A medium effect requires at least 85 cases.

A large effect requires at least 30 cases.

Chapter 14

14.01association; predicting; variable; variable

14.03statistically significant

14.05mean

14.07minimizes; squared errors

14.09error

14.11predicted score; predictor

14.13positive

14.15Y-axis

14.17point

14.19the same

14.21error

14.23width

14.25multiple regression; simple regression

14.27≈45

14.29

14.31Every penny increase in the price of gas was associated with families driving, on average, 35 fewer miles on their summer vacation.

14.33a = M_Y – bM_X = 17.50 – (4.33 × 5.00) = 17.50 – 21.6500 = –4.1500 = –4.15

14.35Y'= bX + a = 12.98 X + (–5.00) = 12.98X – 5.00

14.37First, calculate b and a:

a = M_Y – bM_X = 25.00 – (–0.09 × 55.00) = 25.00 – (–4.9500) = 25.00 + 4.9500 = 29.9500 = 29.95

Then complete the regression equation:

Y' = bX + a = –0.09X + 29.95

14.39Y' = bX + a = 0.37X + 15 = (0.37 × 25) + 15 = 9.2500 + 15 = 24.2500 = 24.25

14.41

Y' values can be calculated for X values from 10 to 80.

14.43First calculate the Y' values for the two endpoints:

Y' = 2.50X – 12.50 = (2.50 × 30) –12.50 = 75.0000 – 12.50 = 62.5000 = 62.50

Y' = 2.50X – 12.50 = (2.50 × 70) –12.50 = 175.0000 – 12.50 = 162.5000 = 162.50

The two endpoints are (30, 62.50) and (70, 162.50)

Then draw the regression line:

14.45Residual = Y – Y’ = 12.96 – 13.43 = –0.4700 = –0.47

B-29

14.47

14.49The range of scores, from 400 to 1600, covers 1,200 points. If the average error is 40 points, that’s off by 3% of the range. This seems to be a relatively small amount of error.

14.51Y' = (3.42 × variable 1) + (–0.76 × variable 2) + 5.74 = (3.42 × 56.66) + (–0.76 × 88.99) + 5.74 = 193.7772 + (–67.6324) + 5.74 = 193.7772 – 67.6324 + 5.74 = 131.8848 = 131.88

14.53b

14.55c

14.57The college applauds Shemekia’s knowledge of statistics. But, the error could just as easily go the other way and Shemekia’s first year GPA would be 2.52, well below their admission cutoff score.

14.59Yes. Start with the value on the y-axis. Then move to the right horizontally until you intersect the regression line. Then move vertically downward until you hit the x-axis. This seems to be your predicted value of X for a given value of Y.

Chapter 15

15.01interval; ratio

15.03parametric

15.05fall back

15.07power; parametric

15.09observed; expected

15.11random sample

15.13sample; population

15.15chi-square value; rare zone

15.17not rejected

15.19N

15.21degrees of freedom

15.23statistically significant

15.25relationship; difference

15.27case; cell

15.29dependent variable; independent variable; independent

15.31rows; columns

15.33100

15.35ranks

15.37samples; group; rank

15.39Examining the relationship between two nominal-level variables: chi-square test of independence. (Or, comparing two independent samples on a nominal-level dependent variable: chi-square test of independence.)

15.41Looking at the relationship between two ordinal-level variables: Spearman r

15.43(a) Random samples assumption is not violated; (b) Independence of observations assumption is not violated; (c) All cells have expected frequencies of at least 5 is violated, so can’t proceed with the planned test.

15.45H₀: In the population of small towns, 2% experienced a murder in the past 10 years.

H₁: In the population of small towns, the percentage experiencing a murder in the past 10 years is not equal to 2%.

15.47df = k – 1 = 4 – 1 = 3

15.49df = (R – 1) × (C – 1) = (3 – 1) × (3 – 1) = 2 × 2 = 4

15.51(a) χ²_cv = 5.991 (b) If χ² ≥ 5.991, reject H₀. If χ² < 5.991, fail to reject H₀.

15.53

15.55

15.57

B-30

15.59χ²(1, N = 43) = 4.81, p < .05

15.61χ²(4, N = 55) = 9.49, p < .05

15.63There are more cases than expected in Category I and fewer than expected in Category II.

15.65“This study investigated if fortunetellers had the extrasensory ability to tell whether a randomly drawn card was red or black. Though the fortunetellers judged the color accurately 58% of the time, there was insufficient evidence to show that they performed better than chance [χ²(1, N =124) = 3.23, p > .05]. The possibility exists that fortune tellers have some slight extrasensory abilities, too slight to be found in this sample. The study should be replicated with a larger sample size in order to have a better chance of finding the effect if it exists.”

15.67Step 1 Pick a Test

Seeing if the selection of one’s favorite cola by the sample matches what would occur if the selection were made randomly. This calls for a chi-square goodness-of-fit test.

Step 2 Check the Assumptions

Random samples is not violated.
Independence of observations is not violated.
All expected frequencies more than 5 is not violated.

Step 3 List the Hypotheses

H₀: The distribution of cola choices in the population is random.
H₁: The distribution of cola choices in the population differs from random.

Step 4 Set the Decision Rule

df = k – 1 = 2 – 1 = 1

χ²_cv = 3.841 with α = .05

If χ² ≥ 3.841, then reject H₀. If χ² < 3.841, then fail to reject H₀.

Step 5 Calculate the Test Statistic

Step 6 Interpret the Results

21.10 ≥ 3.841, so reject H₀.
χ²(1, N = 880) = 21.10, p < .05

Comparing observed frequency to expected frequency shows that more people than expected correctly identified their favorite cola:

The ability of consumers to differentiate their favorite cola from three competing brands was tested in this study. If consumers could not differentiate and were making decisions randomly, then they would have selected their preferred cola 25% of the time. They showed a statistically significant improvement over this [χ²(1, N = 880) = 21.10, p < .05], correctly choosing their cola 32% of the time. Though an effect exists, it doesn’t seem very powerful as the majority of consumers could not correctly identify their preferred cola. It would be interesting to retest this sample and find out whether the people who identified the cola correctly can do so consistently.

15.69

Support Gay Marriage: Yes		Support Gay Marriage: No
20s
40s
60s

B-31

15.71

N_Row1 = 34 + 86 = 120

N_Row2 = 67 + 113 = 180

N_Column1 = 34 + 67 = 101

N_Column2 = 86 + 113 = 199

15.73

15.75

15.77

15.79Type I

15.81Outcome A occurs less frequently than expected for Group I and more frequently than expected for Group II, while Outcome B occurs more frequently than expected for Group I and less frequently than expected for Group II.

15.83

Medium effect
No need to worry about Type II error. (As the null hypothesis was rejected, worry about Type I error.)

15.85“A tennis journalist gathered a random sample of points played at major tournaments in order to investigate whether the server’s advantage diminished as the length of the rally increased. She classified rallies as short (the ball went back and forth no more than 2 times) or long (the ball went back and forth more than 2 times). She found that when the rally was short, the server won the point about 62% of the time. In contrast, the server won only 48% of the time on long rallies. This difference is statistically significant [χ²(1, N = 209) = 3.93, p < .05], and it is a meaningful difference. Future research might examine if the same effect is found in nonprofessional players.”

15.87Step 1 Pick a Test

Comparing a nominal-level dependent variable between two independent samples calls for a chi-square test of independence.

Step 2 Check the Assumptions

Random samples is violated. Can proceed, but limitations on generalizability.
Independence of observations is not violated.

B-32
Expected frequencies ≥5 is not violated.

Step 3 List the Hypotheses

H₀: There is no difference in satisfaction with the scar left by sutures vs. staples.

H₁: There is some difference in satisfaction with the scar left by sutures vs. staples.

Step 4 Set the Decision Rule

df = (R – 1) × (C – 1) = (2 – 1) × (2 – 1) = 1 × 1 = 1

χ²_cv = 3.841 with α = .05

If χ² ≥ 3.841, then reject H₀. If χ² < 3.841, then fail to reject H₀.

Step 5 Calculate the Test Statistic

Step 6 Interpret the Results

0.10 < 3.841; fail to reject H₀.
χ²(1, N = 42) = 0.10, p > .05

In this study, patient satisfaction with scars left by sutures vs. staples was measured. There was no statistical difference in level of satisfaction [χ²(1, N = 42) = 0.10, p > .05], suggesting that either could be used to close incisions, at the preference of the surgeon and/or the patient. A positive of this study is that patients were randomly assigned to receive sutures or staples, but only one surgeon provided both. Thus, it is possible that he is better at one technique over the other. Therefore, this study should be replicated using multiple surgeons and with the quality of the suturing and the stapling assessed.

15.89

12	8	20
14	16	30
26	24

15.91This table can’t be completed. If the marginal frequencies are known, a table can be completed if there are as many cells filled in as degrees of freedom (in this case, 2).

15.93If observed frequencies and expected frequencies are all the same, then χ ² = 0 and the null hypothesis is not rejected.

Chapter 16

16.01describe

16.03two variables

16.05explanatory variable (or independent variable)

16.07context

16.09shape; level of measurement

16.11is

16.13Descriptive

16.15Correlational

16.17Experimental

16.19Descriptive

16.21

Ratio;
Continuous

16.23

Nominal;
Discrete

16.25

Interval;
Continuous

16.27Standard deviation is the best answer. Also acceptable are interquartile range, variance, and range.

16.29Bar graph

16.31chi-square goodness-of-fit test

16.33between-subjects, one-way ANOVA

16.35independent-samples t test

16.37single-sample t test

16.39Pearson r

16.41independent-samples t test

16.43independent-samples t test

16.45No such test appears in the book.

16.47Pearson r

16.49chi-square test of independence