You must read each slide, and complete any questions on the slide, in sequence.
Nonexperimental Design
A design in which there is no control or manipulation of the variables. This design does not seek to establish cause and effect and instead focuses on describing or summarizing what takes place.
Experimental Design
A design in which the experimenter controls and manipulates the independent variable and makes comparisons between the different levels, allowing the establishment of cause-and-effect relationships between the independent and dependent variables.
Two-group Design
A design that compares 2 groups or conditions and is the most basic way to establish cause and effect.
Pretest-posttest Design
A design where participants are measured before and after exposure to a treatment or intervention.
Repeated-measures Design
A design where participants are exposed to each level of the independent variable and are measured on the dependent variable after each level.
Independent Variable
The variable that influences the dependent variable. In experiments the researcher manipulates or controls this variable.
Dependent Variable
The variable measured in association with changes in the independent variable; the outcome or effect.
Baseline Measurement
The participants’ initial assessment at the beginning of a study.
Experimental Realism
The degree to which a study participant becomes engrossed in the manipulation and truly influenced by it.
Mundane Realism
The degree to which a study parallels everyday situations in the real world.
Reliability
The stability or consistency of a measure.
Validity
The degree to which a tool measures what it claims to measure.
Sensitivity
The range of data a researcher can gather from a particular instrument.
Order Effect
A threat to interval validity in a within-subjects design resulting from influence that the sequence of experimental conditions can have on the dependent variable.
Practice Effect
Changes in a participant’s responses or behavior due to increased experience with the measurement instrument, not the variable under investigation.
Fatigue Effect
Deterioration in measurements due to participants becoming tired, less attentive, or careless during the course of the study.
Carryover Effect
Exposure to earlier experimental conditions influencing responses to subsequent conditions.
Sensitization Effect
Continued exposure to experimental conditions in a within-subjects study increasing the likelihood of hypothesis-guessing, potentially influencing participants’ responses in later experimental conditions.
Counterbalancing
Using all potential treatment sequences in a within-subjects design.
Experimental Hypothesis
A clear and specific prediction of how the independent variable influences the dependent variable.
IRB
A board that reviews the ethical merit of all the human research conducted within an institution.
Descriptive
Describes what is happening.
Inferential
Tests a specific prediction about why something occurs.
Within-Subjects Design
In this activity, you will explore the impact of inclusion and exclusion on self-esteem by creating a design to measure change within individuals.
Dr. Melanie Maggard
Dr. Natalie J. Ciarocco, Monmouth University
Dr. David B. Strohmetz, Monmouth University
Dr. Gary W. Lewandowski, Jr., Monmouth University
Something to Think About…
Scenario: Imagine that you are a child again. You are on the playground surrounded by your peers and it’s time for teams to be chosen for a game of kickball. Your hands start to sweat and your heart races as you think, Please let me get picked. Please let me get the chance to play today. You stand there attentively as the team leaders choose their players, but before you can be picked, they reach the number of players they want. You are devastated! As the teams run off to play, you think, Why didn’t I get picked? Do they not like me? What’s wrong with me?
Something to Think About…
Being excluded from social groups can cause us to reconsider how we feel about ourselves, even if these doubts are temporary. As social beings, we tend to feel better when we are part of a group, not excluded from one. Now, we are going to investigate this concept by exploring the impact that being included in or excluded from playing a game can have on our self-esteem. Maybe it isn’t just kids who can experience the feeling of being left out!
Our Research Question
Based on your experiences with being included in or excluded from groups, you can develop a research study that examines the impact of social exclusion on self-esteem. But first, you will need a framework to help you explore this topic. Research studies all start with a question, so here is your chance to ask one of your own.
Now that you have a research question (“Does being included in or excluded from playing a game with others cause an increase or decrease in young adults’ self-esteem?”), you must decide which type of research design will best answer your research question. To narrow things down, consider the following:
Since comparisons must be made in order to answer your research question (“Does being included in or excluded from playing a game with others cause an increase or decrease in young adults’ self-esteem?”), consider the following types of experimental designs:
Now that you know you have an experimental design that compares the pretest to inclusion in a game to exclusion from a game, you can identify your independent and dependent variables.
Because you have an experiment with 1 independent variable and 3 levels (Pretest/no game vs. Inclusion in game vs. Exclusion from game) that each participant is exposed to, you will use a type of within-subjects design called a repeated-measures design. The pretest/no game level serves as a baseline measurement to which the other measurements can be compared.
Next, we need to operationally define the independent variable (IV) of game condition by determining exactly how we will manipulate it. As we do, we’ll want to be sure our study has a high level of experimental and mundane realism.
It looks like the task that is highest in experimental and mundane realism involves young adults playing a game of “cyberball.” We know that all participants will be measured at 3 points in time: pretest (before the study begins), after being included in the game, and after being excluded from the game. Therefore, we will have the following design:
Summary of Our Within-Subjects Study
Pretest Measure
IV Level 1
Time 1 Measure
IV Level 2
Time 2 Measure
Self-esteem at Baseline
Inclusion in Game
Self-esteem after Inclusion
Exclusion from Game
Self-esteem after Exclusion
Operationally Defining the Dependent Variable
You have now established the key comparison between Pretest/No game vs. Inclusion in game vs. Exclusion from game. Next, we need to specify the exact nature of our dependent variable, self-esteem. First, consider the following:
We know we want to use a self-report measure to measure self-esteem. Now it is time to determine which type of self-report measure to use. Keep in mind how many and what types of questions, reliability, validity, and sensitivity would be ideal for young adults.
Since we have the potential for multiple order effects in this study, we must consider how to minimize their impact. Fortunately, we do not think that the potential for practice and sensitization effects will drastically impact the results, so we decide to keep the same measure of self-esteem, SSES, constant throughout the study. However, we do think it would be worthwhile to reduce the impact of the carryover effect by using counterbalancing.
Let’s update the chart we made earlier to reflect the counterbalancing method we have chosen for this study. Notice how we now have a second sequence that allows us to measure self-esteem after being excluded from a game prior to exposure to the inclusion level, thus covering all possible sequences in our study.
Summary of Our Within-Subjects Study
Sequence
Pretest Measure
IV Level 1
Time 1 Measure
IV Level 2
Time 2 Measure
#1
Self-esteem at Baseline
Inclusion in Game
Self-esteem after Inclusion
Exclusion from Game
Self-esteem after Exclusion
SSES at Baseline
SSES after Inclusion
SSES after Exclusion
#2
Self-esteem at Baseline
Inclusion in Game
Self-esteem after Exclusion
Exclusion from Game
Self-esteem after Inclusion
SSES at Baseline
SSES after Exclusion
SSES after Inclusion
Determining Your Hypothesis
Now that you have determined what you will manipulate and measure, you must formulate an experimental hypothesis.
Now that you have determined how you will collect your data and your intended sample, you must submit your research procedure to the Institutional Review Board (IRB) for ethical approval. The IRB or ethics board will determine whether or not your study meets all ethical guidelines.
IRB
Each IRB has its own protocol which conforms to the national standard when a researcher submits an application for proposed research to be reviewed. In addition to the appropriate paperwork and other information submitted to the IRB, the board would consider the following description during their evaluation of your proposed experiment:
The purpose of this research is to determine whether being included in or excluded from playing a virtual game of “cyberball” will result in a change to self-esteem. To study this topic, 30 participants will be randomly selected from the research participant pool at the University. Researchers will measure all participants’ self-esteem via the State Self-Esteem Scale (SSES) at the beginning of the study, after being included in a virtual game of “cyberball” for 5 minutes, and after being excluded from a virtual game of “cyberball” for 5 minutes. Counterbalancing will be used such that half of the participants will receive the inclusion-exclusion sequence and half will receive the exclusion-inclusion sequence. Participants will be debriefed at the end of the study.
Responding to the IRB
The IRB reviewed your submission and has 1 concern. Although the study appears to present less than minimal risk to participants, there is no mention of informed consent and voluntary participation.
You must now determine how to respond to the IRB, keeping in mind the ethics of respect for persons and autonomy.
Now that we have secured the IRB’s approval, we should determine what the entire study will look like. Below are the steps of the study; can you place them in the proper order? (Note: The State Self-Esteem Scale is referred to as SSES.)
A.
B.
C.
D.
E.
F.
G.
H.
Participants take the SSES for the pretest measure.
Obtain informed consent.
Participants play their first game of “cyberball,” which is programmed to include them in or exclude them from the game (depending on their sequence).
Participants take the SSES after the second game.
Debrief the participants.
Give participants instructions for how to play “cyberball” and progress through the 2 games.
Participants play their second game of “cyberball,” which is programmed to include them in or exclude them from the game (depending on their sequence).
Participants take the SSES after the first game.
Collecting Data
Now that you have a sense of how to conduct this study, it is time to see what data from this study might look like.
If you were to run a full version of this study, you would want to have at least 30 participants. Because you have a within-subjects design, each participant will be exposed to all levels of the independent variable.
Example Data Set
This is an example of what your data set would look like. The top row shows the variable names; the other rows display the data for the first 5 participants in each sequence.
In the “Sequence” column, a 1 = Inclusion-Exclusion sequence, and a 2 = Exclusion-Inclusion sequence. The Baseline (Pretest/No game), Inclusion, and Exclusion columns represent a participant’s score measured via the SSES prior to the study, after inclusion in the game, and after exclusion from the game.
Participant Number
Sequence
Baseline
Inclusion
Exclusion
101
1
58
58
43
102
1
91
94
82
103
1
75
85
67
104
1
23
24
9
105
1
65
63
61
116
2
81
85
78
117
2
61
60
49
118
2
53
57
46
119
2
20
29
6
120
2
80
85
67
Selecting the Proper Tool
Now that you have collected your data, you must decide the best way to summarize your findings. The decisions you made about how to collect your data dictate the statistics you can use with your data now. First, you need to consider if your study is descriptive or inferential.
The following is an example of output for another 3-level design where participants experienced all 3 conditions in the study. This study was about how hours slept at night (6 hours, 8 hours, and 10 hours) influence self-reported happiness. Click on the table below to learn more about each element of the output.
To report these numbers in a results section, put the numbers in as follows:
F (#,#) = #.##, p = .##, eta2 = .##.
Tests of Within-Subjects Effects
Measure: MEASURE_1
Source
Type III Sum of Squares
df
Mean Square
F
Sig.
Partial Eta Squared
Hours
Sphericity Assumed
91.289
2
45.644
67.230
.000
.699
Greenhouse-Geisser
91.289
1.694
53.903
67.230
.000
.699
Huynh-Feldt
91.289
1.787
51.074
67.230
.000
.699
Lower-bound
91.289
1.000
91.289
67.230
.000
.699
Error(Hours)
Sphericity Assumed
39.378
58
.679
Greenhouse-Geisser
39.378
49.113
.802
Huynh-Feldt
39.378
51.834
.760
Lower-bound
39.378
29.000
1.358
This is the df or degrees of freedom. An ANOVA has 2 dfs, one for the main effect (within-groups) and one for the error (residual).
This is the F statistic. It represents the size of the difference between condition means compared to the size of the residual error.
This is the p level or the significance level. It represents the probability or likelihood that the results happened by chance. The lower the p level, the less likely the result happened by chance. This would be reported as p < .001 in the results.
The F score and p level will only tell you whether there is a significant difference. To determine which means are different, and the nature or direction of those differences, you need to look at the means via a post-hoc test.
The eta squared (eta2 ) is the effect size. It tells us the proportion of change in the dependent variable that is associated with being in the different groups of the independent variable.
Tutorial: Evaluating Output
chapter_10_table_activity_2
Click on the table below to learn more about each element of the output for this design.
To report these numbers in a results section, put the numbers in as follows:
F (#,#) = #.##, p = .##, eta2 = .##.
Pairwise Comparisons
Measure: MEASURE_1
(I) Hours
(J) Hours
Mean Difference (I-J)
Std. Error
Sig.b
95% Confidence Interval for Differenceb
Lower Bound
Upper Bound
1
2
-2.200*
.222
.000
-2.762
-1.638
3
-.133
.164
.808
-.549
.283
2
1
2.200*
.222
.000
1.638
2.762
3
2.067*
.244
.000
1.448
2.685
3
1
.133
.164
.808
-.283
.549
2
-2.067*
.244
.000
-2.685
-1.448
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Sidak.
The results presented here are from the post-hoc test, which compares each of the groups’ means to all of the other groups’ means.
This is the difference between the mean happiness rating for the 6 hours of sleep and 8 hours of sleep conditions.
This is the difference between the mean happiness rating for the 6 hours of sleep and 10 hours of sleep conditions.
This is the difference between the mean happiness rating for the 8 hours of sleep and 10 hours of sleep conditions.
Happiness was different in the 8 hours sleep condition from the 6 hours and 10 hours sleep conditions, which had similar ratings.
The post-hoc test tells us which comparisons between the means were significant. The p level tells us the significance level of that comparison.
Tutorial: Evaluating Output
chapter_10_table_activity_3
Click on the table below to learn more about each element of the output for this design.
Descriptive Statistics
Mean
Std. Deviation
N
Six
1.77
.728
30
Eight
3.97
.809
30
Ten
1.90
.712
30
This is the average or mean (M) happiness rating after 6 hours of sleep.
This is the standard deviation (SD) of happiness rating after 6 hours of sleep.
This is the average or mean (M) happiness rating after 8 hours of sleep.
This is the standard deviation (SD) of happiness rating after 8 hours of sleep.
This is the average or mean (M) happiness rating after 10 hours of sleep.
This is the standard deviation (SD) of happiness rating after 10 hours of sleep.
In this case, the means tell us that happiness ratings were highest after 8 hours of sleep. The results from the post-hoc test support the finding that happiness ratings were similar for 6 and 10 hours of sleep, but increased significantly with 8 hours of sleep.
Your Turn: Evaluating Output
Below is the output from your study:
Tests of Within-Subjects Effects
Measure: MEASURE_1
Source
Type III Sum of Squares
df
Mean Square
F
Sig.
Partial Eta Squared
GameCondition
Sphericity Assumed
2081.156
2
1040.578
101.918
.000
.778
Greenhouse-Geisser
2081.156
1.816
1146.056
101.918
.000
.778
Huynh-Feldt
2081.156
1.930
2081.156
101.918
.000
.778
Lower-bound
2081.156
1.000
91.289
101.918
.000
.778
Sphericity Assumed
592.178
58
10.210
Greenhouse-Geisser
592.178
52.662
11.245
Huynh-Feldt
592.178
55.983
10.578
Lower-bound
592.178
29.000
20.420
Your Turn: Evaluating Output
Below is the output from your study:
Pairwise Comparisons
Measure: MEASURE_1
(I) GameCondition
(J) GameCondition
Mean Difference (I-J)
Std. Error
Sig.b
95% Confidence Interval for Differenceb
Lower Bound
Upper Bound
1
2
-3.133*
.728
.001
-4.979
-1.288
3
8.267*
.788
.000
6.271
10.262
2
1
3.133*
.728
.001
1.288
4.979
3
11.400*
.944
.000
9.008
13.792
3
1
-8.267*
.788
.000
-10.262
-6.271
2
-11.400*
.944
.000
-13.792
-9.008
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Sidak.
Descriptive Statistics
Mean
Std. Deviation
N
Baseline
61.10
25.694
30
Inclusion
64.23
24.694
30
Exclusion
52.83
26.478
30
Your Turn: Evaluating Output
chapter_10_multiple_choice_2
Based on the results of your statistical analyses on Screens 32 and 33, match the correct number in the “Answer” column to the term requested under “Prompt”:
F for the ANOVA test
df for the main effect of condition (within-groups)
df for error (residual)
p for the ANOVA test
Mean difference between Baseline and Inclusion
p for the difference between Baseline and Inclusion
Mean difference between Baseline and Exclusion
p for the difference between Baseline and Exclusion
Mean difference between Inclusion and Exclusion
p for the difference between Inclusion and Exclusion
eta2
101.918
2
58
0.00
3.133
0.001
8.267
0.00
11.400
0.00
0.778
Activity: Graphing Results
Based on the data provided, drag each Game Condition bar to the correct Mean SSES Score.
chapter_10_graph_activity
Descriptive Statistics
Mean
Std. Deviation
N
Baseline
61.10
25.694
30
Inclusion
64.23
24.694
30
Exclusion
52.83
26.478
30
Game Condition & Self-Esteem
Mean SSES Score
Game Condition
Your Turn: Results
Now that you have worked with your data, you must determine the best way to express your findings in written form. You must be sure that how you describe your findings accurately represents the data.
You have determined how to express your findings in a scientifically responsible way. Now, you need to be able to talk about what your findings mean in everyday terms so that the world can benefit from your science.