G-1
alpha (or alpha level) – the probability of making a Type I error; the probability that a result will fall in the rare zone and the null hypothesis will be rejected when the null hypothesis is true; often called significance level; abbreviated α; usually set at .05 or 5%.
alternative hypothesis – abbreviated H1; a statement that the explanatory variable has an effect on the outcome variable in the population; usually, a statement of what the researcher believes to be true.
analysis of variance (ANOVA) – a family of statistical tests for comparing the means of two or more groups.
apparent limits – what seem to be the upper and lower bounds of an interval in a grouped frequency distribution.
bar graph – a graph of a frequency distribution for discrete data that uses the heights of bars to indicate frequency; the bars do not touch.
beta – the probability of making a Type II error; abbreviated β.
between-group variability – variability in scores that is primarily due to the different treatments that different groups receive.
between-subjects – ANOVA terminology for independent samples.
between-subjects, one-way ANOVA – a statistical test used to compare the means of two or more independent samples when there is just one explanatory variable.
cases – the participants in or subjects of a study.
central limit theorem – a statement about the shape that a sampling distribution of the mean takes if the size of the samples is large and every possible sample were obtained.
central tendency – a value used to summarize a set of scores; also known as the average.
chi-square goodness-of-fit test – a nonparametric, single-sample test used to compare the distribution of a categorical (nominal- or ordinal-level) outcome variable in a sample to a known population value.
chi-square test of independence – a nonparametric test used to determine whether two or more populations of cases differ on a categorical (nominal- or ordinal-level) outcome variable.
clinical significance (or practical significance) – whether the size of the effect is large enough to say the explanatory variable has a meaningful impact on clinical outcome.
coefficient of determination − formal name for the effect size r 2.
Cohen’s d – a standardized measure of effect used to measure the difference between means.
common zone – the section of the sampling distribution of a test statistic in which the observed outcome should fall if the null hypothesis is true; typically set to be the middle 95%.
confidence interval – a range within which it is estimated, based on a sample value, that a population value falls.
confounding variable – a third variable in correlational and quasi-experimental designs that is not controlled for and that has an impact on both of the other variables.
consent rate – the percentage of targeted subjects who agree to participate in a study.
contingency table – a table showing the degree to which a case’s value on the outcome variable depends on its category on the explanatory variable.
continuous number – number that answers the question “how much” and can have “in-between” values; the specificity of the number, the number of decimal places reported, depends on the precision of the measuring instrument.
convenience sample – a sampling strategy in which cases are selected for study based on the ease with which they can be obtained.
correlation coefficient − a statistic that summarizes, in a single number, the strength of a relationship between two variables.
G-2
correlational design – a scientific study in which the relationship between two variables is examined without any attempt to manipulate or control them.
criterion variable – the outcome variable in a correlational design.
critical value – the value of the test statistic that forms the boundary between the rare zone and the common zone of the sampling distribution of the test statistic.
critical value of t – value of t used to determine whether a null hypothesis is rejected or not; abbreviated tcv.
crossed – a factorial ANOVA in which each level of each explanatory variable occurs with each level of the other explanatory variable.
cumulative frequency – a count of how often a given value, or a lower value, occurs in a set of data.
cumulative percentage – cumulative frequency expressed as a percentage of the number of cases in the data set.
degrees of freedom (df) – the number of values in a sample that are free to vary.
dependent samples – samples in which the selection of cases for one group is related to, influences, or is determined by case selection for another group.
dependent variable – the variable where the effect is measured in an experimental or quasi-experimental study; an outcome variable.
descriptive statistic – a summary statement about a set of cases.
descriptive statistics – statistics used to describe a set of observations.
deviation score – a measure of how far away a score falls from the mean.
difference tests – statistical tests that look for differences among groups of cases.
direct relationship − a relationship in which high scores on X are associated with high scores on Y. Also called a positive relationship.
discrete number – numbers that answer the question “how many,” take whole number values, and have no “in-between” values.
effect size – a measure of the degree of impact of the explanatory variable on the outcome variable.
eta squared (η2) – an effect size that calculates the percentage of variability in the outcome variable accounted for by the explanatory variable.
experimental design – a scientific study in which an explanatory variable is manipulated or controlled by the experimenter and the effect that is measured in a dependent variable allows for a cause and effect conclusion.
explanatory variable – the variable that causes, predicts, or explains the outcome variable.
extreme percentage – percentage of the normal distribution that is found in the two tails and is evenly divided between them.
factor – term for an explanatory variable in ANOVA.
factorial ANOVA – an analysis of variance in which there is more than one explanatory variable.
frequency distribution – a tally of how often different values of a variable occur in a set of data.
frequency polygon – a frequency distribution for continuous data, displayed in graphical format, using a line connecting dots above interval midpoints to indicate frequency.
grouped frequency distribution – a count of how often the values of a variable, grouped into intervals, occur in a set of data.
grouping variable – the variable that is the explanatory variable in a quasi-experimental design.
histogram – a frequency distribution for continuous data, displayed in graph form, using the heights of bars to indicate frequency; the bars touch each other.
hypothesis – a proposed explanation for observed facts; a statement or prediction about a population value.
hypothesis testing – a statistical procedure in which data from a sample are used to evaluate a hypothesis about a population.
independence – in probability, when the occurrence of one outcome does not have any impact on the occurrence of a second outcome.
independent samples – when the selection of cases for one sample has no impact on the selection of cases for another sample.
independent-samples t test – an inferential statistical test used to compare two independent samples on an interval- or ratio-level outcome variable.
independent variable – the variable that is controlled by the experimenter in an experimental design.
individual differences – attributes that vary from case to case.
inferential statistic – using observations from a sample to draw a conclusion about a population.
interaction effect – situation, in factorial ANOVA, in which the impact of one explanatory variable on the outcome variable depends on the level of another explanatory variable.
G-3
interquartile range – a measure of variability for interval- or ratio-level data; the distance covered by the middle 50% of scores; abbreviated IQR.
interval estimate – an estimate of a population value that says the population value falls somewhere within a range of values.
interval-level numbers – numbers that provide information about how much of an attribute is possessed, as well as information about same/different and more/less; interval-level numbers have equality of units and an arbitrary zero point.
inverse relationship − a relationship in which high scores on X are associated with low scores on Y. Also called a negative relationship.
kurtosis – how peaked or flat a frequency distribution is.
least squares criterion – prediction errors are squared and the best-fitting regression line is the one that has the smallest sum of squared errors.
level – ANOVA terminology for a category of an explanatory variable.
linear regression – a predictor variable is used to predict a case’s score on another variable and the prediction equation takes the form of a straight line.
longitudinal research (or repeated-measures design) – a study in which the same participants are measured at two or more points in time.
main effect – the impact of an explanatory variable, by itself, on the outcome variable.
Mann–Whitney U test – a nonparametric test used to compare two independent samples on an ordinal-level outcome variable.
matched pairs – participants are grouped into sets of two based on their being similar on potential confounding variables.
mean – an average calculated for interval- or ratio-level data by summing all the values in a data set and dividing by the number of cases; abbreviated M.
median – an average calculated by finding the score associated with the middle case, the case that separates the top half of scores from the bottom half; abbreviated Mdn; can be calculated for ordinal-, interval-, or ratio-level data.
middle percentage – percentage of the normal distribution found around the midpoint, evenly divided into two parts, one just above the mean and one just below it.
midpoint – the middle of an interval in a grouped frequency distribution.
modality – the number of peaks that exist in a frequency distribution.
mode – the score that occurs with the greatest frequency.
multiple linear regression – prediction in which multiple predictor variables are combined to predict an outcome variable.
negative relationship − a relationship in which high scores on X are associated with low scores on Y; also called an inverse relationship.
negative skew – an asymmetrical frequency distribution in which the tail extends to the left, in the direction of lower scores.
nominal-level numbers – numbers used to place cases in categories; numbers are arbitrary and only provide information about same/different.
nonparametric test – a statistical test for use with nominal- or ordinal-level outcome variables, and for which assumptions about the shape of the population don’t have to be met.
nonrobust assumption – an assumption for a statistical test that must be met in order to proceed with the test.
normal distribution – also called the normal curve; a specific bell-shaped curve defined by the percentage of cases that fall in specific areas under the curve.
null hypothesis – abbreviated H0; a statement that in the population the explanatory variable has no impact on the outcome variable.
one-tailed hypothesis test – hypothesis that predicts the explanatory variable has an impact on the outcome variable in a specific direction.
ordinal-level numbers – numbers used to indicate if more or less of an attribute is possessed; numbers provide information about same/different and more/less.
outcome variable − the variable that is caused, predicted, or influenced by the explanatory variable; the variable in a relationship test, Y, that is predicted from the other variable, X. Sometimes called the dependent variable.
outlier – an extreme (unusual) score that falls far away from the rest of the scores in a set of data.
p value – the probability of Type I error; the same as alpha level or significance level.
G-4
paired samples – case selection for one sample is influenced by, depends on, the cases selected for another sample.
paired-samples t test – hypothesis test used to compare the means of two dependent samples; also known as dependent-samples t test, correlated-samples t test, related-samples t test, matched-pairs t test, within-subjects t test, or repeated-measures t test.
parameter – a value that summarizes a population.
parametric test – a statistical test for use with interval- or ratio-level outcome variables, and for which assumptions about the shape of the population must be met.
partial correlation − a correlation between two variables from which the influence of a third variable has been mathematically removed.
Pearson correlation coefficient − a statistical test that measures the degree of linear relationship between two interval/ratio-level variables.
percentile rank – percentage of cases with scores at or below a given level in a frequency distribution.
perfect relationship − a relationship between two variables in which the value of one can be exactly predicted from the other.
point estimate – an estimate of a population value that is a single value.
pooled variance – the average variance for two samples.
population – the larger group of cases a researcher is interested in studying.
positive relationship − a relationship in which high scores on X are associated with high scores on Y; also called direct relationship.
positive skew – an asymmetrical frequency distribution in which the tail extends to the right, in the direction of higher scores.
post-hoc test – a follow-up test to a statistically significant ANOVA, engineered to find out which pairs of means differ while keeping the overall alpha level at the chosen level.
power – the probability of rejecting the null hypothesis when the null hypothesis should be rejected.
practical significance (or clinical significance) – the size of the effect is large enough to say the explanatory variable has a meaningful impact on the outcome variable (or the clinical outcome).
prediction interval − a range around Y′ within which there is some certainty that a case’s real value of Y falls.
predictor variable − the variable in a relationship test, X, that is used to predict the other variable, Y; the explanatory variable in a correlation design.
pre-post design – participants are measured on the dependent variable before and after an intervention or manipulation.
probability – how likely an outcome is; the number of ways a specific outcome can occur, divided by the total number of possible outcomes.
quasi-experimental design – a scientific study in which cases are classified into naturally occurring groups and then compared on a dependent variable.
r2 − an effect size that reveals the percentage of variability in one variable that is accounted for by the other variable; formally called coefficient of determination.
random assignment – every case has an equal chance of being assigned to any group in an experiment; random assignment is the hallmark of an experiment.
random sample – a sampling strategy in which each case in the population has an equal chance of being selected.
range – a measure of variability for interval- or ratio-level data; the distance from the lowest score to the highest score.
rare zone – the section of the sampling distribution of a test statistic in which it is unlikely an observed outcome will fall if the null hypothesis is true; typically, 5% of the sampling distribution.
ratio-level numbers – numbers that have all the attributes of interval-level numbers, plus a real zero point; numbers that provide information about same/different, more/less, how much of an attribute is possessed, and that can be used to calculate a proportion.
real limits – what are really the upper and lower bounds of a single continuous number or of an interval in a grouped frequency distribution.
regression line – the best-fitting straight line for predicting Y from X.
relationship tests – statistical tests that determine if two variables in a group of cases covary.
repeated-measures ANOVA – a statistical test used to compare two or more dependent samples on an interval- or ratio-level–dependent variable; also called within-subjects ANOVA, dependent-samples ANOVA, or related-samples ANOVA.
G-5
repeated-measures design (or longitudinal research) – a study in which the same participants are measured at two or more points in time.
replicate – to repeat a study, usually introducing some change in procedure to make it better.
representative – the attributes of the population are present in the sample in approximately the same proportion as in the population.
residual – the difference between an actual score and a predicted score; the size of the error in prediction.
robust assumption – an assumption for a statistical test that can be violated to some degree and it is still OK to proceed with the test.
sample – a group of cases selected from a population.
sampling distribution – a frequency distribution generated by taking repeated, random samples from a population and generating some value, like a mean, for each sample.
sampling error – discrepancies, due to random factors, between a sample statistic and a population parameter.
self-selection bias – a nonrepresentative sample that may occur when the subjects who agree to participate in a research study differ from those who choose not to participate.
significance level – the probability of Type I error; the same as alpha level or p value.
simple linear regression – prediction in which Y′ is predicted from a single predictor variable.
single-sample test – a statistical test used to compare the results in a sample to a known population value or a specified value.
single-sample t test – a statistical test that compares a sample mean to a population mean when the population standard deviation is not known.
skewness – the degree to which a set of scores is not symmetric but tails off in one direction or the other.
slope – the tilt of the line; rise over run; how much up or down change in Y is predicted for each 1-unit change in X.
Spearman rank-order correlation coefficient – a nonparametric test that examines the relationship between two ordinal-level variables or one ordinal and an interval/ratio variable.
standard deviation – a measure of variability for interval- or ratio-level data; the square root of the variance; a measure of the average distance that scores fall from the mean.
standard error of the estimate – the standard deviation of the residual scores, a measure of error in regression.
standard error of the mean – the standard deviation of a sampling distribution of the mean.
standard error of the mean difference for difference scores – the standard deviation of the sampling distribution of difference scores, abbreviated SMD; used as the denominator in the paired-samples t test equation.
standard score – raw score expressed in terms of how many standard deviations it falls away from the mean; also known as a z score.
statistic – a value that summarizes data from a sample.
statistical significance – the observed difference between sample means is large enough to conclude that it represents a difference between population means.
statistically significant – when a researcher concludes that the observed sample results are different from the null-hypothesized population value.
statistics – techniques used to summarize data in order to answer questions.
stem-and-leaf display – a data summary technique that combines features of a table and a graph.
sum of squares – squaring a set of scores and then adding together the squared scores; abbreviated SS.
sum of squares between (SSBetween) – a sum of the squared deviation scores representing the variability between groups.
sum of squares total (SSTotal) – a sum of the squared deviation scores representing all the variability in the scores.
sum of squares within (SSWithin) – a sum of the squared deviation scores representing the variability within groups.
treatment effect – the impact of the explanatory variable on the dependent variable.
two-samples t test – an inferential statistical test used to compare the mean of one sample to the mean of another sample.
two-tailed hypothesis test – hypothesis that predicts the explanatory variable has an impact on the outcome variable, but doesn’t predict the direction of the impact.
Type I error – the error that occurs when the null hypothesis is true but is rejected; p(Type I error) = α.
G-6
Type II error – the error that occurs when one fails to reject the null hypothesis but should have rejected it; p(Type II error) = β.
underpowered – term for a study with a sample size too small for the study to have a reasonable chance to reject the null hypothesis given the size of the effect.
ungrouped frequency distribution – a count of how often each individual value of a variable occurs in a set of data.
variability – how much variety (spread or dispersion) there is in a set of scores.
variables – characteristics measured by researchers.
variance – a measure of variability for interval- or ratio-level data; the mean of the squared deviation scores.
way – term for an explanatory variable in ANOVA.
within-group variability – variability within a sample of cases, all of which have received the same treatment.
within-subjects – ANOVA terminology for dependent samples.
within-subjects design – the same participants are measured in two or more different situations or under two or more different conditions.
Y-intercept – the spot where the regression line would pass through the Y-axis.
Y prime – the value of Y predicted from X by a regression equation; Y′.
z score – raw score expressed in terms of how many standard deviations it falls away from the mean; also known as a standard score.