KEY POINTS
Descriptive Statistics
Statistics is a branch of mathematics that researchers use to organize and interpret data.
Descriptive statistics are summaries of data that make the data meaningful and easy to understand. One descriptive statistic is a frequency distribution, which can be presented in the form of a table, a histogram, or a frequency polygon. Some frequency distributions are positively skewed; that is, most of the scores in the distribution pile up at the low end. Distributions that are negatively skewed have mostly high scores. Symmetrical distributions have equal numbers of scores on both sides of the distribution’s midpoint.
The mode, the median, and the mean are measures of central tendency of a distribution. The mode is the most frequent score. The median is the middle score in the distribution. The mean is the arithmetic average. To calculate the mean, scores are summed and divided by the total number of scores. The mean is usually the best overall representation of central tendency, but it is strongly influenced by extremely high or extremely low scores.
The range and standard deviation are measures of variability or spread of a distribution. The range is the highest score in the distribution minus the lowest score. The standard deviation is the square root of the average of the squared deviations from the mean.
A z score expresses a single score’s deviation from the mean of a distribution in standard deviation units.
The standard normal curve is a symmetrical distribution forming a bell-shaped curve in which the mean, median, and mode are all equal and fall in the exact middle. The percentage of cases that fall between any two points on the normal curve is known. Over 95 percent of the cases fall between two standard deviations above the mean and two standard deviations below the mean.
Correlation refers to the relationship between two variables. A correlation coefficient is a number that indicates the magnitude and direction of such a relationship. A correlation coefficient may range from -1 to +1. The closer the value is to -1 or +1, the stronger the relationship is. Correlations close to 0 indicate no relationship. A positive correlation coefficient tells us that as one variable increases in size, the second variable also increases. A negative correlation coefficient indicates that as one variable increases in size, the second variable decreases. A correlation relationship may be presented visually in a scatter diagram or scatter plot.
Correlations enable us to predict the value of one variable from knowledge of another variable’s value. However, a correlational relationship is not necessarily a causal relationship.
Inferential Statistics
Inferential statistics are used to determine whether the outcomes of a study can be legitimately generalized to a larger population. A population is a complete set of something. A sample is a subset of a population. One technique is a t-test, which is used to compare the means of two groups.
Inferential statistics provide information about the probability of a particular result if only chance or random factors are operating. If this probability is small, the findings are said to be statistically significant; that is, they are probably due to the researcher’s interventions. Researchers must avoid making decision errors. Erroneously concluding that study results are significant is a Type I error. Failing to find a significant effect that does, in fact, exist is a Type II error.
Match each of the terms on the left with its definition on the right. Click on the term first and then click on the matching definition. As you match them correctly they will move to the bottom of the activity.
correlation correlation coefficient descriptive statistics frequency distribution frequency polygon histogram inferential statistics mean measure of central tendency measure of variability median mode negative correlation population positive correlation range sample scatter diagram or scatter plot skewed distribution standard deviation standard normal curve or standard normal distribution statistics symmetrical distribution t-test Type I error Type II error z score | A finding that two factors vary systematically in opposite directions, one increasing as the other decreases. A measure of variability; expressed as the square root of the sum of the squared deviations around the mean divided by the number of scores in the distribution. A finding that two factors vary systematically in the same direction, increasing or decreasing together. Failing to find a significant effect that does, in fact, exist. The score that divides a frequency distribution exactly in half so that the same number of scores lie on each side of it. An asymmetrical distribution; more scores occur on one side of the distribution than on the other. In a positively skewed distribution, most of the scores are low scores; in a negatively skewed distribution, most of the scores are high scores. A distribution in which scores fall equally on both sides of the graph. The normal curve is an example of a symmetrical distribution. The most frequently occurring score in a distribution. A measure of variability; the highest score in a distribution minus the lowest score. Mathematical methods used to determine how likely it is that a study's outcome is due to chance and whether the outcome can be legitimately generalized to a larger population. A summary of how often various scores occur in a sample of scores. Score values are arranged in order of magnitude, and the number of times each score occurs is recorded. A single number that presents information about the spread of scores in a distribution. A symmetrical distribution forming a bell-shaped curve in which the mean, median, and mode are all equal and fall in the exact middle. A way of graphically representing a frequency distribution; a type of bar chart that uses vertical bars that touch. Test used to establish whether the means of two groups are statistically different from each other. A branch of mathematics used by researchers to organize, summarize, and interpret data. Erroneously concluding that study results are significant. A number, expressed in standard deviation units, that shows a score's deviation from the mean. The relationship between two variables. A subset of a population. A graph that represents the relationship between two variables. Mathematical methods used to organize and summarize data. A numerical indication of the magnitude and direction of the relationship (the correlation) between two variables. The sum of a set of scores in a distribution divided by the number of scores; the mean is usually the most representative measure of central tendency. A complete set of something—people, nonhuman animals, objects, or events. A way of graphically representing a frequency distribution; frequency is marked above each score category on the graph's horizontal axis, and the marks are connected by straight lines. A single number that presents some information about the "center" of a frequency distribution. |