Define a “good” sample and how to obtain one.
A good sample is representative of the larger population, so results can be generalized from the sample to the population. A sample is representative if it contains all the attributes in the population, in the same proportions that they are present in the population.
Random sampling, where all the cases in the population have an equal chance of being selected, makes a representative sample likely but does not guarantee it. Sample size is also important as larger random samples are more likely to be representative.
Two factors that affect representativeness are self-selection bias and sampling error. Self-selection bias occurs when not all targeted participants agree to participate. Sampling error, which is the result of random factors, causes a sample to differ from the population. Sampling error, thought of as normally distributed, is more likely to be small than large.
List three facts derived from the central limit theorem.
The central limit theorem is based on a sampling distribution of the mean. A sampling distribution of the mean is obtained by (a) taking repeated, random samples of size N from a population; (b) calculating a mean for each sample; and then (c) making a frequency distribution of the mean. It demonstrates how much sampling error exists in the samples.
The central limit theorem makes three predictions about a sampling distribution of the mean when the number of cases in each sample is large: (1) The sampling distribution is normally distributed; (2) The mean of the sampling distribution is the mean of the population; (3) The standard error of the mean can be calculated if one knows the population standard deviation and the size of the sample.
Calculate the 95% confidence interval for μ.
Though less precise than a point estimate, which is a single value estimate of a population value, a confidence interval is a range, built around a sample value, within which a population value is thought to be likely to fall.
The 95% confidence interval for the population mean is built by taking the sample mean and subtracting 1.96 standard errors of the mean from it and adding 1.96 standard errors of the mean to it. An interval constructed this way will capture μ 95% of the time.