Statistics is the process of collecting, describing, and drawing conclusions about data. Data are observations or measurements made about the people or objects that we are investigating. When we read a study, we want to ask several questions about the data, including the following. What is the question of interest? From what individuals were the data collected? What observations or measurements were made? How were the data presented? What conclusions were reported? We ask these questions to learn more about the population of interest.
The population is the set of individuals that we want to describe or draw conclusions. In practice, it is too difficult to collect data on the entire population so we instead focus on collecting data from a subset of the population, which we call a sample. Collecting data from an individual means that we measure or observe a characteristic of that individual. We call these characteristics variables, and variables can be categorized as quantitative or categorical. A quantitative variable is numerical and represents characteristics that we can count or measure whereas a categorical (or qualitative) variable is typically non-numerical that can generally be described with words rather than numbers. Often we summarize the data gathered from variables using measures like a mean or a percentage. Numerical summaries of the population are called parameters, while numeric summaries of a sample are called statistics.
When one or more variables are measured for every individual in a population, the resulting data set is called a census. In the case where we do not wish to collect data from the whole population but instead use a representative sample, researchers will typically either use an experiment or an observational study. An experiment is a study in which a treatment is being imposed. An observational study involves observing and recording certain variables of interest about the individuals in the study. Polls and surveys are typical examples of observational studies.