Inference for Regression

483

image

CHAPTER OUTLINE

  • 10.1 Inference about the Regression Model
  • 10.2 Using the Regression Line
  • 10.3 Some Details of Regression Inference

Introduction

One of the most common uses of statistical methods in business and economics is to predict or forecast a response based on one or several explanatory variables. Here are some examples:

  • Facebook uses the number of friend requests, the number of photographs tagged, and the number of likes in the last month to predict a user’s level of future engagement.
  • Amazon wants to describe the relationship between dollars spent in their Digital Music department and dollars spent in their Electronics and Computers department by 18- to 25-year-olds this past year. This information will be used to determine a new advertising strategy.
  • Panera Bread, when looking for a new store location, develops a model of store profitability using the amount of traffic near the store, the proximity to competitive restaurants, and the average income level in the neighborhood.

Prediction is most straightforward when there is a straight-line relationship between a quantitative response variable and a single quantitative explanatory variable. This is simple linear regression, the topic of this chapter. In Chapter 11, we discuss regression when there is more than one explanatory variable.

simple linear regression

As we saw in Chapter 2, when a scatterplot shows a linear relationship between a quantitative explanatory variable and a quantitative response variable , we can use the least-squares line to predict for a given value of . Now we want to do tests and confidence intervals in this setting.

Reminder

image

least-squares line, p. 82

To do this, we will think of the least-squares line, , as an estimate of a regression line for the population, just as in Chapter 7 where we viewed the sample mean as the estimate of the population mean . We write the population regression line as . The numbers and are parameters that describe the population. The numbers and are statistics calculated from a sample. The intercept estimates the intercept of the population line , and the fitted slope estimates the slope of the population line .

Reminder

image

parameters and statistics, p. 276

484

We can give confidence intervals and significance tests for inference about the slope and the intercept . Because regression lines are most often used for prediction, we also consider inference about either the mean response or an individual future observation on for a given value of the explanatory variable . Finally, we discuss statistical inference about the correlation between two variables and .