Chapter 13 Appendix

Chapter 13 AppendixTime Series Forecasting with Excel, JMP, Minitab, SPSS, CrunchIt!, R, and TI-83/-84 Calculators

Page TA13-1

Time series models can vary widely depending on the structure of the data and the type of model desired, from a simple regression model to extremely complex ones. The basic data assumption is that all observations were taken at equally spaced intervals.

Time Series Graphs and Simple Trend Models

There are two ways in Excel to create a time series graph and fit a trend line.

You can create a time series graph using Insert ➔ Scatterplot with Straight Lines . The time index (1, 2, …) or date (X) variable must be the left column in the two adjoining columns of data in the spreadsheet (use copy and paste to make the columns adjacent to one another, if needed).

Simple trend models can be fit using linear regression from the Data ➔ Data Analysis menu.
Highlight the column of the time series, then Insert ➔ Line. Add the trend line by right-clicking on a data point and selecting “Add Trendline.”

There are three options to create a time series graph.

Use Graph ➔ Graph Builder and drag a variable with the time index (or date) to the x axis; drag the variable of interest to the y axis. Click the icon, right-clicking on any point, and select “Add trendline” from the pop-up to connect the points.
Use Analyze ➔ Time Series. Click to enter the series of interest into the “Y, Time Series” box. If you have a data or index variable, that can be entered into the “X, Time ID” box.
Graph ➔ Overlay Plot. Select the time series data column into Y, and click “OK.” Connect the points by clicking the red triangle and selecting “Connect Thru Missing.”

Simple trend models can be fit using the time (or date) as $X$ and the variable of interest as $Y$ with Analyze ➔ Fit Y by X.

Page TA13-2

There are three options to create a time series graph.

Use Graph ➔ Time Series Plot. Use the default “Simple” and click “OK.”
1. Click to select and enter the variable of interest into the main box.
2. Minitab will default to using an index $(1, 2, \dots)$ for the x axis. If you want a different axis labeling, click “Time/Scale” and select the radio button for the appropriate option. To start graphing yearly data, for example, click “Calendar” and use the drop-down to select “Year.” Enter a starting year in the box. If the data are annual (increment 1), simply click “OK” to return to the main dialog.
With a time variable (or an index variable), use Graph ➔ Scatterplot.
1. Select “With Connect Line” and click “OK.”
2. Click to enter the variable of interest as $Y$ and the time (or index) as $X$ . Click “OK.”
Use Stat ➔ Time Series ➔ Trend Analysis. Click to enter the column containing the series. For the time variable, you can use the default $(1, 2, \dots, n)$ or click “Time” and change to years, months, etc. Click “OK” for the plot with the trend line.

Simple trend models can be fit using Stat ➔ Regression ➔ Regression ➔ Fit Regression Model with the variable of interest as the response and the time (or index) variable as a continuous predictor.

Assuming you have the correct version of the software (there are several from Base to Standard to Professional), there are two ways to create a time series plot.

Use Graphs ➔ Legacy Dialogs ➔ Line.
1. In the next dialog, Simple should be the default; move the radio button in “Values in Chart are:” to “Values of individual cases.” Click “Define.”
2. Click to enter the variable of interest into the box labeled “Line Represents.” If you have a time variable, move the radio button from “Case Number” to “Variable” and enter that variable name. Click “OK.”
Use Analyze ➔ Forecasting ➔ Sequence Charts (only if you have this module).
1. Click to enter the variable of interest into the box labeled “Variables.” You can leave “Time Axis Labels” blank if you simply want a time index used on the X axis.
2. Click “OK.”

Simple trend models can be fit using Analyze ➔ Regression ➔ Linear if you have a time variable (or index) in the data set. Click to enter the variable of interest as the dependent variable and the time variable as the independent variable.

Use Graphics ➔ Scatterplot. Define the plot with the time variable as the $x$ variable and the statistic as $y$ . Set the “Display” option to either “Line” or “Both.”

Simple trend models can be fit using Statistics ➔ Regression ➔ Simple Linear. Use the time variable as the independent variable and the variable of interest as the dependent variable.

Page TA13-3

You can create the time series plot using the connected Scatterplot ( ) option from $\begin{matrix} 2nd \end{matrix}$ $\begin{matrix} Y= \end{matrix}$ (STAT PLOTS). Define the plot with the time variable as the $x$ variable and the variable of interest as $y$ .

Simple trend models can be fit using $\begin{matrix} STAT \end{matrix}$ ➔ Calc ➔ LinReg(a+bX).

The simplest way to graph a time series is with the time series plot command

> ts.plot(var)

You can also use the regular plot command. The type=“1” part tells R to connect the points. Note that no time variable is needed here.

> plot(var,type=“1”,xlab=“time”,ylab=“Variable Name”)

The simple trend model can be fit using lm (linear model) command as detailed in Chapters 2 and 10.

Trend with Seasonality

These models allow cyclical behavior in the variable of interest. You need one fewer indicator variable than there are periods in the year (3 for quarterly data, for example). They are fit as a multiple regression.

Data files may have dates entered as, for example, 01/2010. We need to separate the month from the date to add the indicator variable while preserving the original for the trend part of the model.

Place the cursor in a blank column on the same row as the first date. Enter the command =text(a2,“MM”) if the first date is in cell A2. Copy the formula down the column to extract the month from the rest of the dates.
Create the indicator variables using the “If logical” function. Supposing the months were placed in column D, this would look like =if(d2=“01”,1,0). Then copy/paste to fill that column.
Repeat step 2 until you have created all the indicator variables needed.
Copy/paste columns as needed until you have all the predictor variables in a contiguous range.
Use Data ➔ Data Analysis ➔ Regression to compute the multiple regression using the original date variable and the indicators as predictors.

Page TA13-4

Data files may have dates entered as, for example, 01/2010. We need to separate the month from the date to add the indicator variable while preserving the original for the trend part of the model. This requires an add-in that can be found at https://community.jmp.com/docs/DOC-7537. You need only install this add-in once. (This procedure also works for quarterly data.)

Right-click in the column header (the variable name). Select New Formula Column ➔ Date Time ➔ Month. This creates a new column with [Month] as part of the variable name.
Right-click in the new column header and select “Column Info” to create a new name without the square brackets.
Click Add-Ins ➔ Make Dummy Variables. Select the variable just created and click “OK.” One dummy variable will be created for each level of the original.
Use Analyze ➔ Fit Model with the original month/year variable and one fewer indicator than there were months (or quarters) to fit the regression model.

Data files may have dates entered as, for example, 01/2010. We need to separate the month from the date to add the indicator variable while preserving the original for the trend part of the model.

Click Data ➔ Date/Time ➔ Extract to Numeric. If you want the month, enter the column with the date to convert in the “Extract from …” box and a new column to receive the month. Check the box labeled “Month” and click “OK.”
Create the indicator variables using Calc ➔ Make Indicator Variables. Enter the column just created as “Indicator variables for.” Minitab will populate a table of variables it intends to create, along with the proposed column names. Click “OK.”
Use Stat ➔ Regression ➔ Regression ➔ Fit Regression Model to compute the multiple regression using the original date variable (as continuous) and the indicators as categorical predictors. Be sure to not use the last indicator.

Click Transform ➔ Compute Variable. Enter a new variable name in the box at upper left. In the “Function Group” box, select “Date Extraction.” In the “Functions” box, select “Xdate.Month.” Double-click to send that shell to the formula box. Click on the original variable to replace the ? in the function with the variable name. Click “OK.”
Now, create the indicator variables. Click Transform ➔ Create Dummy Variables. Click to select and enter the name of the variable just created; give a stub name for the new indicator variables. Click “OK.”
Use Analyze ➔ Regression ➔ Linear to fit the model using one fewer indicator than was just created.

Page TA13-5

> date<-as.Date(Month,format=“%m/%d/%y”)

Next, extract the month from the date

> month2<-format(date,“%m”)

Create the indicator variables

> month.f=factor(month2)

> dummies=model.matrix(~month.f)

Perform the regression

> res = lm(Retail.Sales ~ date + dummies)

Lag Regression Models

These models exploit the relationship between an observation and its previous value. The degree of relationship is measured by the autocorrelation function.

To find the autocorrelations in a time series, copy and paste the column of data (with the variable of interest) so it is offset by 1 observation (paste so the first observation is now paired with the second). Use Data ➔ Data Analysis ➔ Correlation to find the correlation of ( $Y_{t}, Y_{t - 1}$ ).

Use Data ➔ Data Analysis ➔ Regresson to fit the model using the offset column as the predictor (X).

To find the autocorrelation (and partial autocorrelation) in a time series, use Analyze ➔ Time Series (in the student version) or Analyze ➔ Modeling ➔ Time Series (in the full version). Click to enter the variable of interest into the “Y, Time Series” block and (if desired) the dates into the “X, Time ID” block. You can set the number of autocorrelation lags to something like 2 from the default 25. Click “OK.”

To fit the autoregressive model, there are two options.

Lag the series the desired number of lags by selecting the column containing the series and using Columns ➔ Row ➔ Lag. Fill the command shell with the desired number of lags. Use Analyze ➔ Fit Y by X to fit the regression.
In time series mode, click the red triangle at the top of the output and select “ARIMA.” Set the “Autoregressive” order to 1 (or the desired number of lags) and click “Estimate.”

Page TA13-6

Find the autocorrelation (partial autocorrelation) using Stat ➔ Time Series ➔ Autocorrelation (Partial Autocorrelation). Click to select and enter the series variable name. You can also move the radio button to only ask for a reasonable (2 or 3) autocorrelation lags.

To fit the model, you have two options:

Use Stat ➔ Time Series ➔ ARIMA and set the Autoregressive order to 1 (or the desired number of lags),

or
Use Stat ➔ Time Series ➔ Lag and lag the time series by 1 or 2 or whatever lag is of interest. Click to enter the time series name, a column for the new variable and enter the desired lag. Then use Stat ➔ Regression ➔ Regression ➔ Fit regression model. Use the original time series as the response and the lagged series as a continuous predictor.

Use Analyze ➔ Forecasting ➔ (Partial) Autocorrelations to find the (partial) autocorrelation. Click to select and enter the time series as the variable. Uncheck the box for Partial Autocorrelations. To limit the number of autocorrelations shown, click “Options,” then enter a smaller number (like 2 or 3). Click “Continue” to return to the main dialog and click “OK.”

To fit the model, first lag the series. Use Transform ➔ Shift Values. Click to select and enter the series name. Enter a name for the result variable in the box labeled “Name.” Click “Change.” The default is a 1 period lag; if a different lag is desired, enter that. Click “OK.” Use Analyze ➔ Regression ➔ Linear to fit the model using the lagged variable as the predictor.

Create a lagged time series using the following steps:

Place the cursor so it highlights the name of an empty list. Use $\begin{matrix} 2nd \end{matrix}$ to enter the list name you wish to duplicate. The command at the bottom of the screen should look like $L 2 = L 1$ , for example.
The cursor will be on the first value in the new list. Press $\begin{matrix} 2nd \end{matrix}$ $\begin{matrix} DEL \end{matrix}$ = [INS] (Insert).
Delete the entries in the first row of both lists (to eliminate pairing the first observation with the inserted 0), and the last entry in the new list.
If more lags are desired, repeat the insert until you have offset by the desired amount. Be sure to delete entries as needed until the sizes of the lists match.

For the autocorrelation and regression, use $\begin{matrix} STAT \end{matrix}$ , Calc, 8:LinReg a+bx using the original list as $Y$ and the new list as $X$ . For example, LinReg $(a + b x)$ L2,L1.

Page TA13-7

For the autocorrelation (or partial autocorrelation), the command shown below uses “series” as the name of the variable; nn is a reasonable number (usually 2 or 3) for the maximum lag to consider, and plot=F tells R to display the numeric coefficients instead of simply plotting them. For partial autocorrelations, the command is pacf with the same parameters.

> acf(series,lag.max=nn, plot=F)

To create a lagged series by one time step, use the command like

> lagseries <- lag(series,k=-1)

where “series” is the name of the original time series. For the regression, use a command like

> res = lm(Retail.Sales~lagseries)

Moving Average and Smoothing Models

Moving average (MA) models smooth a time series by sliding a window of width $k$ and computing the average of those $k$ previous observations to replace observation $y_{t}$ . Exponential smoothing models use a weighted average of the previous observation and its forecast, which progressively lowers the weight (impact) of observations far distant on the estimate of a current value.

For a moving average, use Data ➔ Data Analysis ➔ Moving Average.

Drag to select the column with the time series.
Enter the value of $k$ in the box labeled “Interval.”
Specify a cell in an empty column for the “Output Range.”
Check the box for “Chart Output,” if desired. Examining the standard errors is beyond the scope of this text, but if you look at them, you’ll notice these are higher on the ends of the series and lower in the center.

For exponential smoothing, use Data ➔ Data Analysis ➔ Exponential Smoothing.

Drag to select the column with the time series.
Enter 1-w in the box labeled “Damping Factor.” This means w will be the weight used on observation $y_{t - 1}$ .
Specify a cell in an empty column for the “Output Range.”
Check the box for “Chart Output,” if desired. Examining the standard errors is beyond the scope of this text, but if you look at them, you’ll notice these are higher on the ends of the series and lower in the center.

Use Analyze ➔ Time Series as discussed earlier.

For moving average smoothing, click the red triangle at the top of the output and select Smoothing Model ➔ Simple Moving Average.

Enter the value of $k$ ,
Click “OK” to select “No centering” (centering replaces observation $y_{t - (k - 1) / 2}$ instead of observation $y_{t}$ as used in your text).

Page TA13-8

For exponential smoothing as described in your text, click the red triangle at the top of the output and select Smoothing Model ➔ Simple Exponential Smoothing.

To specify the weight, select “Custom” from the Constraints drop-down.
Use the drop-down for Level to select “Fixed.”
Enter the weight.
Click “Estimate.”

For a moving average smooth, use Stat ➔ Time Series ➔ Moving Average.

Click to select and enter the time series variable. Enter $k$ in the box labeled “MA length.”
Click “OK.” Minitab will superimpose the smooth on the original time series plot.

For exponential smoothing, use Stat ➔ Time Series ➔ Single Exp Smoothing.

Click to select and enter the time series variable.
For specified weights, move the radio button from “Optimal” to “Use” and enter the specified weight.
Click “OK.” Minitab will superimpose the smooth on the original time series plot.

SPSS does not have a built-in function to do moving average smoothing.

SPSS can do exponential smoothing, but not with specified weights. The system will determine an “optimal” weight or you if you use Analyze ➔ Forecasting ➔ Create Traditional Models and move the drop-down to “Exponential smoothing.”

Using these methods requires we actually declare the variable to be a time series. Use the ts command as illustrated below. The terms “frequency” and “start” are optional, but can be used to specify the time labels. For example, “ $frequency = 12, start = c (2000,) 1$ ” means that the series is monthly and starts in January, 2000.

> Vartimeseries <-ts(var, frequency=nn, start=c(t1,inc)

With this definition, you can also use plot.ts to plot the time series as

> plot.ts(vartimeseries)

The smoothing commands shown below require you to install and load the package TTR. (Click Packages ➔ Set CRAN Mirror and choose a site; Packages ➔ Install Package and locate TTR; Packages ➔ Load Package ➔ TTR.)

A moving average smooth of length $k$ is created with the SMA command. Use plot. ts to see the smoothed time series.

> SMA(vartimeseries,n=k)

The exponential smooth is created with the HoltWinters function. with alpha being the specified weight. Plotting this afterward shows both the smooth and the original series.

> ExpSM <-HoltWinters(vartimeseries,Gamma=F, Beta=F, alpha = wt)