 # Autocorrelation

Quick question: In the Autocorrelations of the Residual section of a model why are there four lags (pg 369 CFA text)? Does each lag represent a different observation? For example, if there were 359 observations (as there are in this example) is there 359 possible lags (autocorrelations)? I read in the reading that they would typically select four to simulate a quarterly model and 12 for a monthly, but what happens when the observations occur over several years? Are these four (12) somehow taking an average autocorrelation of all quarterly (monthly) observations for that time period?

I’m horrible at statistics… but my understanding is that with 359 observations, you have 355 lags possible (if you’re lagging 4) i.e. observation #5 corresponds to #1, #6 to #2, and so forth. Therefore, you can only have 355 points of comparison. If I’m way wrong, someone please shed light on it after you’re through pointing and laughing.

Whoa. If you have 359 observations, you could have 358 lags (or maybe 357 for model fit reasons). That is, you could fit the model Y(t) = a1*y(t-1) + a2*y(t-2) + … + a357*y(t-357). Since you only observed two points that have lag 257, you are going to get a really terrible fit on a357 (among other reasons). Further, I believe an AR(357) model not at all ever. It will always be an overfit model. Now if we fit four lags we would have at least 355 observations to fit each parameter estimate which I am believing a lot more.

So for the 359 example with four lags you could have 355 residuals? And each residual would simply be the correlation run under a different set of observations? For example: Residual 1 (lag 1) = the error term captured when running the regression with observations 1 & 2. and Residual 2 (lag 2) = the error term captured when running the regression with observations 2 & 3. Residual 3 (lag 3) = the error term captured when running the regression with observations 3 & 4. …

TJR Wrote: ------------------------------------------------------- > So for the 359 example with four lags you could > have 355 residuals? Yes > And each residual would > simply be the correlation run under a different > set of observations? > > For example: > > Residual 1 (lag 1) = the error term captured when > running the regression with observations 1 & 2. > and Residual 2 (lag 2) = the error term captured > when running the regression with observations 2 & > 3. Residual 3 (lag 3) = the error term captured > when running the regression with observations 3 & > 4. … I think you’re missing the big picture here. When you run a regression you use all the data. All the data is used to estimate the parameters and then the residual of each individual observation relies on the model fit from all the data.

AR(1) model is used in the problem on page 369. In any regression model residuals can be examined. if residuals are e1 … e359, autocorrelation with lag 1 = correl(e1…e358, e2 … e359), autocorrel with lag 2 = correl(e1 … e357, e3 … e359) - obviously the longer the lag, the shorter the vectors used in autocorrelation calculations. Typically autocorrelation declines as lag increases. Therefore, it’s important to look at autocorrelations with small lags. 4 was used just as an example in the problem. The number could’ve been 3 or 5 or 10.

TJR Wrote: ------------------------------------------------------- > Quick question: In the Autocorrelations of the > Residual section of a model why are there four > lags (pg 369 CFA text)? Does each lag represent a > different observation? For example, if there were > 359 observations (as there are in this example) is > there 359 possible lags (autocorrelations)? It sounds like the original question was referring to why there were only four lags listed in the “Autocorrelations of the Residual” section of Table 5, Example 6. It notes in the example that they are only looking at the autocorrelations of the first four lagged variables. If we saw the entire model, we would see should see 358 lags as Joey noted above, which would take up an additional 10+ pages in the book for this example. > I read in the reading that they would typically > select four to simulate a quarterly model and 12 > for a monthly, but what happens when the > observations occur over several years? Are these > four (12) somehow taking an average > autocorrelation of all quarterly (monthly) > observations for that time period? I dont really understand this question…It sounds like you are comparing the AR(1) model itself with testing for seasonality in an AR(1) model. In this case, the 4th lag autocorrelation (for quarterly data) and the 12th lag autocorrelation (for monthly data) are each used to test for seasonality. See the second paragraph under “Seasonality in Time-Series Models” on p.389. As for why we would only test the first four autocorrelations in any AR(1) model, regardless of the number of observations, to ensure that a model is correctly specified, I could not find this in the reading. It does state that any given autocorrelation shows the correlation of the variable in one period to its occurence in the previous period. Therefore if each autocorrelation is dependent on the previous autocorrelation, we should be able to detect serial correlation within the first four autocorrelations for large samples? If anyone can provide clarification on this, that would be great.

The reason it’s not in the reading is that it is a bit more involved than that. Just having an autocorrelation doesn’t mean that we should include it in the model. For example, if the structure really is AR(1) the X(t) is correlated with X(t-1) which is correlated with X(t-2) so we expect X(t) to be correlated with X(t-2) even though the real structure is AR(1). That’s a pretty easily solvable problem but it just gets beyond the scope pretty quickly.

Joey, you are right about autocorrelations of dependent variable. The reading is talking about autocorrelations of residuals. They use AR(1) and then test whether assumption of no serial correlation of residual is violated or not. There are two questions worth discussing: how to specify a model (whether it’s AR(1), AR(2) or any other kind of regression) and then how to test whether it’s specified properly or misspecified. In the example on page 369 they discuss the second question.

It sounds like I need to correctly state my question How are we able to determine/assume that a model is properly specified and there is no serial correlation by only testing the first four autocorrelations of the residual in a model with a large number of observations? Is this case the same for a model with a small number of observations?
Lisa Marie Wrote: ------------------------------------------------------- > It sounds like I need to correctly state my > question > > How are we able to determine/assume that a model > is properly specified and there is no serial > correlation by only testing the first four > autocorrelations of the residual in a model with a > large number of observations? Is this case the > same for a model with a small number of > observations? I don’t disagree with you, Lisa Marie. Four autocorrelations of the residuals are not always enough, especially when a model has a large number of observations.