It’s a little embarassing to ask this, given that I do a fair amount of econometrics, but I’m puzzled by something. I tend to do more cross-sectional analysis than time series analysis, and so some of my knowledge here is rusty.
I am analyzing the historical performance of the Case-Shiller housing index, using data from Robert Shiller’s website:
http://www.econ.yale.edu/~shiller/data.htm
Specifically from this file here:
http://www.econ.yale.edu/~shiller/data/Fig2-1.xls
I’m using R and the “tseries” library. The goal is to create some kind of model for forecasting long-term (10+ year) CS index. It seems that there really isn’t much that is powerful for explaining past changes at this timeframe, other than inflation, which is a little surprising, but not super-surprising
–
When I’ve used more recent Case-Shiller data, such as that off of the S&P website - I find that the composite CS-10 index since 1987 (when the series starts) is integrated of order 2. I have to difference twice before the ADF test supports a conclusion that the series is stationary. That’s problematic, because it means there will be a ton of noise by the time one integrates back to the nominal Case-Shiller for forecasting, but of course the world doesn’t have to be simplifiable just because we want it to.
–
However when I run adf.test() on the nominal CS index from Shiller’s much longer-term data (Column i in the Excel link above, starting in 1916, because of other data limitations), I get a p-value of 0.065. This says that we can’t conclude that the series is stationary at the 95% confidence level, but it is pretty darned close (it would be stationary at the 93% confidence level). This was a surprise, because I had been expecting to find results similar to those I had with the post 1987 data.
But here’s the stranger thing. If you actually plot the value of the nominal CS index over time, it doesn’t look even remotely stationary There’s a huge trend upwards as 100 years of inflation (among other things) has its effects. I’m very surprised that the ADF test came so close to concluding stationarity and trying to understand why.
I understand that if something LOOKS stationary, it might not be (that’s part of what the ADF test is for), but I didn’t think that something which - on inspection - looks highly trending would be able to pass it. This is strange to me.
Can any other quanthead explain what is happening? How is it that a highly trending series ends up potentially passing a stationarity test without having been detrended?