Sign up  |  Log in

Durbin Watson for Serial Correlation, WHY?

I rushed through the quant material when I took the exam last year. I am no quant expert, but this question has been bugging me for some time. 

Why does the CFA decide to go with Durbin Watson to test for serial correlation? I thought DW only tests for first order serial correlation and the inconclusive area of the test makes it a less effective test. Is there a reason behind this? 

"Using Wiley for my CFA journey was by far the best option… I was able to pass on my first attempt.”– Moe E., Canada

It does only test for first order serial correlation. It’s a pretty common introductory test for serial correlation. It’s probably to introduce students to the idea of this problem, and if you have some decent wits about you, you’ll ask the question you seem to be getting at…what if there’s something other than first-order autocorrelation?

There are more tests, of course. Durbin and Watson test statistics are developed for higher order autocorrelation: they’re generalized DW statistics (this is the same as ANOVA being a generalization of the independent samples t-test, for example). These generalized DW statistics should be used sequentially, though, as they assume no lower-order autocorrelation; if you’re suspecting third-order, you need to test first and then second order, assuming first order showed insufficient evidence of autocorrelation and then assuming second-order DW showed insufficient evidence of autocorrelation. Read the on PSU masters in applied stats page if you want some more background, poke around on there; it doesn’t cover higher order DW but it talks about other kinds of autocorrelation testing or things to think about. This SAS documentation has some limited background on the generalized DW statistics.

To touch on the “inconclusive” comment: this test is good and bad for having this region. Really, every statistical test is unable to render a “conclusive” answer. This whole “p<.05” nonsense that people use in research is problematic in that it creates a false dichotomy of “real” or “conclusive” that doesn’t exist; this usage that most non-statisticians (non-MS or PhD in statistics) “understand” is not how p-values were designed to be used. So, the DW inconclusive region is good, because it should point out that statistics doesn’t give conclusive answers to anything, but it’s bad because without deeper knowledge, people might incorrectly conclude the other tests that lack this “inconclusive” region are somehow giving a conclusive answer. In reality, there isn’t a difference between a p-value of .04 or .06; one is not “conclusive”– they both show a similar level of evidence that contrasts with the null. The end decision of “significant” or not depends on many factors and is boiled down to the idea of “do we have sufficient or insufficient evidence, for this case, to make a conclusion?” 

Also wanted to say that “inconclusive” is a good way to think of “fail to reject H0” in the sense that you’re not saying “no” but rather saying “I’m not comfortable saying yes without more information”…