Quant - Omitting a variable

cvd332 · May 8, 2010, 11:12am

Hi All, Got a question on quant, mutiple linear regression. I know that in our linear regression equation, y = ax1+bx2+…+e, we do not want any of the indep variables to be correlated, or else it’s going to be multicollinearity, so got that part. what surprises me, in schweser page 207, there seems to be conflicting info( for me at least, that’s why i am asking for help here) from schweser pg 206, it said the correct equation is: R= b0+biB+b2LnM + b3LnPb+B4ff+e then it said, LnM is statistically significant at 1% level, so no doubt, this variable indeed needs to be in the equation then below it, it said that if LnM is correlated with other indep variables (B,LnPb,FF), the error term is also correlated with the same indep variables, and the resulting regression coefficient. question 1: how can LnM be correlated with other indep variables here, since LnM has been defined in the correct model specification. If LnM is indeed correlated with other indep variables, it should not be listed in the correct model specification, otherwise it would be multicollinearity) question 2: what does the 2nd statement mean, the error term is also correlated with the same indep variables. Does that mean that if we omit a statistically significant variable, the coeff estimates will be correlated with the error term ,thus conditional heterokedasticity? Thanks

cpk123 · May 8, 2010, 3:09pm

Original Variables were lnM, Beta, lnPB and FF. They were found to be statistically significant in the original model. Now the explanation all corresponds to “omitting the lnM variable because you found out that it was correlated with B and FF”. Based on your new model - you might interpreting that Beta and Free Float have much more to do with the stock return - whereas, as your original equation had indicated - some of that was actually explained by Company size, which you now had missed - therefore your model is now misspecified due to leaving out the variable. You are assigning much more of “explanation” of return (saying it is highly correlated much more with FF and B - though some of that is actually due to the size lnM.). Hope this helps.

cvd332 · May 9, 2010, 2:19am

CPK, I understand your point, but somehow the EOC CFAI isnt really that way. If you look at EOC reading12, no 22, if an omitted variable is correlated with variables already included in the model, coeff estimates will be biased and inconsistent and std error will be inconsistent —>this is correct statement according to CFAI What i do not understand, how come by removing a correlated variable, our coeff estimates will be biased and inconsistent and std error will be inconsistent? i thought we just got rid of multicollinearity? isnt that what we are supposed to do, when we found out that one of the variables turned out to be correlated with other indep variables (although we know that variable is statistically significant), we should remove it ? BTW, just to clarify, omitted variable bias happens when we remove a statistically significant coeff right? But does the theory say anything about the omitted variable being correlated with other indep variables? Thanks in advance CPK, you have been very helpful to all of us L2 candidates