When we drop the variable to remove the multicollinearity does it not lead to the model misspecification where the error term is correlated to the independent variable?
Yes-- you will be in voilation of the zero conditional mean assumption, and OLS will be biased and inconsistent.
Then why the curriculum recommend omiting variable instead of applying different method (GLS for example)?
And one unrelated question - if we can’t use Durbin-Watson statitstics for AR models why they continue to give us Durbin-Watson stat in the provided table?
Thank you, tickersu.
If 2 variables are highly correlated, then it is redundant to include both, so better drop one of them. You reduce MC but still explaining well your dependent variable (assuming the variables are correct for especification).
About D-W, Tickersu is right, the data is given in excess and you need to select the necessary from the whole pool.
I clearly accept what you saying, less than perfect MC does not violate OLS assumptions, but affect efficency of the paramenter estimators, if you add many correlated variables with the objective of rising explanation (assuming model correctly especified) you are just blowing up variance and hence getting a very marginal better explanation in contrast to a stratospheric variance added… I have no expertise fixing MC in real life, but I think what you are saying could fit in a very uncommon case where explanation objective is much more important than extra variance added; and in the rest of cases we can just “drop” that variable and still having a good day.
Ommited variable invalidates OLS because error terms caught that ommited variable, so errors are not normal anymore, thats ok. However, I don’t know how could you conclude that, when builing a model, (1) a highly correlated X variable with other X variable is extremely necessary to the model because if not used, the model invalidates (this looks like coincidence for me, this case is rare); (2) In which extent that ommited variable can effectively destroy your model knowing that the both variables are highly correlated, and (3) How can you rely on parameters that have very large confidence intervals even at very low alphas.
I personally believe that if you encounter such a case in real life, the most responsible conclusions you must declare is that your model has a very large range of prediction, so the forecasting would be a headache.