Multicollinearity and omitting a correlated variable - Contradiction

Hello everyone

I am little confused with the multicollinearity and misspecification thing…

When two independent variables are correlated its multicollinearity and to correct that schweser text says omit one or more correlated variables…


going forward when it goes one to explain misspecifications it says that if an independent variable (say A) is omitted which is correlated with another independent variable (say B) then the error term is also correlated with that remaining variable (i.e. B) resulting in regression coefficients biased and inconsistent…!!

Why excluding a correlated independent variable causing problem when we were supposed to do that when correcting multicollinearity… I might be missing a connection b/w the two?. SOMEONE PLZ HELP MEEE!!

You’re not missing anything. It’s just that CFAI is terrible at many basic statistical concepts and the prep providers are usually worse than the CFAI.

In real life, there are many more fixes for multicollinearity, and there are many times where a fix isn’t needed even though multicollinearity is substantially present.

For the test, they likely won’t make you choose between it, but dropping a correlated independent variable is possible as a solution. However, this may cause additional issues. No multicollinearity isn’t an assumption needed for unbiasedness and consistency, whereas you can violate an assumption by excluding a truly important variable that is correlated with another in the model. ’

For the exam, just recall that dropping a correlated x variable might be reasonable. Also recognize that doing so may cause the model to lose some nice properties. In real life, you need to balance the benefits, but on the test, they probably won’t have you do that.

Thanks alot tikersu.

Pls help me with one more thing the text says “even though multicolinearity does not effect the consistency of slope co-efficients, such coefficients themselves tend to be unreliable”

what does this hairy sentence mean?

The desirable properties of the coefficient estimator, unbiased and consistent, are unaffected by multicollinearity. However, the relationship between the X variables makes it hard to attribute the relationship of each variable with the dependent variable. This means that the coefficients may have different signs or magnitudes than they should-- purely as an artifact of the interrelatedness of the x-variables.

Here is a quick way to think of it: Imagine you are a teacher in the classroom. If students raise hands and wait to be called on before answering, you can discern who gave what answer and who was correct (less multicollinearity among predictors). If students shout out the answers to questions, it becomes difficult to know who gave the correct answer, but you just know the correct answer was said (relatively more multicollinearity among predictors makes attribution harder).

This is what they mean by “unreliable”-- caution should be used when trying to interpret them, but it’s pretty much irrelevant if all you want the regression for is to generate predicted values.

Great analogy… thank you again… much clear!!

I’m glad you found it useful!

Tikersu another problem withs stats that I have… sorry I am bothering you a lot but really like the way you make things clear

In autoregressive model the time series has to be covariance stationary (constant mean, constant variance and constant cov)… But when you have these assumptions why would you then need to forecast using auroregressive model… Wouldn’t the forecasted value be then similar to past value i.e. the forecast will also be a mean… What’s the use of doing regression then???

Tikersu / smagician pls help me with my previous post