Omitting the variable correlated with another independent variable

Hi,

in CFAI Curriculum there is a short text about the misspecification of the multiple regression model. It’s on the page 356 Reading 10 chapter 5.2. Here it is:

If the omitted variable (X2) is correlated with the remaining variable (X1), then the error term in the model will be correlated with (X1), and the estimated values of the regression coefficients a0 and a1 would be biased and inconsistent. In addition, the estimates of the standard errors of those coefficients will also be inconsistent, so we can use neither the coefficients estimates nor the estimated standard errors to make statistical tests.

(Institute 356)

Institute, CFA. 2018 CFA Program Level II Volume 1 Ethical and Professional Standards, Quantitative Methods, and Economics. CFA Institute, 07/2017. VitalBook file.

Ok, from CFAI Curriculum I know that correlation does not mean multicollinearity, but let’s say that correlation raises the chance of encountering multicollinearity. Having two strongly correlated independent variables, as far as I know, is not much welcomed by the econometricians etc. So my question is:

  1. Why would the error term be correlated with the independent variable?

  2. Ok, I understand that the standard errors will be wrong then, but why would the coefficient be biased and inconsistent? I know they would change comparing to the model with the omitted variable included, but why would it be automatically wrong?

  3. Would not we then just have a problem with heteroskedascity?

Please utilize the search function. I know I have personally addressed this question at least a handful of times.

Technically, correlation between at least 2 independent variables is collinearity, however, it’s generally referred to as a problem. In other words, multicollinearity is a spectrum, a matter of degree.

The CFAI curriculum is really crappy for most of the statistical topics. Multicollinearity is perfectly acceptable in some cases but can pose a huge issue in need of remedy in other contexts.

The error term absorbs unmeasured variables. If you have two correlated independent variables and you do not include one in the model, it is part of the error term and then the error term is related to the included IV.

For 2, review the regression assumptions required for unbiased and consistent estimation. You’ll then realize why this can be an issue (or again, search the forum for more detailed, prior answers). It’s wrong because the true model includes x1 and x2, and you’ve only included x1 in yours.

The search function is your friend.

thank you @tickersu

Sure thing. Let us know if there are still unanswered questions after your search!