ommitting an independent variable

BB No. 10 from Section Multiple Regression:

There are 2 ANOVA tables. 1 with 2 X variables, the other with 1 (the other omitted).

Conclusion of the example is that the coefficient of X and the intercept changed. These results illustrate that omitting an independent variable can cause the remaining regression coefficients to be inconsistent.

WHY?

Moosey are you studying again or do you just love the level 2 curriculum?

Omitting an independent variable can cause the estimator of coefficients of remaining independent either biased and inconsistent.

Bias means the coefficient estimated is not equal to the true coefficient. Consistence means when the number of observations increases, the coefficient estimated converges to the true coefficient.

I don’t have the Curriculum now, but the fact that the coefficient of X and the intercept changed demonstrates only “the remaining regression coefficients are biased”. And IMHO, it’s a bit hasty to conclude that “the remaining regression coefficients are inconsistent”.

Since no one has answered your question of WHY, I think this does a pretty good job of explaining why the estimate will be inconsistent.

https://en.m.wikipedia.org/wiki/Omitted-variable_bias

You can see in the linear regression example how you are attempting to estimate b but by omitting the variable z you end up estimating b+cf instead.

Now, your estimate of b+cf will not be inconsistent (obviously), but if you are attempting to actually estimate b then with the estimates converging to b+cf instead, you clearly aren’t converging to b no matter how many observations you take; hence, inconsistent.

Hope that helps!

Careful with your wording; lack of bias means the expected value of the estimat or , not the estimat e , will equal the true value of the coefficient. Therefore, bias would mean the expected value of the estimator does not equal the true value of the coefficient.

Estimate ≠ estimator

Amazing things happen when you look at the mathematics behind the methods, eh? Keep in mind that consistency is about sampling variation of the estimates at a given sample size. We could talk about probability limits and asymptotic unbiasedness, but that might conflate ideas for people on the exam.

You are right. The true term is “expected value of the estimator”, not “coefficient estimated” . I wanted to make thing simple but I made it simplistic.

The wiki demonstration suppose that we know the “true cause-and-effect relationship” is

, with this we can indicate there is bias in the estimation of remaining coefficient (variable X). And the bias is quantified by this equation {\displaystyle {\begin{aligned}E[{\hat {\beta }}|X]&=\beta +(X'X)^{-1}E[X'Z|X]\delta \&=\beta +{	ext{bias}}.\end{aligned}}}

But in my opinion, we can never know the expectation of X’Z. We can only estimate it (the expectation) via an estimator, for example, 1/n * Sum(xi * zi) / Sum(xi ^2).

The fact that the coefficient changes means the estimator 1/n * Sum(xi * zi) / Sum(xi ^2) is different to zero, but it doesn’t means that the limit of 1/n * Sum(xi * zi) / Sum(xi ^2) converge to something different to 0 when n tends to infinity.

So, that’s why I don’t think we can illustrate the consequence of omitting an independent variable in practice and the conclusion of Curriculum is hasty. Having only some observations (x1,x2,…xn) , (y1,y2,…,yn) and (z1,z2,…,zn) and the fact that the remaining coefficients changes after omitting the variable Z, we can only conclude the remaining coefficient may be inconsistent.

PS: the remaining coefficient beta is estimated by Estimator of (true beta) + Estimator of bias at below

{\hat {\beta }}=(X'X)^{{-1}}X'Y,

Estimator of bias (for example) 1/n * Sum(xi * zi) / Sum(xi ^2)

I usually dig into the details when I learn something (CFA or otherwise) because I genuinely think that’s the right way to build understanding. That tends to work on the CFA exams although it doesn’t do that well for FRA (since the answers are often “because that’s the rules” which lends itself more to memorizing which I’m not good at).

I’m not really interested in getting into a debate about the mathematics underneath this question (#lazy :slight_smile: ). I showed you all how it would work, feel free to take from it whatever conclusions you guys want. :slight_smile:

I agree with you. There is no debate here. What i want to say is the Curriculum is not sufficiently clear.

The CFAI curriculum is incredibly bad for almost every topic related to statistics. If you really want to learn this stuff pick up a statistics text written by someone with the appropriate academic background such as a PhD or MS in statistics (you could probably even do an econometric book if you want something more focused on econ and finance…econometricians really enjoy the kinds of topics discussed in this thread). It’s uncommon to find people without a formal background in statistics who truly know what they’re doing. Unfortunately, it’s surprisingly common that these people publish “popular” books.

Here is something related to this thread: take the case of a simple linear regression… if you add an independent variable to this regression so that two Xs are present now, and if the coefficient of the original independent variable changes, you can conclude that the variables are correlated (pretty common). If the coefficient doesn’t change between the SLR and the MLR, it implies that the true coefficient of the new variable is zero (no real effect), the correlation between the new and original variable is zero, or that both the coefficient and correlation are zero.

I certainly love it. :-))

I decided that until results come out I will review those parts of the material which I left out due to time constraint or where I had question marks but did not go into it. Not more than 20 minutes a day.

Thanks guys, it’s starting to be a bit clearer.

This kind of attitude and curiosity deserves to pass. Good for you. I hope you receive a pass when scores are released.