Unfortunately, the book does a pretty poor job at covering statistics (I’ve said this before, and multicollinearity is one of their really weak points). Also, I wouldn’t really call it a “residual” factor (as it already has an unambiguous name “Variance Inflation Factor”). It tells you how much the variance on coefficient i is inflated (and it’s standard error is inflated by the square root of the VIF).

I sent you a PM when I made the first post to try and further clarify what the text says. I’ll repost some of what I wrote before and try to explain it further.

**The F-statistic (and R-squared) is not increased or decreased purely due to multicollinearity.** This is because OLS is unbiased in the presence of multicollinearity (i.e. model fit is unaffected). Anyone telling you otherwise is mistaken.

**The coefficient SEs can be inflated because they are multiplied by a factor of sqrt[1/(1-Rsquaredi)] where R-squaredi is the R-squared from a regression where Xi is the dependent variable and all other X variables are independent variables.**

If you are trying to predict Y with X1, X2, and X3, you can assess MC by running a regression of X1 (as a dependent variable) and X2 and X3 as predictor variables.

**X1= b0 + b2X2 + b3X3**

Get the R-squared value for this regression. This value will essentially tell you how much variation in X1 is shared by X2 and X3 *together*. Subtracting this value from 1 will give you variation in X1 that is unique, and not shared by X2, X3. The more unique information, the less the coefficient variance (and standard error) will be inflated in the original regression. To quantify the inflation, take the reciprocal of (1-r-squared) from the regression we just did. This is a VIF (not mentioned in the curriculum, because the curriculum gives insufficient coverage of the topic).

If the VIF for X1 is 5, for example, then the variance of coefficient b1 will be 5 times as large as it would be if X2 and X3 were not related at all to X1 in the regression of Y with X1, X2, X3 (and the standard error of b1 will be sqrt(5) times larger).

You can (and the computer software does) do this for X2 with X1 and X3 as predictors as well as a final regression of X3 with X1 and X2 as predictors.

Let me know if it’s still unclear, and I can probably find another resource. It’s partially because I’m being lazy, and partially because it’s a pain to explain using only text.