I came across the following solution to a practice problem:

“In the presence of multicollinearlity, the regression coefficients would still be consistent but unreliable.”

***

In that context, which of the following properties of an estimator are we referring to when we say a slope coefficient is “unreliable”? I assume it’s not referring to consistency given the solution text?

*unbiased*: the expected value of the statistic equals the value of the parameter it estimates.

*efficient*: of all unbiased estimators, it has the smallest sampling error.

*consistent*: as the sample size increases, the sampling error decreases.

“Unreliable” means that it might be wrong (i.e., biased).

When multicollinearity exists, the effect of one independent variable might be ascribed to another independent variable; i.e., the slope coefficients for those two variables might both be wrong.

My guess is that their “correct” answer is “unbiased” but this is factually, 1000000% wrong as multicollinearity does not introduce bias into the estimators. (Based on my issues with the CFA Institute in the past on this, and the general quality of the stats review materials in the CFA curriculum from prep providers, I will be extremely surprised if they answered correctly with “efficient”.)

By process of elimination, and knowing what multicollinearity does, the correct answer is efficiency. This is because the standard errors are inflated which can vastly increase the sampling variability making the coefficient estimates “unstable” and “unreliable”; unstable is really a better term. And, you can find many references that mullticollinearity makes the least squares estimators no longer the MVUE (minimum variance unbiased estimator due to loss of efficiency).

The problem of multicollinearity is that it becomes hard to pin down the individual relationships of each collinear predictor with Y, but as a group the relationship is well known; there is no statistical bias introduced into the individual estimators. This can be shown by derivation and more easily (for many people) by simulation.

I’ll have to disagree; unreliable refers to the higher sampling variation for the coefficients that are collinear (represented by inflated standard errors for those coefficients) and this explains why the actual calculated values may be “much different” in magnitude and sign than theory or otherwise would dictate. It helps to remember that standard errors indicate sampling variation (spread of sampling distribution) in the statistic at the given sample size.

The estimators themselves do not have bias because their expected values still equal the population value. The individual values that we calculate (estimates not estimators) will (basically) always be “incorrect” (different from the true value) but their expected values are what tells us about statistical bias (or rather, the difference between the population value and the mean (expected) value of the estimator, which should be zero if unbiased).

I may have given the wrong impression. The quoted statement in my original post was in the answer explanation - I wasn’t sure what they’d meant by “unreliable” and assumed that it must correspond to one of the three properties of an estimator.