multicollinearity and overestimates standard errors

Does anyone know why this tends to be the case in regression models that have multicollinearity?

tickersu does.

He’ll likely chime in here.

Standard errors represent uncertainty surrounding the estimate related to that standard error. Estimates that are consistent have less uncertainty when we have more information. In the case of a regression coefficient, the more we know about the relationship of Y with X1, the smaller that standard error will be for the beta 1 hat coefficient. If X1 and X2 are related, there is less unique information we know about each with their respective relationships with Y. It becomes harder to understand how X1 or X2 relates to Y, so we have less certainty in beta 1 hat or beta 2 hat. That’s the detailed explanation.

The simpler explanation comes from the formula for the standard error for beta i hat.

Suppose we have X1 and X2 to predict Y. The standard error for beta i hat where i=1 or 2 is SS(error)/[(SSxi)*(1-R2aux)]0.5

Where SS(error) is sum of squared errors of prediction, SSxi is the sum of squared deviations for Xi about its mean, and where R2aux is the R2 for predicting X1 with X2 (and any other X variables if there are more than 2 involved) including an intercept. This tells us the proportion of variation in X1 explained (shared) by X2. Subtracting this from 1 gives us the unique (unshared) variation in X1. We make the denominator of the standard error smaller, so the quotient gets bigger.

Another point to note is that 1/(1-R2aux) is called the Variance Inflation Factor (VIF). It tells us the factor by which the variance of beta i hat is inflated due to Xi’s relationship with the other independent variables in the model. The square root VIF tells us how much the standard error of beta i is inflated.

How can you tell the difference between a trend model and AR model just by looking at a regression.

Look at the independent variable(s):

  • If it’s t, it’s a trend model
  • If it’s (they’re) the same as the dependent variable, it’s an autoregressive model
  • If it’s (they’re) something else, it’s something else

Helpful or nah? I can try again if not, or S2k can stab at it!