# Detecting Multicollinearity - What's The R^2?

So a high R^2 is a sign of multicollinearity. What is the cutoff for this? Is 0.69 not multicollinearity while 0.70 is??

I thought High F, with low T-stats on the regression coefficients was multicoll.

I think a high R^2 in combination with insignificant coefficients for the independent variables is a red flag for multicollinearity, but doesn’t necessarily guarantee it. . . of course, I’m not positive about this, but I think I remember reading this.

nm, i think cpk is correct.

CFAI Mock afternoon had exactly this question, and I used CPK’s logic to answer it. I didn’t even RTFQ. Of course I got it wrong since there was no F stat given Instead the fine print gave the R^2 between the TWO independent variables as 0.3 . The answer claimed that this was low , hence no multi-collinearity

Right Right, but I saw a question where T-Stats were low and R^2 was 81%. I thought that over like 90% was high? What’s the cutoff?

R^2 is just How well the model as whole explains the variation in the dependent variable

what I was talking about was the F-stat for the multiple regression as a whole, while you are mentioning R^2 between two independent variables. R^2 between 2 independent variables being high - would indicate high correlation between the variables, and if both are present on the same equation - highly correlated variables cause multicollinearity. Here an independent linear regression in 2 variables is being performed between the indep. variables.

But in a single variable regression , R^2 can detect strong correlation . Then , if you use the dep. and independent variables to explain a third dependent variable, you should claim multi-collinarity straightaway. Because of the strong correlation in the first equation. Kind of basic stuff

I don’t think that is how R^2 is used. R^2 in a multiple regression is rss/sst It explains the variation in a model. Its uses similar inputs as the F test but is different. Multi-c is only tested with a significant F and insignificant T test.

Try q. 42 in CFAI Mock 2010 Afternoon . You’ll know what I mean. Here 's the question: Because there are only two independent variables in her regression, Hamilton’s most appropriate conclusion is that multicollinearity is least likely a problem, based on the observation that the: A. model R2 is relatively low. B. correlation between S&P500 and SPREAD is low. C. model F-value is high and the p-values for S&P500 and SPREAD are low. Guess the answer?

definitive answer from the book: High R^2 and significant F-statistic even though the t–statistics on the slope coefficients are themselves not significant (indicating inflated std. errors for the slope coeffs). Also they specifically also write - magnitude of pairwise correlations between independent variables has occasionally been suggested to assess multicoll. but is generally not adequate. It is not necessary that the pairwise correlations be high for there to be a multicoll. problem. so based on the above B) is eliminated. C) F-value is high, but p-values are low - which means that the S&P500 and SPREAD are significant variables - so that too is out. has to be A) by elimination.

i looked it up, the correlation is .3 the r^2 is .40 the answer B They don’t mention r^2 at all in the explanation. I don’t see a relation in the answer.

Should be B. Def of multicollinearity is that variables are highly correlated. R^2 that is high and an adjusted R^2 that is extremely low (very rare).

Right answer is : B. correlation between S&P500 and SPREAD is low. The fine print gives the correlation between SPREAD and S&P to be 0.3, which supposedly is low.

If there are only 2 independent variables pairwise correlation is all you need. When adding variables past two you need to look for significant f, non-significant t-values.

book specifically mentions in the section “detecting multicoll.” that going with correlations is not the recommended approach. And I have not done the mock. So this is a -1 for me, definitely… .

Book mentions pairwise with multiple variables, read fine print on only two independent :).

I believe the pairwise correlation needs to be .7 and up to be considered high also

i just looked over it, R^2 in a linear regression with one independent variable is the coefficient determination. Correlation is the Square root of that, in a single linear regression model.