Assessment Charlent : Multicollinearity explanation issue

In this assessment, we are given a table with the following :

  • 2 out of 3 independent var are signficant (based on t-test)

  • F-test is significant

  • High pairwise correlations among the 3 independant variable (from 0,7 to 0,98)


My answer was model is correct because : No conflict between F-test and t-test (majority of slopes are significant), and from what I read in the curriculum the pairwise correlations are only a good indicator in case we have 2 independant vars, otherwise they do not imply anything.

Guess what ?? The answer was there is multicollinearity ! And based on the pairwise correlations !

What is weird is that I had the points even if it said Incorrect. I did a try to resubmit the assessment with the supposedly right answer, and I also got the points.

Are the two answers valid ??

If someone can clarify the following, would be really appreciated :

There is multicollinearity if :

  1. F-test significant (high) and 2) ALL t-test are not significant => Or vice versa

OR

In case of regression with 2 independant vars ONLY, we can look at pairwise correlations

The title of this post is equivalent to shining the Batman symbol in the sky for tickersu, he’ll be here shortly.

When there is high correlation between your independent variables multicollinearity exist. If you do not know correlation among your independent variables, you can test for the presence of multicollinearity by checking if all the T test are insignificant and the F test is significant. The reverse is not necessarily true

When F test is significant and one or more T tests are significant you can not conclude that multicollinearity does not exist.

You can only conclude that when F test is significant and all T tests are insignificant => multicollinearity

Thank you for the input !

In this case shall we say that the answer is “Inconclusive” given that the all t-test are not insignificant / F is significant (no multicollinearity) while the correlations indicate otherwise ?

Also do you agree that the pairwise correlation observation is only valid when we have 2 independant vars and not more ?

I wouldn’t call it inconclusive, I would say one can not conclude multicollinearity does not exist. the pairwise table given is based on correlations between 2 independent vars. Pairwise correlation is calculated for each 2 variables seperately and then it is put in a matrix for visualization. Hope that helps :slight_smile:

Thanks :slight_smile: It definitely cleared up my mind. I think the point of view of the correlations viewed separately counts.

Sorry, I was cleaning up the streets-- er, my mock exam wrong answers…

But seriously, this is the question I wrote them about (for a different reason, sort of-- the explanation is still incorrectly stating MC overstates R-squared and the F-test-- probably pretty busy this time of the month). Long story short (there’s a recent thread about it, and an older one), I said MC doesn’t over or understate the F-test or R-squared; the curriculum author agreed…

I think it’s a poorly worded question (still). Given that you have a near perfect correlation between two IV’s you can say collinearity quite possibly an issue (at least) between these two. Given that the t-tests are significant for these two in question, it makes it less condemning, but the sample size isn’t very large (which would help mitigate any MC to an extent)…I think the institute should cover VIFs, which would be another tool in multiple regression for assessing MC.

I think the answer choice regarding explanatory power of the regression is poorly worded for what they are intending to say, making it a possible answer choice. The second regression has a significant F-test, while the first does not (indicating improved model fit, explanatory power of the IVs for the DV). Additionally, the adjusted R-squared is even close to 10 times larger than in the first model. Google the interpretation of R-squared or how you would judge a regression model’s explanatory power-- commonly refers to R-squared (many text books will tell you the same…). Hence, explanatory power of the regression (in terms of the total variation in the DV) has greatly improved (and in a statistically significant way).

The question writer intends for the question to refer to the explanatory power in reference to the relationships between Y and the IVs (can we look at the slope coefficients?). Then, the answer is probably not, because MC is possibly an issue and the effects of the IVs appear to negate one another (this is what I gathered from the responses the Institute gave me). If they change the words “explanatory power” to “explaining the relationships of the DV with the IVs” or something to indicate coefficients, then I would agree with this answer completely-- 2nd regression isn’t better than the first to help us examine the slopes.

Thanks ! That clears up my mind. We’ll wait hopefully they will officially clarify before the exam. Otherwise we have enough tools to judge on the exam day and i hope there will be no wording confusion there !

I don’t know if they will have anything this unclear on exam day (low probability topic and they’re good on wording for the real deal, from what I hear). They did clarify that MC does not affect the F-statistic (r-squared, SER)