# R-squared and significance

I remember asking a professor about this at university and they basically just blew me off and said I’m not looking at it right, so after doing quant I figure i should seek confirmation.

A regression can be seen to be very significant, say p=0.000000001, so we could unofficially say it’s “accurate”, but it can still have an R2 of say 10%. In that case it would be “accurate”, but useless. Am I correct in seeing it this way?

High R2 does not indicate a strong correlation, just like low R2 does not indicate an insignificant one. You should be focusing on the significance of your variables. R2 is simply the line of best fit. It just tells you how much of variance is explained by the independent variable.

Imagine a scatter plot of data points sloping upwards. Also, envision these points being far apart from each other. Obviously, there is a relation between the two variables - as X increases, Y increases. So your p-value would be significant, but your R2 would be low because the data points are so far apart.

Any Stats teacher will warn you about using R2 as an indicator for significance. It is relevant when considered along with other metrics, but on its own, it holds little inference value.