I think you’re on the right track - for simple regression they are essentially the same thing but correlation can’t be used as easily to find R squared in multiple regression.
for simple regression you can calc r squared two ways: 1.) the R squared is simply the square of the correlation between the two variables. This sort of makes sense, if you think about it. The only thing we know about two variables and their movements is their correlation, and simple regression does the same thing. Therefore, the correlation between predicted values of Y and actual values of Y is dependent on the independent variable. and 2.) the ratio of explained to total variance. We can calculate the correlation between the two variables, and t he square root of the r squared, and they should be the same.
In multiple regression, only the second method is accurate for determining R2. This makes sense, because correlation is only between two variables or sets of data. You can’t have a correlation among 4 different independent variables and the dependent, so we dont have a single correlation measure among all variables (ie, there is no way to take the correlation measure and square it to get R squared).
Finding R squared with mult regression is done by taking the total explained variation (ie, the variance of the predicted minus actual variation) divided by total variance (actual minus average). This gives us a measure of overall “fit” - if we take the square root of that, we get the correlation between the predicted and the actual.
Hopefully that sheds some light