Based on a sample of 12 observation, an analyst is performing a test of the significance (at the 1% level) of a correlation coefficient according to the following set of hypothesis: Ho: r = 0 versus Ha: r does not = 0 The sample correlation coeffcient ® is 0.79. Based on this information, the analyst should: A) Reject the null since t = 4.075 > t(0.01), 12 df = 2.681 B) Reject the null since t = 4.075 > t(0.005), 10 df = 3.169 C) Fail to reject the null since t = 2.075 < t(0.01), 10 df = 2.764 D) Fail to reject the null since t = 2.075 < t(0.005), 11 df = 3.106

without T - table t-calc = 4.075 (r * sqrt(n-2) / sqrt(1-r^2)) and it is a 2 tailed test. so you need 0.005, 10 df so B.

B. t=0.79*sqrt(10) / Sqrt (1-0.79^2)=4.07465 12 - 2 = 10df. Since 4.07 > 3.169 reject the null

t(12-2, 0.01/2) = t(10, 0.005) = 3.169 t-calculated = ®* SQRT(n-2) / SQRT(1 - r^2) = 0.79*SQRT(10)/ SQRT(0.376) = 4.07465 … Falls in rejection region Reject null and df = 10 (not 12) … Is the answer B?? - Dinesh S

I have never seen this formula in my studies. I thought t = X - u / (std. deviation) / Sqrt (n) ? Can someone tell me where i can find this formula in book 1 of Schweser? t-calc = (r * sqrt(n-2) / sqrt(1-r^2)) t=0.79*sqrt(10) / Sqrt (1-0.79^2)

parry test of r (correlation) - page 272 and again in the summary on page 289

ohh now I recall seeing this, its been to long Thanks!

Could someone tell me where is this formula found in Stall notes? Thanks in advance.

In stalla it is in the regression chapter… page 69… CP

You don’T even have to calculate. I directly knew that it has to be B. r>0,7 is alway significant. -> possible answers are A & B It has to be a two tailed test (because of Ha) --> possible answer left B. such a questions takes maximum 30 seconds to answer. So you scored and saved time.Without calculating…

Thanks cp.

definitely B.

When do you use n-1 for degrees of freedom and when do you use n-2 for degrees of freedom? I was all ready to choose D in this case since it had 11 df…

so what is degrees of freedom? 1. Say you are predicting Population Mean (known to be 90) from a Sample of 10. Essentially since you are predicting a known number – you can go ahead and change 9 of these (and the 10th one is a fixed Number —> which would make up for the shortfall at the end of summing up the 9 you changed ----> so your degrees of freedom is 9 (or (n-1) in this case). 2. For the Correlation coefficient – you would need to adjust essentially both for the mean and the standard deviation of the sample to arrive at the mean —> so (n-2) becomes your degrees of freedom. This is I know quite crude, and JoeyDVivre might end up throwing more light on this (or maybe Maratikus can) … but should help to drive the point home. CP

To help remember, can you assume in questions like this (testing the significance of the correlation coefficient) that whenever you see the words “correlation coefficient” you should use n-2 degrees of freedom? Will that always be the case?

yep that is about right…

cfaisok: I like your reasoning for this question. Infact, there are several different ways in which one could attempt to answer this question without any further calculation. For example, right off the bat, we know that df=10. Then, all that is left is a one tailed test © and a two tailed test (B). Obviously its a two-tailed test so the answer must be B. It wasn’t even necessary to know the formula, either way you look at it.

jalmy8, that’s a nice explanation, too. We should give together some quant classes

You can think of degrees of freedom as the amount of “information” contained by the data you’re working with. You get more degrees of freedom by having a higher N in your sample, you “use up” degrees of freedom every time you compute a statistic like a mean or a standard deviation (SD actually uses up 2, because you need to compute the mean before you can compute the SD). Calculating a regression coefficient (slope) and an intercept also use up one df each. Each time you calculate a statistic (a statistic is technically a single number that you calculate by doing operations on the elements of dataset), you effectively “use up” a degree of freedom, which means that there are fewer additional numbers that you can compute before everything becomes fixed by everything else you know about the data. For example, as above, if you know the mean and 9 out of 10 data points, you can compute the 10th. If you know the mean AND the SD, then you only need 8 out of 10 data points, b/c the 9th and the 10th have to be numbers that will give you the right mean and SD, etc. It’s good to have lots of degrees of freedom, because the more “unused” degrees of freedom you have left over in your analysis, the more accurately you can estimate the things that you did estimate. This works its way into your calculations in the form of smaller standard errors, tighter confidence intervals, etc… You don’t really need to understand degrees of freedom fully for the exam - or even most jobs that have statistical analysis - just think of dfs as a kind of accounting mechanism for how much information you can squeeze out of a dataset and then remember how many get used up in each type of test you apply (the last is the most important to remember for the exam)

great explanation…thanks =)