All 34 samples were provided by 34 Men and 34 Women. ( I am pasting these data so that we can analyze them in Excel). This said, can we assume that the two RVs are independent? Practically, I believe that they are independent no one individual provided two observations and men on Mars are unrelated to women on Venus. However, if I use the identity to calculate correlation, I find that the correlation is 0.166. This said, should I consider these data independent for hypothesis testing ?
If the samples are drawn from normal distributions, the statistic for testing whether the population correlation is (null hypothesis) equal to zero (covered in Level II) is:
t = [r × √(n – 2)] / √(1 – _r_²)
This statistic follows a t distribution with n – 2 degrees of freedom.
When r = 0.166 and n = 34, t = 0.9523. As this is less than virtually any critical t value (which are generally around 2), we fail to reject the null hypothesis and conclude that the population correlation could be zero, so the samples could be independent.
I have a follow-up question on your approach. I think if the two random samples are independent, then we know that correlation is zero. However, correlation = 0 doesn’t establish Independence. Right? Given this, how do I decide about the independence of the two samples? Should I use Chi-Square test? I am a bit confused. I would appreciate your help.
If X and Y are independent, then they are also uncorrelated. However, if X and Y are uncorrelated, then they can still be dependent. To see two extreme examples of this, let X be uniformly distributed on the interval [−1, 1]. If X ≤ 0, then Y = −X , while if X is positive, then Y = X. The same is true for y=x^2 on the interval of [-1, 1] for x. A zero co-variance implies that no linear correlation exists between the two variabes, but that does not mean that they are also independent. Correlation is a statistic that does not imply causation.