 # JoeyDVivre help

i have read alot of your comments and you are great… so here is my question… i am confused when calculating covariance. specifically when to use n and n-1. i know you said that if they give use the mean then we use n. and that if we have to calculate the mean we should use n-1… ok… this goes hand and hand w/ cfai. specifically r 53 problem 1c… whe had to calculate the mean thus when calculating covariance we used n-1… however i am using q-bank for pratice questions and they seem not to follow this concept. an example of this would be when they give us a 2 asset portfolio w/ some returns… we still have to calculate the mean, but they use n. it is getting on my nerves b/c i keep geting the questions wrong according to q-bank. but when looking at cfai i am right… so is there another way to go about this? a more definitive way b/c i seem to be spending 20 seconds on questions just thinking about wherther to use n or n- 1

http://en.wikipedia.org/wiki/Sample_mean_and_covariance If the population mean is known, then use n. If the population mean is unknown and you can have to calculate the sample mean, then you use n-1

^ that’s all correct. The only caveat is that if you have the entire population then x-bar = mu and you would use n.

when you are saying if the population mean is known you mean if it is given in the question right?

Right - it’s not real world. The only time the population mean is really known is if you have the entire population (in which case inferential statistics are not especially useful) or you have some omniscient question writer telling you “the mean is known to be 5”.

ymc Wrote: ------------------------------------------------------- > If the population mean is known, then use n. If > the population mean is unknown and you can have to > calculate the sample mean, then you use n-1 Could you explain why we use n in the case of population and n-1 with sample. I posted a question about the degree of freedom earlier because this was confusing me.

It’s about estimating sigma^2. Suppose that you know mu then your best guess at sigma^2 is sum(X(i) - Mu)^2/n, i.e., the average of the sqaured deviations. But if you don’t have mu then you compute sum(X(i) - X-bar)^2. However, X-bar contains some information about each of the X(i)'s so in some sense the sum of the squared deviations about X-bar is expected to be smaller. It turns out that you can exactly account for that by using n-1 instead of n.

A long story short: sum((X_i-X_bar)^2)/(n-1) is an unbiased estimator for sigma^2, ie expectation of it is exactly sigma^2 Proof: E(sum((X_i-X_bar)^2)/(n-1)) = sum(E(X_i^2) - 2*E(X_i*X_bar) + E(X_bar^2))/(n-1) = sum(sigma^2+mu^2 - 2*(sigma^2/n + mu^2) + sigma^2/n + mu^2))/(n-1) = (n-1)*sigma^2/(n-1) = sigma^2 Exercises for you: 1. Prove E(X_i^2) = sigma^2 + mu^2 2. Prove E(X_i*X_bar) = sigma^2/n + mu^2 3. Prove E(X_bar^2) = sigma^2/n + mu^2 kochunni69 Wrote: ------------------------------------------------------- > ymc Wrote: > -------------------------------------------------- > ----- > > If the population mean is known, then use n. If > > the population mean is unknown and you can have > to > > calculate the sample mean, then you use n-1 > > > Could you explain why we use n in the case of > population and n-1 with sample. I posted a > question about the degree of freedom earlier > because this was confusing me.

Do you really think that was likely to be helpful?

JoeyDVivre Wrote: ------------------------------------------------------- > Do you really think that was likely to be helpful? But this does illustrate the n-1 in sample variance formula has nothing to do with degree of freedom.

Couldn’t disagree more - what is a degree of freedom? Check out the Wiki entry on degrees of freedom for a good explanation