Sample Variance

As I understand it… we have two formulas. One is for the “sample variance”, and the other is for the “estimation of a population variance”. (Correct me if I’m wrong here) In the first one, we divide over “n”. And in the second one, we divide over “n-1”. Can someone explain how the “n-1” gives a better estimation? This is very bizarre… i don’t see why do we treat the sample any different from the population. Thanks guys!

Use the search function. Was definitely answered very well by someone sitting for the Dec exam last year.

As the sample is drawn for population and in many cases a representative of population. So as the sample is more likely to underestimate the number of outliers by doing n-1 in calculation sample std or variance we get a higher standard deviation to account for the distortions in data not captured by the sample. As a rule the sample std will be higher than the population std.

As we say in statistics, dividing by n-1 produces an unbiased estimation, whereas dividing by n does not. Daniel