Hi
Any one can tell how to determine the degrees of freedom of a population/sample in statistics ?
examples will be better.
Thanks in advance.
Birendra
Hi
Any one can tell how to determine the degrees of freedom of a population/sample in statistics ?
examples will be better.
Thanks in advance.
Birendra
The number of degrees of freedom is the sample/population size minus the number of coefficients/statistics you’ve calculated.
For example, if you have a sample of 50 data points and you’ve calculated the mean (one statistic), then you have 50 – 1 = 49 degrees of freedom. If you calculated the mean and standard deviation (two statistics), you have 50 – 2 = 48 degrees of freedom.
If you have a multiple regression with 5 independent variables, you have a sample of 35 data points, and yo have computed an intercept plus slope coefficients for all of the independent variables then you have 6 coefficients (one intercept and 5 slopes), so you have 35 – 6 = 29 degrees of freedom.
Degrees of freedom measures the no. of independent things. We take away the degrees of freedom as we introduce constraints .
E.G Chi square distribution with n degrees of freedom is the sum of the squares n independent standard normal distributions.
Yes, degrees of freedom is a kind of information accounting to figure out how much information (and its reliability) you can extract from your data set.
For each point of data (each case) in your dataset, you gain one degree of freedom. For each statistic that you calculate on the dataset, you lose a degree of freedom. If a statistic requires calculating an intermediate statistic (such as the standard deviation requiring you to calculate the mean as an intermediate calcuation), you subtract one for each calculation. Generally, the more degrees of freedom you have left over after that accounting, the more reliable the statistic and the conclusions drawn for it - other things equal.
For purposes of finding the reliability of a statistic, only the degrees of freedom used up in calculating that statistic are counted - i.e. if you calculate the standard deviation of something, the fact that you calculate some other statistic elsewhere is irrelevant, unless it is required as an input for calculating that statistic. In other words, each statistic has its own independent accounting mechanism for degrees of freedom. If you go into it, what it means is that when you learn how to calculate a statistic, you must also learn to calculate the degrees of freedom that goes along with it.
How many degrees of freedom are left over once you’ve done this accounting tells you something about how reliable your estimates are. For example, if you have two (x,y) data points, you can always fit a line to go through them, but that estimate is not very reliable because you can ALWAYS draw a best fit line going through two points. If you have 30 points and choose a best fit line, you are more confident that this is better fit (or more accuarately, your confidence intervals are smaller), because you have 30 data points and 2 statistics (slope and intercept), and therefore 28 degrees of freedom captured by the global F test used for checking the reliability of a regression (vs sampling error; there are other things that can make the regression unreliable too). Depending on how much dispersion there is around the best fit line, the fit may be good or bad, but if you have more degrees of freedom left over, the same degree of dispersion will usually result in a more confident fit.