Quant- SEE df

Can somebody explain to me why the degrees of freedom for SEE is equal to (n-k-1) or for that matter why RSS df is k? I know I could just memorize it, but the fact that I don’t know this already is a testament to the fact that I don’t really understand this topic. Thanks.

I have no idea… I’m just going to try and remember the formulas well enough to pass:-)

This might help: Degrees of Freedom The size of samples used to generate an estimate is often described in terms of the numbers of degrees of freedom in that sample. There as many degrees of freedom in a sample for a particular estimate as there are independent terms used to calculate that estimate. For example, for a mean there are N degrees of freedom in a sample of size N; for an estimate of variance there are N-1 degrees of freedom in that same sample. (This is because the mean is used to generate the estimate of the variance, and therefore only N-1 of the data points can vary and still give the same mean.) This does not explain it for the SEE, though. CP

Thanks cpk. Your example helps. We are getting closer.

I just memorized it. RSS df = k SEE df = n-(k+1) Total df = n-1 MWVT - was that a typo on your part? n-k+1 right? not minus 1

No JB, it is not a typo. Formula in the book has it as n-(k+1), which translates into n-k-1 hope this helps.

Describing exactly how this works is long and beyond the scope, but the gist is that when you are calculating a sum of squares around a mean you are using it for some variance estimate. If I told you that the observations were generated from the process b0 + b1 * x + b2 * z + error and gave you b0, b1, b2 the df would be n. You would simply be calculating the variability of that error term directly. In the real world you dont know b0, b1, b2 (but only because you don’t have complete access to omniscient Joey who thinks that omniscience trumps good statistics any day). Anyway, you estimate those things using the data you are about to use to estimate the variance. Since you use the same data for both things, you expect your sum of squared errors to be smaller than it would be if I gave you the coefficients because you chose those estimators to make the sse as small as possible. That is the real coefficients have a worse fit to the data than the estimated ones, because you chose the estimated ones to be the best fit. There’s a price for doing that and it’s expressed in losing a degree of freedom for each parameter you estimate. Why you lose exactly one per parameter in least sqaures regression is an interesting topic that requires a blackboard and a lecture, although it is surely on the internet in 16000 places.

jbisback Wrote: ------------------------------------------------------- > I just memorized it. > > RSS df = k > SEE df = n-(k+1) > Total df = n-1 > > MWVT - was that a typo on your part? n-k+1 right? > not minus 1 They are the same: n-k-1=n-(k+1)

see…i told you all i was a complete mathematical idiot in my other post. This is evidence.

Nah. That doesn’t mean anything. We I first saw your post I thought I had it wrong too. I actually had to use an example to confirm it to myself. I am almost to the harder (new) stuff in quant now.