C.I., std. dev., s.e., etc

JoeyDVivre · December 5, 2008, 1:27pm

There’s some junk on the LI forum about this stuff. For some reason there seem to be a bunch of people who think they know this stuff who are drowning out people like bchadwick who really do. So I’m pulling credential - my Ph.D. in Stats very likely trumps your read of the CFAI material (but you should listen to bchadwick, maratikus, some other people who know this stuff completely too). Big question: std/Sqrt(n) vs std We divide by the sqrt(n) when we are estimating a C.I. for a mean or average. We don’t divide by sqrt(n) when we are estimating a C.I. for an individual observation. Rehashing an example I gave the other day: bchadwick sat at a Walmart shopping aisle on Black Friday and sampled 625 shoppers receipts. He found the following statistics for amount of money spent: X-bar = $200 std = $70 a) Find a 95% C.I. for the MEAN or AVERAGE amount spent by Walmart shoppers. So think about what you are doing here. bchadwick observed only 625 shoppers out of the gajillions trampling people to get in the stores. We are assuming that he has a random sample (yeah, right) from the entire population. Notice that I haven’t told you about normality, but since we have a big sample we can still do the problem using the CLT. Notice also that those numbers that bchadwick collected are sample statistics not population values. He got them by saying “Excuse me Ma’am, I’m a CFA Charterholder, professor, social scientist type doing research so please hand over your receipt so you don’t bias my sample”. Then he calculated the numbers. No omniscient population numbers here. answer: X-bar ± z*s/Sqrt(n) => 200 ± 1.96*70/Sqrt(625) b) Find a 95% C.I. for the amount spent by an individual shopper assuming that the amount spent by shoppers is normally distributed. So this is very different. I want a C.I. about a single shopper, not the average of all shoppers. To do that, I need to know the distribution and the only distribution that makes sense here is the normal distribution. Now we don’t want that sqrt(n) thing. The only other question that remains here is the t vs z thing. With a sample of 625, this doesn’t make a difference but with small sample it does (a t with 624 df is essentially a z). The rule here is that you would always use a t unless you knew the population std. dev. and didn’t have to estimate it. I can’t imagine when you would ever know the pop. std. dev but be estimating the population mean (like I can’t even make up a problem like that from the real world), so you should always use a t unless it’s off the charts so you substitute a z. Ans: X-bar ± t[624 df]*s => (approx) X-bar ± z*s => 200 ± 1.96*70 c) Find a 95% C.I. for the MEAN or AVERAGE amount spent by Walmart shoppers given that the amount spent is normally distributed. We know more than we knew in a) (the normally distributed part), but the answer is the same. Since we didn’t have to use the CLT, we can say that this answer is “exact” and that one was “approximate” but on the CFA exam (and everywhere else in life) that is not an important distinction. d) Find a 95% C.I. for the MEAN or AVERAGE amount spent by Walmart shoppers given that the amount spent is distributed according to a Singh-Moddola distribution. I used Singh-Moddola distns on a project once on modelling meteorite composition. Very cool. Answer is the same as a) because CLT beats Singh-Moddola anytime. I still think meteorites are really cool. e) Find a 95% C.I. for the amount spent by an individual Walmart shopper given that the amount spent is normally distributed and the population std. dev is 60. So here we have the omniscient test writer telling you the population std. dev… This is a strange thing and is a very artificial problem. Here we abandon the t-distribution because we are not estimating s. Ans: X-bar ± z*sigma => 200 ± 1.96*60 f) Find a 95% C.I. for the MEAN or AVERAGE amount spent by Walmart shoppers given that the amount spent is normally distributed and the population std. dev is 60. Again, abandon t. Ans: X-Bar ± z*sigma/Sqrt(n) g) Find a 95% C.I. for the MEAN or AVERAGE amount spent by Walmart shoppers given that the population std. dev is 60. This problem actually shows up sometimes and I never quite know what to do with it. On the one hand, you are using the CLT because you don’t know about the distn and on the other the omniscient test writer has given you a population std. dev. A t stat is more conservative, but you are being conservative on something approximate anyway. I would use a z, but reasonable people disagree.

Rolo_Tumassey · December 5, 2008, 1:35pm

you’re an impressive and unselfish presence on this board. props to you.

perdition · December 5, 2008, 1:47pm

Well, that just about clears that issue up then! Tks for the clarification Joey and taking the time out to offer advice over the last few months. If I had not visited AF, I would never have realised how much work was required for level 1. It has been a great resource in my studies. Cheers.

Damil4real · December 29, 2008, 2:07pm

Freakin’ genius you are, Joey, it’s freaky!

Trogdor · December 29, 2008, 2:41pm

All hail Ph.D wizard! Great job, Joey! There is one trick that helps me analyze stdev’s, maybe it will be useful to others as well. The standard deviation for a single observation is always larger than one for an average. Why because if you have multiple data points (required for average), you tend to get results bunched up together. To compensate for this effect you divide std by sqrt(n).

Mel_2008 · December 29, 2008, 6:05pm

Joey, could you please take a look at my post today? I am trying to figure out if my question is related to this same thing but I don’t quite see it. Thanks I would really appreciate your help.