 # Confidence Intervals

From Schweser, Example: Using a 20-year sample, the average return of a mutual fund has been 10.5 percent per year with a standard deviation of 18 percent. What is the 95 percent confidence interval for the mutual fund return next year? Here the point estimates for and s are 10.5 percent and 18 percent, respectively. Thus, the 95 percent confidence interval for the return, R, is: 10.5 ± 1.96(18) = –24.78 percent to 45.78 percent. Symbolically, this result can be expressed as: P(–24.78 < R < 45.78) = 0.95 or 95%. --------------------------------------------------------------------------------- The way I understand this, the confidence interval takes the form for sample sample statistic +/- reliability factor X standard error of statistic for population parameter +/- reliability factor X standard deviation In the above example, they clearly stated they have a sample, but they went ahead and constructed the confidence interval as if it’s the population…any thoughts?

wait, nevermind, I think I got it…

actually, I didn’t get it…

You used the wrong formula It should be 10.5+/-(1.96*18/sqrt(20)) so 2.611% to 18.389%

I would interpret their standard deviation of 18% as the standard error because distribution parameters are estimated (not known apriori). However, instead of 1.96 which is a 2.5% quantile of normal distribution they should’ve used 2.5 % t-stat with 19 degrees of freedom (equal to 2.093).

ymc, confidence interval is for return, not its average.

Ah, if 10.5% and 18% are from the sample but not from the population, then you should use t-statistics 10.5+/-0.688*(18/sqrt(20)) 7.73% to 13.27%

ymc, either you make a lot of mistakes inadvertently (we’re all humans) or you’re trying to confuse the hell out of people. I’d rather believe the former. Please take your time before you answer questions, for the sake of us all. just my 2 cents.

Oops, looked up the wrong table… 10.5+/-2.093*18/sqrt(20) 2.076% to 18.92% How come the Schweser notes don’t have to be divided by sqrt(sample size)? ymc Wrote: ------------------------------------------------------- > Ah, if 10.5% and 18% are from the sample but not > from the population, then you should use > t-statistics > > 10.5+/-0.688*(18/sqrt(20)) > > 7.73% to 13.27%

you have to divide by sqrt(20) if you estimate AVERAGE return.

maratikus Wrote: ------------------------------------------------------- > you have to divide by sqrt(20) if you estimate > AVERAGE return. Oh I see. Thanks for correcting me. Isn’t it great to help two people in one thread? maratikus, “I would interpret their standard deviation of 18% as the standard error because distribution parameters are estimated (not known apriori). However, instead of 1.96 which is a 2.5% quantile of normal distribution they should’ve used 2.5 % t-stat with 19 degrees of freedom (equal to 2.093).” That’s probably the only way you can make their example sound correct, but therein lies a problem with this assumption. standard error is the standard deviation of the sample statistic, but from the sound of the question, it sounds very much like that’s just the standard deviation of that particular sample…standard error should be the deviation of the sample statistic calculated from many different random samples drawn, not the deviation of the members of 1 sample from the mean of that particular sample… here’s another question from schweser, Construct a 90 percent confidence interval for the starting salaries of 100 recently hired employees with average starting salaries of \$50,000 and a standard deviation of \$3,000 assuming the population has a normal distribution. A) 50000 +/- 1.65(300). B) 50000 +/- 1.65(3000). C) 30000 +/- 1.65(5000). D) 50000 +/- 1.65(30000). Your answer: A was incorrect. The correct answer was B) 50000 +/- 1.65(3000). 90% confidence interval is X ± 1.65s = 50000 ± 1.65(3000) = \$45,050 to \$54,950 The concept of confidence interval was first introduced in reading 9 (without too much details) which used the form Xbar +/- reliability factor X standard deviation, then it was somewhat restated in reading 10 with standard error replacing the standard deviation…and the questions for reading 9 uses the first form, while the questions from reading 10 uses standard error… I do agree you should use t-distribution instead tho, but this was from reading 9, so they had not introduced it at that time… “ymc, confidence interval is for return, not its average.” what do you mean by this? by average, I think they meant the mean of that sample…which is what you can build a confidence interval around, no?

In the above example, if we use 100 employees as the sample and get their average 50,000 and standard deviation (standard error) of \$3,000 and use normal distribution assumption we know that: 1) average salary has normal distribution with mean \$50,000 and standard deviation of \$3,000/sqrt(100) = \$300 2) salary has normal distribution with mean \$50,000 and standard deviation of \$3,000 does it make sense?

I think I got it, notice in the 2 examples I posted, neither asked to construct a confidence for the population mean, rather they were, 1. “What is the 95 percent confidence interval for the mutual fund return next year?” 2. “Construct a 90 percent confidence interval for the starting salaries of 100 recently hired employees” Basically, they’re asking about with a normal distribution defined by mean and standard deviation, what was is the range of possible values for: 1. what is the range for mutual fund return next yr, 2. what is the range of starting salaries based on the distribution defined by the mean and standard variation You would only use standard error if you’re trying to use the statistic to infer about the population parameter, neither of the 2 questions above asked for that, that’s why it doesn’t use the standard error.

liaaba Wrote: ------------------------------------------------------- > From Schweser, > > Example: Using a 20-year sample, the average > return of a mutual fund has been 10.5 percent per > year with a standard deviation of 18 percent. What > is the 95 percent confidence interval for the > mutual fund return next year? > > Here the point estimates for and s are 10.5 > percent and 18 percent, respectively. Thus, the 95 > percent confidence interval for the return, R, > is: > > 10.5 ± 1.96(18) = –24.78 percent to 45.78 > percent. > > Symbolically, this result can be expressed as: > > P(–24.78 < R < 45.78) = 0.95 or 95%. > Holy cow - more messed up Schweser. Rule #1 - Nobody at Schweser knows anything about statistics. There are at least four errors there. 1) As was implied above, there is no parameter being estimated by a statistic so there is no confidence interval. What they are asking for is a “prediction interval” which afaik is not in any LOS. 2) Their calculations (which are wrong) rely on the distribution being normal. They forgot or something to tell you that the observations were normal. 3) As someone else pointed out, in small samples you would need to use a t instead of a z. 4) As this is a prediction interval question, you have to take into account two different sources of error - the error estimating the mean of the distribution by using X-bar and the error in estimating sigma by using the sample standard deviation. The latter is handled by using the t-distribution instead of the z and the former is handled by adding a factor in the width of the interval. The prediction interval is given by X-bar ± t*s*Sqrt(1 + 1/n) if you know the data is normal. If Schweser’s answer confused you that’s a good sign. Could someone who has actually purchased Schweser e-mail them my corrections to the errata dept.? Thanks - B

maratikus Wrote: ------------------------------------------------------- > In the above example, if we use 100 employees as > the sample and get their average 50,000 and > standard deviation (standard error) of \$3,000 and > use normal distribution assumption we know that: > > 1) average salary has normal distribution with > mean \$50,000 and standard deviation of > \$3,000/sqrt(100) = \$300 > 2) salary has normal distribution with mean > \$50,000 and standard deviation of \$3,000 > > does it make sense? Not only does it make sense but it’s all correct. liaaba Wrote: ------------------------------------------------------- > I think I got it, notice in the 2 examples I > posted, neither asked to construct a confidence > for the population mean, rather they were, > > 1. “What is the 95 percent confidence interval for > the mutual fund return next year?” > Right but as pointed out above, that’s just nonsense. A C.I. is about estimating a parameter and there is no parameter being estimated. > 2. “Construct a 90 percent confidence interval > for the starting salaries of 100 recently hired > employees” > This one is even more nonsensical. Somehow we’re trying to make inferences about a sample that we could pull up in Excel? Statistics is about using data to make inferences about populations. If they left out the 100 employee part and said something like “salaries are normal with mean 50k and std dev 20k calculate a 90% prediction interval for the salary of a new hire” then you would give their answer. Note you dont need to adjust for uncertanty in X-bar or s because they have given you population values in that omniscient question writer kinda way. > Basically, they’re asking about with a normal > distribution defined by mean and standard > deviation, what was is the range of possible > values for: 1. what is the range for mutual fund > return next yr, 2. what is the range of starting > salaries based on the distribution defined by the > mean and standard variation > > You would only use standard error if you’re trying > to use the statistic to infer about the population > parameter, neither of the 2 questions above asked > for that, that’s why it doesn’t > use the standard error. That last sentence just warms my heart. Good job.

"1) As was implied above, there is no parameter being estimated by a statistic so there is no confidence interval. What they are asking for is a “prediction interval” which afaik is not in any LOS. " —In reading 9, LOS g states “Construct and explain confidence intervals for a normally distributed random variable and interpret the probability that a normally distributed random variable takes its value inside the constructed confidence interval.” And schweser had for it, “The 90% confidence interval for X is +/- 1.65s. The 95% confidence interval for X is +/- 1.96s. The 99% confidence interval for X is +/- 2.58s.” with the example given above… And from CFAI text, it had pretty much the same content covering it too…I think they meant to teach prediction interval, but they put this in instead… "4) As this is a prediction interval question, you have to take into account two different sources of error - the error estimating the mean of the distribution by using X-bar and the error in estimating sigma by using the sample standard deviation. The latter is handled by using the t-distribution instead of the z and the former is handled by adding a factor in the width of the interval. The prediction interval is given by X-bar ± t*s*Sqrt(1 + 1/n) if you know the data is normal. " In the CFAI notes, for the explanation given to the calculation of the standard error when sigma is not known, they said there’s a finite population correction factor = sqrt[(N-n)/(N-1)] that should be included, but said "practitioners usually do not apply it when n is <5% of N (n is size of sample, whereas N is size of distribution)…don’t know if this is the same as the factor you mentioned above…

liaaba Wrote: ------------------------------------------------------- > "1) As was implied above, there is no parameter > being estimated by a statistic so there is no > confidence interval. What they are asking for is a > “prediction interval” which afaik is not in any > LOS. " > > —In reading 9, LOS g states > > “Construct and explain confidence intervals for a > normally distributed random variable and interpret > the probability that a normally distributed random > variable takes its value inside the constructed > confidence interval.” > I would like to give them some benefit of the doubt, but that’s just wrong. Are the readings right? > > And schweser had for it, > > “The 90% confidence interval for X is +/- 1.65s. > > The 95% confidence interval for X is +/- 1.96s. > > The 99% confidence interval for X is +/- 2.58s.” > > with the example given above… > > And from CFAI text, it had pretty much the same > content covering it too…I think they meant to > teach prediction interval, but they put this in > instead… > > "4) As this is a prediction interval question, you > have to take into account two different sources of > error - the error estimating the mean of the > distribution by using X-bar and the error in > estimating sigma by using the sample standard > deviation. The latter is handled by using the > t-distribution instead of the z and the former is > handled by adding a factor in the width of the > interval. The prediction interval is given by > X-bar ± t*s*Sqrt(1 + 1/n) if you know the data is > normal. " > > In the CFAI notes, for the explanation given to > the calculation of the standard error when sigma > is not known, they said there’s a finite > population correction factor = sqrt[(N-n)/(N-1)] > that should be included, but said "practitioners > usually do not apply it when n is <5% of N (n is > size of sample, whereas N is size of > distribution)…don’t know if this is the same as > the factor you mentioned above… No that’s for taking a sample from a finite population when your sample is som signifcant proportion of the population size.

>>>"I would like to give them some benefit of the doubt, but that’s just wrong. Are the readings right? " —in readings 9, this is the reading when they first introduce confidence interval, first, they mention: approximately 68% of observations is b/w miu +/- 1 sigma approximately 95% of observations is b/w miu +/- 2 sigma … (which are correct) then, they said, “In general we do not observe the population mean or the population standard deviation of a distribution, so we need to estimate them. we estimate the population mean, miu, using the sample mean, xbar, and estimate the population standard deviation, sigma, using hte sample standard deviation, s. using the sample mean and the sample standard deviation, to estimate the population mean and population standard deviation, respectively, we can make the following probability statements about a normally distributed random variable X, in which we use the more precise numbers for standard deviation in stating intervals.” And they basically showed, > "The 90% confidence interval for X is +/- 1.65s. > > The 95% confidence interval for X is +/- 1.96s. > > The 99% confidence interval for X is +/- 2.58s. These are ALL from reading 9, where I think they really meant to teach prediction interval, and they used it for that purpose as well (and they never used standard error). Then in reading 10, they introduce central limit theorem and standard error, then they showed the correct confidence levels…

According to wikipedia, http://en.wikipedia.org/wiki/Prediction_interval 10.5+/-2.093*18*sqrt(1+1/20) -28.1% to 49.1% Am I right now?