standard error problem

financedude · December 3, 2008, 4:47pm

An airline was concerned about passengers arriving too late at the airport to allow for the additional security measures. Based on a survey of 1,000 passengers, the mean time from arrival at the airport to reaching the boarding gate was 1 hour, 20 minutes, with a standard deviation of 30 minutes. If the airline wants to make sure at the 95 percent confidence level that passengers have sufficient time to catch their flight, how much time ahead of their flight should passengers be advised to arrive at the airport? A) One hour, fifty minutes. B) Two hours, thirty minutes. C) Two hours, ten minutes. D) Two hours, forty-five minutes. Your answer: A was incorrect. The correct answer was C) Two hours, ten minutes. We can use standard distribution tables because the sample is so large. From a table of area under a normally distributed curve, the Z value corresponding to a 95 percent, one-tail test is: 1.65. (We use a one-tailed test because we are not concerned with passengers arriving too early, only arriving too late.) Here, we do not divide by the standard error, because we are interested in a point estimate of making our flight. The answer is One hour, twenty minutes + 1.65(30 minutes) = 2 hours,10 minutes. ------------------------------------------ I don’t understand why we don’t divide by the standard error. Can someone tell me when to not divide by the standard error? Why does being interested in a point estimate of making our flight matter? What else would we be interested in?

ludwig.wittgenstein · December 3, 2008, 5:46pm

hi, you are not interested in the standard deviation of the mean, but of the population. you want to be sure that all passengers make it, NOT that the mean passenger is right on time.

financedude · December 3, 2008, 5:59pm

hmmm…I think I might be getting it…thanks, but is there another way of looking at it? Any other takers on this?

ridgefield · December 3, 2008, 6:00pm

i understand why you don’t use the standard error, but why do you use 1.65 - isn’t that for a 90% CI?

ludwig.wittgenstein · December 3, 2008, 6:02pm

it’s very easy if you think of it. what’s the mean age in this forum? what’s the standard deviation? If you want to be 95% sure that a randomly sampled person, is younger than x, which number would you choose? mu + 1.65*sd mu + 1.65*sd/sqrt(number of persons writing here)

financedude · December 3, 2008, 6:15pm

perfect, thanks!

ludwig.wittgenstein · December 3, 2008, 6:22pm

ridgefield Wrote: ------------------------------------------------------- > i understand why you don’t use the standard error, > but why do you use 1.65 - isn’t that for a 90% CI? 1.65 ~ 95%-Quantile of standard normal. meaning 5% prob mass is to the right of 1.65. and to the left of -1.65 meaning 90% is in between. (so you get your 90%)

boston21 · December 3, 2008, 6:25pm

Am I missing something here? Why are we using 1.65 when the interval is 95%? I am assuming there is a type somewhere…

ludwig.wittgenstein · December 3, 2008, 6:30pm

5+5=10

ridgefield · December 3, 2008, 6:56pm

got it - it’s one tailed

bpdulog · December 3, 2008, 7:54pm

I’m still confused by the logic of using or not using the SE. In the example above, is SE avoided because the mean is a fact and not an estimation of the mean?

revenant · October 12, 2009, 6:27am

Sorry for bringing it back up again. I was searching for threads on standard error when I chanced upon here. Not wanting to start a new thread and felt that this was a great question, I decided to ask my question here. 1) Does someone elaborate the concept of when do we divide by SE and when by SD? 2) Isn’t SD of sample = SE? Thanks.

rus1bus · October 12, 2009, 7:50am

I will answer your 2nd question first and use that explanation to answer your 1st question. 2) Isn’t SD of sample = SE? The answer is NO. I guess an example would be better to explain this. There are totally 1000 L1 candidates registered for Dec 2009 exam. Their Mean age is 28. And age of the youngest candidate is say 20 and that of eldest candidate is say 50. So, this is your POPULATION with Mean 28 and SD of 30 (i.e. 50 - 20). (though it is not a simple subtraction, but doing it just to give an idea of width) Next, say you dont have the time to get ages of all candidates and instead, you decide to take a sample of 100 candidates in NYC area. Their mean age comes out to be 25. And youngest and eldest in this Sample is 21 and 49. So, this is one SAMPLE with Mean of 25 and SD of 28 (i.e. 49 - 21) Now, lets say, there are 20 such samples (though you dont need more than 1 sample). If you plot a distribution of MEAN Ages from all these 20 Samples, it will have some Mean and some Deviation of its own. Deviation of such a distribution is known as STANDARD ERROR. And by formula, SE in this case would be 30 / sqrt of 100 = 3 And if Pop SD was not known, then it would be 28 / Sqrt of 100 = 2.8 So, the answer is, SD of Population or a Sample is the spread of values in that Population or Sample. Whereas, SE is the spread of MEANs coming from various Samples from a given Population. Always remember that, Population or Sample Distribution will be lot more thicker than the Distribution of MEANSs coming from various samples. Hence, SD of Population or Sample will be MUCH MORE than the SE. Now your 1st Question: 1) When do we divide by SE and when by SD? When you have your Sample Mean and you want to know the extent to which this Sample Mean represents (estimates) the Population Mean, you will use SE. In all other cases, you will either use SD of Population or SD of Sample, whichever is available. Hope this clarifies. I will be glad to explain further if it is not so clear.

JoeyDVivre · October 12, 2009, 4:03pm

OK - messed up thread… 0) We have wittgenstein commenting on mathematics and everyone knows that is just to honk off Bertrand Russell and all other real mathematicians. 1) The first problem is bogus, so if you are having trouble with it, it might be because the problem stinks. The real answer is E) unknown because we dont know anything about the distribution of time it takes a passenger to reach the boarding gate after arrival. The calculation in the answer assumes that it is normally distributed. That doesn’t sound likely to me and would surely need to be stated (for one thing it has a minimum which is something like the time it would take Usain Bolt to cover the distance unimpeded by security gates and what-not - like OJ in the old Avis commercials that you are all too young to remember). 2) Rus1Bus is on the right track, but his explanation is a little messed. To fix up the example - forget about a finite or known sized population, except to assume that it is much bigger than the sample size. The standard deviation is always much smaller than the range of the data. In a normal population of size 1000, the standard deviation will be something like 1/5 or 1/6 of the range so the standard deviation here would be something like 5 or 6, not 30. So now we take samples of size 100 from the population and take the mean of each sample. The samples are randomly chosen, so the sample means are random. If they are random, they have a distribution called the sampling distribution. Some math or some careful though says that if the sample has a standard deviation, then the sampling distribution has a standard deviation. Then some math or a carfeul look at the CFA notes says that the standard deviation of the sampling distribution is (sd of sample)/sqrt(sample size). That’s the standard error. The really cool thing is that the sampling distribution is normally distributed “regardless” of the distribution of the ages of the candidates - the central limit theorem. That means that standard error is really valuable for deciding how close your sample mean is likely to be to the actual unknown population mean. The known vs unknown sd is not much of an issue. When would you ever know the population sd but not know the population mean? (only when some oracle of a problem writer tells you)