What is the difference between a confidence interval and a prediction interval?

I’m doing the questions from Reading 11, questions 11 and 12. I am wondering what is the difference between a confidence interval and a prediction interval?

I believe prediction intervals are related specifically to regression analysis and confidence intervals relate to a sample.

Nope. As an example, suppose I regress change in S&P vs change in oil prices. A Confidence interval is for giving an estimate of a parameter. For example, I might say that if oil price go up by 5%, the mean change in S&P prices is 1% ± 3%. This statement is about an expectation and how well I have done in my estimation problem. A Predicion intereval is about estimating the variability of an observation. For example, I might say that if oil prices go up by 1% then I am 95% confident that the S&P will change by 1% ± 10%. This statement is about how well I have done the estimation of the mean problem combined with how volatile the S&P is around its mean. Here I am making a statement about something I will actually observe. Prediction intervals are always wider than confidence intervals.

Thanks Joey

Yes, thank you.

Here is my 2cents on your question from reading the text… Confidence interval is one of the methods of testing a hypothesis that either of your regression coefficents (slope or intercept) is different from your hypothesized value. (The other method is using a t-test). So for example if your 62 observation regression expression gave you a slope of 1.5 with a standard error of that slope of 0.2 and you want to test if this slope is significantly different from 1 at a 5% signifificance level, you would use the following formula to create a confidence interval to see if your slope coefficent falls inside this confidence interval => predicted slope +/_ (t-critical) * (Standard Error of the predicted coefficient). (To get t-critical, you look it up on the t-table using a df=n-2 so in this case its 60, and alpha of 2.5% since this is a two tail test of a 5% significance level). Based on the table, we get a t-critical of 2. So the confidence interval in this case is 1.5 +/- (2)*(0.2) = 1.1 --------- 1.9. And since our hypothesis value of 1 falls outside this confidence interval, we can reject the hypothesis that the predicted slope coefficient of 1.5 =1. I hope this made sense. For Prediction Intervals, once again we are testing a hypothesis, but instead of testing regression coefficents – slope (b1) and intercept (b0) – we are actually testing if the predicted value of the regression equation (Y) is different from your hypothesized value. The idea is the same, the only difference is the formula for the interval is Predicted Value (Y) +/- (t-critical)*(standard prediction error) which is that nasty formula (11-15) on page 262 of the first book. When I started writing this, I thought it would be a little easier to explain, but looking back at it, this looks like a hot mess. I hope it made sense.

"For Prediction Intervals, once again we are testing a hypothesis, but instead of testing regression coefficents – slope (b1) and intercept (b0) – we are actually testing if the predicted value of the regression equation (Y) is different from your hypothesized value. The idea is the same, the only difference is the formula for the interval is Predicted Value (Y) +/- (t-critical)*(standard prediction error) which is that nasty formula (11-15) on page 262 of the first book. " Just isn’t true. The regression equation gives you an expectation (in fact, a conditional expectation). When you observe a point, you don’t think it will fall directly on your regression line because there is variability in the point. There is also variability in your estimate of that expectation. The prediction interval packages those two together. Further, C.I.'s aren’t hypothesis tests though you can probably use them that way. They are estimators. Hypothesis tests are tools for decision-making.

Joey, I’m just going by what the text is telling me. I agree with your statement about the regression equation – “The regression equation gives you an expectation (in fact, a conditional expectation). When you observe a point, you don’t think it will fall directly on your regression line because there is variability in the point. There is also variability in your estimate of that expectation. The prediction interval packages those two together.” – I still however stand by my statement of what is a Confidence Interval and Prediction Interval. If you look at the text, Volume 1, Reading 11, p. 251, under Hypothesis Testing, it says: “We can perform a hypothesis test using the confidence interval approach if we know three things: 1) estimated parameter value, b0 or b1, 2) the hypothesized value of the parameter b0 or b1, and 3) a confidence interval around the estimated parameter. A confidence interval is an interval of values that we believe includes the ture parameter value, with a given degree of confidence…” So according to the text, you can perform a hypothesis test using the confidence interval. As far as Prediction Interval, according to the text, Volume 1, Reading 11, p 263-264, under Prediction Intervals, it says: “…Analysts often want to use regression results to make predictions about a dependent variable (Y)…But we are not merely interested in making these forecasts; we want to know how certain we should be about the forecasts’ results…Therefore, we need to understand how to compute confidence intervals around regression forecasts” Therefore, just like with confidence intervals where we test a hypothesis that a regression parameter is other than a hypothsized value at a given significance level, in a prediction interval, we are testing if a regression result is other than a hypothesized value at a given significance level. So although you are absolutely correct in your definition of the prediction interval in your response, I believe, based on the text, that my explanation is valid as well. I welcome further discussion on this if you feel like I am missing something.

"A Predicion intereval is about estimating the variability of an observation. For example, I might say that if oil prices go up by 1% then I am 95% confident that the S&P will change by 1% ± 10%. This statement is about how well I have done the estimation of the mean problem combined with how volatile the S&P is around its mean. Here I am making a statement about something I will actually observe. " is this an F test?

I think an F-test is used to see if any of the slope coefficients in a regression equal to 0, in other words, do any of the independent variables have any explanatory power over the dependent variable.

Care should be shown with respect to confidence intervals. They really mean nothing. Although some “confidence” may be found in this interval, the fact of the matter is that we don’t know the value of the parameter being estimated, we don’t know if it is in this interval, and we can’t assign any real probability of it being in any given interval. It is either in it or not, and we will never know which one it is. On the other hand, we can find some practical use for a prediction interval. Conditional upon your estimation for a certain parameter (assuming it is unbiased), we can use that estimation to make a prediction about the value of a variable dependent upon that parameter. These are two completely different concepts.

ylager Wrote: ------------------------------------------------------- > Joey, I’m just going by what the text is telling > me. I agree with your statement about the > regression equation – > > “The regression equation gives you an expectation > (in fact, a conditional expectation). When you > observe a point, you don’t think it will fall > directly on your regression line because there is > variability in the point. There is also > variability in your estimate of that expectation. > The prediction interval packages those two > together.” > > – I still however stand by my statement of what > is a Confidence Interval and Prediction Interval. > If you look at the text, Volume 1, Reading 11, p. > 251, under Hypothesis Testing, it says: > > “We can perform a hypothesis test using the > confidence interval approach if we know three > things: 1) estimated parameter value, b0 or b1, 2) > the hypothesized value of the parameter b0 or b1, > and 3) a confidence interval around the estimated > parameter. A confidence interval is an interval > of values that we believe includes the ture > parameter value, with a given degree of > confidence…” > > So according to the text, you can perform a > hypothesis test using the confidence interval. > > As far as Prediction Interval, according to the > text, Volume 1, Reading 11, p 263-264, under > Prediction Intervals, it says: > > “…Analysts often want to use regression results > to make predictions about a dependent variable > (Y)…But we are not merely interested in making > these forecasts; we want to know how certain we > should be about the forecasts’ > results…Therefore, we need to understand how to > compute confidence intervals around regression > forecasts” > > Therefore, just like with confidence intervals > where we test a hypothesis that a regression > parameter is other than a hypothsized value at a > given significance level, in a prediction > interval, we are testing if a regression result is > other than a hypothesized value at a given > significance level. > > So although you are absolutely correct in your > definition of the prediction interval in your > response, I believe, based on the text, that my > explanation is valid as well. > > I welcome further discussion on this if you feel > like I am missing something. OK that’s pretty interesting and I definitely see how you got from the book to “For Prediction Intervals, once again we are testing a hypothesis, but instead of testing regression coefficents – slope (b1) and intercept (b0) – we are actually testing if the predicted value of the regression equation (Y) is different from your hypothesized value” The problem is that the CFA stuff is so cavalier with this term “confidence interval”. A C.I. is an interval estimate of a parameter with some probabilities included. That means, as you say, that you can always use a C.I. for some hypothesis test. So then the book says “Therefore, we need to understand how to compute confidence intervals around regression forecasts” and you logically conclude that you are doing a hypothesis test about a regression forecast. The problem of course is that the term C.I. there is inappropriate. A C.I. around a regression forecast is just our usual C.I. around E(Y|X). A prediction interval is not about estimating a parameter but about how much variability is in your prediction of a new observation. Anyway, that’s a fine point and if you’re reading carefully enough to draw that kind of conclusion you’re probably doing very well.