Quick question about residuals–I seem to not be grasping its idea intuitively. The way I see it, the residual is the difference between the actual and the predicted values at each point. Right? Also, how come in the quant section in Schweser, sometimes they will add the residual at the end of the equation for the dependent variable, but sometimes they do not? Thanks very much.

If I understood your question correctly, then my answer is that it is not the residual that is added at the end of the equation. it is the stochastic term that you are talking about. In an SLR the general form is Yi = B0 + B1Xi + ei, where ei is the stochastic term and not the residual. Note that ei is not observable.

HydrogenRainbow Wrote: ------------------------------------------------------- > If I understood your question correctly, then my > answer is that it is not the residual that is > added at the end of the equation. it is the > stochastic term that you are talking about. In an > SLR the general form is Yi = B0 + B1Xi + ei, where > ei is the stochastic term and not the residual. > Note that ei is not observable. The residual is the stochastic error term. The answer to the original question is that the true relationship will be 1) Yi = B0 + B1Xi + ei, but we are proposing the relationship fits some linear model 2) Yi = B0 + B1Xi. We know that it does not exactly follow this, otherwise it would not be a model, but rather a deterministic equation. The true population ei is not observable, but the sample ei is. Therefore, 2) is the starting point for the model, 1) is the actual relationship, and when place a hat over the parameters in 1) once we have estimated their values.

Who wants to bet that the show doesn’t know what the word ‘stochastic’ means? The language is pretty messed up on this stuff. Normally, when you talk about a residual you are talking about the difference between the predicted value from your estimated model and the measured value of the dependent variable. So if I estimate a regression equation of weight = b0 + b1*height, I do the least sqaures thing, take a height measurement, estimate a weight, and each person in my data set has a residual. That 6’ 5" guy who weighs 155 has a huge residual. But note that my estimates of b0 and b1 are just estimators of the true relationship. I can write weight = b0 + b1*height + e where e is an error term to reflect that there is randomness in this relationship (that it is “stochastic”). So each person has some random error and my residuals are decent estimates of each person’s random error from the true relationship. btw - those other two guys understand this stuff very well, so I’m not suggesting there was anything wrong with their answers.

Also remember that we usually exclude the error term because it’s assumed to be 0 over the entire population. If it is not 0, however, then we need to rethink the model. Think of applying a model to stock returns. If everyday we run the model, and over the course of a year the model has, on average, over-predicted the returns by 1%, then the constant term b0 should be adjusted to where the error term is now equal to 0.

clafleur Wrote: ------------------------------------------------------- > Also remember that we usually exclude the error > term because it’s assumed to be 0 over the entire > population. If it is not 0, however, then we need > to rethink the model. Think of applying a model > to stock returns. If everyday we run the model, > and over the course of a year the model has, on > average, over-predicted the returns by 1%, then > the constant term b0 should be adjusted to where > the error term is now equal to 0. The error term is not assumed to be zero. It is a random variable with a distribution, whose expected value is zero, and whose variance is assumed to be finite and constant. You are correct that we can modify the model slightly to correct for a bias if the expected value is not zero, but you should be careful mixing up the terms assumption and expectation. They are completely differenct concepts.

.

nice catch, my bad

wyantjs Wrote: ------------------------------------------------------- > clafleur Wrote: > -------------------------------------------------- > ----- > > Also remember that we usually exclude the error > > term because it’s assumed to be 0 over the > entire > > population. If it is not 0, however, then we > need > > to rethink the model. Think of applying a > model > > to stock returns. If everyday we run the > model, > > and over the course of a year the model has, on > > average, over-predicted the returns by 1%, then > > the constant term b0 should be adjusted to > where > > the error term is now equal to 0. > > > The error term is not assumed to be zero. It is a > random variable with a distribution, whose > expected value is zero, and whose variance is > assumed to be finite and constant. You are > correct that we can modify the model slightly to > correct for a bias if the expected value is not > zero, but you should be careful mixing up the > terms assumption and expectation. They are > completely differenct concepts. I think “…error term because it’s assumed to be 0 over the entire population” is also misleading. The assumption is Average of those error terms is zero. In other words, they are random and the mean square regression line fits in such a way 68% of the time Y values is found around the mean regression line.

Thanks for everyone’s help; I got a lot more info than I bargained for. However, I’m still not sure I can answer my original question. The way I see it, the regression is first modeled with Yi = B0 + B1Xi, and then once we plot it against the true data points, we make it Yi = B0 + B1Xi + ei in order to incorporate the differences between the actual and predicted values?

No. The original model is Yi = B0 + B1Xi + ei, which is the population. Then when you put in the actual data, you’d get the equation Yi hat = B0 hat + B1 hat Xi, where Yi hat, B0 hat, and B1 hat are the estimated parameters. As an example, say you have 3 data points, independent variable is height in meters and dependent variable is weight in kg X 1.4 1.84 1.54 Y 50 76.3 55 If you are keen to find the relationship between height and weight the original model will be Weight = B0 + B1 height + ei (stochastic error) but note that ei is not observable. You plot a line of best fit, and say, you get this : Yi hat = -0.03 + 30* height. Your estimated value of Y for the 1st data point will be -0.03+30(1.4)=41.97kg The residual is Yi-Yihat = 50-41.97=8.03. It is not uncommon to denote the residual as e hat.

The previous posters have excellently explained the nitty-gritties. My intuitive understanding: Notice that when only Yi = b0 + b1Xi is given (without e), it is given with hats ^ on top to indicate that it is an estimate of Yi that the equation will return not the exact Yi. The real-world Yi will then be Y^i + e which is the predicted value + the error term

Everyone is making it harder than it needs to be. In essence when you are estimating a regression you are not assuming you are going to have a residual. Therefore no error term. The estimated equation is noted as a hat as stated in a previous comment. When you actually look at the real data you will have an error term and therefore you add it. I hope this helps

dfmac Wrote: ------------------------------------------------------- > Everyone is making it harder than it needs to be. Well… > In essence when you are estimating a regression > you are not assuming you are going to have a > residual. Therefore no error term. Wrong. > The estimated > equation is noted as a hat as stated in a previous > comment. When you actually look at the real data > you will have an error term and therefore you add > it. > Add it? > I hope this helps Doesn’t

Thanks for adding your insightful comment. When you are using an estimation there is no error term plain and simple. When you use real data points and compare it to the estimate there are obviously errors what don’t you get. Plain and simple.

dfmac, I would like to welcome you to this forum, as I see this is your third post. However, you should be aware of who you are arguing with. I personally am finishing a masters in math, and Joey here has a PhD is Statistics. I am not sure about Hydrogen’s education, but I believe he is a bright person judging from previous posts. You might not want to argue, especially when you are wrong…plain and simple

Thanks for welcoming me. However the point I make is when you make an estimation there is no error term…Plain and simple. You don’t assume in your estimation an error. Maybe I am coming across wrong but I guarantee that is right. The error term doesn’t exist until you run the regression and compare the actual data points to the estimation. Unless there are enough data points then the Central limit theorem will be exhibited in the error term and the error term approaches O as the number of observations increases. I am not trying to argue and you can correct me if I am wrong on this post but I guarantee I am not. I am just addressing the original question on this thread.

“However the point I make is when you make an estimation there is no error term…Plain and simple.” this is plain, simple and wrong. if you didn’t assume that there is stochastic error term, then once you do your regression and get a linear fit, you would be implying that your model is completely deterministic. if i gave you some X, you would produce the dependent variable Y that corresponds to it. what actually happens is that you have a model Y=a+bX+e, where typically e~N(0,sigma), i.e. the error term is normally distributed with mean zero and some finite standard deviation. hence Y~N(a+bX, sigma) which means that Y is normally distributed with mean a+bX, and standard deviation sigma. so if I give you some X, the corresponding Y is not deterministic but stochastic with some distribution, you can build a confidence interval around it. "unless there are enough data points then the Central limit theorem will be exhibited in the error term and the error term approaches O as the number of observations increases. " that doesnt make sense. the error term is a random variable with some distribution. the mean of this distribution is assumed to be zero, and this has nothing to do with the central limit theorem or the number of observations. the central limit theorem here pays a role in relaxing the assumption that the error term is normally distributed

dfmac Wrote: ------------------------------------------------------- > Thanks for welcoming me. However the point I make > is when you make an estimation there is no error > term…Plain and simple. You don’t assume in your > estimation an error. Maybe I am coming across > wrong but I guarantee that is right. The error > term doesn’t exist until you run the regression > and compare the actual data points to the > estimation. Unless there are enough data points > then the Central limit theorem will be exhibited > in the error term and the error term approaches O > as the number of observations increases. I am not > trying to argue and you can correct me if I am > wrong on this post but I guarantee I am not. I am > just addressing the original question on this > thread. Huh??? Of course you assume that there will be an error. The entire premise of linear regression is to minimize the error, which asserts from the beginning that it is present. The CLT plays no role whatsoever in the model’s structure, and the error term does not approach zero as n -> infinity. This is simply wrong. The CLT may be invoked to make assumptions about the distribution of the error term allowing for confindence intervals and significance tests. I suggest you pick up an econometric text book. You have an incorrect understanding of this concept.

Yes you should assume there is going to be an error but not in your estimation. CLT more than applies considering one of the assumptions of the error term is that it is normally distributed and the assumption on normal distribution is the central limit theorem.