Linear regression

Hello

in a simple linear regression we have

Y=b0 +b1X +ε where ε is a random variable

but when we use a sample

we have

Yi = b0 + b1Xi + εi, i = 1, …, n where εi are random variable too

I’m a little bit lost ε is a random variable and its occurrence εi are random variable too?

and why E(ε)=0 would imply E(εi) for each i?

Suppose that you have three data points:

  • (1, 1)
  • (2, 6)
  • (3, 5)

The regression equation is:

y = 2.0_x_ + 0.0 + ε

In this case:

  • _ε_1 = −1.0
  • _ε_2 = 2.0
  • _ε_3 = −1.0

Yes, as S2000 says, those " i " means the data observations, they can be temporal data or cross-sectional data. For example, set X variable as consumption. So Xi is the " i-th " observation of the sample you got about consumption:

X1 = 50 dollar consumption of Harrogath on August, 2015

X2 = 75 dollar consumption of S2000Magician on August, 2015

Xn = 62 dollar consumption of Javad05 on August, 2015

This is cross-sectional data of X variable.

Time-series data of variable X would be:

X1 = 62 dollar consumption of Javad05 on August, 2015

X2 = 70 dollar consumption of Javad05 on September, 2015

Xn = 128 dollar consumption of Javad05 on January, 2038

Regards

Based on the equation, shouldn’t the signs be flipped on the above ε’s?

(1) = 2.0(1) + 0.0 + ε1

ε1_=-1_

(6) = 2.0(2) + 0.0 + ε2

ε2_=+2_

Absolutely correct.

Mea culpa.

I’ve corrected it.

Thank you guys

but I was not asking about that !

if you look to the assumption of linear regression (6)

-The variance of the error term is the same for all observations: E(ε2i)=σ2ε , i = 1, …, n.

-The error term, ε, is uncorrelated across observations. Consequently, E(εiεj) = 0 for all i not equal to j you can see that εi are not only value but random variable on their own so do I have to dwell on that or move on thank you

The outcome of the roll of a die is a random variable, but once you roll the die, it takes on a specific value.

So it goes with linear regression: the error term is a random variable, but once you run the regression with a specific data set, each random error term then takes on a specific value.