Assumptions of multiple regression

Which one is not correct(only one)? I find it’s hard to distinguish…

  1. A linear relationship exists between the dependent and independent variables.

  2. The independent variables are random, and there is no exact linear relation between any two or more independent variables.

  3. The expected value of the error term is zero.

  4. The variance of the error terms is constant.

  5. The error for one observation is not correlated with that of another observation.

  6. The error term is normally distributed.

Number 1 is incorrect. We would like for our DV and IVs to be correlated, but this is not an assumption.

The rest are below:

  1. assumes a random sample and no perfect collinearity of x-variables

  2. assumes the model has no omitted terms (i.e. we have squared terms to account for curvature when we need it)

  3. assumes homoscedasticity for the error term

  4. assumes no autocorrelation

  5. normality assumption allows for formal testing (regardless of sample size)

In other words 3,4,5, and 6 are usually given the shorthand e ~ iid N(0,Var(e))

errors are independently identically normally distributed with a mean of zero and a variance that is constant for all x

Sorry, number one is correct :slight_smile:

So what’s the answer? Number 2?

Yes, number two is incorrect.

Should be

  1. The independent variables are not random, and there is no exact linear relation between any two or more independent variables.

Ah yes, my mistake. I zoned in on the statement and took it as a statistically significant relationship rather than _ linearity in the parameters _ of the model. I don’t think I noticed they are _ not _ random. Thanks for the catch!

Perhaps I should read more thoroughly indecision

Anyone know if the independent variables are random, will it creat any problem? Thanks!

If you don’t know the value of your independent variables, how can you create a model to predict the dependent variable based on the independent variables? Sure, you can solve a bunch of equations and write down a formula, but in the end you won’t know whether the noise you see is noise in the inputs or noise in the output.

It seems a good waste of time.

Got it. Thanks magician!

You’re welcome.