Assumptions of multiple regression

FrankCFA · May 13, 2014, 3:15am

Which one is not correct(only one)? I find it’s hard to distinguish…

A linear relationship exists between the dependent and independent variables.
The independent variables are random, and there is no exact linear relation between any two or more independent variables.
The expected value of the error term is zero.
The variance of the error terms is constant.
The error for one observation is not correlated with that of another observation.
The error term is normally distributed.

tickersu · May 13, 2014, 3:35am

Number 1 is incorrect. We would like for our DV and IVs to be correlated, but this is not an assumption.

The rest are below:

assumes a random sample and no perfect collinearity of x-variables
assumes the model has no omitted terms (i.e. we have squared terms to account for curvature when we need it)
assumes homoscedasticity for the error term
assumes no autocorrelation
normality assumption allows for formal testing (regardless of sample size)

In other words 3,4,5, and 6 are usually given the shorthand e ~ iid N(0,Var(e))

errors are independently identically normally distributed with a mean of zero and a variance that is constant for all x

FrankCFA · May 13, 2014, 6:38am

Sorry, number one is correct

kjsgbp · May 13, 2014, 1:18pm

So what’s the answer? Number 2?

FrankCFA · May 13, 2014, 2:53pm

Yes, number two is incorrect.

Should be

The independent variables are not random, and there is no exact linear relation between any two or more independent variables.

tickersu · May 13, 2014, 3:27pm

Ah yes, my mistake. I zoned in on the statement and took it as a statistically significant relationship rather than _ linearity in the parameters _ of the model. I don’t think I noticed they are _ not _ random. Thanks for the catch!

Perhaps I should read more thoroughly

FrankCFA · May 13, 2014, 5:35pm

Anyone know if the independent variables are random, will it creat any problem? Thanks!

S2000magician · May 13, 2014, 6:32pm

If you don’t know the value of your independent variables, how can you create a model to predict the dependent variable based on the independent variables? Sure, you can solve a bunch of equations and write down a formula, but in the end you won’t know whether the noise you see is noise in the inputs or noise in the output.

It seems a good waste of time.

FrankCFA · May 14, 2014, 2:39am

Got it. Thanks magician!

S2000magician · May 14, 2014, 3:51am

You’re welcome.