I found a very interesting assumption of a multiple regression model. It states:
“The independent variables are not random, and there is no exact linear relation between any two or more independent variables”
What exactly is this trying to say? It is saying that there is no exact linear relation between the variables, but there is some linearity… Doesn’t this imply multicollinearity?
I could be wrong, but i think ‘some’ linearity between the independent variables could be because of spurious correlation i.e. trying to explain the price of paper through annual growth in GDP and annual growth in tree height… GDP more often than not, grows a little every year, so does a tree, but one can’t help explain the other. Hence, the positive ‘correlation’ between the two is spurious…
Edit: this is a very silly example, I know…
I’m not sure when multicollinearity becomes a problem: when ρ = 0.6? 0.7? 0.8? 0.9?
Clearly, if ρ = 1.0, there’s a problem. There’s likely a problem when ρ is slightly less than 1.0, but it’s hard to make an assumption like, “There’s no approximate linear relation between any two or more independent variables” without having to define what level of approximation is close enough.
I, for one, wouldn’t worry about it. For the exam, at least.
Two or more independent variables are correlated ->Multicollinearity
Will cause Coefficients are consistent (but unreliable). Standard errors are overestimated. Too many Type II errors.
The statement implies that there is no perfect relationship between two variables or a group of variables (X1 as a function of x2 or x1 as a function of (x2,x3, …xi). If this were the case (perfect relationship), the matrix used to estimate the model would be singular, and therefore, the model would not be estimable. So this assumption helps to ensure the estimability of the model.
Multicollinearity, as another poster said, is any correlation between independent variables. However, multicollinearity can present issues at low degrees of correlation and at high degrees of correlation. Variance inflation factors, VIFs, are commonly used on regression outputs from software with an informal threshold of “10” for a VIF being “too high” (other metrics and techniques can be used to examine MC because it can also be problematic at lower correlations). At the higher degrees of correlation, the issues are much more likely to arise. There are a variety of ways to evaluate multicollinearity, however I am pretty sure the CFAI does not require testers to know these methods very much.
Hope this helps
To be short–there is no “hard-and-fast” widley accepted cutoff. The bivariate correlations between the independent variables might be low, but multivariate correlations, i.e. the R-squared value of xi on all remaining x’s, could be high.
To assess multicollinearity involves looking at several measures. If that helps at all!