Multicollinearity: Can anyone explain it in their own words better than CFAI?

Explain what multicollinearity is and how to determine if it’s present.

can someone also explain what ARCH is?

What’s ARCH stand for?

The Schweser guy explained it very well in class with an analogy: multicollinearity is when two independent variables highly correlated, and are competing to explain the dependent variable. Example: Explain weight (dependent) by looking at height and sex. Obviously, height can explain weight, but so can sex. The two are competing to explain why someone weighs more or less–that’s multicollinearity.

If someone can toss us some bones and give analogies for all the other crazy terms in Quant, I’d appreciate it.

Excellent analogy. Now how is that determined…

It’s present when you have high R2 and significant F-scores but with insignificant T-scores. I think of it this way - how can you have a good hockey team (high R2 and significant F-scores) made up of crappy players (insigificant Ts) - answer: you can’t!

Best analogy so far. Let’s keep this going. Analogy for Heterskedasticity…

Conditional, or unconditional?

Answer: the whole is greater than the sum of its parts.

since unconditional is not doing any harm, so let’s talk about conditional.

I drive from Orange County, CA to the Mojave Desert, and note how rough the ride is along the way. In Orange County, the roads are all paved, the highways are smooth, so my Honda S2000 can breeze along at 120 mph (Oops . . . did I really write that? Of course, I meant 65 mph.), top down, with no problems. But when I get to the desert, the roads are unpaved, and rough, with bumps and dips and big rocks and so on; I’m lucky if I can drive 10 mph without risking my suspension, and, of course, I have to have the top up because of all of the dust.

If I were to try to characterize the average bumpiness of the ride, I would be way off; it will depend severely on whether I’m at the start of my drive, or the end of my drive.

How’s that?

jeez, so now i know where the prefix S2000 comes from :wink: what about considering changing it to SLKmagician?

ok to the business…not a bad example, but i thought it should make more sense to all of us if we can somehow integrated the financial terms like t statistic or standard error into it

In 2000, Road & Track magazine did a comparison test amongst a number of roadsters. Their original idea was to include the Porsche Boxster, BMW Z3, Audi TT, and Mercedes-Benz SLK280 (I think it was a 280: the first SLK). They decided that they’d like to have a fifth car in their comparison, so at the last minute they added a Honda S2000.

After all of the testing and comparing – some of it objective (0 – 60 times, lap times, lateral acceleration), some of it subjective (styling, comfort, roominess), the winner was no surprise: the Boxster.

The car they tossed in to round out the field – the S2000 – came in second. _ Second. _

The margin of victory: the Boxster had more trunk space.

I believe that the SLK came in last, but don’t quote me on that. I’d have to look it up.

I liked their summary of the Honda: If you’re driving this car with the top up, the storm outside had better have a name.

(By the way, mine’s an '01 and it just turned over 200,000 miles. Original engine. I doubt that the German cars would make it that far, and I used to own a 911S.)

Um . . . it’s an analogy.

(You don’t ask for _ much _, do you? wink)

I don’t think that that’s a very good example. Height and sex would be very appropriate variables to combine into the same model to predict weight.

We run into multicollinearity when two variables are providing nearly-redundant information; like combining, say, a person’s 1) height and 2) pants inseam length, into the same model. These variables would have a correlation of perhaps 90%, and using them together is likely to lead to multicollinearity.