Multicollinearity: Can anyone explain it in their own words better than CFAI?

CFACPACFP · May 10, 2013, 4:46pm

Explain what multicollinearity is and how to determine if it’s present.

pierrewoodman_fan · May 10, 2013, 4:53pm

can someone also explain what ARCH is?

CFACPACFP · May 10, 2013, 4:56pm

What’s ARCH stand for?

mfreema2 · May 10, 2013, 5:41pm

The Schweser guy explained it very well in class with an analogy: multicollinearity is when two independent variables highly correlated, and are competing to explain the dependent variable. Example: Explain weight (dependent) by looking at height and sex. Obviously, height can explain weight, but so can sex. The two are competing to explain why someone weighs more or less–that’s multicollinearity.

If someone can toss us some bones and give analogies for all the other crazy terms in Quant, I’d appreciate it.

CFACPACFP · May 10, 2013, 5:50pm

Excellent analogy. Now how is that determined…

Gordon_Gecko · May 10, 2013, 5:55pm

It’s present when you have high R2 and significant F-scores but with insignificant T-scores. I think of it this way - how can you have a good hockey team (high R2 and significant F-scores) made up of crappy players (insigificant Ts) - answer: you can’t!

mfreema2 · May 10, 2013, 7:27pm

Best analogy so far. Let’s keep this going. Analogy for Heterskedasticity…

S2000magician · May 11, 2013, 5:14am

Conditional, or unconditional?

S2000magician · May 11, 2013, 5:16am

Answer: the whole is greater than the sum of its parts.

jaychou · May 11, 2013, 5:00pm

since unconditional is not doing any harm, so let’s talk about conditional.

S2000magician · May 11, 2013, 6:10pm

I drive from Orange County, CA to the Mojave Desert, and note how rough the ride is along the way. In Orange County, the roads are all paved, the highways are smooth, so my Honda S2000 can breeze along at 120 mph (Oops . . . did I really write that? Of course, I meant 65 mph.), top down, with no problems. But when I get to the desert, the roads are unpaved, and rough, with bumps and dips and big rocks and so on; I’m lucky if I can drive 10 mph without risking my suspension, and, of course, I have to have the top up because of all of the dust.

If I were to try to characterize the average bumpiness of the ride, I would be way off; it will depend severely on whether I’m at the start of my drive, or the end of my drive.

How’s that?

jaychou · May 11, 2013, 6:17pm

jeez, so now i know where the prefix S2000 comes from what about considering changing it to SLKmagician?

ok to the business…not a bad example, but i thought it should make more sense to all of us if we can somehow integrated the financial terms like t statistic or standard error into it

S2000magician · May 11, 2013, 6:34pm

In 2000, Road & Track magazine did a comparison test amongst a number of roadsters. Their original idea was to include the Porsche Boxster, BMW Z3, Audi TT, and Mercedes-Benz SLK280 (I think it was a 280: the first SLK). They decided that they’d like to have a fifth car in their comparison, so at the last minute they added a Honda S2000.

After all of the testing and comparing – some of it objective (0 – 60 times, lap times, lateral acceleration), some of it subjective (styling, comfort, roominess), the winner was no surprise: the Boxster.

The car they tossed in to round out the field – the S2000 – came in second. _ Second. _

The margin of victory: the Boxster had more trunk space.

I believe that the SLK came in last, but don’t quote me on that. I’d have to look it up.

I liked their summary of the Honda: If you’re driving this car with the top up, the storm outside had better have a name.

(By the way, mine’s an '01 and it just turned over 200,000 miles. Original engine. I doubt that the German cars would make it that far, and I used to own a 911S.)

Um . . . it’s an analogy.

(You don’t ask for _ much _, do you? )

Wendy · May 12, 2013, 12:41am

I don’t think that that’s a very good example. Height and sex would be very appropriate variables to combine into the same model to predict weight.

We run into multicollinearity when two variables are providing nearly-redundant information; like combining, say, a person’s 1) height and 2) pants inseam length, into the same model. These variables would have a correlation of perhaps 90%, and using them together is likely to lead to multicollinearity.