What is covariance and why is it important?

I understand what variance and standard deviation are, and how to calculate them

I understand the theory of what correlation is, and why it ranges from -1 to +1

I can’t seem to figure out what covariance is. Can someone explain it to me?

You can’t calculate correlation directly. You have to calculate covariance first, then divide it by the standard deviation of A and the standard deviation of B to get correlation.

By itself, covariance is relatively useless.

Along a similar vein, should I look at variance vs standard deviation in the same light?

Pretty much. Variance is important to understand and is used in many calculations, but standard deviation is easier to apply.

Not exactly. However, you should look variance and co-variance in the same light.

You cannot calculate Standard deviation without variance. And you cannot calculate Correlation without co-variance.

Variance and Co-variance both have that usless unit (unit^2) and it cannot be interpretated. The book said Co-variance shows the direction of the relationship (whether’s it’s positive or negative) whereas correlation shows both the direction as well as the intensity of relationship. (How strong the relationship is?)

i look at (covariance vs correlation)this way…

Covariance between A & B is 10000000000000000000000000000000000000000000.

Correlation between M& N is .90.

We can certainly say that M and N are strongly positively correlated.

How ever we can only conclude that A &B tend to move together.We cannot comment on the strength of relationship between A &B without knowing correlation between A &B…

Hope it helped wink

Variance is covariance of an object with itself.

Of a _ random variable _ with itself.

thanks sir…

True if the two random variables have the same units. A better way of saying it is that the units on covariance are the product of the units on the two random variables: if X is measured in feet and Y is measured in lbs., then COV(X, Y) is measured in ft.-lbs.

I always teach covariance as “cross-deviations” - so the units it is denominated in are "cross-units.

Here’s an example:

You would expect some kind of relationship between a person’s height and their weight. So I went to the roster for the San Antonio Spurs and entered the height (in inches) and weight (in pounds) into Excel and got the statistics. (EG - Tim Duncan is 83 inches tall and weighs 250 pounds.)

Covariance was 73, and correlation was .92.


Then, I re-ran the exact same data, but this time I converted the height into feet, and the weight into ounces. (EG - Tim Duncan is 6.92 feet and weighs 4250 ounces)

Covariance was 103.42 and correlation was .92.


Then I ran the data again, but this time I converted the height into centimeters and the weight into kilograms. (EG - Tim Duncan is 2108 cm tall and weigs 113.4 kg.)

Covariance was 841 and correlation was .92.


As you can see, covariance changes every time you change the units. In inch-pounds, it was 73, but in foot-ounces it was 103.42. And in centimeter-kilograms it was 841. (Not very useful. And what does “inch-pounds” mean, anyway?)

But when you calculate correlation (by dividing covariance by the product of the standard deviations), you get .92 every time. Why? Because you’re using the exact same data, and the data isn’t influenced by the unit of measurement.

THANK YOU FOR YOUR EFFORTS

plz link your excel spreadsheet because i would like to see how you did it. In order to calculate this, would you need to know population standard deviation or just sample standard deviation?

^Don’t know how to link, and it’s been deleted anyway.

Just do it yourself. Pick two random variables and put them into Excel. You can use the =COVAR and =CORREL functions. (Or better yet–do it with your calculator to get some practice.)

And I’m not sure which stdev you need to know. Your book should tell you. But you have to know the variance in order to get the standard deviation and covariance, and you need the covariance to get the correlation.

+1…

thanks greenman sir

Torque.

Covariance is hard to interpret, it will give you any value and you would not be able to compare if it is strong or weak. Another issue with the covariance value is unit, if you are working with two data sets with different units of measurement then the product of the units would be hard to understand. In order to overcome this we calculate correlation by dividing the covariance by the S D of both data sets. Resultantly we get a standardized number from -1 to +1 without any units.

Green man 72, I read your comment after posting mine and I like the way you proved it. Awesome… You nailed it.

Very nice Greenman!

The same is true if you’re computing the variance of a single random variable (e.g., the height of the players), or the standard deviation. It’s hardly an indictment against covariance; changing the units changes the numbers. So what?