Lunch Crunch - Quant Jocks!

Correlation and Regression -------------------------------------------------------------------------------- Question 1 - 86899 A regression between the returns on a stock and its industry index gives the following results: Coefficient Standard Error Intercept 2.1 2.01 Industry Index 1.9 0.31 Standard error of estimate = 15.1 Correlation coefficient = 0.849 Part 1) If the return on the industry index is 4%, the stock’s expected return would be: A) 7.6%. B) 11.2%. C) 9.7%. Part 2) The percentage of the variation in the stock return explained by the variation in the industry index return is closest to: A) 84.9%. B) 63.2%. C) 72.1%. -------------------------------------------------------------------------------- Question 2 - 86780 Which of the following is least likely an assumption of linear regression? A) The residuals are normally distributed. B) The independent variable is correlated with the residuals. C) The variance of the residuals is constant. -------------------------------------------------------------------------------- Question 3 - 86837 A variable Y is regressed against a single variable X across 24 observations. The value of the slope is 1.14, and the constant is 1.3. The mean value of X is 1.10, and the mean value of Y is 2.67. The standard deviation of the X variable is 1.10, and the standard deviation of the Y variable is 2.46. The sum of squared errors is 89.7. For an X value of 1.0, what is the 95% confidence interval for the Y value? A) −1.83 to 6.72. B) −1.68 to 6.56. C) 0.59 to 4.30. -------------------------------------------------------------------------------- Question 4 - 86490 An analyst is regressing fund returns against the return on the Wilshire 5000 to determine whether beta is equal to 1.0. The analyst is trying to determine whether the number of observations should be increased. Which of the following is a reason why the test will have higher power if the number of observations is increased? The: A) mean squared error of the regression will be lower. B) standard error of the regression will be lower. C) estimate of beta will be farther away from 1.0. -------------------------------------------------------------------------------- Question 5 - 86996 Joe Harris is interested in why the returns on equity differ from one company to another. He chose several company-specific variables to explain the return on equity, including financial leverage and capital expenditures. In his model: A) return on equity is the independent variable, and financial leverage and capital expenditures are dependent variables B) return on equity is the dependent variable, and financial leverage and capital expenditures are independent variables. C) return on equity, financial leverage, and capital expenditures are all independent variables. -------------------------------------------------------------------------------- Question 6 - 86787 Thomas Manx is attempting to determine the correlation between the number of times a stock quote is requested on his firm’s website and the number of trades his firm actually processes. He has examined samples from several days trading and quotes and has determined that the covariance between these two variables is 88.6, the standard deviation of the number of quotes is 18, and the standard deviation of the number of trades processed is 14. Based on Manx’s sample, what is the correlation between the number of quotes requested and the number of trades processed? A) 0.78. B) 0.35. C) 0.18. -------------------------------------------------------------------------------- Question 7 - 86952 A study of a sample of incomes (in thousands of dollars) of 35 individuals shows that income is related to age and years of education. The following table shows the regression results: Coefficient Standard Error t-statistic P-value Intercept 5.65 1.27 4.44 0.01 Age 0.53 ? 1.33 0.21 Years of Education 2.32 0.41 ? 0.01 Anova df SS MS F Regression ? 215.10 ? ? Error ? 115.10 ? Total ? ? Part 1) The standard error for the coefficient of age and t-statistic for years of education are: A) 0.53; 2.96. B) 0.40; 5.66. C) 0.32; 1.65. Part 2) Mean square regression (MSR) and mean square error (MSE) are: A) 6.72; 3.58. B) 102.10; 7.11. C) 107.55; 3.60. Part 3) What is the R2 for the regression? A) 65%. B) 76%. C) 62%. Part 4) What is the predicted income of a 40-year-old person with 16 years of education? A) $62,120. B) $63,970. C) $74,890. Part 5) What is the F-value? A) 29.88. B) 14.36. C) 1.88. -------------------------------------------------------------------------------- Question 8 - 86792 A simple linear regression is run to quantify the relationship between the return on the common stocks of medium sized companies (Mid Caps) and the return on the S&P 500 Index, using the monthly return on Mid Cap stocks as the dependent variable and the monthly return on the S&P 500 as the independent variable. The results of the regression are shown below: Coefficient Standard Error of coefficient t-Value Intercept 1.71 2.950 0.58 S&P 500 1.52 0.130 11.69 R2= 0.599 The strength of the relationship, as measured by the correlation coefficient, between the return on Mid Cap stocks and the return on the S&P 500 for the period under study was: A) 0.130. B) 0.774. C) 0.599. -------------------------------------------------------------------------------- Question 9 - 86816 Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses lumber sales as the dependent variable with housing starts and commercial construction as the independent variables. The results of the regression are: Coefficient Standard Error t-statistics Intercept 5.37 1.71 3.14 Housing starts 0.76 0.09 8.44 Commercial construction 1.25 0.33 3.78 The level of significance for a 95% confidence level is 1.96 Part 1) Construct a 95% confidence interval for the slope coefficient for Housing Starts. A) 0.76 ± 1.96(0.09). B) 0.76 ± 1.96(8.44). C) 1.25 ± 1.96(0.33). Part 2) Construct a 95% confidence interval for the slope coefficient for Commercial Construction. A) 1.25 ± 1.96(0.33). B) 0.76 ± 1.96(0.09). C) 1.25 ± 1.96(3.78). -------------------------------------------------------------------------------- Question 10 - 86975 Which of the following is least likely an assumption of linear regression analysis? A) The X values are uncorrelated with the error terms. B) The Y values are all less than 3 standard deviations from the regression line. C) The error term is normally distributed. -------------------------------------------------------------------------------- Question 11 - 86859 In order to have a negative correlation between two variables, which of the following is most accurate? A) The covariance must be negative. B) Either the covariance or one of the standard deviations must be negative. C) The covariance can never be negative. -------------------------------------------------------------------------------- Question 12 - 87004 A study of 40 men finds that their job satisfaction and marital satisfaction scores have a correlation coefficient of 0.52. At 5% level of significance, is the correlation coefficient significantly different from 0? A) No, t = 1.68. B) No, t = 2.02. C) Yes, t = 3.76. -------------------------------------------------------------------------------- Question 13 - 86737 Jason Brock, CFA, is performing a regression analysis to identify and evaluate any relationship between the common stock of ABT Corp and the S&P 100 index. He utilizes monthly data from the past five years, and assumes that the sum of the squared errors is .0039. The calculated standard error of the estimate (SEE) is closest to: A) 0.0082. B) 0.0080. C) 0.0360. -------------------------------------------------------------------------------- Question 14 - 86788 Unlike the coefficient of determination, the coefficient of correlation: A) indicates whether the slope of the regression line is positive or negative. B) indicates the percentage of variation explained by a regression model. C) measures the strength of association between the two variables more exactly.

my answers 1 – 86899 Part 1) C) 9.7%. Part 2) C) 72.1%. 2 - 86780 B) The independent variable is correlated with the residuals. 3 - 86837 B) -1.68 to 6.56. 4 - 86490 B) standard error of the regression will be lower. 5 - 86996 B) return on equity is the dependent variable, and financial leverage and capital expenditures are independent variables. 6 - 86787 B) 0.35. 7 – 86952 Part 1) B) 0.40; 5.66. Part 2) C) 107.55; 3.60. Part 3) A) 65%. Part 4) B) $63,970. Part 5) A) 29.88. 8 - 86792 B) 0.774. 9 - 86816 Part 1) A) 0.76 ± 1.96(0.09). Part 2) A) 1.25 ± 1.96(0.33). 10 - 86975 B) The Y values are all less than 3 standard deviations from the regression line. 11 - 86859 A) The covariance must be negative. 12 - 87004 C) Yes, t = 3.76. 13. 86737 – A) 0.0082 14. 86788 – C) measures the strength of association between the two variables more exactly.

and my qbank f’ed up. Now all the questions are randomized. I can look up specific questions if you wish, otherwise providing all the answers will not come easily.

Q1.1. C [2.1 + 1.9*4% = 9.7%] Q1.2. C [(0.849)^2 = 72.08%] Q2.B Q3.B Q4.B [as n inc, se dec] Q5.B [lol] Q6.B [88.6/18*14 = 0.351587] Q7.1.B [0.398496240 - 5.658536] Q7.2.C [215.10/2 & 115.10/32 = 107.55 & 3.596875] Q7.3.A [215.10/330.2 = R2 = r = 0.6514] Q7.4.B [5.65 + 40*0.53 + 16*2.32 = 5.65 + 21.2 + 37.12 = 63.97] Q7.5.A [107.55/3.596875 = 29.900955] Q8.B [r = SQRT(R2) = SQRT(0.599) = 0.7739] Q9.1.A [0.76 ± 1.96(0.09)] Q9.2.A [1.25 ± 1.96(0.33)] Q10.B [haha] Q11.A Q12.C (tcalculated = 3.75278227 > tcritical = 2.024 — Reject H0, hence r is statistically significantally different than 0) Q13. A SQRT(0.0039/58) = 0.0082008 Q14. C (r = strength of association)

CPK and I got the exact same answer choices.

Me too… 1a. c 1b. c 2. b 3. b 4. b 5. b 6. b 7a. b 7b. c 7c. a 7d. b 7e. a 8. b 9a. a 9b. a 10. b 11. a 12. c 13. a 14. c

  1. C, C 2. B 3. A 4. B 5. B 6. B 7. B, C, A, B, A 8. B 9. A, A 10. B 11. A 12. C 13. A 14. C

ditch, could you look up in the Qbank Question 3 - 86837 ?

When I redo question Q3 again - I get A now.

A variable Y is regressed against a single variable X across 24 observations. The value of the slope is 1.14, and the constant is 1.3. The mean value of X is 1.10, and the mean value of Y is 2.67. The standard deviation of the X variable is 1.10, and the standard deviation of the Y variable is 2.46. The sum of squared errors is 89.7. For an X value of 1.0, what is the 95% confidence interval for the Y value? A) −1.83 to 6.72. B) −1.68 to 6.56. C) 0.59 to 4.30. The correct answer was A. First the standard error of the estimate must be calculated — it is equal to the square root of the mean squared error, which is equal to the sum of squared errors divided by the number of observations minus 2 = (89.7 / 22)1/2 = 2.02. The standard deviation of the prediction is equal to the squared standard error of the estimate multiplied by [1 + (1 / n) + (x − u)2] / [(n − 1)sx2]1/2 = 2.022 × [1 + (1 / 24) + (1.0 − 1.1)2] / (23 × 1.12)1/2 = 2.06. The prediction value is 1.3 + (1.0 × 1.14) = 2.44. The t-value for 22 degrees of freedom is 2.074. The endpoints of the interval are 2.44 ± 2.074 × 2.06 = −1.83 and 6.72.

Answer is A SSE=89.7 SEE=2.01922577 SEE^2=4.077272 1/n=0.04166667 X - Xbar = 1-1.10=-0.1 n - 1 = 23 (X - Xbar)/(n - 1) = -0.1/23=-0.0043478260 sx^2 = 1.21 sf^2 = 4.077272* [1 + 0.04166667 + -0.0043478260*1.21] 4.077272* [1 + 0.04166667 - 0.00526086946] 4.077272*[1.03640580054] 4.22570835117932688 sf = 2.0556527798 tc=2.074 ycap = 1.3 + 1.14*1 = 2.44 tc*sf = 4.2634238653052 2.44-4.2634238653052 = -1.8234238653052 2.44+4.2634238653052 = 6.7034238653052

#$%@% SF!!! :slight_smile:

See, now that’s what I get for assuming a critical t-vale of 2 (in which case you get answer B - the difference in answers A and B being the difference in critical values used - 2 vs. 2.074). Though I think those guys at Schweser are being tricky ^@\*#%s, I guess it’s good to throw in there. Lesson learned - don’t be lazy and assume tcrit=2… On the test they may do something similar and so always find the critical value yourself…

If you have more than 30 observations, you can safely assume t as 1.96. Other than this, look it up in the tables.

Could any one of you explain the logic behind this wuestion? why is the answer B and not A? I was thinking as MSE is SSE/(n-k-1); as n is increased, MSE will be lower. Many thanks. _________________- Question 4 - 86490 An analyst is regressing fund returns against the return on the Wilshire 5000 to determine whether beta is equal to 1.0. The analyst is trying to determine whether the number of observations should be increased. Which of the following is a reason why the test will have higher power if the number of observations is increased? The: A) mean squared error of the regression will be lower. B) standard error of the regression will be lower. C) estimate of beta will be farther away from 1.0.

Question 14 - 86788 Unlike the coefficient of determination, the coefficient of correlation: A) indicates whether the slope of the regression line is positive or negative. B) indicates the percentage of variation explained by a regression model. C) measures the strength of association between the two variables more exactly. Answer should be A, unlike what some people have chosen. If X and Y are negatively correlated, it means that when X increases, Y will decrease, meaning that the slope of the line is negative. The coefficient of determination does not tell you whether the relationship is positive or negative because R squared is always >= 0

nimz Wrote: ------------------------------------------------------- > Could any one of you explain the logic behind this > wuestion? why is the answer B and not A? I was > thinking as MSE is SSE/(n-k-1); as n is increased, > MSE will be lower. > > Many thanks. > _________________- > Question 4 - 86490 > > An analyst is regressing fund returns against the > return on the Wilshire 5000 to determine whether > beta is equal to 1.0. The analyst is trying to > determine whether the number of observations > should be increased. Which of the following is a > reason why the test will have higher power if the > number of observations is increased? The: > > A) mean squared error of the regression will be > lower. > B) standard error of the regression will be lower. > > C) estimate of beta will be farther away from 1.0. The smaller the standard error of the regression, the closer the actual data points are to the fitted line, i.e. the fit is very good. Standard error of regression is given by (RSS/n-k-1)^0.5 When n increases, the standard error of regression drops and hence, the better the fit.

Question 14 - 86788 Unlike the coefficient of determination, the coefficient of correlation: A) indicates whether the slope of the regression line is positive or negative. B) indicates the percentage of variation explained by a regression model. C) measures the strength of association between the two variables more exactly. Answer should be A, unlike what some people have chosen. If X and Y are negatively correlated, it means that when X increases, Y will decrease, meaning that the slope of the line is negative. The coefficient of determination does not tell you whether the relationship is positive or negative because R squared is always >= 0 HydrogenRainbow: do not agree with you. Coefficient of determination tells you nothing because it is r^2 - that part is true - it is always positive. slope of the line = cov(x,y)/var(x) and cov(x,y) = r * sx * sy. so cov(x,y) is +ve or -ve depending on the sign of r. While this is true, however primary use of correlation is “strength of association” - so C would be the better choice, in my mind.

Actually, on the contrary I’d say A is the better answer because when you are talking about correlation, you are talking about it in the case of a LINEAR regression, i.e. straight line. Strength of association between X and Y: Not the best answer because if you remember the illustration given in the text about Y=X^2, the curve can go through all the points (I can’t remember is it in Schweser or CFAI text). Strength of association is 100% but it is meaningless to find the correlation coefficient because it is not a straight line.

I believe it is A because non-linear relationships can exist that aren’t measured by linear regression, as confirmed by HydrogenRainbow.