Lunch Crunch!

Quant A scatter plot is a collection of points on a graph where each point represents the values of two variables (i.e., an X/Y pair). Suppose that we wish to graphically represent the data for the returns on Stock A and returns on a market index, over the last six months, shown in Figure 1. Figure 2 shows the data graphically with the returns on Stock A shown on the Y-axis and the returns on the market index on the X-axis. Each point of the scatter plot in Figure 2 represents one month of the six in our sample. The rightmost point in the scatter plot is for the month of March, a 2.0 percent return on the market index and a 1.8 percent return on Stock A. LOS Explanation Computed correlation coefficients, as well as other sample statistics, may be affected by outliers. Outliers represent a few extreme values for sample observations. Relative to the rest of the sample data, the value of an outlier may be extraordinarily large or small. Outliers can provide statistical evidence that a significant relationship exists when, in fact, there is none, or provide evidence that there is no relationship when, in fact, there is a relationship. Spurious correlation refers to the appearance of a linear relationship when, in fact, there is no relation. Certain data items may be highly correlated but not necessarily a result of a causal relationship. If you compute the correlation coefficient for historical stock prices and snowfall totals in Minnesota, you will get a statistically significant relationship—especially for the month of January. Since there is no economic explanation for this relationship, this would be considered a spurious correlation. LOS Explanation The dependent variable is the variable whose variation is explained by the independent variable. The dependent variable is also referred to as the explained variable, the endogenous variable, or the predicted variable. The independent variable is the variable used to explain the variation of the dependent variable. The independent variable is also referred to as the explanatory variable, the exogenous variable, or the predicting variable. Example: Dependent vs. independent variables Suppose that you want to predict stock returns with GDP growth. Which variable is the independent variable? Answer: Since GDP is going to be used as a predictor of stock returns, stock returns are being explained by GDP. Hence, stock returns are the dependent (explained) variable, and GDP is the independent (explanatory) variable. LOS Explanation The assumptions of linear regression include: A linear relationship exists between the dependent and independent variable. The independent variable is uncorrelated with the residuals. The expected value of the residual term is zero. The variance of the residual term is constant for all observations. The residual term is independently distributed; that is, the residual for one observation is not correlated with that of another observation. The residual term is normally distributed. LOS Explanation The linear regression model says that the value of the dependent variable, Y, is equal to the intercept, b0, plus the product of the slope coefficient, b1, and the value of the independent variable, X, plus an error term, å. The regression line is chosen so that the sum of the squared differences (vertical distances) between the Y-values predicted by the regression equation and actual Y-values, Yi, is minimized. The estimated slope coefficient for the regression line describes the change in Y for a one-unit change in X. It can be positive, negative, or zero, depending on the relationship between the regression variables. The estimated intercept is interpreted as the value of the dependent variable (the Y) if the independent variable (the X) takes on a value of zero. LOS Explanation Analysis of variance (ANOVA) is a statistical procedure for analyzing the total variability of a data set. Output of an ANOVA table consists of: Total sum of squares (SST) measures the total variation in the dependent variable. Regression sum of squares (RSS) measures the variation in the dependent variable explained by the independent variable. Sum of squared errors(SSE) measures the unexplained variation in the dependent variable. Thus, total variation = explained variation + unexplained variation, or SST = RSS + SSE (Crap ANOVA table won’t format) Source of Variation Degrees of Freedom Sum of Squares Mean Square Regression (explained) k = 1 RSS MSR = RSS/k Error (unexplained) n − 2 SSE MSE = SSE/(n − 2) Total n − 1 SST

sorry quant and me don’t mix…I will look at this in May.

Always fun to review early

if i play, get ready for a smokeshow- quant is my worst.

don’t tease me Ditch, wheres the questions : )

Determine and interpret the correlation coefficient for the two variables X and Y. The standard deviation of X is 0.05, the standard deviation of Y is 0.08, and their covariance is −0.003. A) +0.75 and the two variables are positively associated. B) −1.33 and the two variables are negatively associated. C) −0.75 and the two variables are negatively associated. The Y variable is regressed against the X variable resulting in a regression line that is flat with the plot of the paired observations widely dispersed about the regression line. Based on this information, which statement is most accurate? A) The R2 of this regression is close to 100%. B) X is perfectly positively correlated to Y. C) The correlation between X and Y is close to zero. Suppose the covariance between Y and X is 12, the variance of Y is 25, and the variance of X is 36. What is the correlation coefficient ®, between Y and X? A) 0.160. B) 0.013. C) 0.400. A sample covariance of two random variables is most commonly utilized to: A) identify and measure strong nonlinear relationships between the two variables. B) estimate the “pure” measure of the tendency of two variables to move together over a period of time. C) calculate the correlation coefficient, which is a measure of the strength of their linear relationship. Consider the case when the Y variable is in U.S. dollars and the X variable is in U.S. dollars. The ‘units’ of the covariance between Y and X are: A) a range of values from −1 to +1. B) squared U.S. dollars. C) U.S. dollars.

Determine and interpret the correlation coefficient for the two variables X and Y. The standard deviation of X is 0.05, the standard deviation of Y is 0.08, and their covariance is −0.003. A) +0.75 and the two variables are positively associated. B) −1.33 and the two variables are negatively associated. C) −0.75 and the two variables are negatively associated. C if the #8722 means a negative sign The Y variable is regressed against the X variable resulting in a regression line that is flat with the plot of the paired observations widely dispersed about the regression line. Based on this information, which statement is most accurate? A) The R2 of this regression is close to 100%. B) X is perfectly positively correlated to Y. C) The correlation between X and Y is close to zero. C Suppose the covariance between Y and X is 12, the variance of Y is 25, and the variance of X is 36. What is the correlation coefficient ®, between Y and X? A) 0.160. B) 0.013. C) 0.400. C A sample covariance of two random variables is most commonly utilized to: A) identify and measure strong nonlinear relationships between the two variables. B) estimate the “pure” measure of the tendency of two variables to move together over a period of time. C) calculate the correlation coefficient, which is a measure of the strength of their linear relationship. B Consider the case when the Y variable is in U.S. dollars and the X variable is in U.S. dollars. The ‘units’ of the covariance between Y and X are: A) a range of values from −1 to +1. B) squared U.S. dollars. C) U.S. dollars. A

For the case of simple linear regression with one independent variable, which of the following statements about the correlation coefficient is least accurate? A) If the correlation coefficient is negative, it indicates that the regression line has a negative slope coefficient. B) If the regression line is flat and the observations are dispersed uniformly about the line, the correlation coefficient will be +1. C) The correlation coefficient can vary between −1 and +1. Rafael Garza, CFA, is considering the purchase of ABC stock for a client’s portfolio. His analysis includes calculating the covariance between the returns of ABC stock and the equity market index. Which of the following statements regarding Garza’s analysis is most accurate? A) The actual value of the covariance is not very meaningful because the measurement is very sensitive to the scale of the two variables. B) The covariance measures the strength of the linear relationship between two variables. C) A covariance of +1 indicates a perfect positive covariance between the two variables Which of the following statements regarding the coefficient of determination is least accurate? The coefficient of determination: A) cannot decrease as independent variables are added to the model. B) may range from −1 to +1. C) is the percentage of the total variation in the dependent variable that is explained by the independent variable. Which of the following statements about covariance and correlation is least accurate? A) There is no relation between the sign of the covariance and the correlation. B) The covariance and correlation are always the same sign, positive or negative. C) A zero covariance implies a zero correlation.

  1. if the covariance has a positive sign, A. If a negative sign, C. 2. C 3.C 4. I say B, not sure 5.C

Determine and interpret the correlation coefficient for the two variables X and Y. The standard deviation of X is 0.05, the standard deviation of Y is 0.08, and their covariance is -0.003. A) +0.75 and the two variables are positively associated. B) -1.33 and the two variables are negatively associated. C) -0.75 and the two variables are negatively associated. Ans C -.003/(0.05 * .08) = -0.75 The Y variable is regressed against the X variable resulting in a regression line that is flat with the plot of the paired observations widely dispersed about the regression line. Based on this information, which statement is most accurate? A) The R2 of this regression is close to 100%. B) X is perfectly positively correlated to Y. C) The correlation between X and Y is close to zero. Ans C Suppose the covariance between Y and X is 12, the variance of Y is 25, and the variance of X is 36. What is the correlation coefficient ®, between Y and X? A) 0.160. B) 0.013. C) 0.400. 12 / (5 * 6) = 0.4 Ans C A sample covariance of two random variables is most commonly utilized to: A) identify and measure strong nonlinear relationships between the two variables. B) estimate the “pure” measure of the tendency of two variables to move together over a period of time. C) calculate the correlation coefficient, which is a measure of the strength of their linear relationship. ans B Consider the case when the Y variable is in U.S. dollars and the X variable is in U.S. dollars. The ‘units’ of the covariance between Y and X are: A) a range of values from -1 to +1. B) squared U.S. dollars. C) U.S. dollars. Ans B

6.B 7.A 8.A 9.C

For the case of simple linear regression with one independent variable, which of the following statements about the correlation coefficient is least accurate? A) If the correlation coefficient is negative, it indicates that the regression line has a negative slope coefficient. B) If the regression line is flat and the observations are dispersed uniformly about the line, the correlation coefficient will be +1. C) The correlation coefficient can vary between −1 and +1. Ans: B Rafael Garza, CFA, is considering the purchase of ABC stock for a client’s portfolio. His analysis includes calculating the covariance between the returns of ABC stock and the equity market index. Which of the following statements regarding Garza’s analysis is most accurate? A) The actual value of the covariance is not very meaningful because the measurement is very sensitive to the scale of the two variables. B) The covariance measures the strength of the linear relationship between two variables. C) A covariance of +1 indicates a perfect positive covariance between the two variables Ans A Which of the following statements regarding the coefficient of determination is least accurate? The coefficient of determination: A) cannot decrease as independent variables are added to the model. B) may range from −1 to +1. C) is the percentage of the total variation in the dependent variable that is explained by the independent variable. Ans C Which of the following statements about covariance and correlation is least accurate? A) There is no relation between the sign of the covariance and the correlation. B) The covariance and correlation are always the same sign, positive or negative. C) A zero covariance implies a zero correlation. Ans A

Your answer: C was correct! The correlation coefficient is the covariance divided by the product of the two standard deviations, i.e. −0.003 / (0.08 × 0.05). Your answer: C was correct! Perfect correlation means that the observations fall on the regression line. An R2 of 100%, means perfect correlation. When there is no correlation, the regression line is flat and the residual standard error equals the standard deviation of Y. Your answer: C was correct! The correlation coefficient is: r = 12 / (5)(6) = 0.40 Your answer: A was incorrect. The correct answer was C) calculate the correlation coefficient, which is a measure of the strength of their linear relationship. Since the actual value of a sample covariance can range from negative to positive infinity depending on the scale of the two variables, it is most commonly used to calculate a more useful measure, the correlation coefficient. Your answer: B was correct! The covariance is in terms of the product of the units of Y and X. It is defined as the average value of the product of the deviations of observations of two variables from their means. The correlation coefficient is a standardized version of the covariance, ranges from −1 to +1, and is much easier to interpret than the covariance. Your answer: B was correct! Correlation analysis is a statistical technique used to measure the strength of the relationship between two variables. The measure of this relationship is called the coefficient of correlation. If the regression line is flat and the observations are dispersed uniformly about the line,there is no linear relationship between the two variables and the correlation coefficient will be zero. Both of the other choices are TRUE. Your answer: A was correct! Covariance is a statistical measure of the linear relationship of two random variables, but the actual value is not meaningful because the measure is extremely sensitive to the scale of the two variables. Covariance can range from negative to positive infinity. Your answer: B was correct! In a simple regression, the coefficient of determination is calculated as the correlation coefficient squared and ranges from 0 to +1. Your answer: A was correct! The correlation is the ratio of the covariance to the product of the standard deviations of the two variables. Therefore, the covariance and the correlation have the same sign.

One major limitation of the correlation analysis of two random variables is when two variables are highly correlated, but no economic relationship exists. This condition most likely indicates the presence of: A) outliers. B) nonlinear relationships. C) spurious correlation. Ron James, CFA, computed the correlation coefficient for historical oil prices and the occurrence of a leap year and has identified a statistically significant relationship. Specifically, the price of oil declined every fourth calendar year, all other factors held constant. James has most likely identified which of the following conditions in correlation analysis? A) Positive correlation. B) Outliers. C) Spurious correlation. One of the limitations of correlation analysis of two random variables is the presence of outliers, which can lead to which of the following erroneous assumptions? A) The presence of a nonlinear relationship between the two variables, when in fact, there is a linear relationship. B) The presence of a nonlinear relationship between the two variables, when in fact, there is no relationship whatsoever between the two variables. C) The absence of a relationship between the two variables, when in fact, there is a linear relationship.

Your answer: C was correct! Spurious correlation occurs when the analysis erroneously indicates a linear relationship between two variables when none exists. There is no economic explanation for this relationship; therefore this would be classified as spurious correlation. Your answer: C was correct! Outliers represent a few extreme values for sample observations in a correlation analysis. They can either provide statistical evidence that a significant relationship exists, when there is none, or provide evidence that no relationship exists when one does. The independent variable in a regression equation is called all of the following EXCEPT: A) predicting variable. B) predicted variable. C) explanatory variable. The purpose of regression is to: A) get the largest R2 possible. B) explain the variation in the dependent variable. C) explain the variation in the independent variable. The covariance between stock A and the market portfolio is 0.05634. The variance of the market is 0.04632. The beta of stock A is: A) 1.2163. B) 0.8222. C) 0.0026 The capital asset pricing model is given by: Ri =Rf + Beta ( Rm -Rf) where Rm = expected return on the market, Rf = risk-free market and Ri = expected return on a specific firm. The dependent variable in this model is: A) Rm - Rf. B) Rf. C) Ri.

One major limitation of the correlation analysis of two random variables is when two variables are highly correlated, but no economic relationship exists. This condition most likely indicates the presence of: A) outliers. B) nonlinear relationships. C) spurious correlation. Ans C Ron James, CFA, computed the correlation coefficient for historical oil prices and the occurrence of a leap year and has identified a statistically significant relationship. Specifically, the price of oil declined every fourth calendar year, all other factors held constant. James has most likely identified which of the following conditions in correlation analysis? A) Positive correlation. B) Outliers. C) Spurious correlation. Ans C One of the limitations of correlation analysis of two random variables is the presence of outliers, which can lead to which of the following erroneous assumptions? A) The presence of a nonlinear relationship between the two variables, when in fact, there is a linear relationship. B) The presence of a nonlinear relationship between the two variables, when in fact, there is no relationship whatsoever between the two variables. C) The absence of a relationship between the two variables, when in fact, there is a linear relationship. Ans B

The independent variable in a regression equation is called all of the following EXCEPT: A) predicting variable. B) predicted variable. C) explanatory variable. Ans B The purpose of regression is to: A) get the largest R2 possible. B) explain the variation in the dependent variable. C) explain the variation in the independent variable. Ans B The covariance between stock A and the market portfolio is 0.05634. The variance of the market is 0.04632. The beta of stock A is: A) 1.2163. B) 0.8222. C) 0.0026 Ans A The capital asset pricing model is given by: Ri =Rf + Beta ( Rm -Rf) where Rm = expected return on the market, Rf = risk-free market and Ri = expected return on a specific firm. The dependent variable in this model is: A) Rm - Rf. B) Rf. C) Ri. Ans C

Linear regression is based on a number of assumptions. Which of the following is least likely an assumption of linear regression? A) Values of the independent variable are not correlated with the error term. B) There is at least some correlation between the error terms from one observation to the next. C) The variance of the error terms each period remains the same. Which of the following is least likely an assumption of linear regression? The: A) expected value of the residuals is zero. B) residuals are mean reverting; that is, they tend towards zero over time. C) residuals are independently distributed. The assumptions underlying linear regression include all of the following EXCEPT the: A) independent variable is linearly related to the residuals (or disturbance term). B) disturbance term is normally distributed with an expected value of 0. C) disturbance term is homoskedastic and is independently distributed. Which of the following is least likely an assumption of a simple regression? A) The expected value of the error term is zero. B) The error term is normally distributed. C) The variance of the error term is one. Which of the following statements about linear regression analysis is most accurate? A) The coefficient of determination is defined as the strength of the linear relationship between two variables. B) When there is a strong relationship between two variables we can conclude that a change in one will cause a change in the other. C) An assumption of linear regression is that the residuals are independently distributed. (CP gets all the answers right ;-))

10.C 11.C 12.C

Isnt the last one C? cpk123 Wrote: ------------------------------------------------------- > One major limitation of the correlation analysis > of two random variables is when two variables are > highly correlated, but no economic relationship > exists. This condition most likely indicates the > presence of: > A) outliers. > B) nonlinear relationships. > C) spurious correlation. > > Ans C > > Ron James, CFA, computed the correlation > coefficient for historical oil prices and the > occurrence of a leap year and has identified a > statistically significant relationship. > Specifically, the price of oil declined every > fourth calendar year, all other factors held > constant. James has most likely identified which > of the following conditions in correlation > analysis? > A) Positive correlation. > B) Outliers. > C) Spurious correlation. > > Ans C > > > One of the limitations of correlation analysis of > two random variables is the presence of outliers, > which can lead to which of the following erroneous > assumptions? > A) The presence of a nonlinear relationship > between the two variables, when in fact, there is > a linear relationship. > B) The presence of a nonlinear relationship > between the two variables, when in fact, there is > no relationship whatsoever between the two > variables. > C) The absence of a relationship between the two > variables, when in fact, there is a linear > relationship. > > Ans B