A question about collinearity in joint optimization, when hedging multiple currencies

Hi all,

In currency management, how to hedge multiple currencies, the practical implication of Minimum-variance hedge ratio is joint optimization.

Rdc = f(Rfc, Rfx), in which Rfc refers to foreign stocks, Rfx refers to foreign currency. It is said by multiple regression like this function, the correlation between these two variables (Rfc and Rfx) could be considered as well.

Will there be multicollinearity in this regression, which on the contrary decrease, not increase the accuracy of regression? Thanks.

What do you mean, “…the accuracy of regression?”

Contrary to the poor teaching of the CFAI curriculum, multicollinearity doesn’t really “change” predictive ability of a model (i.e. giving reasonable values of Y compared to true values), nor does it change model based statistics like R-squared or the F-statistic.

The issues can arise if you want to look at the individual beta estimates and try to interpret them because there is often instability and difficulty estimating the correct magnitude and or direction for the coefficients of the collinear variables. Individually, this could pose an issue since the estimates are “unreliable” in that the weighted sum is “as it should be” but the components may not be “as they should be.” THe following example is not necessarily technically or mathematically accurate but should convey they general idea. Suppose X1 and X2 are collinear with true slopes of 3 and 3 relating each to Y. When we estimate the slopes, we get 8 and -2 (assuming the average X values are 1 each, so predicted Y = 6, ignoring an intercept for the example). We then conduct another sample for the same study and find the estimated slopes are now 1 and 5 which would give the same predicted value but the signs and magnitudes are different than the prior time. Again, this isn’t a technical or mathematically rigorous example, but hopefully this shows the point of what they mean (the predictions are generally unaffected, R2 and the F-stat are unchanged by multicollinearity, but interpreting the slopes individually may not be a great idea because they’re unstable). This is not the same as bias or inconsistency.

If you’re looking to build a portfolio after interpreting each beta estimate, then that might not be a great idea, but I’m sure it will depend. I haven’t had to do this in practice, but the regression concepts are still the same (and again, the CFAI doesn’t do a great job even conveying accurate information on this stuff).

I mean due to multicollinearity, the model risk of regression model will increase.

I am not sure if I understood your explanation fully and correctly, but I did feel inspired.

So I personally think,

the regression model: Rdc = f(Rfc, Rfx) could be maintained, with the premise that cor(Rfc, Rfx) exists.

To improve the model, (since it appears multicollinearity),

Rdc = f(Ri) or Rdc = f(Ri, cor((Rfc, Rfx)) here i = fc, fx

Make sure the correlation effect of the two variables, fc and fx, has been considered while constructing and re-balancing the portfolio, which means, for example, hedging a slightly more when losses occur.

Besides, the poor teaching you mentioned is in Quantitative method of CFA L2 Curriculum, actually.

Thanks.

What do you mean by “the model risk”? Do you mean the variance/standard deviation of the error term (residual variance/standard deviation)? If so, that is not true that multicollinearity increases it (typically decreases it). It is the variance (and therefore standard error) for the estimated coefficients that are collinear that should increase, but not the overall model error variance. This is easily seen by looking at the formula for each kind of statistic. If this is not what you meant, then what do you mean by “model risk of regression”?

What I am saying here, is that it may be hard to rely on the estimated coefficients individually in the presence of multicollinearity but the weighted sum should be okay, so I wonder how this would play in practice when you pick hedging ratios based on the volatile individual coefficients because the weighted sum is not necessarily volatile.

Right, the CFAI doesn’t have a specific QM section for LIII if I recall, but I highly doubt that they suddenly understand their statistical theory and application when they clearly don’t at LI and LII (in general)

I mean multicollinearity makes the coefficients unreliable. And since the standard errors of the slope coefficients are artificially inflated,there is a greater probability that one will incorrectly conclude that a variable is not statistically significant (the probability of occurrence of a Type II error increases).

To correct for multicollinearity, one needs either to omit one or more of the correlated independent variables, or to use statistical procedures like stepwise regression, which systematically remove variables from the regression until multicollinearity is minimized.

What I wrote here was nothing more but a sign which indicated one or more variables should be omitted.

They’re only unreliable in the sense that you shouldn’t look at them and try to describe the relationship of that variable and the independent variable because they’re unstable estimates; however, they remain unbiased. The loss of efficiency is demonstrated by higher type II error rates for individual significance tests on involved variables’ coefficients, as you noted.

I would first say, the goal of stepwise is not to minimize multicollinearity (although it can help reduce MC); the stopping criterion is based on p-values which may relate indirectly to the degree of collinearity, but the selected variables from stepwise are not orthogonal, therefore, MC is not minimized (which is possible with other methods). If you’re doing this in practice, I would read a real applied regression book (by someone with a PhD in stats); the CFA curriculum is trash. If there is multicollinearity you need to first determine if it’s even detrimental to your use of the model, which it might not be in your case (i.e. for prediction versus inference; in the case of interpreting beta estimates, than yes, probably important to do something about MC). However, the reason for getting a real stats book is because the CFAI is garbage with stats; there are far better and more appropriate methods for dealing with MC. Dropping one or a few of the highly collinear variables may be appropriate only in a few circumstances but is generally a silly approach to blanket as “the answer to MC”. Stepwise regression is also pretty trash and there is tons of research that supports this sentiment. The literature also demonstrates why stepwise is basically something that belongs in a museum: fun to look at an understand, shouldn’t be used today. Stepwise is incredibly unreliable for picking out the real variables from a list of potential predictors, and this is shown easily with simulations; you mentioned Type II error rates before, stepwise regressions practically guarantee at least 1 type I error, and often many type 2 errors, and this is evident if you have an understanding of how the stepwise procedures (forward with or without checking, backwards, etc) work to select variables. Almost the only way stepwise can be a useful tool to reduce the number of predictors is by using backwards stepwise from a saturated model and using alpha .5 (yes, at least 50% for alpha). There are far better tools; I suggest you read into regularization/penalization methods such as ridge regression or LASSO for helping select variables; principal components analysis (very different from factor analysis, despite common misconception) is also helpful for determining how to handle highly collinear predictors. There is tons of statistical literature now that shows why stepwise is almost always the wrong answer (there are good scenarios when it might be useful, but most nonstatisticians don’t use them appropriately).

“Should” is a strong word, especially considering that omitting the variable is often not necessary nor is it the optimal way of addressing the problem (in general terms, because, of course, we can find times that it is the best option).

Even though multicollinearity does not affect the consistency of slope coefficients, such coefficients themselves tend to be unreliable. Since the standard errors of slope coefficients are inflated, I don’t think their estimates will be unbiased.

Based only on what I have known, the principal components analysis should help; don’t know much about the others you have mentioned.

Generally speaking, with multicollinearity, there is only a loss of efficiency; the CFAI should be embarrassed about how they “teach” this material because even they have tried to argue that multicollinearity causes bias in the OLS beta estimates (they also tried to claim bias in the F stat and R2)…only later after having the author of their QM book weigh in did they back down and agree with me, despite providing an informal proof and several references to papers, texts, and websites (they’re not very bright when it comes to this stuff, so it seems).

The “unreliable” nature of coefficients is because they become unstable; the signs and magnitude may be different than anticipated, removing another variable or changing the sample size may cause large shifts in the values of the estimates. However, in the statistical sense of unbiased, the coefficients are still unbiased estimates. There is no calculable bias in the coefficients because all assumptions required for unbiased and consistent estimation of betas are still met.

A short resource from Notre Dame: page 3 under “consequences of multicollinearity” might be a good place to start.

Wikipedia has a short note on this as well toward the end of the"consequences"section.

Picking up any regression (general statistics or econometrics to fit the context very easily) text (such as Wooldridge or Greene) will usually give a section on the regression assumptions required for consistent and unbiased estimates. You would then see that multicollinearity doesn’t violate any of those assumptions, therefore, assuming those are met elsewhere, the estimation is unbiased and consistent. You could also prove that it’s unbiased (the expected difference between the true value of beta and the expected estimate (beta hat) is zero). Alternatively, you could simulate some data for a regression model with 2 independent variables that are collinear; calculate the average difference between the estimate and the truth you set in the simulation.

Well, I never expect that this topic could be extended so much. Thank you for your reply, just to type, it must have already taken you a lot of time. CFAI puts away the cup after taking a tiny sip, doesn’t talk much about the quantitative part. Obviously one needs to learn more by self-study.

Exactly that :+1:

Based on previous discussions with the CFAI on some of this material, I’d bet they just don’t know some of it (because they’ve told me as much in their discussions). I agree with you, that self-study for statistics is better than relying on the CFAI books. It’s irrelevant that its for “econometrics” because that’s just statistics applied to economics problems and data sets, which you can do when you understand the stats well.