# Thesis question; Hedge funds returns data

I am asked to submit a non-official thesis proposal which if accepted may serve as a basis for writing my thesis in the future. The topic I am thinking about is variation of hedge fund alphas with volatility. The hypothesis would be related to hedge funds outperforming the markets in times of higher volatility. I am however unsure whether (in case that I decide to start my thesis on this topic) I would be able to find enough information on hedge fund returns. Also, the quantitative analysis is what mostly bothers me. I have some intermediate skills in excel but I was told that Matlab would be better - I don’t really know how to work with rocket science software. So I have to stick with Excel. I would appreciate if someone can give me his/her thoughts on the problem and regarding the difficulty of finding returns data. What is the best hedge fund information source that I could use for the purpose ? Thnx for the help!

Is the thesis bs or masters? Matlab is pretty easy to use once you get the hang of it, you should take the time to learn it. You would also need the econometrics toolpack available here: http://www.spatial-econometrics.com/. You could also do this in stata or any other econometrics package. One method would be to create 3-5 dummy variables for like high, medium, low volatility and multiply each of them by returns. You could get the Fama/French factors from Ken French’s website and also look at how their exposure to style factors changes in periods of volatility. Depending on the level of the thesis, I would probably focus on long/short equity hedge funds. Necessarily to have alpha you must regress them against beta factors, if you’re comparing CTA and arb funds, you’re gonna need to be able to regress them against their own style factors as well (there’s a paper on this, but I don’t recall the cite, but it has to do with replicate hedge fund returns and how all hedge funds can be modeled by like 7 or 11 factors). That would get too messy. Anyway, there are several hedge fund indices out there and not all of them contain the same funds. There’s plenty of bias in how they are created. You should check more of the literature for what exactly is used. One paper that you should check to start with is called, “Unbundling hedge fund beta” by Rodrigo, Giamouridis, Mesomeris, and Noorizadeh. Not only can you check the biblio for some lit suggestions, but it is highly relevant to what you’re looking at. Basically, hedge funds collectively add more beta if they think the market will go up and vice versa if it will go down.

There’s plenty of information out there and MAR or Barclays can get you the data for not too much money. But why would you want to do a statistics thesis when you are bothered by quantitative analysis and using computer programs? I feel like I could dictate that thesis, but you would need to feel comfortable doing statistical modelling. “One method would be to create 3-5 dummy variables for like high, medium, low volatility and multiply each of them by returns” - shouldn’t do that jmh. Why lose data but categorizing something that has ratio scale measurements?

Well it depends on what kind of volatility measure you use. You’re probably right.

10x guys for the input! But I’ve been assigned to write a research proposal on “the interplay between housing and mortgage markets”. It is way less technical than the hedge funds topic. I may continue my thesis on this proposal or write another one but only after I write this one. If I want to prove that the housing bubble has led to boom in mortgage lending and not the other way round (mortgage lending leads to boom in the housing market) what guys do you think is the best econometric method to employ ? Theoretically, I know the subject matter but the empirical part is yet to be figured out…I simply don’t know how would regression analysis be used to determine which one is the true dependent variable and which one is the explanatory ?

Well you’d be doing simultaneous equation modeling… So you’d need at least one measurable variable that affects mortgages but not housing and at least one that affects housing but not mortgages. Not sure what those variables would be off hand, but that’s what you’d be looking for. The other way to do it is to look at timing. If mortgages move before housing, that would tend to indicate the direction of causality, and vice versa. The housing data probably doesn’t come at comparable intervals to the mortgage data, though.

HFRI indices; start there. I work on those every quarter.

they provide monthly data for free. HFR is very recognized in the industry.

I did my thesis on the effect of predatory lending laws and the boom of the subprime mortgage lending. I used Loan Performance Corporation data (courtesy of one of my partners who worked at Goldman) by state and by month and then a fixed effects panel regression. Since there is so much variation between states, if you can obtain data on a state-by-state basis and run a panel estimation you’ll probably be better off. One difficulty is that the variations in housing markets are more localized, but the mortgage-market is largely a national one. As bchadwick notes, there are feedback loops between the national mortgage market and regional housing markets that could require a simultaneous equations model. I think you need to make sure you get your story straight. A housing boom certainly would be coincident with a boom in mortgage lending, it would be difficult to tease out which happened first given these feedback loops. My story for the housing crisis is that you have a macro factors and what I would call micro factors. The macro factor is that the central bank took the interest rates down to 1% following 9/11 and held them too low and for too long. This is combined with specific micro factors that directed the investment into the housing sector, such as the tax-deductability of mortgage interest, the growth in securitization, the easing of lending standards, etc. When it comes down to it, which came first: the boom of mortgage lending or the boom of housing prices? It doesn’t seem that relevant to the most important parts of the story to me. And the causation of the factors in my story aren’t exactly easy to test.

I found a paper about the relationship b/n housing and mortgages and the authors determined the direction of causality by running VAR models (which as far as I can remember are similar to simultaneous equations). It’s not that hard or at least I think it’s not. Jmh530, I agree, its too specific and I am not sure if I’ll be able to deliver 40-50 pages of writings on that stuff (as I see it now, 15-20 pgs is the theoretical maximum). In fact, I may try to deviate and extend the coverage to include “the interplay between the macro economy, the housing market and the market for mortgages” or even the real sector but with the latter it will get too messy. I also plan to cover the US housing and mortgage markets as a whole without differentiating b/n states. I realize that choosing a topic is the most difficult part of writing a thesis!

I think bchadwick means structural equation modelling (like LISREL and all that nonsense), yes? 2x2 - I think you can easily write 40-50 pages on the interrelationship of housing markets and mortgage markets. 1) Bag this idea of causality empirically. It’s a morass and the best you will get is that a change in one is “more” responsible for a change in the other than vice versa which is a totally useless result. 2) Divide up “mortgages” into chunks - availability of firsts due to credit, interest rates, structures (e.g., teaser, negative am loans), first vs seconds, requirements for down payment or equity, etc. 3) Divide up the housing market into economic chunks - “conforming” properties, > \$10 M, > \$1M, house trailers, etc. 4) Divide up housing by geography - states, suburbs vs cities vs rural, … 5) Determine what influences mortgage rates that has nothing to do with housing (bonds, Feds, credit spreads, govt funding). 6) Determine what influences value of housing that has nothing to do with mortgages (economic growth, inflation, taxation of housing, availability of rental property). Frankly, I think I could spew out 40 or 50 pages just by typing.

Joey, thanx for the nice ideas! But let me see if I understand it correctly. Do you mean to determine what and to what extent influences the mortgages (besides housing) and the housing markets (besides mortgages) for each of the above-mentioned categories ? Something like a comparative analysis ?

Good quant research follows from a good narrative story. It supports it and drives it. For example, “Most mortgages are long-term loans. That means that mortgage interest rates only direct affect the value of the housing to the extent that the housing can be (or is) refinanced or is newly financed. Of course, not everybody who can refinance their home does so because of expense, financial savvy, or limited rewards from refinancing. There are a few easy conclusions from this. First, houses in towns with high turnover should be more affected by mortgage interest rates than houses in towns with low turnover. A liquid market in housing with repeated new financing should show the effects of interest rate changes more quickly than towns with low turnover. Here is some analysis to prove it” “On the other hand, Bubba living in his trailer with the TV he bought at Walmart on layaway can’t refinance and doesn’t even know how he would refinance. Bubba doesn’t know interest rates and used house trailers are affected much more by the prevalence of tornadoes than interest rates. Here is the data to show that:” Etc, etc…