I was revisiting the quant topics (Schweser book 1) today and several questions emerged. First, how do I get P(ABC), without assuming independence? Should I simply breakup the term into P(AX) where X = P(BC)? Or is there something more complicated to get simultaneous event probabilities of more than two events? Second, Is the expected value property of E(X^2) =! [E(X)]^2 (book 1, p. 139) useful for tests of randomness. That is, if the property does not hold, can I assume randomness of the variable? I somehow thought something like auto-correlation test may need to be performed… Third, the figures in Schweser (i.e. book 1, p. 219 ff) lack axis description. I assume all these density function graphs uses frequency (y-axis) and periods (x-axis), right? In relation to that, the area under the curve produced by these density functions is the probability that the event occurs at any given period after t=0? Fourth, what exactly are the benefits of generalized Pareto distributions (GPD) against generalized extreme value (GEV) distributions? Is there any accuracy or computational benefit from this linear approximation property mentioned (book 1, p 225)? Fifth, Schweser mentions (book1, p. 225) that new distributions may be created. That’s nice. Now, this is essentially the same as “fitting” a model to better match observed events by introducing a new factor to that model, right? Sixth, regarding the GARCH parameter estimation (book 1, p. 267), how is over-fitting controlled in this process? That is, how do I control that fact the ML will not simply match the model perfectly to the past and therefore reducing prediction power for the future? Seven, is it possible, and how do I run GBM or just any Monte Carlo simulation for non-normally distributed variables? Schweser describes (book 1, p. 272) the formulas only for normally distributed random variables? Oh, and what can I do if the variable is not random? Not use statistical simulations? Any answers, partial answers, suggestions or references are greatly appreciated.