Point Estimate...

maratikus · October 18, 2007, 9:52pm

Joey, what about a high school problem. My 15-year old brother game it to me a couple of weeks ago. There are 100 white balls in a bucket. On every step one ball is randomly pulled out from the bucket, painted in black and then put back (if the ball is black, it gets painted anyway). What is the distribution of black balls in the bucket after 100 steps. The answer should be presented as a function of k where k is the number of black balls in the bucket. For example, f(1) = 1/(100)^100, f(100) = 100!/(100)^100 You are supposed to come up with the general formula.

JoeyDVivre · October 19, 2007, 1:38am

You’re right - pretty easy, but you have to be a pretty sharp HS student to do this problem. In fact, I would say that it would be a tough CFA problem. For those who want to mess with it, get rid of the 100 and try it with 5. The technique is the same but numbers like 100! are just inherently confusing. Now for a more difficult version of this question - eventually all the balls will be black. What’s the distribution of the number of draws until they are all black? The exact distribution is kind of a pain (but easy to write a program to give you numbers), but the asymptotic distribution is pretty easy (i.e., as # balls original -> infinity).

cfaisok · October 19, 2007, 1:09pm

cfaboston28 Wrote: ------------------------------------------------------- > Isn’t the answer is A? Talking about dreary’s question?

maratikus · October 19, 2007, 1:51pm

Joey, you are right, most people won’t be able to solve this high school problem. My brother does math tournaments, they are given problems a few weeks in advance to sovle them and then present them in debate-type tournaments. Actually the technique for calculating distribution of the number of draws until all balls are black is the same as solving the initial problem. I generalized the initial problem by adding an extra parameter (number of steps): n balls, k black balls after l steps and came up with the formula for number of combinations S(n,k,l). Then the initial problem solution is S(n,k,n)/n^n and yours S(n,n,l)*NormalizationFactor. The difficulty with solving yours is going to be normalizing by the sum of S(n,n,l) for l = n to infinity, so that the combined probability equals one. I might give it a try.

JoeyDVivre · October 19, 2007, 3:07pm

Actually, you can do that but here’s a really cool thing that is more useful in the end. If you let B(k) = # black balls after draw k, then B(k) is a Markov Chain with a really simple transition matrix. Further it’s got an absorbing state which can be reached from any other state (the state where all balls are black). The cool thing is that for any such Markov chain the distribution of time until absorption is asymptotically exponential with mean = 1/r where r is the spectral radius of the transition matrix (pretty sure that mean is right but not certain). That’s a pretty cool theorem because it means that asymptotically, the number of draws you have made has no relevance in computing the number of draws you have left to do. Thus, if we both start with 2000 balls that we are painting black (we oldsters think that red doors should be painted black) and you have been doing it for two days and have not reached the end, you are probabilistically no closer to the end than I am if I just start. That just doesn’t seem right… For you parents out there - games like Chutes and Ladders are absorbing Markov chains which means the time until absorption is exponential. Mathematica can easily work out the spectral radius of the transition matrix which is also very simple. That means that if you have been playing Chutes and Ladders for two hours you are no closer to thye end than you were when you first started. If you’re just doing this to humor the kid, that can be a pretty depressing thought.

maratikus · October 19, 2007, 3:36pm

That is very elegant, Joey! I see what you are saying. If the initial number of balls is n, then Markov chain can be defined as B(k+1) = B(k) with prob B(k)/n B(k+1) = B(k) + 1 with prob (n-B(k))/n It is obvious that number of draws left doesn’t depend on the number of draws you’ve made from the above equation. Why is the distribution of time until absorption is exponential with mean 1/r, could you give me a link refering to that result? I’d like to read more about it, it’s fascinating.

JoeyDVivre · October 19, 2007, 4:36pm

Hmmm… I learned it from http://www.amazon.com/dp/0471842729?tag=worldcat-20&camp=14573&creative=327641&linkCode=as1&creativeASIN=0471842729&adid=0993H2JJP7X2TEJDWGPB& which doesn’t seem to be available but it’s a really fine book.

maratikus · October 19, 2007, 5:22pm

That’s interesting. I might have to review my stochastic analysis lectures, I’ve forgotten quite a bit and there is a lot to re-learn. It was fun, Joey!