Central limit theorem

Submariner · December 28, 2014, 10:28pm

We are given a nonnormal distribution with sample size N=100. We are given a range of dollar figures that a company may spend (0, 100). The expenditures form a continuous uniform random variable. The probability function is a horizontal line with a vertical intercept = 1/100. We are told that all possible outcomes are equally likely for a uniform random variable.

We are told to collect 200 random samples of the expenditures from 100 companies (n=100). In each trial, 100 values for capital expenditure are generated from the uniform distribution (0, 100). For each random sample, we compute the sample mean and conduct 200 simulatipon trials in total. Because we have specified the distribuion generating the samples, we know that the population mean = (0 + 100)/2 = 50.

The population variance = (100-0)^2 / 12 = 833

The population standard deviation = root(833) = 28.87

The standard error = 28.87 / root(100) = 2.887

To save time and space, the range of sample means is:

42.5

increasing by increments of 1.5 to:

56

The frequency of results is given; it approximates a bell shaped distribution, wth the highest frequencies centered around 48.5

So the following are my questions:

I.) we are told that the probability of any expenditure occuring (0, 100) is equally likely. Why does the absolute frequency contain only values between 42.5 and 57.5?

II.) we are told that we have a nonnormal distribution but the sample absolute frequency shows a relative normal (i.e. bell shaped) distribution. I understand that this is the case for large sample sizes N, but how does it work? III.) If the distribution is nonnormal, how can the population mean be simply the range devided by 2? Doesn’t the non-normal distribution effect the way the mean is calculated?

Thanks!

S2000magician · December 29, 2014, 3:45am

I.) The range from 42.5 to 57.5 is a range of μ ± 2.598ς, which encompasses 99.06% of a normal distribution; for all practical purposes, you’re not going to get a mean outside that range.

II.) Remember that they’re not saying that a sample from this nonnormal distribution is (approximately) normal; they’re saying that if you take a bunch of (nonnormal) samples from this nonnormal distribution, calculate the mean for each of those samples, and look at the distribution of just those means, _ that _ distribution is (approximately) normal. The distributions of the underlying samples should approximate the distribution of the population.

III.) If the distribution of the sample means is normal, then it’s symmetric, and the mean should be the average of the extremes (which, of course, should be infinite, but as the sample is finite they’ll only be something like ±4ς. If the maximum is, say, μ + 4ς, then you would expect the minimum to be μ − 4ς; the average of these two numbers is μ.