Math question. Using CDF to find expected value.

Maybe someone here will know.

If you are watching a soccer game and the odds for # of passes for a certain player is provided in the bookie system, this is usually a CDF (cumulative distribution function)

  • 20+ passes: 73.5%
  • 24+ passes: 55.5%
  • 28+ passes: 40%
  • 32+ passes: 27.7%
  • 36+ passes: 18.18%
  • 40+ passes: 13.3%

From the above you can deduce the following:

  • 1-19 passes: 26.5% (1-.735)
  • 20-23 passes: 18% (.735-.555)
  • 24-27 passes: 15.5 (.555-.40)
  • 28-31 passes: 12.3% (.4-.277)
  • 32-35 passes: 9.52% (.277-.1818)
  • 36-40 passes: 4.88% (.1818-.133)
  • 40+ passes: 13.3%

To find the expected value, would you take the median value of each set and multiple it to the probability, or would you take the left edge value?

For example:

(using median): (10*.265) + (21.5*.18) + (25.5*.155) + (29.5*12.3) + (33.5*.0952) + (37.5 * .0488) + (40*.133) = 24.44

(using left): (0*.265) + (20*.18) + (24*.155) + (28*12.3) + (32*.0952) + (36 * .0488) + (40*.133) = 20.88

The method using the median value doesnt make seem to work out logically… based on the logic, shouldnt the expected value fall just below the 24 mark (since that is where the 55% probability is)

If there’s a 55% chance of 24+ passes, wouldn’t that mean that the average must be above 24?

Anyways, the first approach (multiply the percentage by the arithmetic average of each range) is definitely the right one, and I guess 24.44 is a good rough approximation of what it is. Mind you, this approach assumes that any integer within a range is equally likely as another integer within the range, which given what we’re talking about, seems like a relatively safe assumption.

Sounds good! thx, that makes a lot of sense :slight_smile: