Q: A bottler of iced tea wishes to ensure that an average of 16 ounces of tea is in each bottle. In order to analyze the accuracy of the bottling process, a random sample of 150 bottles is taken. Using a t-distributed test statistic of -1.09 and a 5% level of significance, the bottler should:

A: H_{o}: µ = 16; H_{a}: µ ≠ 16. Do not reject the null since |t| = 1.09 < 1.96 (critical value).

Why is the null hypothesis H_{o}: µ = 16? Shouldn’t it be the other way around (µ ≠ 16)?

This isn’t their position, this is how calculating a p-value or test statistic works. In Frequentist statistical theory (where significance testing comes from), you need to specify some assumed (reasonably specific) state of nature which, for logical and calculation purposes, means setting the null hypothesis (assumed state of nature) to be equal to some value/model. Once you do this, you’re able to then see how much your observed data contradict this assumed state. This is is seen in the formula for any test statistic. In general:

some test statistic= (o-e)/se

where o is the observed statistic calculated from the sample, e is the “null” hypothesis value (our expectation if the sample perfectly aligned with our assumed truth), and se is a measure of how much we expect o to vary if we sampled again and again, and again and again…all at the same sample size of n.

the larger difference between o and e, the more our observation contrasts our expectation.

There’s not an easy way to do this when you try to make o less than or greater than (or not equal) to some value.

The framework is designed for (indirect) inductive inference: I think the true mean is different from 16, so I should say “well, if the true mean were exactly equal to 16, what would we expect and how do the data at hand compare?” The other point is that if we set the null equal to 16 we can say how much the data disagree with this assumption, but we can’t say anything about how much they support it.