“We would interpret the p-value of 0.00236 as the smallest level of significance at which we can reject a null hypothesis that the population value of the coefficient is 0, in a two-sided test.”

I am confused by this interpretation. Isn’t the p value the largest level of significance to reject the null hypothesis (since anything larger we fail to reject the null hypothesis).

i.e. if we have a larger value than 0.00236 for a regression coefficient we fail to reject the null hypothesis.

What they mean is that the p-value is the smallest α you could have chosen that would result in rejecting the null hypothesis.

Suppose that you get a p-value of 0.00236 (to grab a random number out of the air). If you had chosen α = 0.05 (larger than your calculated p-value of 0.00236), you would reject the null hypothesis. If you had chosen α = 0.01 (also larger than your calculated p-value of 0.00236), you would reject the null hypothesis. If you had chosen α = 0.0024 (still larger than your calculated p-value of 0.00236), you would reject the null hypothesis. But if you had chosen α = 0.023 (_ smaller _ than your calculated p-value of 0.00236), you would _ not _ reject the null hypothesis. The smallest α you could have chosen and still rejected the null hypothesis is α = 0.00236.

Great example to clarify the statement. I’d like to point out that the statement of “…smallest alpha you could choose and still reject Ho…” is pretty barren as it offers little intuition for what the p-value actually is and it only restates (in a dangerous way) how to make decisions when comparing alpha and p-values (i.e. it discusses “how” in an indirect and lacking manner, and fails to address “what”).

Recall that alpha should be chosen before you see the data, run any numbers, or otherwise “begin” a study or experiment. The statement they provide (and other do as well) is dangerous because it can encourage cherry picking or post-test alpha selection to force (non)significance.

In other words, you should decide on your research question(s) and select an appropriate alpha level for each test. You see the results and make a comparison, then you make an inference. Their (CFAI) statement leads many to undermine this process and do things out of order; for example, come up with a question, pick alpha 0.01, run the test and get a p-value of 0.02 and decide to switch their alpha to 0.025 or something at least as large as the p-value (in order to claim significance or because the CFAI book [or another] gave them this limited explanation of a p-value). This (cherry picking or changing alpha levels after a test) is unethical at best (especially if unreported), but it also is a misuse of the framework developed by statisticians.

The CFA L2 QM text (at least in 2015) had plenty of errors in it regarding the interpretation of p-values, so this is something to be wary of when you’re reading. One example is something like this…

A researcher does an F-test on coefficients b1, b2, b3 and gets a p-value of 0.0001 and rejects the null hypothesis that all of the coefficients are zero. The p-value of 0.0001 indicates a miniscule probability that the real coefficients are all zero!

The problem with this statement is that it implies that the p-value alone can tell us anything about the probability of the null or alternative hypothesis-- which it absolutely cannot. All the p-value is telling us in the above example is the following: If the null hypothesis is true (i.e. all coefficients equal zero) and if all the assumptions we used are correct, then we will get results this extreme or more extreme with probability 0.0001 (think of it as 0.01% of the time). This is a much different statement than what the text indicated in 2015, and I’m suspicious that they’ve left some of these mistakes unchanged.

Although I may have diverged from your original topic a bit, I hope I’ve illustrated a few caveats in reading the CFAI books (or others that aren’t real statistics books). You’re more likely, in my experience, to run into unclear or poor explanations in books that only teach statistics as a peripheral topic. (Yes, I am biased for this topic, which is how I went on this tangent.) Continue to be skeptical and ask questions when you read!

Just be careful not to confuse alpha (chosen or selected significance level) with the p-value (observed significance). Software packages will often report a p-value as “Significance of t” or “significance” which might be confusing at first glance.