4.3 Standards of Proof and Handling of Uncertainty

Proof and Certainty

What constitutes certainty about a given observation? How many times must it be observed to be considered “valid proof” of a particular event? What is considered to be statistically significant for a given event to be occurring? Answering these kinds of questions seems a somewhat arbitrary matter, but consider that what is considered proof in one context is considered a “shadow of doubt” in another context. As well, being wrong in some cases will cost more than being wrong in other cases (as we see in the politics of climate science).

Standards of Proof and Handling of Uncertainties

Standards of proof often incorporate social values. As Anderson writes, “Social scientists reject the null hypothesis (that observed results in a statistical study reflect mere chance variation in the sample) only for P-values\5%, an arbitrary level of statistical significance. Bayesians and others argue that the level of statistical significance should vary, depending on the relative costs of type I error (believing something false) and type II error (failing to believe something true).

Type I and Type II errors:

Type I error: (false positive)

where the test produces a positive result when the negative result is the case, such as in a medical patient testing positive for a disease they do not have. In terms of data analysis, new information falsely changes previous estimates of uncertainty.

Type II error: (false negative)

where the test produces a negative result when the positive result is the case, such as when a medical patient has an ailment that goes undetected by test(s). Regarding data, new information does not correctly change previous estimates of probability of occurrence.

Both types of errors present different costs in different contexts, and result in a choice about values.

In medicine, clinical trials are routinely stopped and results accepted as genuine notwithstanding much higher P-values, if the results are dramatic enough and the estimated costs to patients of not acting on them are considered high enough” (Anderson 2009). Type I and II errors can have significant impacts in energy applications, and will require mindful foresight and consideration both by researcher and peer-reviewers.