If the probability of having one or fewer clean sweep episodes of The Price Is Right out of 6,000 aired shows is a little over one and a half percent — and it is — and we consider outcomes whose probability is less than five percent to be so unlikely that we can rule them out as happening by chance — and, last time, we did — then there are improbably few episodes where all six contestants came from the same seat in Contestants Row, and we can usefully start looking for possible explanations as to why there are so few clean sweeps. At least, that’s the conclusion at our significance level, that five percent.
But there’s no law dictating that we pick that five percent significance level. If we picked a one percent significance level, which is still common enough and not too stringent, then we would say this might be fewer clean sweeps than we expected, but it isn’t so drastically few as to raise our eyebrows yet. And we would be correct to do so. Depending on the significance level, what we saw is either so few clean sweeps as to be suspicious, or it’s not. This is why it’s better form to choose the significance level before we know the outcome; it feels like drawing the bullseye after shooting the arrow the other way around.
It might seem unfair, even un-mathematical, for the answer to a question like “are there suspiciously few clean sweeps?” to be “yes or no”. But the answer really is, “yes, at the five percent confidence level” and “no, at the one percent confidence level”. With the one percent confidence level, we require the outcome to be even less likely, the outcome even more remarkable, before we suppose it can’t just be chance, than the five percent level. But, we run the risk of overlooking outcomes which are not just chance, results that are reflective of something happening, just because they aren’t rare enough. At the five percent confidence level, we’re more likely to make the opposite mistake, declaring as the result of something suspicious what is actually only chance, but we’re less likely to dismiss an improbable outcome just because it isn’t improbable enough.
That’s part of the human role in statistics. We have to decide whether we would prefer to throw away a weak signal that something is up because it isn’t improbable enough, or whether we would prefer to accept as signals that something is up what is really just chance. There is not a unique and inevitably correct answer to this question. It depends on the field one is studying, the problem one is studying, what other researchers in the field expect to see, and a host of other factors, including how much data you can gather. Really unlikely events require more data to be gathered for them to be detectable.
Let’s say, then, that we wanted the five percent confidence level all along. At that level, one clean sweep out of 6,000 episodes is too unlikely to result from chance. So, if we suppose it can’t be chance making so few clean sweeps, what was it?