So, we calculated that on any given episode of The Price Is Right there’s around one chance of all six winners of the Item Up For Bid coming from the same seat. And we know there have been about six thousand episodes with six Items Up For Bid. So we expect there to have been about six clean sweep episodes; yet if Drew Carey is to be believed, there has been just the one. What’s wrong?
Possibly, nothing. Just because there is a certain probability of a thing happening does not mean it happens all that often. Consider an analogous situation: a baseball batter might hit safely one time out of every three at-bats; but there would be nothing particularly odd in the batter going hitless in four at-bats during a single game, however much we would expect him to get at least one. There wouldn’t be much very peculiar in his hitting all four times, either. Our expected value, the number of times something could happen times the probability of it happening each time, is not necessarily what we actually see. (We might get suspicious if we always saw the expected value turn up.)
Still, there must be some limits. We might accept a batter who hits one time out of every three getting no hits in four at-bats. If he got no runs in four hundred at-bats, we’d be inclined to say he’s not a decent hitter having some bad luck. More likely he’s failing to bring the bat with him to the plate. We need a tool to say whether some particular outcome is tolerably likely or so improbable that something must be up.
One tool we have is the binomial distribution. This is a good way to estimate how likely outcomes are when the outcomes are made up of a bunch of similar little attempts. Each of these little attempts — a single at-bat, a single coin being flipped, a single episode of The Price Is Right — can either succeed or fail to meet whatever outcome we’re interested in. We assume the probability of success or failure is equal on each try, and that each attempt is independent of the others; flipping tails last time doesn’t make it more or less likely this time. We might not actually want the result of a successful outcome — suppose we’re checking the probability of successfully getting stopped at a red light during the morning’s commute — but the difference between successfully satisfying some condition and actually wanting that condition for ourselves we can leave to baffle students learning probability.
Each one of these little attempts is known as a Bernoulli trial, after one or more of the Bernoulli clan. (I jest, slightly; it was named for Jacob Bernoulli, 1654 – 1705, also known as James and Jacques. The Bernoulli of the Bernoulli Principle was his nephew Daniel, 1700 – 1782.) The Bernoullis were a family of mathematicians/physicists/theologian /logicians/philosophers/lawyers — the usual career path for the 17th and 18th century intellectual, an era I note without asserting a connection is also about when the caffeinated drinks of coffee, tea, and chocolate became common in western Europe intellectual circles — who divided their time between proving everything there was to be proved, forming enough interesting mathematical questions to keep Isaac Newton distracted, and plagiarizing one another until the mathematical historian throws up both hands in disgust and wanders off to some less melodramatic family, like the Borgias or the candy makers Mars.
Anyway, we take the chance of successfully meeting whatever our interesting property is on a single Bernoulli trial to be some constant number, which is usually called p to give it a convenient name. p is some number from 0 to 1, with the smaller numbers representing lower chances of success, and the larger numbers representing greater chances of success.
We will also need, it turns out, the chance of failing to meet whatever the interesting condition is. Remember that we can either succeed or fail to meet the condition; there’s no uncertain area, by assumption. So if the probability of succeeding is p, the probability of failing to meet the condition is exactly 1 – p.
We need to know the number of Bernoulli trials; this we call N, which is a popular name to give any number, particularly if it’s a number we know counts something but we don’t particularly care what its value is. To have some manageable example, let’s imagine there to be four trials, so that N = 4, and that on each of them the chance of success is p = 1/3. This means on each the chance of failure is 1 – p = 2/3. This is our problem of the batter with four at-bats, and the one-in-three chance of hitting on each of them.
So what is the chance of one successful hit in the four at-bats?
The chance of hitting on the first at-bat is p or 1/3. The chance of failing to hit on the second at-bat is 1 – p or 2/3. The chance of failing to hit on the third at-bat is again 1 – p or 2/3. And the chance of failing to hit on the fourth at-bat is once more 1 – p or 2/3 again. Since the chance of succeeding or failing is, we assume, independent of what happened the last time, then the chance of getting a hit the first at-bat and failing to the next three at-bats is p x (1 – p) x (1 – p) x (1 – p), or 1/3 x 2/3 x 2/3 x 2/3 or (1x2x2x2)/(3x3x3x3) or 8/81. That’s a bit less than one chance in ten.
But that isn’t the only way to get one hit in four at-bats. We need to know what the chance is of getting a hit on the second at-bat but missing on the first, third and fourth. The probability of failing to get a hit the first at-bat is 1 – p or 2/3; of getting a hit on the second is p or 1/3; of failing to hit on the third again 1 – p or 2/3; and of failing to get a hit on the fourth 1 – p or 2/3 again. So the chance of hitting on the second and no other at-bat is (1 – p) x p x (1 – p) x (1 – p), or 2/3 x 1/3 x 2/3 x 2/3, which is again 8/81.
This doesn’t exhaust all the possibilities, though. There’s also the chance of getting a hit on the third at-bat, but failing to get a hit on the first, second, and fourth. Would you be willing to believe the probability of this combination is again 8/81? And how about that the chance of failing to get a hit the first three at-bats but getting one successfully the fourth time is again 8/81? Sure. The chance of getting one success and three failures, if the chance doesn’t change between attempts, is the same whether the success is on the first attempt, or the second, or the third, or the fourth.
We have four ways to get one hit out of four: the hit is on the first, or the second, or the third, or the fourth at-bat. Any of those ways has a probability of 8/81 of happening. And there’s not any way that two or more of them can happen simultaneously: we can’t have the single hit be at both the first and the third at-bats. This means if we want to know the probability of at least one of them happening we can just add together the chances of each of them. If we have a couple of outcomes which can’t possibly both happen simultaneously, the chance of one or the other of them happening is the sum of the chances of each of the different outcomes happening.
The probability of getting one out of four hits here is 8/81 + 8/81 + 8/81 + 8/81 or 32/81. That’s a chance of just under forty percent, which — since we expected this to be the most likely outcome — seems reasonable. If we had noticed sooner that there were just four ways to get the one success out of four attempts, we could have just figured the probability of one success and three failures and multiplied that by the four ways to distribute one success and three failures among four attempts.
What about the chance of getting two hits in four at-bats? The chance of successfully getting hits the first and second times will be p x p x (1 – p) x (1 – p) or 1/3 x 1/3 x 2/3 x 2/3 or 4/81. The chance of getting hits the first and third time will be just the same. And also the first and fourth time. Really, the hard part looks to be figuring out all the ways there are to get two successful and two failed at-bats out of four possibilities; we would multiply that by the 4/81 which is the chance of any particular two hits being successfully gotten in four at-bats.
For two successful at-bats out of four that’s not too hard; a little work will find that there are six ways to pick out two things out of four possibilities; we can call the two we picked our successes and the two we don’t pick our failures. Therefore the chance of getting two hits somewhere in the four at-bats is the 4/81 probability of two successes and two failures times the 6 ways to arrange two successes and two failures out of four attempts. So the chance is 6 x 4/81 or 24/81 of two hits in four at-bats, a chance of just under thirty percent. It’s less likely than one hit, but not all that improbable, which seems reasonable.
For small cases, fiddling around with pencil and paper to list all the combinations of so many successes and failures out of N trials is a fun pastime. But when the numbers get large, as in picking a couple of episodes out of 6,000 possible shows … that’s the sort of problem the word tedious was created to describe. Not just is it a lot of work, but the risk of overlooking some combination gets uncomfortably high. If we’re to do the problem right, we need system; we need method; we need something we don’t have to think too hard about.