## How Interesting Is A Basketball Tournament?

Yes, I can hear people snarking, “not even the tiniest bit”. These are people who think calling all athletic contests “sportsball” is still a fresh and witty insult. No matter; what I mean to talk about applies to anything where there are multiple possible outcomes. If you would rather talk about how interesting the results of some elections are, or whether the stock market rises or falls, whether your preferred web browser gains or loses market share, whatever, read it as that instead. The work is all the same.

To talk about quantifying how interesting the outcome of a game (election, trading day, whatever) means we have to think about what “interesting” qualitatively means. A sure thing, a result that’s bound to happen, is not at all interesting, since we know going in that it’s the result. A result that’s nearly sure but not guaranteed is at least a bit interesting, since after all, it might not happen. An extremely unlikely result *would* be extremely interesting, if it could happen.

In the language of probability, we might describe an result, like, Georgetown winning a game, by some shorthand such as G. We write the probability of an outcome happening as something like p_{G}, or if that seems like too much bother to typeset, p(G). The probability of a particular result is, by definition, some number from 0 (it’s impossible for it to happen) up to 1 (it’s mandatory that it happens), with the increasing number corresponding to the result being more likely. We seem to be looking at an idea of “interesting” that starts out at some large number for a probability p_{G} that’s at or near 0 and which decreases until it gets to a minimum when p_{G} is at 1, since a sure thing is a boring result, while an unlikely result is interesting.

That hasn’t pinned down very much, but then we haven’t finished thinking about what a measure for “interesting” ought to be. For example, suppose we have two different results: say, result C, University of Connecticut winning a game, and result B, Baylor losing a game. A long as these results are independent — one happening doesn’t change the probability of the other happening, which is probably the case unless Connecticut is playing Baylor — then our measure for how interesting it is that Connecticut wins *and* Baylor loses should be at least as big as how interesting it is the Connecticut wins, or how interesting it is that Baylor loses.

What this is pointing to is a measure of “interesting” that’s based on a logarithm of the probability of the particular result. The logarithm of 1 — the probability of our sure event — is 0. If the probability of result C is p_{C} and the probability of result B is p_{B}, and C and B are independent results so that one happening doesn’t affect the chance the other happens, then the probability that *both* happen is p_{C} times p_{B}, and the logarithm of the product p_{C} times p_{B} is the logarithm of p_{C} plus the logarithm of p_{B}.

There is one little flaw: the logarithm of a number between 0 and 1 is a negative number, and that’s just the opposite of what we want. That’s easy to fix, though. Just use minus one times the logarithm of the probability, and we have something that perfectly matches what we wanted. The minus logarithm of a probability starts out extremely high — infinitely high, in fact — if the probability is 0; as te probability of the result increases, the minus logarithm decreases, dropping to its smallest value of zero when the probability gets to 1; the minus logarithm of the product of two probabilities is the minus logarithm of the first probability plus the minus logarithm of the second probability. It’s almost perfectly matched to our needs.

So a logarithm seems the way to go, although that raises a follow-up question: *which* logarithm? The logarithm of a number depends on the base of the logarithm, which can easily be any positive number. If you choose a logarithm base of 10, for example, then 100 has a logarithm of 2, because 10^{2} is equal to 100. If you choose a logarithm base of 5, though, then 100 has a logarithm of about 2.8614, since 5^{2.8614} is approximately 100. If you choose a logarithm base of e — about 2.71828, and this is known as the “natural” logarithm — then 100 has a logarithm of about 4.6052, since 2.71828^{4.6052} is just about 100.

So which one to use? In a sense, it doesn’t make a difference: the qualitative results are going to be the same, with the difference just being how big the numbers are. It’s like the difference between measuring a length in inches, meters, miles, or light-years; the results are all going to be compatible, and the only good reason to choose one over the other is what gives convenient numbers.

Making for a convenient number, then, is to use the logarithm with a base of 2. We’re looking at results that are well-defined to either happen or not happen, and with a logarithm base 2, a result that has probability of 1/2 of happening has a minus logarithm of 1. This seems neat and clean. I’ll throw in some other reasons to make this sound like it makes sense shortly. And, really, if you don’t want to use the logarithm base 2, use some other one. It’ll change the numbers, but not important things like whether one event is more interesting than another.

But simply taking a minus logarithm of the probability of a result doesn’t quite satisfy our sense of how interesting the outcome of a *game* would be. We can all agree that result S, in which the players for Siena are just about to win when they are struck by rays from a strange magnetic meteorite and transform into superheroes, would be an extraordinarily interesting outcome, but it is so unlikely to happen that the possibility doesn’t make the *game* more interesting. We need to compensate for this, make our measure of how interesting a result is smaller for results that just don’t happen.

The way we finally patch this up is to multiply the probability of the result by this minus logarithm of the probability, which is a result we could derive rigorously, but I suspect that’s not going to be more convincing. To find how interesting the outcome of the game is, we work out all the different results that we’re interested in — call them A, B, C, and so on, up through some highest imaginable letter such as D — and then add up:

Here log means the logarithm base 2, but like I said, if you prefer some other base pick that. And yeah, you can remove some of those parentheses, but I didn’t want to write that you were adding together things on a line that didn’t have any plus signs in it.

If all you’re interested in is, for a single game, whether let’s say Siena wins or loses, then there’s two results, W and L. But the probability that Siena loses has to be exactly one minus the probability that Siena wins — the rules of the contest rule out ties, and I suppose they have some way of handling interrupted games — so how interesting the result of Siena’s game is ought to be:

(We could also have worked this out from the perspective of the probability of L happening, Siena losing, but why not take the sunny interpretation? Also they’re not in the 2015 contest anyway.)

Now, if the probability of Siena winning p_{W} is 1 — a sure thing — then the outcome of the game is utterly uninteresting; we know what it is without it being done, and our formula there gives an interest level of 0 (although if you want to show that’s so you will have to do a little bit of work). If the probability of Siena winning is 0 — a sure loss — then again, it’s not an interesting game, and again, it has an interest level of 0. If Siena has a slight chance of losing, say p_{W} is 0.9, then the game is a bit interesting, coming in at about 0.469. If Siena has only a slight chance of winning, say p_{W} is 0.1, then the game is just as interesting since it’s just as not-quite-foregone a conclusion: this measure comes in at 0.469. If Siena has a modest but not really good chance of winning, say p_{W} is 0.35, then the game is still more interesting, about 0.93407. And if its chance of winning were 0.65 instead — a modest but not good chance of losing — the game has an interest of 0.93407 again. If the game is a perfect toss-up, Siena having a chance of 0.5 of winning, then the interest level is exactly 1. (If we’d picked logarithms with another base, the interest level would be some other number here.) I like to think this makes intuitive sense; ultimately, we were curious about *one* thing in the game, the answer to the question, “did Siena win?” 1 seems to me like a reasonable measurement then.

So, by this metric, if a team has a fifty percent chance of winning a game, then the game itself has an interest level of 1, which is probably as reasonable as you could hope. And if the outcome of every game is independent — the result of one game doesn’t change the probability of a win or a loss in other games —- then we should expect the result of 63 separate games to be 1 plus 1 plus 1 et cetera, 63 times over. So if the outcome of a single game, where each team has a 50-50 chance of winning, is 1, then the outcome of the whole tournament with every team evenly matched so is 63.

So that’s at least one answer to the question of how interesting a March Madness-style best-of-64-teams single-elimination tournament is: it’s 63.

## Angie Mc 9:52 pm

onFriday, 20 March, 2015 Permalink |I’ll hop in at the sweet 16. I’m be in charge of cheering and you can be in charge of the stats :D Super cool post, Joseph!

LikeLike

## Joseph Nebus 8:36 pm

onSunday, 22 March, 2015 Permalink |Thanks. I hope to be interesting with this sort of thing. I actually do have a follow-up in the works.

LikeLiked by 1 person

## Angie Mc 6:45 pm

onTuesday, 24 March, 2015 Permalink |Great! I caught the Notre Dame vs Butler game…WOW! Excellence right to the end :D

LikeLike

## Joseph Nebus 3:40 am

onFriday, 27 March, 2015 Permalink |Oh, good, good. Uhm … I called the outcome of that right, if I’m reading my brackets correctly. That seems to have turned out all right.

LikeLiked by 1 person

## Garfield Hug 6:17 am

onSaturday, 21 March, 2015 Permalink |Math as in stats, calculus or algebra has been my weakest link. Still I managed to do Bs for math subjects in university. I am enjoying the way you use math to define things and just want to say thanks for making math not so complicated. This is your passion in spreading math in your topics of your blog 😊 Thanks for the teaching and I am learning☺

LikeLike

## Joseph Nebus 8:36 pm

onSunday, 22 March, 2015 Permalink |Thank you so. I’m glad you’re enjoying. I do want to share the fun I have in mathematics.

LikeLiked by 1 person

## elkement 7:18 pm

onTuesday, 24 March, 2015 Permalink |Excellent post!! Are you perhaps planning to write about entropy?

LikeLike

## Joseph Nebus 3:40 am

onFriday, 27 March, 2015 Permalink |Thank you! I am indeed sidling my way up to that.

LikeLiked by 1 person

## What We Talk About When We Talk About How Interesting What We’re Talking About Is | nebusresearch 7:29 pm

onSaturday, 28 March, 2015 Permalink |[…] When I wrote last weekend’s piece about how interesting a basketball tournament was, I let some terms slide without definition, mostly so I could explain what ideas I wanted to use and how they should relate. My love, for example, read the article and looked up and asked what exactly I meant by “interesting”, in the attempt to measure how interesting a set of games might be, even if the reasoning that brought me to a 63-game tournament having an interest level of 63 seemed to satisfy. […]

LikeLike

## But How Interesting Is A Real Basketball Tournament? | nebusresearch 6:46 pm

onMonday, 30 March, 2015 Permalink |[…] I wrote about how interesting the results of a basketball tournament were, and came to the conclusion that it was … (and filled in that I meant 63 bits of information), I was careful to say that the outcome of a […]

LikeLike

## But How Interesting Is A Basketball Score? | nebusresearch 5:44 pm

onSaturday, 4 April, 2015 Permalink |[…] The answer was given, in embryo, in my first piece about how interesting a game might be. If you can list all the possible outcomes of something that has multiple outcomes, and how probable each of those outcomes is, then you can describe how much information there is in knowing the result. It’s the sum, for all of the possible results, of the quantity negative one times the probability of the result times the logarithm-base-two of the probability of the result. When we were interested in only whether a team won or lost there were just the two outcomes possible, which made for some fairly simple calculations, and indicates that the information content of a game can be as high as 1 — if the team is equally likely to win or to lose — or as low as 0 — if the team is sure to win, or sure to lose. And the units of this measure are bits, the same kind of thing we use to measure (in groups of bits called bytes) how big a computer file is. […]

LikeLike

## How Interesting Can A Basketball Tournament Be? | nebusresearch 3:00 pm

onThursday, 17 March, 2016 Permalink |[…] How Interesting Is A Basketball Tournament?, the starting point, laying out how we measure one game’s interestingness and go on to 63 games from that. […]

LikeLike