I’d worked out an estimate of how much information content there is in a basketball score, by which I was careful to say the score that one team manages in a game. I wasn’t able to find out what the actual distribution of real-world scores was like, unfortunately, so I made up a plausible-sounding guess: that college basketball scores would be distributed among the imaginable numbers (whole numbers from zero through … well, infinitely large numbers, though in practice probably not more than 150) according to a very common distribution called the “Gaussian” or “normal” distribution, that the arithmetic mean score would be about 65, and that the standard deviation, a measure of how spread out the distribution of scores is, would be about 10.
If those assumptions are true, or are at least close enough to true, then there are something like 5.4 bits of information in a single team’s score. Put another way, if you were trying to divine the score by asking someone who knew it a series of carefully-chosen questions, like, “is the score less than 65?” or “is the score more than 39?”, with at each stage each question equally likely to be answered yes or no, you could expect to hit the exact score with usually five, sometimes six, such questions.
When I worked out how interesting, in an information-theory sense, a basketball game — and from that, a tournament — might be, I supposed there was only one thing that might be interesting about the game: who won? Or to be exact, “did (this team) win”? But that isn’t everything we might want to know about a game. For example, we might want to know what a team scored. People often do. So how to measure this?
The answer was given, in embryo, in my first piece about how interesting a game might be. If you can list all the possible outcomes of something that has multiple outcomes, and how probable each of those outcomes is, then you can describe how much information there is in knowing the result. It’s the sum, for all of the possible results, of the quantity negative one times the probability of the result times the logarithm-base-two of the probability of the result. When we were interested in only whether a team won or lost there were just the two outcomes possible, which made for some fairly simple calculations, and indicates that the information content of a game can be as high as 1 — if the team is equally likely to win or to lose — or as low as 0 — if the team is sure to win, or sure to lose. And the units of this measure are bits, the same kind of thing we use to measure (in groups of bits called bytes) how big a computer file is.
Continue reading “But How Interesting Is A Basketball Score?”