The picture explains itself nicely. Just a thought on an average day.
After the state pinball championship last month there was a second, side tournament. It was a sort-of marathon event in which I played sixteen games in short order. I won three of them and lost thirteen, a disheartening record. The question I can draw from this: was I hopelessly outclassed in the side tournament? Is it plausible that I could do so awfully?
The answer would be “of course not”. I was playing against, mostly, the same people who were in the state finals. (A few who didn’t qualify for the finals joined the side tournament.) In that I had done well enough, winning seven games in all out of fifteen played. It’s implausible that I got significantly worse at pinball between the main and the side tournament. But can I make a logically sound argument about this?
In full, probably not. It’s too hard. The question is, did I win way too few games compared to what I should have expected? But what should I have expected? I haven’t got any information on how likely it should have been that I’d win any of the games, especially not when I faced something like a dozen different opponents. (I played several opponents twice.)
But we can make a model. Suppose that I had a fifty percent chance of winning each match. This is a lie in detail. The model contains lies; all models do. The lies might let us learn something interesting. Some people there I could only beat with a stroke of luck on my side. Some people there I could fairly often expect to beat. If we pretend I had the same chance against everyone, though, we get something that we can model. It might tell us something about what really happened.
If I play 16 matches, and have a 50 percent chance of winning each of them, then I should expect to win eight matches. But there’s no reason I might not win seven instead, or nine. Might win six, or ten, without that being too implausible. It’s even possible I might not win a single match, or that I might win all sixteen matches. How likely?
This calls for a creature from the field of probability that we call the binomial distribution. It’s “binomial” because it’s about stuff for which there are exactly two possible outcomes. This fits. Each match I can win or I can lose. (If we tie, or if the match is interrupted, we replay it, so there’s not another case.) It’s a “distribution” because we describe, for a set of some number of attempted matches, how the possible outcomes are distributed. The outcomes are: I win none of them. I win exactly one of them. I win exactly two of them. And so on, all the way up to “I win exactly all but one of them” and “I win all of them”.
To answer the question of whether it’s plausible I should have done so badly I need to know more than just how likely it is I would win only three games. I need to also know the chance I’d have done worse. If I had won only two games, or only one, or none at all. Why?
Here I admit: I’m not sure I can give a compelling reason, at least not in English. I’ve been reworking it all week without being happy at the results. Let me try pieces.
One part is that as I put the question — is it plausible that I could do so awfully? — isn’t answered just by checking how likely it is I would win only three games out of sixteen. If that’s awful, then doing even worse must also be awful. I can’t rule out even-worse results from awfulness without losing a sense of what the word “awful” means. Fair enough, to answer that question. But I made up the question. Why did I make up that one? Why not just “is it plausible I’d get only three out of sixteen games”?
Habit, largely. Experience shows me that the probability of any particular result turns out to be implausibly low. It isn’t quite that case here; there’s only seventeen possible noticeably different outcomes of playing sixteen games. But there can be so many possible outcomes that even the most likely one isn’t.
Take an extreme case. (Extreme cases are often good ways to build an intuitive understanding of things.) Imagine I played 16,000 games, with a 50-50 chance of winning each one of them. It is most likely that I would win 8,000 of the games. But the probability of winning exactly 8,000 games is small: only about 0.6 percent. What’s going on there is that there’s almost the same chance of winning exactly 8,001 or 8,002 games. As the number of games increases the number of possible different outcomes increases. If there are 16,000 games there are 16,001 possible outcomes. It’s less likely that any of them will stand out. What saves our ability to predict the results of things is that the number of plausible outcomes increases more slowly. It’s plausible someone would win exactly three games out of sixteen. It’s impossible that someone would win exactly three thousand games out of sixteen thousand, even though that’s the same ratio of won games.
Card games offer another way to get comfortable with this idea. A bridge hand, for example, is thirteen cards drawn out of fifty-two. But the chance that you were dealt the hand you just got? Impossibly low. Should we conclude from this all bridge hands are hoaxes? No, but ask my mother sometime about the bridge class she took that one cruise. “Three of sixteen” is too particular; “at best three of sixteen” is a class I can study.
Unconvinced? I don’t blame you. I’m not sure I would be convinced of that, but I might allow the argument to continue. I hope you will. So here are the specifics. These are the chance of each count of wins, and the chance of having exactly that many wins, for sixteen matches:
So the chance of doing as awfully as I had — winning zero or one or two or three games — is pretty dire. It’s a little above one percent.
Is that implausibly low? Is there so small a chance that I’d do so badly that we have to figure I didn’t have a 50-50 chance of winning each game?
I hate to think that. I didn’t think I was outclassed. But here’s a problem. We need some standard for what is “it’s implausibly unlikely that this happened by chance alone”. If there were only one chance in a trillion that someone with a 50-50 chance of winning any game would put in the performance I did, we could suppose that I didn’t actually have a 50-50 chance of winning any game. If there were only one chance in a million of that performance, we might also suppose I didn’t actually have a 50-50 chance of winning any game. But here there was only one chance in a hundred? Is that too unlikely?
It depends. We should have set a threshold for “too implausibly unlikely” before we started research. It’s bad form to decide afterward. There are some thresholds that are commonly taken. Five percent is often useful for stuff where it’s hard to do bigger experiments and the harm of guessing wrong (dismissing the idea I had a 50-50 chance of winning any given game, for example) isn’t so serious. One percent is another common threshold, again common in stuff like psychological studies where it’s hard to get more and more data. In a field like physics, where experiments are relatively cheap to keep running, you can gather enough data to insist on fractions of a percent as your threshold. Setting the threshold after is bad form.
In my defense, I thought (without doing the work) that I probably had something like a five percent chance of doing that badly by luck alone. It suggests that I did have a much worse than 50 percent chance of winning any given game.
Is that credible? Well, yeah; I may have been in the top sixteen players in the state. But a lot of those people are incredibly good. Maybe I had only one chance in three, or something like that. That would make the chance I did that poorly something like one in six, likely enough.
And it’s also plausible that games are not independent, that whether I win one game depends in some way on whether I won or lost the previous. But it does feel like it’s easier to win after a win, or after a close loss. And it feels harder to win a game after a string of losses. I don’t know that this can be proved, not on the meager evidence I have available. And you can almost always question the independence of a string of events like this. It’s the safe bet.
And now let me close the week with some other evergreen articles. A couple years back I mixed the NCAA men’s basketball tournament with information theory to produce a series of essays that fit the title I’ve given this recap. They also sprawl out into (US) football and baseball. Let me link you to them:
- How Interesting Is A Basketball Tournament? in which I consider 63 single games and whether one team or the other wins, which is what usually happens.
- What We Talk About When We Talk About How Interesting What We’re Talking About Is as I fill in some terminology.
- But How Interesting Is A Real Basketball Tournament? noting that sometimes you know who’s going to win a game before the game even starts.
- But How Interesting Is A Basketball Score? in which I open information theory to points-shaving.
- Doesn’t The Other Team Count? How Much? in which I ponder how to extend the information content of a single score to cover the case of two teams being in the game.
- A Little More Talk About What We Talk About When We Talk About How Interesting What We Talk About Is which fills in some more of the terminology and historical content.
- How Interesting Is A Football Score? as I try to figure out (US) football as an information-theory puzzle.
- How Interesting Is A Baseball Score? Some Partial Results that are based on historic data, so far as I could find, and that should extend to any low-scoring sport.
- How Interesting Is A Baseball Score? Some Further Results as I got some other historical data and refined my estimate based on what scores actually turn up a lot, as opposed to that time a game ended with a score of 49 to 33.
- How Interesting Is A Low-Scoring Game? in which I address baseball and any other low-scoring sport, such as soccer or hockey, by the simple process of making up data and seeing what those imply.
A follow-up for people curious how much I lost at the state pinball championships Saturday: I lost at the state pinball championships Saturday. As I expected I lost in the first round. I did beat my expectations, though. I’d figured I would win one, maybe two games in our best-of-seven contest. As it happened I won three games and I had a fighting chance in game seven.
I’d mentioned in the previous essay about how much contingency there is especially in a short series like this one. My opponent picked the game I expected she would to start out. And she got an awful bounce on the first ball, while I got a very lucky bounce that started multiball on the last. So I won, but not because I was playing better. The seventh game was one that I had figured she might pick if she needed to crush me, and if I had gotten a better bounce on the first ball I’d still have had an uphill struggle. Just less of one.
After the first round I got into a set of three “tie-breaking” rounds, used to sort out which of the sixteen players ranked as number 11 versus number 10. Each of those were a best-of-three series. I did win one series and lost two others, dropping me into 12th place. Over the three series I had four wins and four losses, so I can’t say that I mismatched there.
Where I might have been mismatched is the side tournament. This was a two-hour marathon of playing a lot of games one after the other. I finished with three wins and 13 losses, enough to make me wonder whether I somehow went from competent to incompetent in the hour or so between the main and the side tournament. Of course not, based on a record like that, but — can I prove it?
Meanwhile a friend pointed out The New York Times covering the New York State pinball championship:
The article is (at least for now) at https://www.nytimes.com/2017/02/12/nyregion/pinball-state-championship.html. What my friend couldn’t have known, and what shows how networked people are, is that I know one of the people featured in the article, Sean “The Storm” Grant. Well, I knew him, back in college. He was an awesome pinball player even then. And he’s only got more awesome since.
How awesome? Let me give you some background. The International Flipper Pinball Association (IFPA) gives players ranking points. These points are gathered by playing in leagues and tournaments. Each league or tournament has a certain point value. That point value is divided up among the players, in descending order from how they finish. How many points do the events have? That depends on how many people play and what their ranking is. So, yes, how much someone’s IFPA score increases depends on the events they go to, and the events they go to depend on their score. This might sound to you like there’s a differential equation describing all this. You’re close: it’s a difference equation, because these rankings change with the discrete number of events players go to. But there’s an interesting and iterative system at work there.
(Points only expire with time. The system is designed to encourage people to play a lot of things and keep playing them. You can’t lose ranking points by playing, although it might hurt your player-versus-player rating. That’s calculated by a formula I don’t understand at all.)
Anyway, Sean Grant plays in the New York Superleague, a crime-fighting band of pinball players who figured out how to game the IFPA rankings system. They figured out how to turn the large number of people who might visit a Manhattan bar and casually play one or two games into a source of ranking points for the serious players. The IFPA, combatting this scheme, just this week recalculated the Superleague values and the rankings of everyone involved in it. It’s fascinating stuff, in that way a heated debate over an issue you aren’t emotionally invested in can be.
Anyway. Grant is such a skilled player that he lost more points in this nerfing than I have gathered in my whole competitive-pinball-playing career.
So while I knew I’d be knocked out in the first round of the Michigan State Championships I’ll admit I had fantasies of having an impossibly lucky run. In that case, I’d have gone to the nationals and been turned into a pale, silverball-covered paste by people like Grant.
Thanks again for all your good wishes, kind readers. Now we start the long road to the 2017 State Championships, to be held in February of next year. I’m already in 63rd place in the state for the year! (There haven’t been many events for the year yet, and the championship and side tournament haven’t posted their ranking scores yet.)
This weekend, all going well, I’ll be going to the Michigan state pinball championship contest. There, I will lose in the first round.
I’m not trying to run myself down. But I know who I’m scheduled to play in the first round, and she’s quite a good player. She’s the state’s highest-ranked woman playing competitive pinball. So she starts off being better than me. And then the venue is one she gets to play in more than I do. Pinball, a physical thing, is idiosyncratic. The reflexes you build practicing on one table can betray you on a strange machine. She’s had more chance to practice on the games we have and that pretty well settles the question. I’m still showing up, of course, and doing my best. Stranger things have happened than my winning a game. But I’m going in with I hope realistic expectations.
That bit about having realistic expectations, though, makes me ask what are realistic expectations. The first round is a best-of-seven match. How many games should I expect to win? And that becomes a probability question. It’s a great question to learn on, too. Our match is straightforward to model: we play up to seven times. Each time we play one or the other wins.
So we can start calculating. There’s some probability I have of winning any particular game. Call that number ‘p’. It’s at least zero (I’m not sure to lose) but it’s less than one (I’m not sure to win). Let’s suppose the probability of my winning never changes over the course of seven games. I will come back to the card I palmed there. If we’re playing 7 games, and I have a chance ‘p’ of winning any one of them, then the number of games I can expect to win is 7 times ‘p’. This is the number of wins you might expect if you were called on in class and had no idea and bluffed the first thing that came to mind. Sometimes that works.
7 times p isn’t very enlightening. What number is ‘p’, after all? And I don’t know exactly. The International Flipper Pinball Association tracks how many times I’ve finished a tournament or league above her and vice-versa. We’ve played in 54 recorded events together, and I’ve won 23 and lost 29 of them. (We’ve tied twice.) But that isn’t all head-to-head play. It counts matches where I’m beaten by someone she goes on to beat as her beating me, and vice-versa. And it includes a lot of playing not at the venue. I lack statistics and must go with my feelings. I’d estimate my chance of beating her at about one in three. Let’s say ‘p’ is 1/3 until we get evidence to the contrary. It is “Flipper Pinball” because the earliest pinball machines had no flippers. You plunged the ball into play and nudged the machine a little to keep it going somewhere you wanted. (The game Simpsons Pinball Party has a moment where Grampa Simpson says, “back in my day we didn’t have flippers”. It’s the best kind of joke, the one that is factually correct.)
Seven times one-third is not a difficult problem. It comes out to two and a third, raising the question of how one wins one-third of a pinball game. Most games involve playing three rounds, called balls, is the obvious observation. But this one-third of a game is an average. Imagine the two of us playing three thousand seven-game matches, without either of us getting the least bit better or worse or collapsing of exhaustion. I would expect to win seven thousand of the games, or two and a third games per seven-game match.
Ah, but … that’s too high. I would expect to win two and a third games out of seven. But we probably won’t play seven. We’ll stop when she or I gets to four wins. This makes the problem hard. Hard is the wrong word. It makes the problem tedious. At least it threatens to. Things will get easy enough, but we have to go through some difficult parts first.
There are eight different ways that our best-of-seven match can end. She can win in four games. I can win in four games. She can win in five games. I can win in five games. She can win in six games. I can win in six games. She can win in seven games. I can win in seven games. There is some chance of each of those eight outcomes happening. And exactly one of those will happen; it’s not possible that she’ll win in four games and in five games, unless we lose track of how many games we’d played. They give us index cards to write results down. We won’t lose track.
It’s easy to calculate the probability that I win in four games, if the chance of my winning a game is the number ‘p’. The probability is p4. Similarly it’s easy to calculate the probability that she wins in four games. If I have the chance ‘p’ of winning, then she has the chance ‘1 – p’ of winning. So her probability of winning in four games is (1 – p)4.
The probability of my winning in five games is more tedious to work out. It’s going to be p4 times (1 – p) times 4. The 4 here is the number of different ways that she can win one of the first four games. Turns out there’s four ways to do that. She could win the first game, or the second, or the third, or the fourth. And in the same way the probability she wins in five games is p times (1 – p)4 times 4.
The probability of my winning in six games is going to be p4 times (1 – p)2 times 10. There are ten ways to scatter four wins by her among the first five games. The probability of her winning in six games is the strikingly parallel p2 times (1 – p)4 times 10.
The probability of my winning in seven games is going to be p4 times (1 – p)3 times 20, because there are 20 ways to scatter three wins among the first six games. And the probability of her winning in seven games is p3 times (1 – p)4 times 20.
Add all those probabilities up, no matter what ‘p’ is, and you should get 1. Exactly one of those four outcomes has to happen. And we can work out the probability that the series will end after four games: it’s the chance she wins in four games plus the chance I win in four games. The probability that the series goes to five games is the probability that she wins in five games plus the probability that I win in five games. And so on for six and for seven games.
So that’s neat. We can figure out the probability of the match ending after four games, after five, after six, or after seven. And from that we can figure out the expected length of the match. This is the expectation value. Take the product of ‘4’ and the chance the match ends at four games. Take the product of ‘5’ and the chance the match ends at five games. Take the product of ‘6’ and the chance the match ends at six games. Take the product of ‘7’ and the chance the match ends at seven games. Add all those up. That’ll be, wonder of wonders, the number of games a match like this can be expected to run.
Now it’s a matter of adding together all these combinations of all these different outcomes and you know what? I’m not doing that. I don’t know what the chance is I’d do all this arithmetic correctly is, but I know there’s no chance I’d do all this arithmetic correctly. This is the stuff we pirate Mathematica to do. (Mathematica is supernaturally good at working out mathematical expressions. A personal license costs all the money you will ever have in your life plus ten percent, which it will calculate for you.)
Happily I won’t have to work it out. A person appearing to be a high school teacher named B Kiggins has worked it out already. Kiggins put it and a bunch of other interesting worksheets on the web. (Look for the Voronoi Diagramas!)
There’s a lot of arithmetic involved. But it all simplifies out, somehow. Per Kiggins’ work, the expected number of games in a best-of-seven match, if one of the competitors has the chance ‘p’ of winning any given game, is:
Whatever you want to say about that, it’s a polynomial. And it’s easy enough to evaluate it, especially if you let the computer evaluate it. Oh, I would say it seems like a shame all those coefficients of ‘4’ drop off and we get weird numbers like ’52’ after that. But there’s something beautiful in there being four 4’s, isn’t there? Good enough.
So. If the chance of my winning a game, ‘p’, is one-third, then we’d expect the series to go 5.5 games. This accords well with my intuition. I thought I would be likely to win one game. Winning two would be a moral victory akin to championship.
Let me go back to my palmed card. This whole analysis is based on the idea that I have some fixed probability of winning and that it isn’t going to change from one game to the next. If the probability of winning is entirely based on my and my opponents’ abilities this is fair enough. Neither of us is likely to get significantly more or less skilled over the course of even seven matches. We won’t even play long enough to get fatigued. But ability isn’t everything.
But our abilities aren’t everything. We’re going to be playing up to seven different tables. How each table reacts to our play is going to vary. Some tables may treat me better, some tables my opponent. Luck of the draw. And there’s an important psychological component. It’s easy to get thrown and to let a bad ball wreck the rest of one’s game. It’s hard to resist feeling nervous if you go into the last ball from way behind your opponent. And it seems as if a pinball knows you’re nervous and races out of play to help you calm down. (The best pinball players tend to have outstanding last balls, though. They don’t get rattled. And they spend the first several balls building up to high-value shots they can collect later on.) And there will be freak events. Last weekend I was saved from elimination in a tournament by the pinball machine spontaneously resetting. We had to replay the game. I did well in the tournament, but it was the freak event that kept me from being knocked out in the first round.
That’s some complicated stuff to fit together. I suppose with enough data we could possibly model how much the differences between pinball machines affects the outcome. That’s what sabermetrics is all about. Representing how severely I’ll build a little bad luck into a lot of bad luck? Oh, that’s hard.
Too hard to deal with, at least not without much more sports psychology and modelling of pinball players than we have data to do. The supposition that my chance of winning is fixed for the duration of the match may not be true. But we won’t be playing enough games to be able to tell the difference. The assumption that my chance of winning doesn’t change over the course of the match may be false. But it’s near enough, and it gets us some useful information. We have to know not to demand too much precision from our model.
And seven games isn’t statistically significant. Not when players are as closely matched as we are. I could be worse and still get a couple wins in when they count; I could play better than my average and still get creamed four games straight. I’ll be trying my best, of course. But I expect my best is one or two wins, then getting to the snack room and waiting for the side tournament to start. Shall let you know if something interesting happens.
The mathematically-themed comic strips of the past week tended to touch on some classic topics and classic motifs. That’s enough for me to declare a title for these comics. Enjoy, won’t you please?
John McPherson’s Close To Home for the 9th uses the classic board full of mathematics to express deep thinking. And it’s deep thinking about sports. Nerds like to dismiss sports as trivial and so we get the punch line out of this. But models of sports have been one of the biggest growth fields in mathematics the past two decades. And they’ve shattered many longstanding traditional understandings of strategy. It’s not proper mathematics on the board, but that’s all right. It’s not proper sabermetrics either.
Vic Lee’s Pardon My Planet for the 10th is your classic joke about putting mathematics in marketable terms. There is an idea that a mathematical idea to be really good must be beautiful. And it’s hard to say exactly what beauty is, but “short” and “simple” seem to be parts of it. That’s a fine idea, as long as you don’t forget how context-laden these are. Whether an idea is short depends on what ideas and what concepts you have as background. Whether it’s simple depends on how much you’ve seen similar ideas before. π looks simple. “The smallest positive root of the solution to the differential equation y”(x) = -y(x) where y(0) = 0 and y'(0) = 1” looks hard, but however much mathematics you know, rhetoric alone tells you those are the same thing.
Scott Hilburn’s The Argyle Sweater for the 10th is your classic anthropomorphic-numerals joke. Well, anthropomorphic-symbols in this case. But it’s the same genre of joke.
Randy Glasbergen’s Glasbergen Cartoons rerun for the 10th is your classic sudoku-and-arithmetic-as-hard-work joke. And it’s neat to see “programming a VCR” used as an example of the difficult-to-impossible task for a comic strip drawn late enough that it’s into the era of flat-screen, flat-bodied desktop computers.
Bill Holbrook’s On The Fastrack for 11th is your classic grumbling-about-how-mathematics-is-understood joke. Well, statistics, but most people consider that part of mathematics. (One could mount a strong argument that statistics is as independent of mathematics as physics or chemistry are.) Statistics offers many chances for intellectual mischief, whether deliberately or just from not thinking matters through. That may be inevitable. Sampling, as in political surveys, must talk about distributions, about ranges of possible results. It’s hard to be flawless about that.
That said I’m not sure I can agree with Fi in her example here. Take her example, a political poll with three-point margin of error. If the poll says one candidate’s ahead by three points, Fi asserts, they’ll say it’s tied when it’s as likely the lead is six. I don’t see that’s quite true, though. When we sample something we estimate the value of something in a population based on what it is in the sample. Obviously we’ll be very lucky if the population and the sample have exactly the same value. But the margin of error gives us a range of how far from the sample value it’s plausible the whole population’s value is, or would be if we could measure it. Usually “plausible” means 95 percent; that is, 95 percent of the time the actual value will be within the margin of error of the sample’s value.
So here’s where I disagree with Fi. Let’s suppose that the first candidate, Kirk, polls at 43 percent. The second candidate, Picard, polls at 40 percent. (Undecided or third-party candidates make up the rest.) I agree that Kirk’s true, whole-population, support is equally likely to be 40 percent or 46 percent. But Picard’s true, whole-population, support is equally likely to be 37 percent or 43 percent. Kirk’s lead is actually six points if his support was under-represented in the sample and Picard’s was over-represented, by the same measures. But suppose Kirk was just as over-represented and Picard just as under-represented as they were in the previous case. This puts Kirk at 40 percent and Picard at 43 percent, a Kirk-lead of minus three percentage points.
So what’s the actual chance these two candidates are tied? Well, you have to say what a tie is. It’s vanishingly impossible they have precisely the same true support and we can’t really calculate that. Don’t blame statisticians. You tell me an election in which one candidate gets three more votes than the other isn’t really tied, if there are more than seven votes cast. We can work on “what’s the chance their support is less than some margin?” And then you’d have all the possible chances where Kirk gets a lower-than-surveyed return while Picard gets a higher-than-surveyed return. I can’t say what that is offhand. We haven’t said what this margin-of-tying is, for one thing.
But it is certainly higher than the chance the lead is actually six; that only happens if the actual vote is different from the poll in one particular way. A tie can happen if the actual vote is different from the poll in many different ways.
Doing a quick and dirty little numerical simulation suggests to me that, if we assume the sampling respects the standard normal distribution, then in this situation Kirk probably is ahead of Picard. Given a three-point lead and a three-point margin for error Kirk would be expected to beat Picard about 92 percent of the time, while Picard would win about 8 percent of the time.
Here I have been making the assumption that Kirk’s and Picard’s support are to an extent independent. That is, a vote might be for Kirk or for Picard or for neither. There’s this bank of voting-for-neither-candidate that either could draw on. If there are no undecided candidates, every voter is either Kirk or Picard, then all of this breaks down: Kirk can be up by six only if Picard is down by six. But I don’t know of surveys that work like that.
Not to keep attacking this particular strip, which doesn’t deserve harsh treatment, but it gives me so much to think about. Assuming by “they” Fi means news anchors — and from what we get on panel, it’s not actually clear she does — I’m not sure they actually do “say the poll is tied”. What I more often remember hearing is that the difference is equal to, or less than, the survey’s margin of error. That might get abbreviated to “a statistical tie”, a usage that I think is fair. But Fi might mean the candidates or their representatives in saying “they”. I can’t fault the campaigns for interpreting data in ways useful for their purposes. The underdog needs to argue that the election can yet be won. The leading candidate needs to argue against complacency. In either case a tie is a viable selling point and a reasonable interpretation of the data.
Gene Weingarten, Dan Weingarten, and David Clark’s Barney and Clyde for the 12th is a classic use of Einstein and general relativity to explain human behavior. Everyone’s tempted by this. Usually it’s thermodynamics that inspires thoughts that society could be explained mathematically. There’s good reason for this. Thermodynamics builds great and powerful models of complicated systems by supposing that we never know, or need to know, what any specific particle of gas or fluid is doing. We care only about aggregate data. That statistics shows we can understand much about humanity without knowing fine details reinforces this idea. The Wingartens and Clark probably shifted from thermodynamics to general relativity because Einstein is recognizable to normal people. And we’ve all at least heard of mass warping space and can follow the metaphor to money warping law.
In vintage comics, Dan Barry’s Flash Gordon for the 14th (originally run the 28th of November, 1961) uses the classic idea that sufficient mathematics talent will outwit games of chance. Many believe it. I remember my grandmother’s disappointment that she couldn’t bring the underaged me into the casinos in Atlantic City. This did save her the disappointment of learning I haven’t got any gambling skill besides occasionally buying two lottery tickets if the jackpot is high enough. I admit that an irrational move on my part, but I can spare two dollars for foolishness once or twice a year. The idea of beating a roulette wheel, at least a fair wheel, isn’t absurd. In principle if you knew enough about how the wheel was set up and how the ball was weighted and how it was launched into the spin you could predict where it would land. In practice, good luck. I wouldn’t be surprised if a good roulette wheel weren’t chaotic, or close to it. If it’s chaotic then while the outcome could be predicted if the wheel’s spin and the ball’s initial speed were known well enough, they can’t be measured well enough for a prediction to be meaningful. The comic also uses the classic word balloon full of mathematical symbols to suggest deep reasoning. I spotted Einstein’s famous quote there.
I’ve found a good way to procrastinate on the next essay in the Why Stuff Can Orbit series. (I’m considering explaining all of differential calculus, or as much as anyone really needs, to save myself a little work later on.) In the meanwhile, though, here’s some interesting reading that’s come to my attention the last few weeks and that you might procrastinate your own projects with. (Remember Benchley’s Principle!)
First is Jeremy Kun’s essay Habits of highly mathematical people. I think it’s right in describing some of the worldview mathematics training instills, or that encourage people to become mathematicians. It does seem to me, though, that most everything Kun describes is also true of philosophers. I’m less certain, but I strongly suspect, that it’s also true of lawyers. These concentrations all tend to encourage thinking about we mean by things, and to test those definitions by thought experiments. If we suppose this to be true, then what implications would it have? What would we have to conclude is also true? Does it include anything that would be absurd to say? And is are the results useful enough we can accept a bit of apparent absurdity?
New York magazine had an essay: Jesse Singal’s How Researchers Discovered the Basketball “Hot Hand”. The “Hot Hand” phenomenon is one every sports enthusiast, and most casual fans, know: sometimes someone is just playing really, really well. The problem has always been figuring out whether it exists. Do anything that isn’t a sure bet long enough and there will be streaks. There’ll be a stretch where it always happens; there’ll be a stretch where it never does. That’s how randomness works.
But it’s hard to show that. The messiness of the real world interferes. A chance of making a basketball shot is not some fixed thing over the course of a career, or over a season, or even over a game. Sometimes players do seem to be hot. Certainly anyone who plays anything competitively experiences a feeling of being in the zone, during which stuff seems to just keep going right. It’s hard to disbelieve something that you witness, even experience.
So the essay describes some of the challenges of this: coming up with a definition of a “hot hand”, for one. Coming up with a way to test whether a player has a hot hand. Seeing whether they’re observed in the historical record. Singal’s essay writes about some of the history of studying hot hands. There is a lot of probability, and of psychology, and of experimental design in it.
And then there’s this intriguing question Analysis Fact Of The Day linked to: did Gaston Julia ever see a computer-generated image of a Julia Set? There are many Julia Sets; they and their relative, the Mandelbrot Set, became trendy in the fractals boom of the 1980s. If you knew a mathematics major back then, there was at least one on her wall. It typically looks like a craggly, lightning-rimmed cloud. Its shapes are not easy to imagine. It’s almost designed for the computer to render. Gaston Julia died in March of 1978. Could he have seen a depiction?
It’s not clear. The linked discussion digs up early computer renderings. It also brings up an example of a late-19th-century hand-drawn depiction of a Julia-like set, and compares it to a modern digital rendition of the thing. Numerical simulation saves a lot of tedious work; but it’s always breathtaking to see how much can be done by reason.
And now to close out the rest of last week’s comics, those from between the 1st and the 6th of the month. It’s a smaller set. Take it up with the traffic division of Comic Strip Master Command.
Mason Mastroianni, Mick Mastroianni, and Perri Hart’s B.C. for the 2nd is mostly a word problem joke. It’s boosted some by melting into it a teacher complaining about her pay. It does make me think some about what the point of a story problem is. That is, why is the story interesting? Often it isn’t. The story is just an attempt to make a computation problem look like the sort of thing someone might wonder in the real world. This is probably why so many word problems are awful as stories and as incentive to do a calculation. There’s a natural interest that one might have in, say, the total distance travelled by a rubber ball dropped and bouncing until it finally comes to a rest. But that’s only really good for testing how one understands a geometric series. It takes more storytelling to work out why you might want to find a cube root of x2 minus eight.
Dave Whamond’s Reality Check for the 3rd uses mathematics on the blackboard as symbolic for all the problems one might have. Also a solution, if you call it that. It wouldn’t read so clearly if Ms Haversham had an English problem on the board.
Mark Anderson’s Andertoons for the 5th keeps getting funnier to me. At first reading I didn’t connect the failed mathematics problem of 2 x 0 with the caption. Once I did, I realized how snugly fit the comic is.
Greg Curfman’s Meg Classics for the 5th ran originally the 23rd of May, 1998. The application of mathematics to everyday sports was a much less developed thing back then. It’s often worthwhile to methodically study what you do, though, to see what affects the results. Here Mike has found the team apparently makes twelve missed shots for each goal. This might not seem like much of a formula, but these are kids. We shouldn’t expect formulas with a lot of variables under consideration. Since Meg suggests Mike needed to account for “the whiff factor” I have to suppose she doesn’t understand the meaning of the formula. Or perhaps she wonders why missed kicks before getting to the goal don’t matter. Well, every successful model starts out as a very simple thing to which we add complexity, and realism, as we’re able to handle them. If lucky we end up with a good balance between a model that describes what we want to know and yet is simple enough to understand.
While researching for my post about the information content of baseball scores I found some tantalizing links. I had wanted to know how often each score came up. From this I could calculate the entropy, the amount of information in the score. That’s the sum, taken over every outcome, of minus one times the frequency of that score times the base-two logarithm of the frequency of the outcome. And I couldn’t find that.
An article in The Washington Post had a fine lead, though. It offers, per the title, “the score of every basketball, football, and baseball game in league history visualized”. And as promised it gives charts of how often each number of runs has turned up in a game. The most common single-team score in a game is 3, with 4 and 2 almost as common. I’m not sure the date range for these scores. The chart says it includes (and highlights) data from “a century ago”. And as the article was posted in December 2014 it can hardly use data from after that. I can’t imagine that the 2015 season has changed much, though. And whether they start their baseball statistics at either 1871, 1876, 1883, 1891, or 1901 (each a defensible choice) should only change details.
Which is fine. I can’t get precise frequency data from the chart. The chart offers how many thousands of times a particular score has come up. But there’s not the reference lines to say definitely whether a zero was scored closer to 21,000 or 22,000 times. I will accept a rough estimate, since I can’t do any better.
I made my best guess at the frequency, from the chart. And then made a second-best guess. My best guess gave the information content of a single team’s score as a touch more than 3.5 bits. My second-best guess gave the information content as a touch less than 3.5 bits. So I feel safe in saying a single team’s score is about three and a half bits of information.
So the score of a baseball game, with two teams scoring, is probably somewhere around twice that, or about seven bits of information.
I have to say “around”. This is because the two teams aren’t scoring runs independently of one another. Baseball doesn’t allow for tie games except in rare circumstances. (It would usually be a game interrupted for some reason, and then never finished because the season ended with neither team in a position where winning or losing could affect their standing. I’m not sure that would technically count as a “game” for Major League Baseball statistical purposes. But I could easily see a roster of game scores counting that.) So if one team’s scored three runs in a game, we have the information that the other team almost certainly didn’t score three runs.
This estimate, though, does fit within my range estimate from 3.76 to 9.25 bits. And as I expected, it’s closer to nine bits than to four bits. The entropy seems to be a bit less than (American) football scores — somewhere around 8.7 bits — and college basketball — probably somewhere around 10.8 bits — which is probably fair. There are a lot of numbers that make for plausible college basketball scores. There are slightly fewer pairs of numbers that make for plausible football scores. There are fewer still pairs of scores that make for plausible baseball scores. So there’s less information conveyed in knowing that the game’s score is.
I’ve hardly stopped reading the comics. I doubt I could even if I wanted at this point. But all the comics this bunch are from GoComics, which as far as I’m aware doesn’t turn off access to comic strips after a couple of weeks. So I don’t quite feel justified including the images of the comics when you can just click links to them instead.
It feels a bit barren, I admit. I wonder if I shouldn’t commission some pictures so I have something for visual appeal. There’s people I know who do comics online. They might be able to think of something to go alongside every “Student has snarky answer for a word problem” strip.
Brian and Ron Boychuk’s The Chuckle Brothers for the 8th of May drops in an absolute zero joke. Absolute zero’s a neat concept. People became aware of it partly by simple extrapolation. Given that the volume of a gas drops as the temperature drops, is there a temperature at which the volume drops to zero? (It’s complicated. But that’s the thread I use to justify pointing out this strip here.) And people also expected there should be an absolute temperature scale because it seemed like we should be able to describe temperature without tying it to a particular method of measuring it. That is, it would be a temperature “absolute” in that it’s not explicitly tied to what’s convenient for Western Europeans in the 19th century to measure. That zero and that instrument-independent temperature idea get conflated, and reasonably so. Hasok Chang’s Inventing Temperature: Measurement and Scientific Progress is well-worth the read for people who want to understand absolute temperature better.
Gene Weingarten, Dan Weingarten & David Clark’s Barney and Clyde for the 9th is another strip that seems like it might not belong here. While it’s true that accidents sometimes lead to great scientific discoveries, what has that to do with mathematics? And the first thread is that there are mathematical accidents and empirical discoveries. Many of them are computer-assisted. There is something that feels experimental about doing a simulation. Modern chaos theory, the study of deterministic yet unpredictable systems, has at its founding myth Edward Lorentz discovering that tiny changes in a crude weather simulation program mattered almost right away. (By founding myth I don’t mean that it didn’t happen. I just mean it’s become the stuff of mathematics legend.)
But there are other ways that “accidents” can be useful. Monte Carlo methods are often used to find extreme — maximum or minimum — solutions to complicated systems. These are good if it’s hard to find a best possible answer, but it’s easy to compare whether one solution is better or worse than another. We can get close to the best possible answer by picking an answer at random, and fiddling with it at random. If we improve things, good: keep the change. You can see why this should get us pretty close to a best-possible-answer soon enough. And if we make things worse then … usually but not always do we reject the change. Sometimes we take this “accident”. And that’s because if we only take improvements we might get caught at a local extreme. An even better extreme might be available but only by going down an initially unpromising direction. So it’s worth allowing for some “mistakes”.
Mark Anderson’s Andertoons for the 10th of Anderson is some wordplay on volume. The volume of boxes is an easy formula to remember and maybe it’s a boring one. It’s enough, though. You can work out the volume of any shape using just the volume of boxes. But you do need integral calculus to tell how to do it. So maybe it’s easier to memorize the formula for volumes of a pyramid and a sphere.
Berkeley Breathed’s Bloom County for the 10th of May is a rerun from 1981. And it uses a legitimate bit of mathematics for Milo to insult Freida. He calls her a “log 10 times 10 to the derivative of 10,000”. The “log 10” is going to be 1. A reference to logarithm, without a base attached, means either base ten or base e. “log” by itself used to invariably mean base ten, back when logarithms were needed to do ordinary multiplication and division and exponentiation. Now that we have calculators for this mathematicians have started reclaiming “log” to mean the natural logarithm, base e, which is normally written “ln”, but that’s still an eccentric use. Anyway, the logarithm base ten of ten is 1: 10 is equal to 10 to the first power.
10 to the derivative of 10,000 … well, that’s 10 raised to whatever number “the derivative of 10,000” is. Derivatives take us into calculus. They describe how much a quantity changes as one or more variables change. 10,000 is just a number; it doesn’t change. It’s called a “constant”, in another bit of mathematics lingo that reminds us not all mathematics lingo is hard to understand. Since it doesn’t change, its derivative is zero. As anything else changes, the constant 10,000 does not. So the derivative of 10,000 is zero. 10 to the zeroth power is 1.
So, one times one is … one. And it’s rather neat that kids Milo’s age understand derivatives well enough to calculate that.
Ruben Bolling’s Super-Fun-Pak Comix rerun for the 10th happens to have a bit of graph theory in it. One of Uncle Cap’n’s Puzzle Pontoons is a challenge to trace out a figure without retracting a line or lifting your pencil. You can’t, not this figure. One of the first things you learn in graph theory teaches how to tell, and why. And thanks to a Twitter request I’m figuring to describe some of that for the upcoming Theorem Thursdays project. Watch this space!
Charles Schulz’s Peanuts Begins for the 11th, a rerun from the 6th of February, 1952, is cute enough. It’s one of those jokes about how a problem seems intractable until you’ve found the right way to describe it. I can’t fault Charlie Brown’s thinking here. Figuring out a way the problems are familiar and easy is great.
Shaenon K Garrity and Jeffrey C Wells’s Skin Horse for the 12th is a “see, we use mathematics in the real world” joke. In this case it’s triangles and triangulation. That’s probably the part of geometry it’s easiest to demonstrate a real-world use for, and that makes me realize I don’t remember mathematics class making use of that. I remember it coming up some, particularly in what must have been science class when we built and launched model rockets. We used a measure of how high an angle the rocket reached, and knowledge of how far the observing station was from the launchpad. But that wasn’t mathematics class for some reason, which is peculiar.
Meanwhile I have the slight ongoing quest to work out the information-theory content of sports scores. For college basketball scores I made up some plausible-looking score distributions and used that. For professional (American) football I found a record of all the score outcomes that’ve happened, and how often. I could use experimental results. And I’ve wanted to do other sports. Soccer was asked for. I haven’t been able to find the scoring data I need for that. Baseball, maybe the supreme example of sports as a way to generate statistics … has been frustrating.
The raw data is available. Retrosheet.org has logs of pretty much every baseball game, going back to the forming of major leagues in the 1870s. What they don’t have, as best I can figure, is a list of all the times each possible baseball score has turned up. That I could probably work out, when I feel up to writing the scripts necessary, but “work”? Ugh.
Some people have done the work, although they haven’t shared all the results. I don’t blame them; the full results make for a boring sort of page. “The Most Popular Scores In Baseball History”, at ValueOverReplacementGrit.com, reports the top ten most common scores from 1871 through 2010. The essay also mentions that as of then there were 611 unique final scores. And that lets me give some partial results, if we trust that blogger post from people I never heard of before are accurate and true. I will make that assumption over and over here.
There’s, in principle, no limit to how many scores are possible. Baseball contains many implied infinities, and it’s not impossible that a game could end, say, 580 to 578. But it seems likely that after 139 seasons of play there can’t be all that many more scores practically achievable.
Suppose then there are 611 possible baseball score outcomes, and that each of them is equally likely. Then the information-theory content of a score’s outcome is negative one times the logarithm, base two, of 1/611. That’s a number a little bit over nine and a quarter. You could deduce the score for a given game by asking usually nine, sometimes ten, yes-or-no questions from a source that knew the outcome. That’s a little higher than the 8.7 I worked out for football. And it’s a bit less than the 10.8 I estimate for college basketball.
And there’s obvious rubbish there. In no way are all 611 possible outcomes equally likely. “The Most Popular Scores In Baseball History” says that right there in the essay title. The most common outcome was a score of 3-2, with 4-3 barely less popular. Meanwhile it seems only once, on the 28th of June, 1871, has a baseball game ended with a score of 49-33. Some scores are so rare we can ignore them as possibilities.
(You may wonder how incompetent baseball players of the 1870s were that a game could get to 49-33. Not so bad as you imagine. But the equipment and conditions they were playing with were unspeakably bad by modern standards. Notably, the playing field couldn’t be counted on to be flat and level and well-mowed. There would be unexpected divots or irregularities. This makes even simple ground balls hard to field. The baseball, instead of being replaced with every batter, would stay in the game. It would get beaten until it was a little smashed shell of unpredictable dynamics and barely any structural integrity. People were playing without gloves. If a game ran long enough, they would play at dusk, without lights, with a muddy ball on a dusty field. And sometimes you just have four innings that get out of control.)
What’s needed is a guide to what are the common scores and what are the rare scores. And I haven’t found that, nor worked up the energy to make the list myself. But I found some promising partial results. In a September 2008 post on Baseball-Fever.com, user weskelton listed the 24 most common scores and their frequency. This was for games from 1993 to 2008. One might gripe that the list only covers fifteen years. True enough, but if the years are representative that’s fine. And the top scores for the fifteen-year survey look to be pretty much the same as the 139-year tally. The 24 most common scores add up to just over sixty percent of all baseball games, which leaves a lot of scores unaccounted for. I am amazed that about three in five games will have a score that’s one of these 24 choices though.
But that’s something. We can calculate the information content for the 25 outcomes, one each of the 24 particular scores and one for “other”. This will under-estimate the information content. That’s because “other” is any of 587 possible outcomes that we’re not distinguishing. But if we have a lower bound and an upper bound, then we’ve learned something about what the number we want can actually be. The upper bound is that 9.25, above.
The information content, the entropy, we calculate from the probability of each outcome. We don’t know what that is. Not really. But we can suppose that the frequency of each outcome is close to its probability. If there’ve been a lot of games played, then the frequency of a score and the probability of a score should be close. At least they’ll be close if games are independent, if the score of one game doesn’t affect another’s. I think that’s close to true. (Some games at the end of pennant races might affect each other: why try so hard to score if you’re already out for the year? But there’s few of them.)
The entropy then we find by calculating, for each outcome, a product. It’s minus one times the probability of that outcome times the base-two logarithm of the probability of that outcome. Then add up all those products. There’s good reasons for doing it this way and in the college-basketball link above I give some rough explanations of what the reasons are. Or you can just trust that I’m not lying or getting things wrong on purpose.
So let’s suppose I have calculated this right, using the 24 distinct outcomes and the one “other” outcome. That makes out the information content of a baseball score’s outcome to be a little over 3.76 bits.
As said, that’s a low estimate. Lumping about two-fifths of all games into the single category “other” drags the entropy down.
But that gives me a range, at least. A baseball game’s score seems to be somewhere between about 3.76 and 9.25 bits of information. I expect that it’s closer to nine bits than it is to four bits, but will have to do a little more work to make the case for it.
Last month, Sarcastic Goat asked me how interesting a soccer game was. This is “interesting” in the information theory sense. I describe what that is in a series of posts, linked to from above. That had been inspired by the NCAA “March Madness” basketball tournament. I’d been wondering about the information-theory content of knowing the outcome of the tournament, and of each game.
This measure, called the entropy, we can work out from knowing how likely all the possible outcomes of something — anything — are. If there’s two possible outcomes and they’re equally likely, the entropy is 1. If there’s two possible outcomes and one is a sure thing while the other can’t happen, the entropy is 0. If there’s four possible outcomes and they’re all equally likely, the entropy is 2. If there’s eight possible outcomes, all equally likely, the entropy is 3. If there’s eight possible outcomes and some are likely while some are long shots, the entropy is … smaller than 3, but bigger than 0. The entropy grows with the number of possible outcomes and shrinks with the number of unlikely outcomes.
But it’s easy to calculate. List all the possible outcomes. Find the probability of each of those possible outcomes happening. Then, calculate minus one times the probability of each outcome times the logarithm, base two, of that outcome. For each outcome, so yes, this might take a while. Then add up all those products.
I’d estimated the outcome of the 63-game basketball tournament was somewhere around 48 bits of information. There’s a fair number of foregone, or almost foregone, conclusions in the game, after all. And I guessed, based on a toy model of what kinds of scores often turn up in college basketball games, that the game’s score had an information content of a little under 11 bits of information.
Sarcastic Goat, as I say, asked about soccer scores. I don’t feel confident that I could make up a plausible model of soccer score distributions. So I went looking for historical data. Surely, a history of actual professional soccer scores over a couple decades would show all the possible, plausible, outcomes and how likely each was to turn out.
I didn’t find one. My search for soccer scores kept getting contaminated with (American) football scores. But that turned up something interesting anyway. Sports Reference LLC has a table which purports to list the results of all professional football games played from 1920 through the start of 2016. There’ve been, apparently, some 1,026 different score outcomes, from 0-0 through to 73-0.
As you’d figure, there are a lot of freakish scores; only once in professional football history has the game ended 62-28. (Although it’s ended 62-14 twice, somehow.) There hasn’t been a 2-0 game since the second week of the 1938 season. Some scores turn up a lot; 248 games (as of this writing) have ended 20-17. That’s the most common score, in its records. 27-24 and 17-14 are the next most common scores. If I’m not making a dumb mistake, 7-0 is the 21st most common score. 93 games have ended with that tally. But it hasn’t actually been a game’s final score since the 14th week of the 1983 season, somehow. 98 games have ended 21-17; only ten have ended 21-18. Weird.
Anyway, there’s 1,026 recorded outcomes. That’s surely as close to “all the possible outcomes” as we can expect to get, at least until the Jets manage to lose 74-0 in their home opener. But if all 1,026 outcomes were equally likely then the information content of the game’s score would be a touch over 10 bits. But these outcomes aren’t all equally likely. It’s vastly more likely that a game ended 16-13 than it is likely it ended 16-8.
Let’s suppose I didn’t make any stupid mistakes in working out the frequency of all the possible outcomes. Then the information content of a football game’s outcome is a little over 8.72 bits.
Don’t be too hypnotized by the digits past the decimal. It’s approximate. But it suggests that if you were asking a source that would only answer ‘yes’ or ‘no’, then you could expect to get the score for any particular football game with about nine well-chosen questions.
I’m not surprised this is less than my estimated information content of a basketball game’s score. I think basketball games see a wider range of likely scores than football games do.
If someone has a reference for the outcomes of soccer games — or other sports — over a reasonably long time please let me know. I can run the same sort of calculation. We might even manage the completely pointless business of ranking all major sports by the information content of their scores.
John Zakour and Scott Roberts’s Maria’s Day is going to Sunday-only publication. A shame, but I understand Zakour and Roberts choosing to focus their energies on better-paying venues. That those venues are “writing science fiction novels” says terrifying things about the economic logic of web comics.
This installment, from the 23rd, is a variation on the joke about the lawyer, or accountant, or consultant, or economist, who carefully asks “what do you want the answer to be?” before giving it. Sports are a rich mine of numbers, though. Mostly they’re statistics, and we might wonder: why does anyone care about sports statistics? Once the score of a game is done counted, what else matters? A sociologist and a sports historian are probably needed to give true, credible answers. My suspicion is that it amounts to money, as it ever does. If one wants to gamble on the outcomes of sporting events, one has to have a good understanding of what is likely to happen, and how likely it is to happen. And I suppose if one wants to manage a sporting event, one wants to spend money and time and other resources to best effect. That requires data, and that we see in numbers. And there are so many things that can be counted in any athletic event, aren’t there? All those numbers carry with them a hypnotic pull.
In Darrin Bell’s Candorville for the 24th of October, Lemont mourns how he’s forgotten how to do long division. It’s an easy thing to forget. For one, we have calculators, as Clyde points out. For another, long division ultimately requires we guess at and then try to improve an answer. It can’t be reduced to an operation that will never require back-tracking and trying some part of it again. That back-tracking — say, trying to put 28 into the number seven times, and finding it actually goes at least eight times — feels like a mistake. It feels like the sort of thing a real mathematician would never do.
And that’s completely wrong. Trying an answer, and finding it’s not quite right, and improving on it is perfectly sound mathematics. Arguably it’s the whole field of numerical mathematics. Perhaps students would find long division less haunting if they were assured that it is fine to get a wrong-but-close answer as long as you make it better.
John Graziano’s Ripley’s Believe It or Not for the 25th of October talks about the Rubik’s Cube, and all the ways it can be configured. I grant it sounds like 43,252,003,274,489,856,000 is a bit high a count of possible combinations. But it is about what I hear from proper mathematics texts, the ones that talk about group theory, so let’s let it pass.
The Rubik’s Cube gets talked about in group theory, the study of things that work kind of like arithmetic. In this case, turning one of the faces — well, one of the thirds of a face — clockwise or counterclockwise by 90 degrees, so the whole thing stays a cube, works like adding or subtracting one, modulo 4. That is, we pretend the only numbers are 0, 1, 2, and 3, and the numbers wrap around. 3 plus 1 is 0; 3 plus 2 is 1. 1 minus 2 is 3; 1 minus 3 is 2. There are several separate rotations that can be done, each turning a third of each face of the cube. That each face of the cube starts a different color means it’s easy to see how these different rotations interact and create different color patterns. And rotations look easy to understand. We can at least imagine rotating most anything. In the Rubik’s Cube we can look at a lot of abstract mathematics in a handheld and friendly-looking package. It’s a neat thing.
Scott Hilburn’s The Argyle Sweater for the 26th of October is really a physics joke. But it uses (gibberish) mathematics as the signifier of “a fully thought-out theory” and that’s good enough for me. Also the talk of a “big boing” made me giggle and I hope it does you too.
Izzy Ehnes’s The Best Medicine Cartoon makes, I believe, its debut for Reading the Comics posts with its entry for the 26th. It’s also the anthropomorphic-numerals joke for the week.
Frank Page’s Bob the Squirrel is struggling under his winter fur this week. On the 27th Bob tries to work out the Newtonian forces involved in rolling about in his condition. And this gives me the chance to share a traditional mathematicians joke and a cliche punchline.
The story goes that a dairy farmer knew he could be milking his cows better. He could surely get more milk, and faster, if only the operations of his farm were arranged better. So he hired a mathematician, to find the optimal way to configure everything. The mathematician toured every part of the pastures, the milking barn, the cows, everything relevant. And then the mathematician set to work devising a plan for the most efficient possible cow-milking operation. The mathematician declared, “First, assume a spherical cow.”
The punch line has become a traditional joke in the mathematics and science fields. As a joke it comments on the folkloric disconnection between mathematicians and practicality. It also comments on the absurd assumptions that mathematicians and scientists will make for the sake of producing a model, and for getting an answer.
The joke within the joke is that it’s actually fine to make absurd assumptions. We do it all the time. All models are simplifications of the real world, tossing away things that may be important to the people involved but that just complicate the work we mean to do. We may assume cows are spherical because that reflects, in a not too complicated way, that while they might choose to get near one another they will also, given the chance, leave one another some space. We may pretend a fluid has no viscosity, because we are interested in cases where the viscosity does not affect the behavior much. We may pretend people are fully aware of the costs, risks, and benefits of any action they wish to take, at least when they are trying to decide which route to take to work today.
That an assumption is ridiculous does not mean the work built on it is ridiculous. We must defend why we expect those assumptions to make our work practical without introducing too much error. We must test whether the conclusions drawn from the assumption reflect what we wanted to model reasonably well. We can still learn something from a spherical cow. Or a spherical squirrel, if that’s the case.
Keith Tutt and Daniel Saunders’s Lard’s World Peace Tips for the 28th of October is a binary numbers joke. It’s the other way to tell the joke about there being 10 kinds of people in the world. (I notice that joke made in the comments on Gocomics.com. That was inevitable.)
Eric the Circle for the 29th of October, this one by “Gilly” again, jokes about mathematics being treated as if quite subject to law. The truth of mathematical facts isn’t subject to law, of course. But the use of mathematics is. It’s obvious, for example, in the setting of educational standards. What things a member of society must know to be a functioning part of it are, western civilization has decided, a subject governments may speak about. Thus what mathematics everyone should know is a subject of legislation, or at least legislation in the attenuated form of regulated standards.
But mathematics is subject to parliament (or congress, or the diet, or what have you) in subtler ways. Mathematics is how we measure debt, that great force holding society together. And measurement again has been (at least in western civilization) a matter for governments. We accept the principle that a government may establish a fundamental unit of weight or fundamental unit of distance. So too may it decide what is a unit of currency, and into how many pieces the unit may be divided. And from this it can decide how to calculate with that currency: if the “proper” price of a thing would be, say, five-ninths of the smallest available bit of currency, then what should the buyer give the seller?
Who cares, you might ask, and fairly enough. I can’t get worked up about the risk that I might overpay four-ninths of a penny for something, nor feel bad that I might cheat a merchant out of five-ninths of a penny. But consider: when Arabic numerals first made their way to the west they were viewed with suspicion. Everyone at the market or the moneylenders’ knew how Roman numerals worked, and could follow addition and subtraction with ease. Multiplication was harder, but it could be followed. Division was a diaster and I wouldn’t swear that anyone has ever successfully divided using Roman numerals, but at least everything else was nice and familiar.
But then suddenly there was this influx of new symbols, only one of them something that had ever been a number before. One of them at least looked like the letter O, but it was supposed to represent a missing quantity. And every calculation on this was some strange gibberish where one unfamiliar symbol plus another unfamiliar symbol turned into yet another unfamiliar symbol or maybe even two symbols. Sure, the merchant or the moneylender said it was easier, once you learned the system. But they were also the only ones who understood the system, and the ones who would profit by making “errors” that could not be detected.
Thus we see governments, even in worldly, trade-friendly city-states like Venice, prohibiting the use of Arabic numerals. Roman numerals may be inferior by every measure, but they were familiar. They stood at least until enough generations passed that the average person could feel “1 + 1 = 2” contained no trickery.
If one sees in this parallels to the problem of reforming mathematics education, all I can offer is that people are absurd, and we must love the absurdness of them.
One last note, so I can get this essay above two thousand words somehow. In the 1910s Alfred North Whitehead and Bertrand Russell published the awesome and menacing Principia Mathematica. This was a project to build arithmetic, and all mathematics, on sound logical grounds utterly divorced from the great but fallible resource of human intuition. They did probably as well as human beings possibly could. They used a bewildering array of symbols and such a high level of abstraction that a needy science fiction movie could put up any random page of the text and pass it off as Ancient High Martian.
But they were mathematicians and philosophers, and so could not avoid a few wry jokes, and one of them comes in Volume II, around page 86 (it’ll depend on the edition you use). There, in Proposition 110.643, Whitehead and Russell establish “1 + 1 = 2” and remark, “the above proposition is occasionally useful”. They note at least three uses in their text alone. (Of course this took so long because they were building a lot of machinery before getting to mere work like this.)
Back in my days as a graduate student I thought it would be funny to put up a mock political flyer, demanding people say “NO ON PROP *110.643”. I was wrong. But the joke is strong enough if you don’t go to the trouble of making up the sign. I didn’t make up the sign anyway.
And to murder my own weak joke: arguably “1 + 1 = 2” is established much earlier, around page 380 of the first volume, in proposition *54.43. The thing is, that proposition warns that “it will follow, when mathematical addition has been defined”, which it hasn’t been at that point. But if you want to say it’s Proposition *54.43 instead go ahead; it will not get you any better laugh.
If you’d like to see either proof rendered as non-head-crushingly as possible, the Metamath Proof Explorer shows the reasoning for Proposition *54.43 as well as that for *110.643. And it contains hyperlinks so that you can try to understand the exact chain of reasoning which comes to that point. Good luck. I come from a mathematical heritage that looks at the Principia Mathematica and steps backward, quickly, before it has the chance to notice us and attack.
There was a neat little fluke in baseball the other day. All fifteen of the Major League Baseball games on Tuesday were won by the home team. This appears to be the first time it’s happened since the league expanded to thirty teams in 1998. As best as the Elias Sports Bureau can work out, the last time every game was won by the home team was on the 23rd of May, 1914, when all four games in each of the National League, American League, and Federal League were home-team wins.
This produced talk about the home field advantage never having it so good, naturally. Also at least one article claimed the odds of fifteen home-team wins were one in 32,768. I can’t find that article now that I need it; please just trust me that it existed.
The thing is this claim is correct, if you assume there is no home-field advantage. That is, if you suppose the home team has exactly one chance in two of winning, then the chance of fifteen home teams winning is one-half raised to the fifteenth power. And that is one in 32,768.
This also assumes the games are independent, that is, that the outcome of one has no effect on the outcome of another. This seems likely, at least as long as we’re far enough away from the end of the season. In a pennant race a team might credibly relax once another game decided whether they had secured a position in the postseason. That might affect whether they win the game under way. Whether results are independent is always important for a probability question.
But stadium designers and the groundskeeping crew would not be doing their job if the home team had an equal chance of winning as the visiting team does. It’s been understood since the early days of organized professional baseball that the state of the field can offer advantages to the team that plays most of its games there.
Jack Jones, at Betfirm.com, estimated that for the five seasons from 2010 to 2014, the home team won about 53.7 percent of all games. Suppose we take this as accurate and representative of the home field advantage in general. Then the chance of fifteen home-team wins is 0.537 raised to the fifteenth power. That is approximately one divided by 11,230.
That’s a good bit more probable than the one in 32,768 you’d expect from the home team having exactly a 50 percent chance of winning. I think that’s a dramatic difference considering the home team wins a bit less than four percent more often than 50-50.
The follow-up question and one that’s good for a probability homework would be to work out what are the odds that we’d see one day with fifteen home-team wins in the mere eighteen years since it became possible.
I haven’t had the chance to read the Gocomics.com comics yet today, but I’d had enough strips to bring up anyway. And I might need something to talk about on Tuesday. Two of today’s strips are from the legacy of Johnny Hart. Hart’s last decades at especially B.C., when he most often wrote about his fundamentalist religious views, hurt his reputation and obscured the fact that his comics were really, really funny when they start. His heirs and successors have been doing fairly well at reviving the deliberately anachronistic and lightly satirical edge that made the strips funny to begin with, and one of them’s a perennial around here. The other, Wizard of Id Classics, is literally reprints from the earliest days of the comic strip’s run. That shows the strip when it was earning its place on every comics page everywhere, and made a good case for it.
Mason Mastroianni, Mick Mastroianni, and Perri Hart’s B.C. (July 8) shows how a compass, without straightedge, can be used to ensure one’s survival. I suppose it’s really only loosely mathematical but I giggled quite a bit.
Ken Cursoe’s Tiny Sepuku (July 9) talks about luck as being just the result of probability. That’s fair enough. Random chance will produce strings of particularly good, or bad, results. Those strings of results can look so long or impressive that we suppose they have to represent something real. Look to any sport and the argument about whether there are “hot hands” or “clutch performers”. And Maneki-Neko is right that a probability manipulator would help. You can get a string of ten tails in a row on a fair coin, but you’ll get many more if the coin has an eighty percent chance of coming up tails.
Brant Parker and Johnny Hart’s Wizard of Id Classics (July 9, rerun from July 12, 1965) is a fun bit of volume-guessing and logic. So, yes, I giggled pretty solidly at both B.C. and The Wizard of Id this week.
Mell Lazarus’s Momma (July 11) identifies “long division” as the first thing a person has to master to be an engineer. I don’t know that this is literally true. It’s certainly true that liking doing arithmetic helps one in a career that depends on calculation, though. But you can be a skilled songwriter without being any good at writing sheet music. I wouldn’t be surprised if there are skilled engineers who are helpless at dividing fourteen into 588.
Bunny Hoest and John Reiner’s Lockhorns (July 12) includes an example of using “adding up” to mean “make sense”. It’s a slight thing. But the same idiom was used last week, in Eric Teitelbaum and Bill Teitelbaum’s Bottomliners. I don’t think Comic Strip Master Command is ordering this punch line yet, but you never know.
And finally, I do want to try something a tiny bit new, and explicitly invite you-the-readers to say what strip most amused you. Please feel free to comment about your choices, r warn me that I set up the poll wrong. I haven’t tried this before.
I’d worked out an estimate of how much information content there is in a basketball score, by which I was careful to say the score that one team manages in a game. I wasn’t able to find out what the actual distribution of real-world scores was like, unfortunately, so I made up a plausible-sounding guess: that college basketball scores would be distributed among the imaginable numbers (whole numbers from zero through … well, infinitely large numbers, though in practice probably not more than 150) according to a very common distribution called the “Gaussian” or “normal” distribution, that the arithmetic mean score would be about 65, and that the standard deviation, a measure of how spread out the distribution of scores is, would be about 10.
If those assumptions are true, or are at least close enough to true, then there are something like 5.4 bits of information in a single team’s score. Put another way, if you were trying to divine the score by asking someone who knew it a series of carefully-chosen questions, like, “is the score less than 65?” or “is the score more than 39?”, with at each stage each question equally likely to be answered yes or no, you could expect to hit the exact score with usually five, sometimes six, such questions.
When I worked out how interesting, in an information-theory sense, a basketball game — and from that, a tournament — might be, I supposed there was only one thing that might be interesting about the game: who won? Or to be exact, “did (this team) win”? But that isn’t everything we might want to know about a game. For example, we might want to know what a team scored. People often do. So how to measure this?
The answer was given, in embryo, in my first piece about how interesting a game might be. If you can list all the possible outcomes of something that has multiple outcomes, and how probable each of those outcomes is, then you can describe how much information there is in knowing the result. It’s the sum, for all of the possible results, of the quantity negative one times the probability of the result times the logarithm-base-two of the probability of the result. When we were interested in only whether a team won or lost there were just the two outcomes possible, which made for some fairly simple calculations, and indicates that the information content of a game can be as high as 1 — if the team is equally likely to win or to lose — or as low as 0 — if the team is sure to win, or sure to lose. And the units of this measure are bits, the same kind of thing we use to measure (in groups of bits called bytes) how big a computer file is.
When I wrote about how interesting the results of a basketball tournament were, and came to the conclusion that it was 63 (and filled in that I meant 63 bits of information), I was careful to say that the outcome of a basketball game between two evenly-matched opponents has an information content of 1 bit. If the game is a foregone conclusion, then the game hasn’t got so much information about it. If the game really is foregone, the information content is 0 bits; you already know what the result will be. If the game is an almost sure thing, there’s very little information to be had from actually seeing the game. An upset might be thrilling to watch, but you would hardly count on that, if you’re being rational. But most games aren’t sure things; we might expect the higher-seed to win, but it’s plausible they don’t. How does that affect how much information there is in the results of a tournament?
Last year, the NCAA College Men’s Basketball tournament inspired me to look up what the outcomes of various types of matches were, and which teams were more likely to win than others. If some person who wrote something for statistics.about.com is correct, based on 27 years of March Madness outcomes, the play between a number one and a number 16 seed is a foregone conclusion — the number one seed always wins — while number two versus number 15 is nearly sure. So while the first round of play will involve 32 games — four regions, each region having eight games — there’ll be something less than 32 bits of information in all these games, since many of them are so predictable.
If we take the results from that statistics.about.com page as accurate and reliable as a way of predicting the outcomes of various-seeded teams, then we can estimate the information content of the first round of play at least.
Here’s how I work it out, anyway:
|Contest||Probability the Higher Seed Wins||Information Content of this Outcome|
|#1 seed vs #16 seed||100%||0 bits|
|#2 seed vs #15 seed||96%||0.2423 bits|
|#3 seed vs #14 seed||85%||0.6098 bits|
|#4 seed vs #13 seed||79%||0.7415 bits|
|#5 seed vs #12 seed||67%||0.9149 bits|
|#6 seed vs #11 seed||67%||0.9149 bits|
|#7 seed vs #10 seed||60%||0.9710 bits|
|#8 seed vs #9 seed||47%||0.9974 bits|
So if the eight contests in a single region were all evenly matched, the information content of that region would be 8 bits. But there’s one sure and one nearly-sure game in there, and there’s only a couple games where the two teams are close to evenly matched. As a result, I make out the information content of a single region to be about 5.392 bits of information. Since there’s four regions, that means the first round of play — the first 32 games — have altogether about 21.567 bits of information.
A statistical analysis of the tournaments which I dug up last year indicated that in the last three rounds — the Elite Eight, Final Four, and championship game — the higher- and lower-seeded teams are equally likely to win, and therefore those games have an information content of 1 bit per game. The last three rounds therefore have 7 bits of information total.
Unfortunately, experimental data seems to fall short for the second round — 16 games, where the 32 winners in the first round play, producing the Sweet Sixteen teams — and the third round — 8 games, producing the Elite Eight. If someone’s done a study of how often the higher-seeded team wins I haven’t run across it.
There are six of these games in each of the four regions, for 24 games total. Presumably the higher-seeded is more likely than the lower-seeded to win, but I don’t know how much more probable it is the higher-seed will win. I can come up with some bounds: the 24 games total in the second and third rounds can’t have an information content less than 0 bits, since they’re not all foregone conclusions. The higher-ranked seed won’t win all the time. And they can’t have an information content of more than 24 bits, since that’s how much there would be if the games were perfectly even matches.
So, then: the first round carries about 21.567 bits of information. The second and third rounds carry between 0 and 24 bits. The fourth through sixth rounds (the sixth round is the championship game) carry seven bits. Overall, the 63 games of the tournament carry between 28.567 and 52.567 bits of information. I would expect that many of the second-round and most of the third-round games are pretty close to even matches, so I would expect the higher end of that range to be closer to the true information content.
Let me make the assumption that in this second and third round the higher-seed has roughly a chance of 75 percent of beating the lower seed. That’s a number taken pretty arbitrarily as one that sounds like a plausible but not excessive advantage the higher-seeded teams might have. (It happens it’s close to the average you get of the higher-seed beating the lower-seed in the first round of play, something that I took as confirming my intuition about a plausible advantage the higher seed has.) If, in the second and third rounds, the higher-seed wins 75 percent of the time and the lower-seed 25 percent, then the outcome of each game is about 0.8113 bits of information. Since there are 24 games total in the second and third rounds, that suggests the second and third rounds carry about 19.471 bits of information.
Taking all these numbers, though — the first round with its something like 21.567 bits of information; the second and third rounds with something like 19.471 bits; the fourth through sixth rounds with 7 bits — the conclusion is that the win/loss results of the entire 63-game tournament are about 48 bits of information. It’s a bit higher the more unpredictable the games involving the final 32 and the Sweet 16 are; it’s a bit lower the more foregone those conclusions are. But 48 bits sounds like a plausible enough answer to me.
Yes, I can hear people snarking, “not even the tiniest bit”. These are people who think calling all athletic contests “sportsball” is still a fresh and witty insult. No matter; what I mean to talk about applies to anything where there are multiple possible outcomes. If you would rather talk about how interesting the results of some elections are, or whether the stock market rises or falls, whether your preferred web browser gains or loses market share, whatever, read it as that instead. The work is all the same.
To talk about quantifying how interesting the outcome of a game (election, trading day, whatever) means we have to think about what “interesting” qualitatively means. A sure thing, a result that’s bound to happen, is not at all interesting, since we know going in that it’s the result. A result that’s nearly sure but not guaranteed is at least a bit interesting, since after all, it might not happen. An extremely unlikely result would be extremely interesting, if it could happen.
The Prior Probability blog points out an interesting graph, showing the most common scores in basketball teams, based on the final scores of every NBA game. It’s actually got three sets of data there, one for all basketball games, one for games this decade, and one for basketball games of the 1950s. Unsurprisingly there’s many more results for this decade — the seasons are longer, and there are thirty teams in the league today, as opposed to eight or nine in 1954. (The Baltimore Bullets played fourteen games before folding, and the games were expunged from the record. The league dropped from eleven teams in 1950 to eight for 1954-1959.)
I’m fascinated by this just as a depiction of probability distributions: any team can, in principle, reach most any non-negative score in a game, but it’s most likely to be around 102. Surely there’s a maximum possible score, based on the fact a team has to get the ball and get into position before it can score; I’m a little curious what that would be.
Prior Probability itself links to another blog which reviews the distribution of scores for other major sports, and the interesting result of what the most common basketball score has been, per decade. It’s increased from the 1940s and 1950s, but it’s considerably down from the 1960s.
You can see the most common scores in such sports as basketball, football, and baseball in Philip Bump’s fun Wonkblog post here. Mr Bump writes: “Each sport follows a rough bell curve … Teams that regularly fall on the left side of that curve do poorly. Teams that land on the right side do well.” Read more about Gaussian distributions here.