The answer would be “of course not”. I was playing against, mostly, the same people who were in the state finals. (A few who didn’t qualify for the finals joined the side tournament.) In that I had done well enough, winning seven games in all out of fifteen played. It’s implausible that I got significantly worse at pinball between the main and the side tournament. But can I make a logically sound argument about this?
In full, probably not. It’s too hard. The question is, did I win way too few games compared to what I should have expected? But what should I have expected? I haven’t got any information on how likely it should have been that I’d win any of the games, especially not when I faced something like a dozen different opponents. (I played several opponents twice.)
But we can make a model. Suppose that I had a fifty percent chance of winning each match. This is a lie in detail. The model contains lies; all models do. The lies might let us learn something interesting. Some people there I could only beat with a stroke of luck on my side. Some people there I could fairly often expect to beat. If we pretend I had the same chance against everyone, though, we get something that we can model. It might tell us something about what really happened.
If I play 16 matches, and have a 50 percent chance of winning each of them, then I should expect to win eight matches. But there’s no reason I might not win seven instead, or nine. Might win six, or ten, without that being too implausible. It’s even possible I might not win a single match, or that I might win all sixteen matches. How likely?
This calls for a creature from the field of probability that we call the binomial distribution. It’s “binomial” because it’s about stuff for which there are exactly two possible outcomes. This fits. Each match I can win or I can lose. (If we tie, or if the match is interrupted, we replay it, so there’s not another case.) It’s a “distribution” because we describe, for a set of some number of attempted matches, how the possible outcomes are distributed. The outcomes are: I win none of them. I win exactly one of them. I win exactly two of them. And so on, all the way up to “I win exactly all but one of them” and “I win all of them”.
To answer the question of whether it’s plausible I should have done so badly I need to know more than just how likely it is I would win only three games. I need to also know the chance I’d have done worse. If I had won only two games, or only one, or none at all. Why?
Here I admit: I’m not sure I can give a compelling reason, at least not in English. I’ve been reworking it all week without being happy at the results. Let me try pieces.
One part is that as I put the question — is it plausible that I could do so awfully? — isn’t answered just by checking how likely it is I would win only three games out of sixteen. If that’s awful, then doing even worse must also be awful. I can’t rule out even-worse results from awfulness without losing a sense of what the word “awful” means. Fair enough, to answer that question. But I made up the question. Why did I make up that one? Why not just “is it plausible I’d get only three out of sixteen games”?
Habit, largely. Experience shows me that the probability of any particular result turns out to be implausibly low. It isn’t quite that case here; there’s only seventeen possible noticeably different outcomes of playing sixteen games. But there can be so many possible outcomes that even the most likely one isn’t.
Take an extreme case. (Extreme cases are often good ways to build an intuitive understanding of things.) Imagine I played 16,000 games, with a 50-50 chance of winning each one of them. It is most likely that I would win 8,000 of the games. But the probability of winning exactly 8,000 games is small: only about 0.6 percent. What’s going on there is that there’s almost the same chance of winning exactly 8,001 or 8,002 games. As the number of games increases the number of possible different outcomes increases. If there are 16,000 games there are 16,001 possible outcomes. It’s less likely that any of them will stand out. What saves our ability to predict the results of things is that the number of plausible outcomes increases more slowly. It’s plausible someone would win exactly three games out of sixteen. It’s impossible that someone would win exactly three thousand games out of sixteen thousand, even though that’s the same ratio of won games.
Card games offer another way to get comfortable with this idea. A bridge hand, for example, is thirteen cards drawn out of fifty-two. But the chance that you were dealt the hand you just got? Impossibly low. Should we conclude from this all bridge hands are hoaxes? No, but ask my mother sometime about the bridge class she took that one cruise. “Three of sixteen” is too particular; “at best three of sixteen” is a class I can study.
Unconvinced? I don’t blame you. I’m not sure I would be convinced of that, but I might allow the argument to continue. I hope you will. So here are the specifics. These are the chance of each count of wins, and the chance of having exactly that many wins, for sixteen matches:
So the chance of doing as awfully as I had — winning zero or one or two or three games — is pretty dire. It’s a little above one percent.
Is that implausibly low? Is there so small a chance that I’d do so badly that we have to figure I didn’t have a 50-50 chance of winning each game?
I hate to think that. I didn’t think I was outclassed. But here’s a problem. We need some standard for what is “it’s implausibly unlikely that this happened by chance alone”. If there were only one chance in a trillion that someone with a 50-50 chance of winning any game would put in the performance I did, we could suppose that I didn’t actually have a 50-50 chance of winning any game. If there were only one chance in a million of that performance, we might also suppose I didn’t actually have a 50-50 chance of winning any game. But here there was only one chance in a hundred? Is that too unlikely?
It depends. We should have set a threshold for “too implausibly unlikely” before we started research. It’s bad form to decide afterward. There are some thresholds that are commonly taken. Five percent is often useful for stuff where it’s hard to do bigger experiments and the harm of guessing wrong (dismissing the idea I had a 50-50 chance of winning any given game, for example) isn’t so serious. One percent is another common threshold, again common in stuff like psychological studies where it’s hard to get more and more data. In a field like physics, where experiments are relatively cheap to keep running, you can gather enough data to insist on fractions of a percent as your threshold. Setting the threshold after is bad form.
In my defense, I thought (without doing the work) that I probably had something like a five percent chance of doing that badly by luck alone. It suggests that I did have a much worse than 50 percent chance of winning any given game.
Is that credible? Well, yeah; I may have been in the top sixteen players in the state. But a lot of those people are incredibly good. Maybe I had only one chance in three, or something like that. That would make the chance I did that poorly something like one in six, likely enough.
And it’s also plausible that games are not independent, that whether I win one game depends in some way on whether I won or lost the previous. But it does feel like it’s easier to win after a win, or after a close loss. And it feels harder to win a game after a string of losses. I don’t know that this can be proved, not on the meager evidence I have available. And you can almost always question the independence of a string of events like this. It’s the safe bet.
I’d mentioned in the previous essay about how much contingency there is especially in a short series like this one. My opponent picked the game I expected she would to start out. And she got an awful bounce on the first ball, while I got a very lucky bounce that started multiball on the last. So I won, but not because I was playing better. The seventh game was one that I had figured she might pick if she needed to crush me, and if I had gotten a better bounce on the first ball I’d still have had an uphill struggle. Just less of one.
After the first round I got into a set of three “tie-breaking” rounds, used to sort out which of the sixteen players ranked as number 11 versus number 10. Each of those were a best-of-three series. I did win one series and lost two others, dropping me into 12th place. Over the three series I had four wins and four losses, so I can’t say that I mismatched there.
Where I might have been mismatched is the side tournament. This was a two-hour marathon of playing a lot of games one after the other. I finished with three wins and 13 losses, enough to make me wonder whether I somehow went from competent to incompetent in the hour or so between the main and the side tournament. Of course not, based on a record like that, but — can I prove it?
Meanwhile a friend pointed out The New York Times covering the New York State pinball championship:
The article is (at least for now) at https://www.nytimes.com/2017/02/12/nyregion/pinball-state-championship.html. What my friend couldn’t have known, and what shows how networked people are, is that I know one of the people featured in the article, Sean “The Storm” Grant. Well, I knew him, back in college. He was an awesome pinball player even then. And he’s only got more awesome since.
How awesome? Let me give you some background. The International Flipper Pinball Association (IFPA) gives players ranking points. These points are gathered by playing in leagues and tournaments. Each league or tournament has a certain point value. That point value is divided up among the players, in descending order from how they finish. How many points do the events have? That depends on how many people play and what their ranking is. So, yes, how much someone’s IFPA score increases depends on the events they go to, and the events they go to depend on their score. This might sound to you like there’s a differential equation describing all this. You’re close: it’s a difference equation, because these rankings change with the discrete number of events players go to. But there’s an interesting and iterative system at work there.
(Points only expire with time. The system is designed to encourage people to play a lot of things and keep playing them. You can’t lose ranking points by playing, although it might hurt your player-versus-player rating. That’s calculated by a formula I don’t understand at all.)
Anyway, Sean Grant plays in the New York Superleague, a crime-fighting band of pinball players who figured out how to game the IFPA rankings system. They figured out how to turn the large number of people who might visit a Manhattan bar and casually play one or two games into a source of ranking points for the serious players. The IFPA, combatting this scheme, just this week recalculated the Superleague values and the rankings of everyone involved in it. It’s fascinating stuff, in that way a heated debate over an issue you aren’t emotionally invested in can be.
Anyway. Grant is such a skilled player that he lost more points in this nerfing than I have gathered in my whole competitive-pinball-playing career.
So while I knew I’d be knocked out in the first round of the Michigan State Championships I’ll admit I had fantasies of having an impossibly lucky run. In that case, I’d have gone to the nationals and been turned into a pale, silverball-covered paste by people like Grant.
Thanks again for all your good wishes, kind readers. Now we start the long road to the 2017 State Championships, to be held in February of next year. I’m already in 63rd place in the state for the year! (There haven’t been many events for the year yet, and the championship and side tournament haven’t posted their ranking scores yet.)
And now to start my second week of this summer mathematics A to Z challenge. This time I’ve got another word that just appears all over the mathematics world.
The word “dual” turns up in a lot of fields. The details of what the dual is depend on which field of mathematics we’re talking about. But the general idea is the same. Start with some mathematical construct. The dual is some new mathematical thing, which is based on the thing you started with.
For example, for the box (or die) you create the dual this way. At the center of each of the flat surfaces (the faces, in the lingo) put a dot. That’s a corner (a vertex) of a new shape. You should have six of them when you’re done. Now imagine drawing in new edges between the corners. The rule is that you put an edge in from one corner to another only if the surfaces those corners come from were adjacent. And on your new shape you put in a surface, a face, between the new edges if the old edges shared a corner. If you’ve done this right, you should get out of it an eight-sided shape, with triangular surfaces, and six corners. It’s known as an octahedron, although you might know it better as an eight-sided die.
A friend who’s also into The Price Is Right claimed to have noticed something peculiar about the “Any Number” game. Let me give context before the peculiarity.
This pricing game is the show’s oldest — it was actually the first one played when the current series began in 1972, and also the first pricing game won — and it’s got a wonderful simplicity: four digits from the price of a car (the first digit, nearly invariably a 1 or a 2, is given to the contestant and not part of the game), three digits from the price of a decent but mid-range prize, and three digits from a “piggy bank” worth up to $9.87 are concealed. The contestant guesses digits from zero through nine inclusive, and they’re revealed in the three prices. The contestant wins whichever prize has its price fully revealed first. This is a steadily popular game, and one of the rare Price games which guarantees the contestant wins something.
A couple things probably stand out. The first is that if you’re very lucky (or unlucky) you can win with as few as three digits called, although it might be the piggy bank for a measly twelve cents. (Past producers have said they’d never let the piggy bank hold less than $1.02, which still qualifies as “technically something”.) The other is that no matter how bad you are, you can’t take more than eight digits to win something, though it might still be the piggy bank.
What my friend claimed to notice was that these “Any Number” games went on to the last possible digit “all the time”, and he wanted to know, why?
My first reaction was: “all” the time? Well, at least it happened an awful lot of the time. But I couldn’t think of a particular reason that they should so often take the full eight digits needed, or whether they actually did; it’s extremely easy to fool yourself about how often events happen when there’s a complicated possibile set of events. But stipulating that eight digits were often needed, then, why should they be needed? (For that matter, trusting the game not to be rigged — and United States televised game shows are by legend extremely sensitive to charges of rigging — how could they be needed?) Could I explain why this happened? And he asked again, enough times that I got curious myself.
So here’s my homework problem: On the original WiiFit there were five activities for testing mental and physical agility, one of which I really disliked. Two of the five were chosen at random each day. On WiiFitPlus, there are two sets of five activities each, with one exercise drawn at random from the two disparate sets, each of which has a test I really dislike. Am I more likely under the WiiFit or under the WiiFitPlus routine to get a day with one of the tests I can’t stand? Here, my reasoning.
I’m sorry to go another day without following up the essay I meant to follow up, but it’s been a frantically busy week on a frantically busy month and something has to give somewhere. But before I return the Symbolic Logic book to the library — Project Gutenberg has the first part of it, but the second is soundly in copyright, I would expect (its first publication in a recognizable form was in the 1970s) — I wanted to pick some more stuff out of the second part.