When Is Thanksgiving Most Likely To Happen?


I thought I had written this up. Which is good because I didn’t want to spend the energy redoing these calculations.

The date of Thanksgiving, as observed in the United States, is that it’s the fourth Thursday of November. So it might happen anytime from the 22nd through the 28th. But because of the quirks of the Gregorian calendar, it can happen that a particular date, like the 23rd of November, is more or less likely to be a Thursday than some other day of the week.

So here’s the results of what days are most and least likely to be Thanksgiving. It turns out the 23rd, this year’s candidate, is tied for the rarest of Thanksgiving days. It’s not that rare, in comparison. It happens only two fewer times every 400 years than do Thanksgivings on the 22nd of November, the (tied) most common day.

Advertisements

Reading the Comics, November 18, 2017: Story Problems and Equation Blackboards Edition


It was a normal-paced week at Comic Strip Master Command. It was also one of those weeks that didn’t have anything from Comics Kingdom or Creators.Com. So I’m afraid you’ll all just have to click the links for strips you want to actually see. Sorry.

Bill Amend’s FoxTrot for the 12th has Jason and Marcus creating “mathic novels”. They, being a couple of mathematically-gifted smart people, credit mathematics knowledge with smartness. A “chiliagon” is a thousand-sided regular polygon that’s mostly of philosophical interest. A regular polygon with a thousand equal sides and a thousand equal angles looks like a circle. There’s really no way to draw one so that the human eye could see the whole figure and tell it apart from a circle. But if you can understand the idea of a regular polygon it seems like you can imagine a chilagon and see how that’s not a circle. So there’s some really easy geometry things that can’t be visualized, or at least not truly visualized, and just have to be reasoned with.

Rick Detorie’s One Big Happy for the 12th is a story-problem-subversion joke. The joke’s good enough as it is, but the supposition of the problem is that the driving does cover fifty miles in an hour. This may not be the speed the car travels at the whole time of the problem. Mister Green is maybe speeding to make up for all the time spent travelling slower.

Brandon Sheffield and Dami Lee’s Hot Comics for Cool People for the 13th uses a blackboard full of equations to represent the deep thinking being done on a silly subject.

Shannon Wheeler’s Too Much Coffee Man for the 15th also uses a blackboard full of equations to represent the deep thinking being done on a less silly subject. It’s a really good-looking blackboard full of equations, by the way. Beyond the appearance of our old friend E = mc2 there’s a lot of stuff that looks like legitimate quantum mechanics symbols there. They’re at least not obvious nonsense, as best I can tell without the ability to zoom the image in. I wonder if Wheeler didn’t find a textbook and use some problems from it for the feeling of authenticity.

Samson’s Dark Side of the Horse for the 16th is a story-problem subversion joke.

Jef Mallett’s Frazz for the 18th talks about making a bet on the World Series, which wrapped up a couple weeks ago. It raises the question: can you bet on an already known outcome? Well, sure, you can bet on anything you like, given a willing partner. But there does seem to be something fundamentally different between betting on something whose outcome isn’t in principle knowable, such as the winner of the next World Series, and betting on something that could be known but happens not to be, such as the winner of the last. We see this expressed in questions like “is it true the 13th of a month is more likely to be Friday than any other day of the week?” If you know which month and year is under discussion the chance the 13th is Friday is either 1 or 0. But we mean something more like, if we don’t know what month and year it is, what’s the chance this is a month with a Friday the 13th? Something like this is at work in this World Series bet. (The Astros won the recently completed World Series.)

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 18th is also featured on some underemployed philosopher’s “Reading the Comics” WordPress blog and fair enough. Utilitarianism exists in an odd triple point, somewhere on the borders of ethics, economics, and mathematics. The idea that one could quantize the good or the utility or the happiness of society, and study how actions affect it, is a strong one. It fits very well the modern mindset that holds everything can be quantified even if we don’t know how to do it well just yet. And it appeals strongly to a mathematically-minded person since it sounds like pure reason. It’s not, of course, any more than any ethical scheme can be. But it sounds like the ethics a Vulcan would come up with and that appeals to a certain kind of person. (The comic is built on one of the implications of utilitarianism that makes it seem like the idea’s gone off the rails.)

There’s some mathematics symbols on The Utilitarian’s costume. The capital U on his face is probably too obvious to need explanation. The \sum u on his chest relies on some mathematical convention. For maybe a half-millennium now mathematicians have been using the capital sigma to mean “take a sum of things”. The things are whatever the expression after that symbol is. Usually, the Sigma will have something below and above which carries meaning. It says what the index is for the thing after the symbol, and what the bounds of the index are. Here, it’s not set. This is common enough, though, if this is understood from context. Or if it’s obvious. The small ‘u’ to the right suggests the utility of whatever’s thought about. (“Utility” being the name for the thing measured and maximized; it might be happiness, it might be general well-being, it might be the number of people alive.) So the symbols would suggest “take the sum of all the relevant utilities”. Which is the calculation that would be done in this case.

Reading the Comics, September 29, 2017: Anthropomorphic Mathematics Edition


The rest of last week had more mathematically-themed comic strips than Sunday alone did. As sometimes happens, I noticed an objectively unimportant detail in one of the comics and got to thinking about it. Whether I could solve the equation as posted, or whether at least part of it made sense as a mathematics problem. Well, you’ll see.

Patrick McDonnell’s Mutts for the 25th of September I include because it’s cute and I like when I can feature some comic in these roundups. Maybe there’s some discussion that could be had about what “equals” means in ordinary English versus what it means in mathematics. But I admit that’s a stretch.

Professor Earl's Math Class. (Earl is the dog.) 'One belly rub equals two pats on the head!'
Patrick McDonnell’s Mutts for the 25th of September, 2017. I should be interested in other people’s research on this. My love’s parents’ dogs are the ones I’ve had the most regular contact with the last few years, and the dogs have all been moderately to extremely alarmed by my doing suspicious things, such as existing or being near them or being away from them or reaching a hand to them or leaving a treat on the floor for them. I know this makes me sound worrisome, but my love’s parents are very good about taking care of dogs others would consider just too much trouble.

Olivia Walch’s Imogen Quest for the 25th uses, and describes, the mathematics of a famous probability problem. This is the surprising result of how few people you need to have a 50 percent chance that some pair of people have a birthday in common. It then goes over to some other probability problems. The examples are silly. But the reasoning is sound. And the approach is useful. To find the chance of something happens it’s often easiest to work out the chance it doesn’t. Which is as good as knowing the chance it does, since a thing can either happen or not happen. At least in probability problems, which define “thing” and “happen” so there’s not ambiguity about whether it happened or not.

Piers Baker’s Ollie and Quentin rerun for the 26th I’m pretty sure I’ve written about before, although back before I included pictures of the Comics Kingdom strips. (The strip moved from Comics Kingdom over to GoComics, which I haven’t caught removing old comics from their pages.) Anyway, it plays on a core piece of probability. It sets out the world as things, “events”, that can have one of multiple outcomes, and which must have one of those outcomes. Coin tossing is taken to mean, by default, an event that has exactly two possible outcomes, each equally likely. And that is near enough true for real-world coin tossing. But there is a little gap between “near enough” and “true”.

Rick Stromoski’s Soup To Nutz for the 27th is your standard sort of Dumb Royboy joke, in this case about him not knowing what percentages are. You could do the same joke about fractions, including with the same breakdown of what part of the mathematics geek population ruins it for the remainder.

Nate Fakes’s Break of Day for the 28th is not quite the anthropomorphic-numerals joke for the week. Anthropomorphic mathematics problems, anyway. The intriguing thing to me is that the difficult, calculus, problem looks almost legitimate to me. On the right-hand-side of the first two lines, for example, the calculation goes from

\int -8 e^{-\frac{ln 3}{14} t}

to
-8 -\frac{14}{ln 3} e^{-\frac{ln 3}{14} t}

This is a little sloppy. The first line ought to end in a ‘dt’, and the second ought to have a constant of integration. If you don’t know what these calculus things are let me explain: they’re calculus things. You need to include them to express the work correctly. But if you’re just doing a quick check of something, the mathematical equivalent of a very rough preliminary sketch, it’s common enough to leave that out.

It doesn’t quite parse or mean anything precisely as it is. But it looks like the sort of thing that some context would make meaningful. That there’s repeated appearances of - \frac{ln 3}{14} , or - \frac{14}{ln 3} , particularly makes me wonder if Frakes used a problem he (or a friend) was doing for some reason.

Mark Anderson’s Andertoons for the 29th is a welcome reassurance that something like normality still exists. Something something student blackboard story problem something.

Anthony Blades’s Bewley rerun for the 29th depicts a parent once again too eager to help with arithmetic homework.

Maria Scrivan’s Half Full for the 29th gives me a proper anthropomorphic numerals panel for the week, and none too soon.

The Summer 2017 Mathematics A To Z: Sárközy’s Theorem


Gaurish, of For the love of Mathematics, gives me another chance to talk number theory today. Let’s see how that turns out.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Sárközy’s Theorem.

I have two pieces to assemble for this. One is in factors. We can take any counting number, a positive whole number, and write it as the product of prime numbers. 2038 is equal to the prime 2 times the prime 1019. 4312 is equal to 2 raised to the third power times 7 raised to the second times 11. 1040 is 2 to the fourth power times 5 times 13. 455 is 5 times 7 times 13.

There are many ways to divide up numbers like this. Here’s one. Is there a square number among its factors? 2038 and 455 don’t have any. They’re each a product of prime numbers that are never repeated. 1040 has a square among its factors. 2 times 2 divides into 1040. 4312, similarly, has a square: we can write it as 2 squared times 2 times 7 squared times 11. So that is my first piece. We can divide counting numbers into squarefree and not-squarefree.

The other piece is in binomial coefficients. These are numbers, often quite big numbers, that get dumped on the high school algebra student as she tries to work with some expression like (a + b)^n . They’re also dumped on the poor student in calculus, as something about Newton’s binomial coefficient theorem. Which we hear is something really important. In my experience it wasn’t explained why this should rank up there with, like, the differential calculus. (Spoiler: it’s because of polynomials.) But it’s got some great stuff to it.

Binomial coefficients are among those utility players in mathematics. They turn up in weird places. In dealing with polynomials, of course. They also turn up in combinatorics, and through that, probability. If you run, for example, 10 experiments each of which could succeed or fail, the chance you’ll get exactly five successes is going to be proportional to one of these binomial coefficients. That they touch on polynomials and probability is a sign we’re looking at a thing woven into the whole universe of mathematics. We saw them some in talking, last A-To-Z around, about Yang Hui’s Triangle. That’s also known as Pascal’s Triangle. It has more names too, since it’s been found many times over.

The theorem under discussion is about central binomial coefficients. These are one specific coefficient in a row. The ones that appear, in the triangle, along the line of symmetry. They’re easy to describe in formulas. for a whole number ‘n’ that’s greater than or equal to zero, evaluate what we call 2n choose n:

{{2n} \choose{n}} =  \frac{(2n)!}{(n!)^2}

If ‘n’ is zero, this number is \frac{0!}{(0!)^2} or 1. If ‘n’ is 1, this number is \frac{2!}{(1!)^2} or 2. If ‘n’ is 2, this number is \frac{4!}{(2!)^2} 6. If ‘n’ is 3, this number is (sparing the formula) 20. The numbers keep growing. 70, 252, 924, 3432, 12870, and so on.

So. 1 and 2 and 6 are squarefree numbers. Not much arguing that. But 20? That’s 2 squared times 5. 70? 2 times 5 times 7. 252? 2 squared times 3 squared times 7. 924? That’s 2 squared times 3 times 7 times 11. 3432? 2 cubed times 3 times 11 times 13; there’s a 2 squared in there. 12870? 2 times 3 squared times it doesn’t matter anymore. It’s not a squarefree number.

There’s a bunch of not-squarefree numbers in there. The question: do we ever stop seeing squarefree numbers here?

So here’s Sárközy’s Theorem. It says that this central binomial coefficient {{2n} \choose{n}} is never squarefree as long as ‘n’ is big enough. András Sárközy showed in 1985 that this was true. How big is big enough? … We have a bound, at least, for this theorem. If ‘n’ is larger than the number 2^{8000} then the corresponding coefficient can’t be squarefree. It might not surprise you that the formulas involved here feature the Riemann Zeta function. That always seems to turn up for questions about large prime numbers.

That’s a common state of affairs for number theory problems. Very often we can show that something is true for big enough numbers. I’m not sure there’s a clear reason why. When numbers get large enough it can be more convenient to deal with their logarithms, I suppose. And those look more like the real numbers than the integers. And real numbers are typically easier to prove stuff about. Maybe that’s it. This is vague, yes. But to ask ‘why’ some things are easy and some are hard to prove is a hard question. What is a satisfying ’cause’ here?

It’s tempting to say that since we know this is true for all ‘n’ above a bound, we’re done. We can just test all the numbers below that bound, and the rest is done. You can do a satisfying proof this way: show that eventually the statement is true, and show all the special little cases before it is. This particular result is kind of useless, though. 2^{8000} is a number that’s something like 241 digits long. For comparison, the total number of things in the universe is something like a number about 80 digits long. Certainly not more than 90. It’d take too long to test all those cases.

That’s all right. Since Sárközy’s proof in 1985 there’ve been other breakthroughs. In 1988 P Goetgheluck proved it was true for a big range of numbers: every ‘n’ that’s larger than 4 and less than 2^{42,205,184} . That’s a number something more than 12 million digits long. In 1991 I Vardi proved we had no squarefree central binomial coefficients for ‘n’ greater than 4 and less than 2^{774,840,978} , which is a number about 233 million digits long. And then in 1996 Andrew Granville and Olivier Ramare showed directly that this was so for all ‘n’ larger than 4.

So that 70 that turned up just a few lines in is the last squarefree one of these coefficients.

Is this surprising? Maybe, maybe not. I’ll bet most of you didn’t have an opinion on this topic twenty minutes ago. Let me share something that did surprise me, and continues to surprise me. In 1974 David Singmaster proved that any integer divides almost all the binomial coefficients out there. “Almost all” is here a term of art, but it means just about what you’d expect. Imagine the giant list of all the numbers that can be binomial coefficients. Then pick any positive integer you like. The number you picked will divide into so many of the giant list that the exceptions won’t be noticeable. So that square numbers like 4 and 9 and 16 and 25 should divide into most binomial coefficients? … That’s to be expected, suddenly. Into the central binomial coefficients? That’s not so obvious to me. But then so much of number theory is strange and surprising and not so obvious.

The Summer 2017 Mathematics A To Z: Quasirandom numbers


Gaurish, host of, For the love of Mathematics, gives me the excuse to talk about amusement parks. You may want to brace yourself. Yes, this essay includes a picture. It would have included a video if I had enough WordPress privileges for that.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Quasirandom numbers.

Think of a merry-go-round. Or carousel, if you prefer. I will venture a guess. You might like merry-go-rounds. They’re beautiful. They can evoke happy thoughts of childhood when they were a big ride it was safe to go on. But they don’t often make one think of thrills.. They’re generally sedate things. They don’t need to be. There’s no great secret to making a carousel a thrill ride. They knew it a century ago, when all the great American carousels were carved. It’s simple. Make the thing spin fast enough, at the five or six rotations per minute the ride was made for. There are places that do this yet. There’s the Cedar Downs ride at Cedar Point, Sandusky, Ohio. There’s the antique carousel at Crossroads Village, a historical village/park just outside Flint, Michigan. There’s the Derby Racer at Playland in Rye, New York. There’s the carousel in the Merry-Go-Round Museum in Sandusky, Ohio. Any of them are great rides. Two of them have a special edge. I’ll come back to them.

Playland's Derby Racer in motion, at night, featuring a ride operator leaning maybe twenty degrees inward.
Rye (New York) Playland Amusement Park’s is the fastest carousel I’m aware of running. Riders are warned ahead of time to sit so they’re leaning to the left, and the ride will not get up to full speed until the ride operator checks everyone during the ride. To get some idea of its speed, notice the ride operator on the left and how far she leans. She’s not being dramatic; that’s the natural stance. Also the tilt in the carousel’s floor is not camera trickery; it does lean like that. If you have a spare day in the New York City area and any interest in classic amusement parks, this is worth the trip.

Randomness is a valuable resource. We know it’s key to many things. We have major fields of mathematics built on it. We can understand the behavior of variables without ever knowing what value they have. All we need is to know than the chance they might be in some particular range. This makes possible all kinds of problems too complicated to do otherwise. We know it’s critical. Quantum mechanics would not work without randomness. Without quantum mechanics, matter doesn’t work. And that’s true randomness, the kind where something is unpredictable. It’s not the kind of randomness we talk about when we ask, say, what’s the chance someone was born on a Tuesday. That’s mere hidden information: if we knew the month and date and year of a person’s birth we would know whether they were born Tuesday or not. We need more.

So the trouble is actually getting a random number. Well, a sequence of randomly drawn numbers. We rarely need this if we’re doing analysis. We can understand how some process changes the shape of a distribution without ever using the distribution. We can take derivatives of a function without ever evaluating the original function, after all.

But we do need randomly drawn numbers. We do too much numerical work with them. For example, it’s impossible to exactly integrate most functions. Numerical methods can take a ferociously long time to evaluate. A family of methods called Monte Carlo rely on randomly-drawn values to estimate the integral. The results are strikingly good for the work required. But they must have random numbers. The name “Monte Carlo” is not some cryptic code. It is an expression of how randomly drawn numbers make the tool work.

It’s hard to get random numbers. Consider: we can’t write an algorithm to do it. If we were to write one, then we’d be able to predict that the sequence of numbers was. We have some recourse. We could set up instruments to rely on the randomness that seems to be in the world. Thermal fluctuations, for example, created by processes outside any computer’s control, can give us a pleasant dose of randomness. If we need higher-quality random numbers than that we can go to exotic equipment. Geiger counters watching the decay of a not-alarmingly-radioactive sample. Cosmic ray detectors watching the sky.

Or we can write something that produces numbers that look random enough. They won’t really be random, and if we wait long enough we’ll notice the sequence repeats itself. But if we only need, say, ten numbers, who cares if the sequence will repeat after ten million numbers? (We’ll surely need more than ten numbers. But we can postpone the repetition until we’ve drawn far more than ten million numbers.)

Two of the carousels I’ve mentioned have an astounding property. The horses in a file move. I mean, relative to each other. Some horse will start the race in front of its neighbors; some will start behind. The four move forward and back thanks to a mechanism of, I am assured, staggering complexity. There are only three carousels in the world that have it. There’s Cedar Downs at Cedar Point in Sandusky, Ohio; the Racing Downs at Playland in Rye, New York; and the Derby Racer at Blackpool Pleasure Beach in Blackpool, England. The mechanism in Blackpool’s hasn’t operated in years. The one at Playland’s had not run in years, but was restored for the 2017 season. My love and I made a trip specifically to ride that. (You may have heard of a fire at the carousel in Playland this summer. This was of part of the building for their other, non-racing, antique carousel. My last information was that the carousel itself was all right.)

These racing derbies have the horses in a file move forward and back in a “random” way. It’s not truly random. If you knew exactly which gears were underneath each horse, and where in their rotations they were, you could say which horse was about to gain on its partners and which was about to fall back. But all that is concealed from the rider. The horse patterns will eventually, someday, repeat. If the gear cycles aren’t interrupted by maintenance or malfunctions. But nobody’s going to ride any horse long enough to notice. We have in these rides a randomness as good as what your computer makes, at least for the purpose it serves.

Cedar Point's Cedar Downs during the race, showing the blur of the ride's motion.
The racing nature of Playland’s and Cedar Point’s derby racers mean that every ride includes exciting extra moments of overtaking or falling behind your partners to the side. It also means quarreling with your siblings about who really won the race because your horse started like four feet behind your sister’s and it ended only two feet behind so hers didn’t beat yours and, long story short, there was some punching, there was some spitting, and now nobody is gonna be allowed to get ice cream at the Carvel’s (for Playland) or cheese on a stick (for Cedar Point). This is the Cedar Downs ride at Cedar Point, and focuses on the poles that move the horses.

What does it mean to look random? Some things seem obvious. All the possible numbers ought to come up, sooner or later. Any particular possible number shouldn’t repeat too often. Any particular possible number shouldn’t go too long without repeating. There shouldn’t be clumps of numbers; if, say, ‘4’ turns up, we shouldn’t see ‘5’ turn up right away all the time.

We can make the idea of “looking” random quite literal. Suppose we’re selecting numbers from 0 through 9. We can draw the random numbers we’ve picked. Use the numbers as coordinates. Say we pick four digits: 1, 3, 9, and 0. Then draw the point that’s at x-coordinate 13, y-coordinate 90. Then the next four digits. Let’s say they’re 4, 2, 3, and 8. Then draw the point that’s at x-coordinate 42, y-coordinate 38. And repeat. What will this look like?

If it clumps up, we probably don’t have good random numbers. If we see lines that points collect along, or avoid, there’s a good chance our numbers aren’t very random. If there’s whole blocks of space that they occupy, and others they avoid, we may have a defective source of random numbers. We should expect the points to cover a space pretty uniformly. (There are more rigorous, logically sound, methods. The eye can be fooled easily enough. But it’s the same principle. We have some test that notices clumps and gaps.) But …

The thing is, there’s always going to be some clumps. There’ll always be some gaps. Part of randomness is that it forms patterns, or at least things that look like patterns to us. We can describe how big a clump (or gap; it’s the same thing, really) is for any particular quantity of randomly drawn numbers. If we see clumps bigger than that we can throw out the numbers as suspect. But … still …

Toss a coin fairly twenty times, and there’s no reason it can’t turn up tails sixteen times. This doesn’t happen often, but it will happen sometimes. Just luck. This surplus of tails should evaporate as we take more tosses. That is, we most likely won’t see 160 tails out of 200 tosses. We certainly will not see 1,600 tails out of 2,000 tosses. We know this as the Law of Large Numbers. Wait long enough and weird fluctuations will average out.

What if we don’t have time, though? For coin-tossing that’s silly; of course we have time. But for Monte Carlo integration? It could take too long to be confident we haven’t got too-large gaps or too-tight clusters.

This is why we take quasi-random numbers. We begin with what randomness we’re able to manage. But we massage it. Imagine our coins example. Suppose after ten fair tosses we noticed there had been eight tails turn up. Then we would start tossing less fairly, trying to make heads more common. We would be happier if there were 12 rather than 16 tails after twenty tosses.

Draw the results. We get now a pattern that looks still like randomness. But it’s a finer sorting; it looks like static tidied up some. The quasi-random numbers are not properly random. Knowing that, say, the last several numbers were odd means the next one is more likely to be even, the Gambler’s Fallacy put to work. But in aggregate, we trust, we’ll be able to enjoy the speed and power of randomly-drawn numbers. It shows its strengths when we don’t know just how finely we must sample a range of numbers to get good, reliable results.

To carousels. I don’t know whether the derby racers have quasirandom outcomes. I would find believable someone telling me that all the possible orderings of the four horses in any file are equally likely. To know would demand detailed knowledge of how the gearing works, though. Also probably simulations of how the system would work if it ran long enough. It might be easier to watch the ride for a couple of days and keep track of the outcomes. If someone wants to sponsor me doing a month-long research expedition to Cedar Point, drop me a note. Or just pay for my season pass. You folks would do that for me, wouldn’t you? Thanks.

The Summer 2017 Mathematics A To Z: Benford's Law


Today’s entry in the Summer 2017 Mathematics A To Z is one for myself. I couldn’t post this any later.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Benford’s Law.

My car’s odometer first read 9 on my final test drive before buying it, in June of 2009. It flipped over to 10 barely a minute after that, somewhere near Jersey Freeze ice cream parlor at what used to be the Freehold Traffic Circle. Ask a Central New Jersey person of sufficient vintage about that place. Its odometer read 90 miles sometime that weekend, I think while I was driving to The Book Garden on Route 537. Ask a Central New Jersey person of sufficient reading habits about that place. It’s still there. It flipped over to 100 sometime when I was driving back later that day.

The odometer read 900 about two months after that, probably while I was driving to work, as I had a longer commute in those days. It flipped over to 1000 a couple days after that. The odometer first read 9,000 miles sometime in spring of 2010 and I don’t remember what I was driving to for that. It flipped over from 9,999 to 10,000 miles several weeks later, as I pulled into the car dealership for its scheduled servicing. Yes, this kind of impressed the dealer that I got there exactly on the round number.

The odometer first read 90,000 in late August of last year, as I was driving to some competitive pinball event in western Michigan. It’s scheduled to flip over to 100,000 miles sometime this week as I get to the dealer for its scheduled maintenance. While cars have gotten to be much more reliable and durable than they used to be, the odometer will never flip over to 900,000 miles. At least I can’t imagine owning it long enough, at my rate of driving the past eight years, that this would ever happen. It’s hard to imagine living long enough for the car to reach 900,000 miles. Thursday or Friday it should flip over to 100,000 miles. The leading digit on the odometer will be 1 or, possibly, 2 for the rest of my association with it.

The point of this little autobiography is this observation. Imagine all the days that I have owned this car, from sometime in June 2009 to whatever day I sell, lose, or replace it. Pick one. What is the leading digit of my odometer on that day? It could be anything from 1 to 9. But it’s more likely to be 1 than it is 9. Right now it’s as likely to be any of the digits. But after this week the chance of ‘1’ being the leading digit will rise, and become quite more likely than that of ‘9’. And it’ll never lose that edge.

This is a reflection of Benford’s Law. It is named, as most mathematical things are, imperfectly. The law-namer was Frank Benford, a physicist, who in 1938 published a paper The Law Of Anomalous Numbers. It confirmed the observation of Simon Newcomb. Newcomb was a 19th century astronomer and mathematician of an exhausting number of observations and developments. Newcomb observed the logarithm tables that anyone who needed to compute referred to often. The earlier pages were more worn-out and dirty and damaged than the later pages. People worked with numbers that start with ‘1’ more than they did numbers starting with ‘2’. And more those that start ‘2’ than start ‘3’. More that start with ‘3’ than start with ‘4’. And on. Benford showed this was not some fluke of calculations. It turned up in bizarre collections of data. The surface areas of rivers. The populations of thousands of United States municipalities. Molecular weights. The digits that turned up in an issue of Reader’s Digest. There is a bias in the world toward numbers that start with ‘1’.

And this is, prima facie, crazy. How can the surface areas of rivers somehow prefer to be, say, 100-199 hectares instead of 500-599 hectares? A hundred is a human construct. (Indeed, it’s many human constructs.) That we think ten is an interesting number is an artefact of our society. To think that 100 is a nice round number and that, say, 81 or 144 are not is a cultural choice. Grant that the digits of street addresses of people listed in American Men of Science — one of Benford’s data sources — have some cultural bias. How can another of his sources, molecular weights, possibly?

The bias sneaks in subtly. Don’t they all? It lurks at the edge of the table of data. The table header, perhaps, where it says “River Name” and “Surface Area (sq km)”. Or at the bottom where it says “Length (miles)”. Or it’s never explicit, because I take for granted people know my car’s mileage is measured in miles.

What would be different in my introduction if my car were Canadian, and the odometer measured kilometers instead? … Well, I’d not have driven the 9th kilometer; someone else doing a test-drive would have. The 90th through 99th kilometers would have come a little earlier that first weekend. The 900th through 999th kilometers too. I would have passed the 99,999th kilometer years ago. In kilometers my car has been in the 100,000s for something like four years now. It’s less absurd that it could reach the 900,000th kilometer in my lifetime, but that still won’t happen.

What would be different is the precise dates about when my car reached its milestones, and the amount of days it spent in the 1’s and the 2’s and the 3’s and so on. But the proportions? What fraction of its days it spends with a 1 as the leading digit versus a 2 or a 5? … Well, that’s changed a little bit. There is some final mile, or kilometer, my car will ever register and it makes a little difference whether that’s 239,000 or 385,000. But it’s only a little difference. It’s the difference in how many times a tossed coin comes up heads on the first 1,000 flips versus the second 1,000 flips. They’ll be different numbers, but not that different.

What’s the difference between a mile and a kilometer? A mile is longer than a kilometer, but that’s it. They measure the same kinds of things. You can convert a measurement in miles to one in kilometers by multiplying by a constant. We could as well measure my car’s odometer in meters, or inches, or parsecs, or lengths of football fields. The difference is what number we multiply the original measurement by. We call this “scaling”.

Whatever we measure, in whatever unit we measure, has to have a leading digit of something. So it’s got to have some chance of starting out with a ‘1’, some chance of starting out with a ‘2’, some chance of starting out with a ‘3’, and so on. But that chance can’t depend on the scale. Measuring something in smaller or larger units doesn’t change the proportion of how often each leading digit is there.

These facts combine to imply that leading digits follow a logarithmic-scale law. The leading digit should be a ‘1’ something like 30 percent of the time. And a ‘2’ about 18 percent of the time. A ‘3’ about one-eighth of the time. And it decreases from there. ‘9’ gets to take the lead a meager 4.6 percent of the time.

Roughly. It’s not going to be so all the time. Measure the heights of humans in meters and there’ll be far more leading digits of ‘1’ than we should expect, as most people are between 1 and 2 meters tall. Measure them in feet and ‘5’ and ‘6’ take a great lead. The law works best when data can sprawl over many orders of magnitude. If we lived in a world where people could as easily be two inches as two hundred feet tall, Benford’s Law would make more accurate predictions about their heights. That something is a mathematical truth does not mean it’s independent of all reason.

For example, the reader thinking back some may be wondering: granted that atomic weights and river areas and populations carry units with them that create this distribution. How do street addresses, one of Benford’s observed sources, carry any unit? Well, street addresses are, at least in the United States custom, a loose measure of distance. The 100 block (for example) of a street is within one … block … from whatever the more important street or river crossing that street is. The 900 block is farther away.

This extends further. Block numbers are proxies for distance from the major cross feature. House numbers on the block are proxies for distance from the start of the block. We have a better chance to see street number 418 than 1418, to see 418 than 488, or to see 418 than to see 1488. We can look at Benford’s Law in the second and third and other minor digits of numbers. But we have to be more cautious. There is more room for variation and quirk events. A block-filling building in the downtown area can take whatever street number the owners think most auspicious. Smaller samples of anything are less predictable.

Nevertheless, Benford’s Law has become famous to forensic accountants the past several decades, if we allow the use of the word “famous” in this context. But its fame is thanks to the economists Hal Varian and Mark Nigrini. They observed that real-world financial data should be expected to follow this same distribution. If they don’t, then there might be something suspicious going on. This is not an ironclad rule. There might be good reasons for the discrepancy. If your work trips are always to the same location, and always for one week, and there’s one hotel it makes sense to stay at, and you always learn you’ll need to make the trips about one month ahead of time, of course the hotel bill will be roughly the same. Benford’s Law is a simple, rough tool, a way to decide what data to scrutinize for mischief. With this in mind I trust none of my readers will make the obvious leading-digit mistake when padding their expense accounts anymore.

Since I’ve done you that favor, anyone out there think they can pick me up at the dealer’s Thursday, maybe Friday? Thanks in advance.

Reading the Comics, April 29, 2017: The Other Half Of The Week Edition


I’d been splitting Reading the Comics posts between Sunday and Thursday to better space them out. But I’ve got something prepared that I want to post Thursday, so I’ll bump this up. Also I had it ready to go anyway so don’t gain anything putting it off another two days.

Bill Amend’s FoxTrot Classics for the 27th reruns the strip for the 4th of May, 2006. It’s another probability problem, in its way. Assume Jason is honest in reporting whether Paige has picked his number correctly. Assume that Jason picked a whole number. (This is, I think, the weakest assumption. I know Jason Fox’s type and he’s just the sort who’d pick an obscure transcendental number. They’re all obscure after π and e.) Assume that Jason is equally likely to pick any of the whole numbers from 1 to one billion. Then, knowing nothing about what numbers Jason is likely to pick, Paige would have one chance in a billion of picking his number too. Might as well call it certainty that she’ll pay a dollar to play the game. How much would she have to get, in case of getting the number right, to come out even or ahead? … And now we know why Paige is still getting help on probability problems in the 2017 strips.

Jeff Stahler’s Moderately Confused for the 27th gives me a bit of a break by just being a snarky word problem joke. The student doesn’t even have to resist it any.

The Venn Diagram of Maintenance. 12 days after cut and color, color still rresh, bluntness of cut relaxed. Same-day mani-pedi, no chips in polish. Ten days after eyebrow tint, faded to look normal. After two weeks of religiously following salt-free diet, bloating at minimum. One day after gym workout, fresh-faced vitality from exercise. The intersection the one perfect day where it all comes together.
Sandra Bell-Lundy’s Between Friends for the 29th of April, 2017. And while it’s not a Venn Diagram I’m not sure of a better way to visually represent that the cartoonist is going for. I suppose the intended meaning comes across cleanly enough and that’s the most important thing. It’s a strange state of affairs is all.

Sandra Bell-Lundy’s Between Friends for the 29th also gives me a bit of a break by just being a Venn Diagram-based joke. At least it’s using the shape of a Venn Diagram to deliver the joke. It’s not really got the right content.

Harley Schwadron’s 9 to 5 for the 29th is this week’s joke about arithmetic versus propaganda. It’s a joke we’re never really going to be without again.

Reading the Comics, April 24, 2017: Reruns Edition


I went a little wild explaining the first of last week’s mathematically-themed comic strips. So let me split the week between the strips that I know to have been reruns and the ones I’m not so sure were.

Bill Amend’s FoxTrot for the 23rd — not a rerun; the strip is still new on Sundays — is a probability question. And a joke about story problems with relevance. Anyway, the question uses the binomial distribution. I know that because the question is about doing a bunch of things, homework questions, each of which can turn out one of two ways, right or wrong. It’s supposed to be equally likely to get the question right or wrong. It’s a little tedious but not hard to work out the chance of getting exactly six problems right, or exactly seven, or exactly eight, or so on. To work out the chance of getting six or more questions right — the problem given — there’s two ways to go about it.

One is the conceptually easy but tedious way. Work out the chance of getting exactly six questions right. Work out the chance of getting exactly seven questions right. Exactly eight questions. Exactly nine. All ten. Add these chances up. You’ll get to a number slightly below 0.377. That is, Mary Lou would have just under a 37.7 percent chance of passing. The answer’s right and it’s easy to understand how it’s right. The only drawback is it’s a lot of calculating to get there.

So here’s the conceptually harder but faster way. It works because the problem says Mary Lou is as likely to get a problem wrong as right. So she’s as likely to get exactly ten questions right as exactly ten wrong. And as likely to get at least nine questions right as at least nine wrong. To get at least eight questions right as at least eight wrong. You see where this is going: she’s as likely to get at least six right as to get at least six wrong.

There’s exactly three possibilities for a ten-question assignment like this. She can get four or fewer questions right (six or more wrong). She can get exactly five questions right. She can get six or more questions right. The chance of the first case and the chance of the last have to be the same.

So, take 1 — the chance that one of the three possibilities will happen — and subtract the chance she gets exactly five problems right, which is a touch over 24.6 percent. So there’s just under a 75.4 percent chance she does not get exactly five questions right. It’s equally likely to be four or fewer, or six or more. Just-under-75.4 divided by two is just under 37.7 percent, which is the chance she’ll pass as the problem’s given. It’s trickier to see why that’s right, but it’s a lot less calculating to do. That’s a common trade-off.

Ruben Bolling’s Super-Fun-Pax Comix rerun for the 23rd is an aptly titled installment of A Million Monkeys At A Million Typewriters. It reminds me that I don’t remember if I’d retired the monkeys-at-typewriters motif from Reading the Comics collections. If I haven’t I probably should, at least after making a proper essay explaining what the monkeys-at-typewriters thing is all about.

'This new math teacher keeps shakin' us down every morning, man ... what's she looking for, anyway?' 'Pocket calculators.'
Ted Shearer’s Quincy from the 28th of February, 1978. So, that FoxTrot problem I did? The conceptually-easy-but-tedious way is not too hard to do if you have a calculator. It’s a buch of typing but nothing more. If you don’t have a calculator, though, the desire not to do a whole bunch of calculating could drive you to the conceptually-harder-but-less-work answer. Is that a good thing? I suppose; insight is a good thing to bring. But the less-work answer only works because of a quirk in the problem, that Mary Lou is supposed to have a 50 percent chance of getting a question right. The low-insight-but-tedious problem will aways work. Why skip on having something to do the tedious part?

Ted Shearer’s Quincy from the 28th of February, 1978 reveals to me that pocket calculators were a thing much earlier than I realized. Well, I was too young to be allowed near stuff like that in 1978. I don’t think my parents got their first credit-card-sized, solar-powered calculator that kind of worked for another couple years after that. Kids, ask about them. They looked like good ideas, but you could use them for maybe five minutes before the things came apart. Your cell phone is so much better.

Bil Watterson’s Calvin and Hobbes rerun for the 24th can be classed as a resisting-the-word-problem joke. It’s so not about that, but who am I to slow you down from reading a Calvin and Hobbes story?

Garry Trudeau’s Doonesbury rerun for the 24th started a story about high school kids and their bad geography skills. I rate it as qualifying for inclusion here because it’s a mathematics teacher deciding to include more geography in his course. I was amused by the week’s jokes anyway. There’s no hint given what mathematics Gil teaches, but given the links between geometry, navigation, and geography there is surely something that could be relevant. It might not help with geographic points like which states are in New England and where they are, though.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 24th is built on a plot point from Carl Sagan’s science fiction novel Contact. In it, a particular “message” is found in the digits of π. (By “message” I mean a string of digits that are interesting to us. I’m not sure that you can properly call something a message if it hasn’t got any sender and if there’s not obviously some intended receiver.) In the book this is an astounding thing because the message can’t be; any reasonable explanation for how it should be there is impossible. But short “messages” are going to turn up in π also, as per the comic strips.

I assume the peer review would correct the cartoon mathematicians’ unfortunate spelling of understanding.

What Is The Most Probable Date For Easter? What Is The Least?


If I’d started pondering the question a week earlier I’d have a nice timely post. Too bad. Shouldn’t wait nearly a year to use this one, though.

My love and I got talking about early and late Easters. We know that we’re all but certainly not going to be alive to see the earliest possible Easter, at least not unless the rule for setting the date of Easter changes. Easter can be as early as the 22nd of March or as late as the 25th of April. Nobody presently alive has seen a 22nd of March Easter; the last one was in 1818. Nobody presently alive will; the next will be 2285. The last time Easter was its latest date was 1943; the next time will be 2038. I know people who’ve seen the one in 1943 and hope to make it at least through 2038.

But that invites the question: what dates are most likely to be Easter? What ones are least? In a sense the question is nonsense. The rules establishing Easter and the Gregorian calendar are known. To speak of the “chance” of a particular day being Easter is like asking the probability that Grover Cleveland was president of the United States in 1894. Technically there’s a probability distribution there. But it’s different in some way from asking the chance of rolling at least a nine on a pair of dice.

But as with the question about what day is most likely to be Thanksgiving we can make the question sensible. We have to take the question to mean “given a month and day, and no information about what year it is, what is the chance that this as Easter?” (I’m still not quite happy with that formulation. I’d be open to a more careful phrasing, if someone’s got one.)

When we’ve got that, though, we can tackle the problem. We could do as I did for working out what days are most likely to be Thanksgiving. Run through all the possible configurations of the calendar, tally how often each of the days in the range is Easter, and see what comes up most often. There’s a hassle here. Working out the date of Easter follows a rule, yes. The rule is that it’s the first Sunday after the first full moon after the spring equinox. There are wrinkles, mostly because the Moon is complicated. A notional Moon that’s a little more predictable gets used instead. There are algorithms you can use to work out when Easter is. They all look like some kind of trick being used to put something over on you. No matter. They seem to work, as far as we know. I found some Matlab code that uses the Easter-computing routine that Karl Friedrich Gauss developed and that’ll do.

Problem. The Moon and the Earth follow cycles around the sun, yes. Wait long enough and the positions of the Earth and Moon and Sun. This takes 532 years and is known as the Paschal Cycle. In the Julian calendar Easter this year is the same date it was in the year 1485, and the same it will be in 2549. It’s no particular problem to set a computer program to run a calculation, even a tedious one, 532 times. But it’s not meaningful like that either.

The problem is the Julian calendar repeats itself every 28 years, which fits nicely with the Paschal Cycle. The Gregorian calendar, with different rules about how to handle century years like 1900 and 2100, repeats itself only every 400 years. So it takes much longer to complete the cycle and get Earth, Moon, and calendar date back to the same position. To fully account for all the related cycles would take 5,700,000 years, estimates Duncan Steel in Marking Time: The Epic Quest To Invent The Perfect Calendar.

Write code to calculate Easter on a range of years and you can do that, of course. It’s no harder to calculate the dates of Easter for six million years than it is for six hundred years. It just takes longer to finish. The problem is that it is meaningless to do so. Over the course of a mere(!) 26,000 years the precession of the Earth’s axes will change the times of the seasons completely. If we still use the Gregorian calendar there will be a time that late September is the start of the Northern Hemisphere’s spring, and another time that early February is the heart of the Canadian summer. Within five thousand years we will have to change the calendar, change the rule for computing Easter, or change the idea of it as happening in Europe’s early spring. To calculate a date for Easter of the year 5,002,017 is to waste energy.

We probably don’t need it anyway, though. The differences between any blocks of 532 years are, I’m going to guess, minor things. I would be surprised if the frequency of any date’s appearance changed more than a quarter of a percent. That might scramble the rankings of dates if we have several nearly-as-common dates, but it won’t be much.

So let me do that. Here’s a table of how often each particular calendar date appears as Easter from the years 2000 to 5000, inclusive. And I don’t believe that by the year we would call 5000 we’ll still have the same calendar and Easter and expectations of Easter all together, so I’m comfortable overlooking that. Indeed, I expect we’ll have some different calendar or Easter or expectation of Easter by the year 4985 at the latest.

For this enormous date range, though, here’s the frequency of Easters on each possible date:

Date Number Of Occurrences, 2000 – 5000 Probability Of Occurence
22 March 12 0.400%
23 March 17 0.566%
24 March 41 1.366%
25 March 74 2.466%
26 March 75 2.499%
27 March 68 2.266%
28 March 90 2.999%
29 March 110 3.665%
30 March 114 3.799%
31 March 99 3.299%
1 April 87 2.899%
2 April 83 2.766%
3 April 106 3.532%
4 April 112 3.732%
5 April 110 3.665%
6 April 92 3.066%
7 April 86 2.866%
8 April 98 3.266%
9 April 112 3.732%
10 April 114 3.799%
11 April 96 3.199%
12 April 88 2.932%
13 April 90 2.999%
14 April 108 3.599%
15 April 117 3.899%
16 April 104 3.466%
17 April 90 2.999%
18 April 93 3.099%
19 April 114 3.799%
20 April 116 3.865%
21 April 93 3.099%
22 April 60 1.999%
23 April 46 1.533%
24 April 57 1.899%
25 April 29 0.966%
Bar chart representing the data in the table above.
Dates of Easter from 2000 through 5000. Computed using Gauss’s algorithm.

If I haven’t missed anything, this indicates that the 15th of April is the most likely date for Easter, with the 20th close behind and the 10th and 14th hardly rare. The least probable date is the 22nd of March, with the 23rd of March and the 25th of April almost as unlikely.

And since the date range does affect the results, here’s a smaller sampling, one closer fit to the dates of anyone alive to read this as I publish. For the years 1925 through 2100 the appearance of each Easter date are:

Date Number Of Occurrences, 1925 – 2100 Probability Of Occurence
22 March 0 0.000%
23 March 1 0.568%
24 March 1 0.568%
25 March 3 1.705%
26 March 6 3.409%
27 March 3 1.705%
28 March 5 2.841%
29 March 6 3.409%
30 March 7 3.977%
31 March 7 3.977%
1 April 6 3.409%
2 April 4 2.273%
3 April 6 3.409%
4 April 6 3.409%
5 April 7 3.977%
6 April 7 3.977%
7 April 4 2.273%
8 April 4 2.273%
9 April 6 3.409%
10 April 7 3.977%
11 April 7 3.977%
12 April 7 3.977%
13 April 4 2.273%
14 April 6 3.409%
15 April 7 3.977%
16 April 6 3.409%
17 April 7 3.977%
18 April 6 3.409%
19 April 6 3.409%
20 April 6 3.409%
21 April 7 3.977%
22 April 5 2.841%
23 April 2 1.136%
24 April 2 1.136%
25 April 2 1.136%
Bar chart representing the data in the table above.
Dates of Easter from 1925 through 2100. Computed using Gauss’s algorithm.

If we take this as the “working lifespan” of our common experience then the 22nd of March is the least likely Easter we’ll see, as we never do. The 23rd and 24th are the next least likely Easter. There’s a ten-way tie for the most common date of Easter, if I haven’t missed one or more. But the 30th and 31st of March, and the 5th, 6th, 10th, 11th, 12th, 15th, 17th, and 21st of April each turn up seven times in this range.

The Julian calendar Easter dates are different and perhaps I’ll look at that sometime.

Did This German Retiree Solve A Decades-Old Conjecture?


And then this came across my desktop (my iPad’s too old to work with the Twitter client anymore):

The underlying news is that one Thomas Royen, a Frankfurt (Germany)-area retiree, seems to have proven the Gaussian Correlation Inequality. It wasn’t a conjecture that sounded familiar to me, but the sidebar (on the Quanta Magazine article to which I’ve linked there) explains it and reminds me that I had heard about it somewhere or other. It’s about random variables. That is, things that can take on one of a set of different values. If you think of them as the measurements of something that’s basically consistent but never homogenous you’re doing well.

Suppose you have two random variables, two things that can be measured. There’s a probability the first variable is in a particular range, greater than some minimum and less than some maximum. There’s a probability the second variable is in some other particular range. What’s the probability that both variables are simultaneously in these particular ranges? This is easy to answer for some specific cases. For example if the two variables have nothing to do with each other then everybody who’s taken a probability class knows. The probability of both variables being in their ranges is the probability the first is in its range times the probability the second is in its range. The challenge is telling whether it’s always true, whether the variables are related to each other or not. Or telling when it’s true if it isn’t always.

The article (and pop reporting on this) is largely about how the proof has gone unnoticed. There’s some interesting social dynamics going on there. Royen published in an obscure-for-the-field journal, one he was an editor for; this makes it look dodgy, at least. And the conjecture’s drawn “proofs” that were just wrong; this discourages people from looking for obscurely-published proofs.

Some of the articles I’ve seen on this make Royen out to be an amateur. And I suppose there is a bias against amateurs in professional mathematics. There is in every field. It’s true that mathematics doesn’t require professional training the way that, say, putting out oil rig fires does. Anyone capable of thinking through an argument rigorously is capable of doing important original work. But there are a lot of tricks to thinking an argument through that are important, and I’d bet on the person with training.

In any case, Royen isn’t a newcomer to the field who just heard of an interesting puzzle. He’d been a statistician, first for a pharmaceutical company and then for a technical university. He may not have a position or tie to a mathematics department or a research organization but he’s someone who would know roughly what to do.

So did he do it? I don’t know; I’m not versed enough in the field to say. It’s interesting to see if he has.

Reading the Comics, April 6, 2017: Abbreviated Week Edition


I’m writing this a little bit early because I’m not able to include the Saturday strips in the roundup. There won’t be enough to make a split week edition; I’ll just add the Saturday strips to next week’s report. In the meanwhile:

Mac King and Bill King’s Magic in a Minute for the 2nd is a magic trick, as the name suggests. It figures out a card by way of shuffling a (partial) deck and getting three (honest) answers from the other participant. If I’m not counting wrongly, you could do this trick with up to 27 cards and still get the right card after three answers. I feel like there should be a way to explain this that’s grounded in information theory, but I’m not able to put that together. I leave the suggestion here for people who see the obvious before I get to it.

Bil Keane and Jeff Keane’s Family Circus (probable) rerun for the 6th reassured me that this was not going to be a single-strip week. And a dubiously included single strip at that. I’m not sure that lotteries are the best use of the knowledge of numbers, but they’re a practical use anyway.

Dolly holds up pads of paper with numbers on them. 'C'mon, PJ, you hafta learn your numbers or else you'll never win the lottery.'
Bil Keane and Jeff Keane’s Family Circus for the 6th of April, 2017. I’m not familiar enough with the evolution of the Family Circus style to say whether this is a rerun, a newly-drawn strip, or an old strip with a new caption. I suppose there is a certain timelessness to it, at least once we get into the era when states sported lotteries again.

Bill Bettwy’s Take It From The Tinkersons for the 6th is part of the universe of students resisting class. I can understand the motivation problem in caring about numbers of apples that satisfy some condition. In the role of distinct objects whose number can be counted or deduced cards are as good as apples. In the role of things to gamble on, cards open up a lot of probability questions. Counting cards is even about how the probability of future events changes as information about the system changes. There’s a lot worth learning there. I wouldn’t try teaching it to elementary school students.

The teacher: 'How many apples will be left, Tillman?' 'When are we going to start counting things more exciting than fruit?' 'What would you like to count, Tillman?' 'Cards.'
Bill Bettwy’s Take It From The Tinkersons for the 6th of April, 2017. That tree in the third panel is a transplant from a Slylock Fox six-differences panel. They’ve been trying to rebuild the population of trees that are sometimes three triangles and sometimes four triangles tall.

Jeffrey Caulfield and Alexandre Rouillard’s Mustard and Boloney for the 6th uses mathematics as the stuff know-it-alls know. At least I suppose it is; Doctor Know It All speaks of “the pathagorean principle”. I’m assuming that’s meant to be the Pythagorean theorem, although the talk about “in any right triangle the area … ” skews things. You can get to stuf about areas of triangles from the Pythagorean theorem. One of the shorter proofs of it depends on the areas of the squares of the three sides of a right triangle. But it’s not what people typically think of right away. But he wouldn’t be the first know-it-all to start blathering on the assumption that people aren’t really listening. It’s common enough to suppose someone who speaks confidently and at length must know something.

Dave Whamond’s Reality Check for the 6th is a welcome return to anthropomorphic-numerals humor. Been a while.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 6th builds on the form of a classic puzzle, about a sequence indexed to the squares of a chessboard. The story being riffed on is a bit of mathematical legend. The King offered the inventor of chess any reward. The inventor asked for one grain of wheat for the first square, two grains for the second square, four grains for the third square, eight grains for the fourth square, and so on, through all 64 squares. An extravagant reward, but surely one within the king’s power to grant, right? And of course not: by the 64th doubling the amount of wheat involved is so enormous it’s impossibly great wealth.

The father’s offer is meant to evoke that. But he phrases it in a deceptive way, “one penny for the first square, two for the second, and so on”. That “and so on” is the key. Listing a sequence and ending “and so on” is incomplete. The sequence can go in absolutely any direction after the given examples and not be inconsistent. There is no way to pick a single extrapolation as the only logical choice.

We do it anyway, though. Even mathematicians say “and so on”. This is because we usually stick to a couple popular extrapolations. We suppose things follow a couple common patterns. They’re polynomials. Or they’re exponentials. Or they’re sine waves. If they’re polynomials, they’re lower-order polynomials. Things like that. Most of the time we’re not trying to trick our fellow mathematicians. Or we know we’re modeling things with some physical base and we have reason to expect some particular type of function.

In this case, the $1.27 total is consistent with getting two cents for every chess square after the first. There are infinitely many other patterns that would work, and the kid would have been wise to ask for what precisely “and so on” meant before choosing.

Berkeley Breathed’s Bloom County 2017 for the 7th is the climax of a little story in which Oliver Wendell Holmes has been annoying people by shoving scientific explanations of things into their otherwise pleasant days. It’s a habit some scientifically-minded folks have, and it’s an annoying one. Many of us outgrow it. Anyway, this strip is about the curious evidence suggesting that the universe is not just expanding, but accelerating its expansion. There are mathematical models which allow this to happen. When developing General Relativity, Albert Einstein included a Cosmological Constant for little reason besides that without it, his model would suggest the universe was of a finite age and had expanded from an infinitesimally small origin. He had grown up without anyone knowing of any evidence that the size of the universe was a thing that could change.

Anyway, the Cosmological Constant is a puzzle. We can find values that seem to match what we observe, but we don’t know of a good reason it should be there. We sciencey types like to have models that match data, but we appreciate more knowing why the models look like that and not anything else. So it’s a good problem some of the cosmologists have been working on. But we’ve been here before. A great deal of physics, especially in the 20th Century, has been driven by looking for reasons behind what look like arbitrary points in a successful model. If Oliver were better-versed in the history of science — something scientifically minded people are often weak on, myself included — he’d be less easily taunted by Opus.

Mikael Wulff and Anders Morgenthaler’s TruthFacts for the 7th thinks that we forgot they ran this same strip back on the 17th of March. I spotted it, though. Nyah.

How Much Might I Have Lost At Pinball?


After the state pinball championship last month there was a second, side tournament. It was a sort-of marathon event in which I played sixteen games in short order. I won three of them and lost thirteen, a disheartening record. The question I can draw from this: was I hopelessly outclassed in the side tournament? Is it plausible that I could do so awfully?

The answer would be “of course not”. I was playing against, mostly, the same people who were in the state finals. (A few who didn’t qualify for the finals joined the side tournament.) In that I had done well enough, winning seven games in all out of fifteen played. It’s implausible that I got significantly worse at pinball between the main and the side tournament. But can I make a logically sound argument about this?

In full, probably not. It’s too hard. The question is, did I win way too few games compared to what I should have expected? But what should I have expected? I haven’t got any information on how likely it should have been that I’d win any of the games, especially not when I faced something like a dozen different opponents. (I played several opponents twice.)

But we can make a model. Suppose that I had a fifty percent chance of winning each match. This is a lie in detail. The model contains lies; all models do. The lies might let us learn something interesting. Some people there I could only beat with a stroke of luck on my side. Some people there I could fairly often expect to beat. If we pretend I had the same chance against everyone, though, we get something that we can model. It might tell us something about what really happened.

If I play 16 matches, and have a 50 percent chance of winning each of them, then I should expect to win eight matches. But there’s no reason I might not win seven instead, or nine. Might win six, or ten, without that being too implausible. It’s even possible I might not win a single match, or that I might win all sixteen matches. How likely?

This calls for a creature from the field of probability that we call the binomial distribution. It’s “binomial” because it’s about stuff for which there are exactly two possible outcomes. This fits. Each match I can win or I can lose. (If we tie, or if the match is interrupted, we replay it, so there’s not another case.) It’s a “distribution” because we describe, for a set of some number of attempted matches, how the possible outcomes are distributed. The outcomes are: I win none of them. I win exactly one of them. I win exactly two of them. And so on, all the way up to “I win exactly all but one of them” and “I win all of them”.

To answer the question of whether it’s plausible I should have done so badly I need to know more than just how likely it is I would win only three games. I need to also know the chance I’d have done worse. If I had won only two games, or only one, or none at all. Why?

Here I admit: I’m not sure I can give a compelling reason, at least not in English. I’ve been reworking it all week without being happy at the results. Let me try pieces.

One part is that as I put the question — is it plausible that I could do so awfully? — isn’t answered just by checking how likely it is I would win only three games out of sixteen. If that’s awful, then doing even worse must also be awful. I can’t rule out even-worse results from awfulness without losing a sense of what the word “awful” means. Fair enough, to answer that question. But I made up the question. Why did I make up that one? Why not just “is it plausible I’d get only three out of sixteen games”?

Habit, largely. Experience shows me that the probability of any particular result turns out to be implausibly low. It isn’t quite that case here; there’s only seventeen possible noticeably different outcomes of playing sixteen games. But there can be so many possible outcomes that even the most likely one isn’t.

Take an extreme case. (Extreme cases are often good ways to build an intuitive understanding of things.) Imagine I played 16,000 games, with a 50-50 chance of winning each one of them. It is most likely that I would win 8,000 of the games. But the probability of winning exactly 8,000 games is small: only about 0.6 percent. What’s going on there is that there’s almost the same chance of winning exactly 8,001 or 8,002 games. As the number of games increases the number of possible different outcomes increases. If there are 16,000 games there are 16,001 possible outcomes. It’s less likely that any of them will stand out. What saves our ability to predict the results of things is that the number of plausible outcomes increases more slowly. It’s plausible someone would win exactly three games out of sixteen. It’s impossible that someone would win exactly three thousand games out of sixteen thousand, even though that’s the same ratio of won games.

Card games offer another way to get comfortable with this idea. A bridge hand, for example, is thirteen cards drawn out of fifty-two. But the chance that you were dealt the hand you just got? Impossibly low. Should we conclude from this all bridge hands are hoaxes? No, but ask my mother sometime about the bridge class she took that one cruise. “Three of sixteen” is too particular; “at best three of sixteen” is a class I can study.

Unconvinced? I don’t blame you. I’m not sure I would be convinced of that, but I might allow the argument to continue. I hope you will. So here are the specifics. These are the chance of each count of wins, and the chance of having exactly that many wins, for sixteen matches:

Wins Percentage
0 0.002 %
1 0.024 %
2 0.183 %
3 0.854 %
4 2.777 %
5 6.665 %
6 12.219 %
7 17.456 %
8 19.638 %
9 17.456 %
10 12.219 %
11 6.665 %
12 2.777 %
13 0.854 %
14 0.183 %
15 0.024 %
16 0.002 %

So the chance of doing as awfully as I had — winning zero or one or two or three games — is pretty dire. It’s a little above one percent.

Is that implausibly low? Is there so small a chance that I’d do so badly that we have to figure I didn’t have a 50-50 chance of winning each game?

I hate to think that. I didn’t think I was outclassed. But here’s a problem. We need some standard for what is “it’s implausibly unlikely that this happened by chance alone”. If there were only one chance in a trillion that someone with a 50-50 chance of winning any game would put in the performance I did, we could suppose that I didn’t actually have a 50-50 chance of winning any game. If there were only one chance in a million of that performance, we might also suppose I didn’t actually have a 50-50 chance of winning any game. But here there was only one chance in a hundred? Is that too unlikely?

It depends. We should have set a threshold for “too implausibly unlikely” before we started research. It’s bad form to decide afterward. There are some thresholds that are commonly taken. Five percent is often useful for stuff where it’s hard to do bigger experiments and the harm of guessing wrong (dismissing the idea I had a 50-50 chance of winning any given game, for example) isn’t so serious. One percent is another common threshold, again common in stuff like psychological studies where it’s hard to get more and more data. In a field like physics, where experiments are relatively cheap to keep running, you can gather enough data to insist on fractions of a percent as your threshold. Setting the threshold after is bad form.

In my defense, I thought (without doing the work) that I probably had something like a five percent chance of doing that badly by luck alone. It suggests that I did have a much worse than 50 percent chance of winning any given game.

Is that credible? Well, yeah; I may have been in the top sixteen players in the state. But a lot of those people are incredibly good. Maybe I had only one chance in three, or something like that. That would make the chance I did that poorly something like one in six, likely enough.

And it’s also plausible that games are not independent, that whether I win one game depends in some way on whether I won or lost the previous. But it does feel like it’s easier to win after a win, or after a close loss. And it feels harder to win a game after a string of losses. I don’t know that this can be proved, not on the meager evidence I have available. And you can almost always question the independence of a string of events like this. It’s the safe bet.

Reading the Comics, March 6, 2017: Blackboards Edition


I can’t say there’s a compelling theme to the first five mathematically-themed comics of last week. Screens full of mathematics turned up in a couple of them, so I’ll run with that. There were also just enough strips that I’m splitting the week again. It seems fair to me and gives me something to remember Wednesday night that I have to rush to complete.

Jimmy Hatlo’s Little Iodine for the 1st of January, 1956 was rerun on the 5th of March. The setup demands Little Iodine pester her father for help with the “hard homework” and of course it’s arithmetic that gets to play hard work. It’s a word problem in terms of who has how many apples, as you might figure. Don’t worry about Iodine’s boss getting fired; Little Iodine gets her father fired every week. It’s their schtick.

Little Iodine wakes her father early after a night at the lodge. 'You got to help me with my [hard] homework.' 'Ooh! My head! Wha'?' 'The first one is, if John has twice as many apples as Tom and Sue put together ... ' 'Huh? kay! Go on, let's get this over with.' They work through to morning. Iodine's teacher sees her asleep in class and demands she bring 'a note from your parents as to why you sleep in school instead of at home!' She goes to her father's office where her father's boss is saying, 'Well, Tremblechin, wake up! The hobo hotel is three blocks south and PS: DON'T COME BACK!'
Jimmy Hatlo’s Little Iodine for the 1st of January, 1956. I guess class started right back up the 2nd, but it would’ve avoided so much trouble if she’d done her homework sometime during the winter break. That said, I never did.

Dana Simpson’s Phoebe and her Unicorn for the 5th mentions the “most remarkable of unicorn confections”, a sugar dodecahedron. Dodecahedrons have long captured human imaginations, as one of the Platonic Solids. The Platonic Solids are one of the ways we can make a solid-geometry analogue to a regular polygon. Phoebe’s other mentioned shape of cubes is another of the Platonic Solids, but that one’s common enough to encourage no sense of mystery or wonder. The cube’s the only one of the Platonic Solids that will fill space, though, that you can put into stacks that don’t leave gaps between them. Sugar cubes, Wikipedia tells me, have been made only since the 19th century; the Moravian sugar factory director Jakub Kryštof Rad got a patent for cutting block sugar into uniform pieces in 1843. I can’t dispute the fun of “dodecahedron” as a word to say. Many solid-geometric shapes have names that are merely descriptive, but which are rendered with Greek or Latin syllables so as to sound magical.

Bud Grace’s Piranha Club for the 6th started a sequence in which the Future Disgraced Former President needs the most brilliant person in the world, Bud Grace. A word balloon full of mathematics is used as symbol for this genius. I feel compelled to point out Bud Grace was a physics major. But while Grace could as easily have used something from the physics department to show his deep thinking abilities, that would all but certainly have been rendered as equation and graphs, the stuff of mathematics again.

At the White Supremacist House: 'I have the smartest people I could find to help me run this soon-to-be-great-again country, but I'm worried that they're NOT SMART ENOUGH! I want the WORLD'S SMARTEST GENIUS to be my SPECIAL ADVISOR!' Meanwhile, cartoonist Bud Grace thinks of stuff like A = pi*r^2 and a^2 + b^2 = c^2 and tries working out 241 times 365, 'carry the one ... hmmmm ... '
Bud Grace’s Piranha Club for the 6th of March, 2017. 241 times 635 is 153,035 by the way. I wouldn’t work that out in my head if I needed the number. I might work out an estimate of how big it was, in which case I’d do this: 241 is about 250, which is one-quarter of a thousand. One-quarter of 635 is something like 150, which times a thousand is 150,000. If I needed it exactly I’d get a calculator. Unless I just needed something to occupy my mind without having any particular emotional charge.

Scott Meyer’s Basic Instructions rerun for the 6th is aptly titled, “How To Unify Newtonian Physics And Quantum Mechanics”. Meyer’s advice is not bad, really, although generic enough it applies to any attempts to reconcile two different models of a phenomenon. Also there’s not particularly a problem reconciling Newtonian physics with quantum mechanics. It’s general relativity and quantum mechanics that are so hard to reconcile.

Still, Basic Instructions is about how you can do a thing, or learn to do a thing. It’s not about how to allow anything to be done for the first time. And it’s true that, per quantum mechanics, we can’t predict exactly what any one particle will do at any time. We can say what possible things it might do and how relatively probable they are. But big stuff, the stuff for which Newtonian physics is relevant, involve so many particles that the unpredictability becomes too small to notice. We can see this as the Law of Large Numbers. That’s the probability rule that tells us we can’t predict any coin flip, but we know that a million fair tosses of a coin will not turn up 800,000 tails. There’s more to it than that (there’s always more to it), but that’s a starting point.

Michael Fry’s Committed rerun for the 6th features Albert Einstein as the icon of genius. Natural enough. And it reinforces this with the blackboard full of mathematics. I’m not sure if that blackboard note of “E = md3” is supposed to be a reference to the famous Far Side panel of Einstein hearing the maid talk about everything being squared away. I’ll take it as such.

How Much I Did Lose In Pinball


A follow-up for people curious how much I lost at the state pinball championships Saturday: I lost at the state pinball championships Saturday. As I expected I lost in the first round. I did beat my expectations, though. I’d figured I would win one, maybe two games in our best-of-seven contest. As it happened I won three games and I had a fighting chance in game seven.

I’d mentioned in the previous essay about how much contingency there is especially in a short series like this one. My opponent picked the game I expected she would to start out. And she got an awful bounce on the first ball, while I got a very lucky bounce that started multiball on the last. So I won, but not because I was playing better. The seventh game was one that I had figured she might pick if she needed to crush me, and if I had gotten a better bounce on the first ball I’d still have had an uphill struggle. Just less of one.

After the first round I got into a set of three “tie-breaking” rounds, used to sort out which of the sixteen players ranked as number 11 versus number 10. Each of those were a best-of-three series. I did win one series and lost two others, dropping me into 12th place. Over the three series I had four wins and four losses, so I can’t say that I mismatched there.

Where I might have been mismatched is the side tournament. This was a two-hour marathon of playing a lot of games one after the other. I finished with three wins and 13 losses, enough to make me wonder whether I somehow went from competent to incompetent in the hour or so between the main and the side tournament. Of course not, based on a record like that, but — can I prove it?

Meanwhile a friend pointed out The New York Times covering the New York State pinball championship:

The article is (at least for now) at https://www.nytimes.com/2017/02/12/nyregion/pinball-state-championship.html. What my friend couldn’t have known, and what shows how networked people are, is that I know one of the people featured in the article, Sean “The Storm” Grant. Well, I knew him, back in college. He was an awesome pinball player even then. And he’s only got more awesome since.

How awesome? Let me give you some background. The International Flipper Pinball Association (IFPA) gives players ranking points. These points are gathered by playing in leagues and tournaments. Each league or tournament has a certain point value. That point value is divided up among the players, in descending order from how they finish. How many points do the events have? That depends on how many people play and what their ranking is. So, yes, how much someone’s IFPA score increases depends on the events they go to, and the events they go to depend on their score. This might sound to you like there’s a differential equation describing all this. You’re close: it’s a difference equation, because these rankings change with the discrete number of events players go to. But there’s an interesting and iterative system at work there.

(Points only expire with time. The system is designed to encourage people to play a lot of things and keep playing them. You can’t lose ranking points by playing, although it might hurt your player-versus-player rating. That’s calculated by a formula I don’t understand at all.)

Anyway, Sean Grant plays in the New York Superleague, a crime-fighting band of pinball players who figured out how to game the IFPA rankings system. They figured out how to turn the large number of people who might visit a Manhattan bar and casually play one or two games into a source of ranking points for the serious players. The IFPA, combatting this scheme, just this week recalculated the Superleague values and the rankings of everyone involved in it. It’s fascinating stuff, in that way a heated debate over an issue you aren’t emotionally invested in can be.

Anyway. Grant is such a skilled player that he lost more points in this nerfing than I have gathered in my whole competitive-pinball-playing career.

So while I knew I’d be knocked out in the first round of the Michigan State Championships I’ll admit I had fantasies of having an impossibly lucky run. In that case, I’d have gone to the nationals and been turned into a pale, silverball-covered paste by people like Grant.

Thanks again for all your good wishes, kind readers. Now we start the long road to the 2017 State Championships, to be held in February of next year. I’m already in 63rd place in the state for the year! (There haven’t been many events for the year yet, and the championship and side tournament haven’t posted their ranking scores yet.)

How Much Can I Expect To Lose In Pinball?


This weekend, all going well, I’ll be going to the Michigan state pinball championship contest. There, I will lose in the first round.

I’m not trying to run myself down. But I know who I’m scheduled to play in the first round, and she’s quite a good player. She’s the state’s highest-ranked woman playing competitive pinball. So she starts off being better than me. And then the venue is one she gets to play in more than I do. Pinball, a physical thing, is idiosyncratic. The reflexes you build practicing on one table can betray you on a strange machine. She’s had more chance to practice on the games we have and that pretty well settles the question. I’m still showing up, of course, and doing my best. Stranger things have happened than my winning a game. But I’m going in with I hope realistic expectations.

That bit about having realistic expectations, though, makes me ask what are realistic expectations. The first round is a best-of-seven match. How many games should I expect to win? And that becomes a probability question. It’s a great question to learn on, too. Our match is straightforward to model: we play up to seven times. Each time we play one or the other wins.

So we can start calculating. There’s some probability I have of winning any particular game. Call that number ‘p’. It’s at least zero (I’m not sure to lose) but it’s less than one (I’m not sure to win). Let’s suppose the probability of my winning never changes over the course of seven games. I will come back to the card I palmed there. If we’re playing 7 games, and I have a chance ‘p’ of winning any one of them, then the number of games I can expect to win is 7 times ‘p’. This is the number of wins you might expect if you were called on in class and had no idea and bluffed the first thing that came to mind. Sometimes that works.

7 times p isn’t very enlightening. What number is ‘p’, after all? And I don’t know exactly. The International Flipper Pinball Association tracks how many times I’ve finished a tournament or league above her and vice-versa. We’ve played in 54 recorded events together, and I’ve won 23 and lost 29 of them. (We’ve tied twice.) But that isn’t all head-to-head play. It counts matches where I’m beaten by someone she goes on to beat as her beating me, and vice-versa. And it includes a lot of playing not at the venue. I lack statistics and must go with my feelings. I’d estimate my chance of beating her at about one in three. Let’s say ‘p’ is 1/3 until we get evidence to the contrary. It is “Flipper Pinball” because the earliest pinball machines had no flippers. You plunged the ball into play and nudged the machine a little to keep it going somewhere you wanted. (The game Simpsons Pinball Party has a moment where Grampa Simpson says, “back in my day we didn’t have flippers”. It’s the best kind of joke, the one that is factually correct.)

Seven times one-third is not a difficult problem. It comes out to two and a third, raising the question of how one wins one-third of a pinball game. Most games involve playing three rounds, called balls, is the obvious observation. But this one-third of a game is an average. Imagine the two of us playing three thousand seven-game matches, without either of us getting the least bit better or worse or collapsing of exhaustion. I would expect to win seven thousand of the games, or two and a third games per seven-game match.

Ah, but … that’s too high. I would expect to win two and a third games out of seven. But we probably won’t play seven. We’ll stop when she or I gets to four wins. This makes the problem hard. Hard is the wrong word. It makes the problem tedious. At least it threatens to. Things will get easy enough, but we have to go through some difficult parts first.

There are eight different ways that our best-of-seven match can end. She can win in four games. I can win in four games. She can win in five games. I can win in five games. She can win in six games. I can win in six games. She can win in seven games. I can win in seven games. There is some chance of each of those eight outcomes happening. And exactly one of those will happen; it’s not possible that she’ll win in four games and in five games, unless we lose track of how many games we’d played. They give us index cards to write results down. We won’t lose track.

It’s easy to calculate the probability that I win in four games, if the chance of my winning a game is the number ‘p’. The probability is p4. Similarly it’s easy to calculate the probability that she wins in four games. If I have the chance ‘p’ of winning, then she has the chance ‘1 – p’ of winning. So her probability of winning in four games is (1 – p)4.

The probability of my winning in five games is more tedious to work out. It’s going to be p4 times (1 – p) times 4. The 4 here is the number of different ways that she can win one of the first four games. Turns out there’s four ways to do that. She could win the first game, or the second, or the third, or the fourth. And in the same way the probability she wins in five games is p times (1 – p)4 times 4.

The probability of my winning in six games is going to be p4 times (1 – p)2 times 10. There are ten ways to scatter four wins by her among the first five games. The probability of her winning in six games is the strikingly parallel p2 times (1 – p)4 times 10.

The probability of my winning in seven games is going to be p4 times (1 – p)3 times 20, because there are 20 ways to scatter three wins among the first six games. And the probability of her winning in seven games is p3 times (1 – p)4 times 20.

Add all those probabilities up, no matter what ‘p’ is, and you should get 1. Exactly one of those four outcomes has to happen. And we can work out the probability that the series will end after four games: it’s the chance she wins in four games plus the chance I win in four games. The probability that the series goes to five games is the probability that she wins in five games plus the probability that I win in five games. And so on for six and for seven games.

So that’s neat. We can figure out the probability of the match ending after four games, after five, after six, or after seven. And from that we can figure out the expected length of the match. This is the expectation value. Take the product of ‘4’ and the chance the match ends at four games. Take the product of ‘5’ and the chance the match ends at five games. Take the product of ‘6’ and the chance the match ends at six games. Take the product of ‘7’ and the chance the match ends at seven games. Add all those up. That’ll be, wonder of wonders, the number of games a match like this can be expected to run.

Now it’s a matter of adding together all these combinations of all these different outcomes and you know what? I’m not doing that. I don’t know what the chance is I’d do all this arithmetic correctly is, but I know there’s no chance I’d do all this arithmetic correctly. This is the stuff we pirate Mathematica to do. (Mathematica is supernaturally good at working out mathematical expressions. A personal license costs all the money you will ever have in your life plus ten percent, which it will calculate for you.)

Happily I won’t have to work it out. A person appearing to be a high school teacher named B Kiggins has worked it out already. Kiggins put it and a bunch of other interesting worksheets on the web. (Look for the Voronoi Diagramas!)

There’s a lot of arithmetic involved. But it all simplifies out, somehow. Per Kiggins’ work, the expected number of games in a best-of-seven match, if one of the competitors has the chance ‘p’ of winning any given game, is:

E(p) = 4 + 4\cdot p + 4\cdot p^2 + 4\cdot p^3 - 52\cdot p^4 + 60\cdot p^5 - 20\cdot p^6

Whatever you want to say about that, it’s a polynomial. And it’s easy enough to evaluate it, especially if you let the computer evaluate it. Oh, I would say it seems like a shame all those coefficients of ‘4’ drop off and we get weird numbers like ’52’ after that. But there’s something beautiful in there being four 4’s, isn’t there? Good enough.

So. If the chance of my winning a game, ‘p’, is one-third, then we’d expect the series to go 5.5 games. This accords well with my intuition. I thought I would be likely to win one game. Winning two would be a moral victory akin to championship.

Let me go back to my palmed card. This whole analysis is based on the idea that I have some fixed probability of winning and that it isn’t going to change from one game to the next. If the probability of winning is entirely based on my and my opponents’ abilities this is fair enough. Neither of us is likely to get significantly more or less skilled over the course of even seven matches. We won’t even play long enough to get fatigued. But ability isn’t everything.

But our abilities aren’t everything. We’re going to be playing up to seven different tables. How each table reacts to our play is going to vary. Some tables may treat me better, some tables my opponent. Luck of the draw. And there’s an important psychological component. It’s easy to get thrown and to let a bad ball wreck the rest of one’s game. It’s hard to resist feeling nervous if you go into the last ball from way behind your opponent. And it seems as if a pinball knows you’re nervous and races out of play to help you calm down. (The best pinball players tend to have outstanding last balls, though. They don’t get rattled. And they spend the first several balls building up to high-value shots they can collect later on.) And there will be freak events. Last weekend I was saved from elimination in a tournament by the pinball machine spontaneously resetting. We had to replay the game. I did well in the tournament, but it was the freak event that kept me from being knocked out in the first round.

That’s some complicated stuff to fit together. I suppose with enough data we could possibly model how much the differences between pinball machines affects the outcome. That’s what sabermetrics is all about. Representing how severely I’ll build a little bad luck into a lot of bad luck? Oh, that’s hard.

Too hard to deal with, at least not without much more sports psychology and modelling of pinball players than we have data to do. The supposition that my chance of winning is fixed for the duration of the match may not be true. But we won’t be playing enough games to be able to tell the difference. The assumption that my chance of winning doesn’t change over the course of the match may be false. But it’s near enough, and it gets us some useful information. We have to know not to demand too much precision from our model.

And seven games isn’t statistically significant. Not when players are as closely matched as we are. I could be worse and still get a couple wins in when they count; I could play better than my average and still get creamed four games straight. I’ll be trying my best, of course. But I expect my best is one or two wins, then getting to the snack room and waiting for the side tournament to start. Shall let you know if something interesting happens.

Reading the Comics, January 7, 2016: Just Before GoComics Breaks Everything Edition


Most of the comics I review here are printed on GoComics.com. Well, most of the comics I read online are from there. But even so I think they have more comic strips that mention mathematical themes. Anyway, they’re unleashing a complete web site redesign on Monday. I don’t know just what the final version will look like. I know that the beta versions included the incredibly useful, that is to say dumb, feature where if a particular comic you do read doesn’t have an update for the day — and many of them don’t, as they’re weekly or three-times-a-week or so — then it’ll show some other comic in its place. I mean, the idea of encouraging people to find new comics is a good one. To some extent that’s what I do here. But the beta made no distinction between “comic you don’t read because you never heard of Microcosm” and “comic you don’t read because glancing at it makes your eyes bleed”. And on an idiosyncratic note, I read a lot of comics. I don’t need to see Dude and Dude reruns in fourteen spots on my daily comics page, even if I didn’t mind it to start.

Anyway. I am hoping, desperately hoping, that with the new site all my old links to comics are going to keep working. If they don’t then I suppose I’m just ruined. We’ll see. My suggestion is if you’re at all curious about the comics you read them today (Sunday) just to be safe.

Ashleigh Brilliant’s Pot-Shots is a curious little strip I never knew of until GoComics picked it up a few years ago. Its format is compellingly simple: a little illustration alongside a wry, often despairing, caption. I love it, but I also understand why was the subject of endless queries to the Detroit Free Press (Or Whatever) about why was this thing taking up newspaper space. The strip rerun the 31st of December is a typical example of the strip and amuses me at least. And it uses arithmetic as the way to communicate reasoning, both good and bad. Brilliant’s joke does address something that logicians have to face, too. Whether an argument is logically valid depends entirely on its structure. If the form is correct the reasoning may be excellent. But to be sound an argument has to be correct and must also have its assumptions be true. We can separate whether an argument is right from whether it could ever possibly be right. If you don’t see the value in that, you have never participated in an online debate about where James T Kirk was born and whether Spock was the first Vulcan in Star Fleet.

Thom Bluemel’s Birdbrains for the 2nd of January, 2017, is a loaded-dice joke. Is this truly mathematics? Statistics, at least? Close enough for the start of the year, I suppose. Working out whether a die is loaded is one of the things any gambler would like to know, and that mathematicians might be called upon to identify or exploit. (I had a grandmother unshakably convinced that I would have some natural ability to beat the Atlantic City casinos if she could only sneak the underaged me in. I doubt I could do anything of value there besides see the stage magic show.)

Jack Pullan’s Boomerangs rerun for the 2nd is built on the one bit of statistical mechanics that everybody knows, that something or other about entropy always increasing. It’s not a quantum mechanics rule, but it’s a natural confusion. Quantum mechanics has the reputation as the source of all the most solid, irrefutable laws of the universe’s working. Statistical mechanics and thermodynamics have this musty odor of 19th-century steam engines, no matter how much there is to learn from there. Anyway, the collapse of systems into disorder is not an irrevocable thing. It takes only energy or luck to overcome disorderliness. And in many cases we can substitute time for luck.

Scott Hilburn’s The Argyle Sweater for the 3rd is the anthropomorphic-geometry-figure joke that’s I’ve been waiting for. I had thought Hilburn did this all the time, although a quick review of Reading the Comics posts suggests he’s been more about anthropomorphic numerals the past year. This is why I log even the boring strips: you never know when I’ll need to check the last time Scott Hilburn used “acute” to mean “cute” in reference to triangles.

Mike Thompson’s Grand Avenue uses some arithmetic as the visual cue for “any old kind of schoolwork, really”. Steve Breen’s name seems to have gone entirely from the comic strip. On Usenet group rec.arts.comics.strips Brian Henke found that Breen’s name hasn’t actually been on the comic strip since May, and D D Degg found a July 2014 interview indicating Thompson had mostly taken the strip over from originator Breen.

Mark Anderson’s Andertoons for the 5th is another name-drop that doesn’t have any real mathematics content. But come on, we’re talking Andertoons here. If I skipped it the world might end or something untoward like that.

'Now for my math homework. I've got a comfortable chair, a good light, plenty of paper, a sharp pencil, a new eraser, and a terrific urge to go out and play some ball.'
Ted Shearer’s Quincy for the 14th of November, 1977, and reprinted the 7th of January, 2017. I kind of remember having a lamp like that. I don’t remember ever sitting down to do my mathematics homework with a paintbrush.

Ted Shearer’s Quincy for the 14th of November, 1977, doesn’t have any mathematical content really. Just a mention. But I need some kind of visual appeal for this essay and Shearer is usually good for that.

Corey Pandolph, Phil Frank, and Joe Troise’s The Elderberries rerun for the 7th is also a very marginal mention. But, what the heck, it’s got some of your standard wordplay about angles and it’ll get this week’s essay that much closer to 800 words.

Reading the Comics, December 17, 2016: Sleepy Week Edition


Comic Strip Master Command sent me a slow week in mathematical comics. I suppose they knew I was on somehow a busier schedule than usual and couldn’t spend all the time I wanted just writing. I appreciate that but don’t want to see another of those weeks when nothing qualifies. Just a warning there.

'Dadburnit! I ain't never gonna git geometry!' 'Bah! Don't fret, Jughaid --- I never understood it neither! But I still manage to work all th' angles!'
John Rose’s Barney Google and Snuffy Smith for the 12th of December, 2016. I appreciate the desire to pay attention to continuity that makes Rose draw in the coffee cup both panels, but Snuffy Smith has to swap it from one hand to the other to keep it in view there. Not implausible, just kind of busy. Also I can’t fault Jughaid for looking at two pages full of unillustrated text and feeling lost. That’s some Bourbaki-grade geometry going on there.

John Rose’s Barney Google and Snuffy Smith for the 12th is a bit of mathematical wordplay. It does use geometry as the “hard mathematics we don’t know how to do”. That’s a change from the usual algebra. And that’s odd considering the joke depends on an idiom that is actually used by real people.

Patrick Roberts’s Todd the Dinosaur for the 12th uses mathematics as the classic impossibly hard subject a seven-year-old can’t be expected to understand. The worry about fractions seems age-appropriate. I don’t know whether it’s fashionable to give elementary school students experience thinking of ‘x’ and ‘y’ as numbers. I remember that as a time when we’d get a square or circle and try to figure what number fits in the gap. It wasn’t a 0 or a square often enough.

'Teacher! Todd just passed out! But he's waring one of those medic alert bracelets! ... Do not expose the wearer of this bracelet to anything mathematical, especially x's and y's, fractions, or anything that he should remember for a test!' 'Amazing how much writing they were able to fit on a little ol' T-Rex wrist!'
Patrick Roberts’s Todd the Dinosaur for the 12th of December, 2016. Granting that Todd’s a kid dinosaur and that T-Rexes are not renowned for the hugeness of their arms, wouldn’t that still be enough space for a lot of text to fit around? I would have thought so anyway. I feel like I’m pluralizing ‘T-Rex’ wrong, but what would possibly be right? ‘Ts-rex’? Don’t make me try to spell tyrannosaurus.

Jef Mallett’s Frazz for the 12th uses one of those great questions I think every child has. And it uses it to question how we can learn things from statistical study. This is circling around the “Bayesian” interpretation of probability, of what odds mean. It’s a big idea and I’m not sure I’m competent to explain it. It amounts to asking what explanations would be plausibly consistent with observations. As we get more data we may be able to rule some cases in or out. It can be unsettling. It demands we accept right up front that we may be wrong. But it lets us find reasonably clean conclusions out of the confusing and muddy world of actual data.

Sam Hepburn’s Questionable Quotebook for the 14th illustrates an old observation about the hypnotic power of decimal points. I think Hepburn’s gone overboard in this, though: six digits past the decimal in this percentage is too many. It draws attention to the fakeness of the number. One, two, maybe three digits past the decimal would have a more authentic ring to them. I had thought the John Allen Paulos tweet above was about this comic, but it’s mere coincidence. Funny how that happens.

When Is Thanksgiving Most Likely To Happen?


So my question from last Thursday nagged at my mind. And I learned that Octave (a Matlab clone that’s rather cheaper) has a function that calculates the day of the week for any given day. And I spent longer than I would have expected fiddling with the formatting to get what I wanted to know.

It turns out there are some days in November more likely to be the fourth Thursday than others are. (This is the current standard for Thanksgiving Day in the United States.) And as I’d suspected without being able to prove, this doesn’t quite match the breakdown of which months are more likely to have Friday the 13ths. That is, it’s more likely that an arbitrarily selected month will start on Sunday than any other day of the week. It’s least likely that an arbitrarily selected month will start on a Saturday or Monday. The difference is extremely tiny; there are only four more Sunday-starting months than there are Monday-starting months over the course of 400 years.

But an arbitrary month is different from an arbitrary November. It turns out Novembers are most likely to start on a Sunday, Tuesday, or Thursday. And that makes the 26th, 24th, and 22nd the most likely days to be Thanksgiving. The 23rd and 25th are the least likely days to be Thanksgiving. Here’s the full roster, if I haven’t made any serious mistakes with it:

November Will Be Thanksgiving
22 58
23 56
24 58
25 56
26 58
27 57
28 57
times in 400 years

I don’t pretend there’s any significance to this. But it is another of those interesting quirks of probability. What you would say the probability is of a month starting on the 1st — equivalently, of having a Friday the 13th, or a Fourth Thursday of the Month that’s the 26th — depends on how much you know about the month. If you know only that it’s a month on the Gregorian calendar it’s one thing (specifically, it’s 688/4800, or about 0.14333). If you know only that it’s a November than it’s another (58/400, or 0.145). If you know only that it’s a month in 2016 then it’s another yet (1/12, or about 0.08333). If you know that it’s November 2016 then the probability is 0. Information does strange things to probability questions.

Reading the Comics, November 26, 2016: What is Pre-Algebra Edition


Here I’m just closing out last week’s mathematically-themed comics. The new week seems to be bringing some more in at a good pace, too. Should have stuff to talk about come Sunday.

Darrin Bell and Theron Heir’s Rudy Park for the 24th brings out the ancient question, why do people need to do mathematics when we have calculators? As befitting a comic strip (and Sadie’s character) the question goes unanswered. But it shows off the understandable confusion people have between mathematics and calculation. Calculation is a fine and necessary thing. And it’s fun to do, within limits. And someone who doesn’t like to calculate probably won’t be a good mathematician. (Or will become one of those master mathematicians who sees ways to avoid calculations in getting to an answer!) But put aside the obviou that we need mathematics to know what calculations to do, or to tell whether a calculation done makes sense. Much of what’s interesting about mathematics isn’t a calculation. Geometry, for an example that people in primary education will know, doesn’t need more than slight bits of calculation. Group theory swipes a few nice ideas from arithmetic and builds its own structure. Knot theory uses polynomials — everything does — but more as a way of naming structures. There aren’t things to do that a calculator would recognize.

Richard Thompson’s Poor Richard’s Almanac for the 25th I include because I’m a fan, and on the grounds that the Summer Reading includes the names of shapes. And I’ve started to notice how often “rhomboid” is used as a funny word. Those who search for the evolution and development of jokes, take heed.

John Atkinson’s Wrong Hands for the 25th is the awaited anthropomorphic-numerals and symbols joke for this past week. I enjoy the first commenter’s suggestion tha they should have stayed in unknown territory.

'Can you help me with my math, Grandma?' 'Let me see.' 'It's pre-algebra.' 'Oh, darn!' 'What's wrong?' 'I'm post-algebra.'
Rick Kirkman and Jerry Scott’s Baby Blues for the 26th of November, 2016. I suppose Kirkman and Scott know their characters better than I do but isn’t Zoe like nine or ten? Isn’t pre-algebra more a 7th or 8th grade thing? I can’t argue Grandma being post-algebra but I feel like the punch line was written and then retrofitted onto the characters.

Rick Kirkman and Jerry Scott’s Baby Blues for the 26th does a little wordplay built on pre-algebra. I’m not sure that Zoe is quite old enough to take pre-algebra. But I also admit not being quite sure what pre-algebra is. The central idea of (primary school) algebra — that you can do calculations with a number without knowing what the number is — certainly can use some preparatory work. It’s a dazzling idea and needs plenty of introduction. But my dim recollection of taking it was that it was a bit of a subject heap, with some arithmetic, some number theory, some variables, some geometry. It’s all stuff you’ll need once algebra starts. But it is hard to say quickly what belongs in pre-algebra and what doesn’t.

Art Sansom and Chip Sansom’s The Born Loser for the 26th uses two ancient staples of jokes, probabilities and weather forecasting. It’s a hard joke not to make. The prediction for something is that it’s very unlikely, and it happens anyway? We all laugh at people being wrong, which might be our whistling past the graveyard of knowing we will be wrong ourselves. It’s hard to prove that a probability is wrong, though. A fairly tossed die may have only one chance in six of turning up a ‘4’. But there’s no reason to think it won’t, and nothing inherently suspicious in it turning up ‘4’ four times in a row.

We could do it, though. If the die turned up ‘4’ four hundred times in a row we would no longer call it fair. (This even if examination proved the die really was fair after all!) Or if it just turned up a ‘4’ significantly more often than it should; if it turned up two hundred times out of four hundred rolls, say. But one or two events won’t tell us much of anything. Even the unlikely happens sometimes.

Even the impossibly unlikely happens if given enough attempts. If we do not understand that instinctively, we realize it when we ponder that someone wins the lottery most weeks. Presumably the comic’s weather forecaster supposed the chance of snow was so small it could be safely rounded down to zero. But even something with literally zero percent chance of happening might.

Imagine tossing a fair coin. Imagine tossing it infinitely many times. Imagine it coming up tails every single one of those infinitely many times. Impossible: the chance that at least one toss of a fair coin will turn up heads, eventually, is 1. 100 percent. The chance heads never comes up is zero. But why could it not happen? What law of physics or logic would it defy? It challenges our understanding of ideas like “zero” and “probability” and “infinity”. But we’re well-served to test those ideas. They hold surprises for us.

A Thanksgiving Thought Fresh From The Shower


It’s well-known, at least in calendar-appreciation circles, that the 13th of a month is more likely to be Friday than any other day of the week. That’s on the Gregorian calendar, which has some funny rules about whether a century year — 1900, 2000, 2100 — will be a leap year. Three of them aren’t in every four centuries. The result is the pattern of dates on the calendar is locked into this 400-year cycle, instead of the 28-year cycle you might imagine. And this makes some days of the week more likely for some dates than they otherwise might be.

This got me wondering. Does the 13th being slightly more likely imply that the United States Thanksgiving is more likely to be on the 26th of the month? The current rule is that Thanksgiving is the fourth Thursday of November. We’ll pretend that’s an unalterable fact of nature for the sake of having a problem we can solve. So if the 13th is more likely to be a Friday than any other day of the week, isn’t the 26th more likely to be a Thursday than any other day of the week?

And that’s so, but I’m not quite certain yet. What’s got me pondering this in the shower is that the 13th is more likely a Friday for an arbitrary month. That is, if I think of a month and don’t tell you anything about what it is, all we can say is it chance of the 13th being a Friday is such-and-such. But if I pick a particular month — say, November 2017 — things are different. The chance the 13th of November, 2017 is a Friday is zero. So the chance the 26th of December, 2017 is a Thursday is zero. Our calendar system sets rules. We’ll pretend that’s an unalterable fact of nature for the sake of having a problem we can solve, too.

So: does knowing that I am thinking of November, rather than a completely unknown month, change the probabilities? And I don’t know. My gut says “it’s plausible the dates of Novembers are different from the dates of arbitrary months”. I don’t know a way to argue this purely logically, though. It might have to be tested by going through 400 years of calendars and counting when the fourth Thursdays are. (The problem isn’t so tedious as that. There’s formulas computers are good at which can do this pretty well.)

But I would like to know if it can be argued there’s a difference, or that there isn’t.