## Reading the Comics, May 12, 2018: New Nancy Artist Edition

And now, closer to deadline than I like, let me wrap up last week’s mathematically-themed comic strips. I had a lot happening, that’s all I can say.

Glenn McCoy and Gary McCoy’s The Flying McCoys for the 10th is another tragic moment in the mathematics department. I’m amused that white lab coats are taken to read as “mathematician”. There are mathematicians who work in laboratories, naturally. Many interesting problems are about real-world things that can be modelled and tested and played with. It’s hardly the mathematics-department uniform, but then, I’m not sure mathematicians have a uniform. We just look like academics is all.

It also shows off that motif of mathematicians as doing anything with numbers in a more complicated way than necessary. I can’t imagine anyone in an emergency trying to evoke 9-1-1 by solving any kind of puzzle. But comic strip characters are expected to do things at least a bit ridiculously. I suppose.

Mark Litzler’s Joe Vanilla for the 11th is about random numbers. We need random numbers; they do so much good. Getting them is hard. People are pretty lousy at picking random numbers in their head. We can say what “lousy” random numbers look like. They look wrong. There’s digits that don’t get used as much as the others do. There’s strings of digits that don’t get used as much as other strings of the same length do. There are patterns, and they can be subtle ones, that just don’t look right.

And yet we have a terrible time trying to say what good random numbers look like. Suppose we want to have a string of random zeroes and ones: is 101010 better or worse than 110101? Or 000111? Well, for a string of digits that short there’s no telling. It’s in big batches that we should expect to see no big patterns. … Except that occasionally randomness should produce patterns. How often should we expect patterns, and of what size? This seems to depend on what patterns we’ve found interesting enough to look for. But how can the cultural quirks that make something seem interesting be a substantial mathematical property?

Olivia Jaimes’s Nancy for the 11th uses mathematics-assessment tests for its joke. It’s of marginal relevance, yes, but it does give me a decent pretext to include the new artist’s work here. I don’t know how long the Internet is going to be interested in Nancy. I have to get what attention I can while it lasts.

Scott Hilburn’s The Argyle Sweater for the 12th is the anthropomorphic-geometry joke for the week. Unless there was one I already did Sunday that I already forgot. Oh, no, that was anthropomorphic-numerals. It’s easy to see why a circle might be labelled irrational: either its radius or its area has to be. Both can be. The triangle, though …

Well, that’s got me thinking. Obviously all the sides of a triangle can be rational, and so its perimeter can be too. But … the area of an equilateral triangle is $\frac{1}{2}\sqrt{3}$ times the square of the length of any side. It can have a rational side and an irrational area, or vice-versa. Just as the circle has. If it’s not an equilateral triangle?

Can you have a triangle that has three rational sides and a rational area? And yes, you can. Take the right triangle that has sides of length 5, 12, and 13. Or any scaling of that, larger or smaller. There is indeed a whole family of triangles, the Heronian Triangles. All their sides are integers, and their areas are integers too. (Sides and areas rational are just as good as sides and areas integers. If you don’t see why, now you see why.) So there’s that at least. The name derives from Heron/Hero, the ancient Greek mathematician whom we credit with that snappy formula that tells us, based on the lengths of the three sides, what the area of the triangle is. Not the Pythagorean formula, although you can get the Pythagorean formula from it.

Still, I’m going to bet that there’s some key measure of even a Heronian Triangle that ends up being irrational. Interior angles, most likely. And there are many ways to measure triangles; they can’t all end up being rational at once. There are over two thousand ways to define a “center” of a triangle, for example. The odds of hitting a rational number on all of them at once? (Granted, most of these triangle centers are unknown except to the center’s discoverer/definer and that discoverer’s proud but baffled parents.)

Carla Ventresca and Henry Beckett’s On A Claire Day for the 12th mentions taking classes in probability and statistics. They’re the classes nobody doubts are useful in the real world. It’s easy to figure probability is more likely to be needed than functional analysis on some ordinary day outside the university. I can’t even compose that last sentence without the language of probability.

I’d kind of agree with calling the courses intense, though. Well, “intense” might not be the right word. But challenging. Not that you’re asked to prove anything deep. The opposite, really. An introductory course in either provides a lot of tools. Many of them require no harder arithmetic work than multiplication, division, and the occasional square root. But you do need to learn which tool to use in which scenario. And there’s often not the sorts of proofs that make it easy to understand which tool does what. Doing the proofs would require too much fussing around. Many of them demand settling finicky little technical points that take you far from the original questions. But that leaves the course as this archipelago of small subjects, each easy in themselves. But the connections between them are obscured. Is that better or worse? It must depend on the person hoping to learn.

## Someone Else’s Homework: A Probability Question

My friend’s finished the last of the exams and been happy with the results. And I’m stuck thinking harder about a little thing that came across my Twitter feed last night. So let me share a different problem that we had discussed over the term.

It’s a probability question. Probability’s a great subject. So much of what people actually do involves estimating probabilities and making judgements based on them. In real life, yes, but also for fun. Like a lot of probability questions, this one is abstracted into a puzzle that’s nothing like anything anybody does for fun. But that makes it practical, anyway.

So. You have a bowl with fifteen balls inside. Five of the balls are labelled ‘1’. Five of the balls are labelled ‘2’. Five of the balls are labelled ‘3’. The balls are well-mixed, which is how mathematicians say that all of the balls are equally likely to be drawn out. Three balls are picked out, without being put back in. What’s the probability that the three balls have values which, together, add up to 6?

My friend’s instincts about this were right, knowing what things to calculate. There was part of actually doing one of these calculations that went wrong. And was complicated by my making a dumb mistake in my arithmetic. Fortunately my friend wasn’t shaken by my authority, and we got to what we’re pretty sure is the right answer.

## Did The Greatest Generation Hosts Get As Drunk As I Expected?

I finally finished listening to Benjamin Ahr Harrison and Adam Pranica’s Greatest Generation podcast reviews of the first season of Star Trek: Deep Space Nine. (We’ve had fewer long car trips for this.) So I can return to my projection of how their drinking game would turn out.

Their plan was to make more exciting the discussion of some of Deep Space Nine‘s episodes by recording their reviews while drinking a lot. The plan was, for the fifteen episodes they had in the season, there would be a one-in-fifteen chance of doing any particular episode drunk. So how many drunk episodes would you expect to get, on this basis?

It’s a well-formed expectation value problem. There could be as few as zero or as many as fifteen, but some cases are more likely than others. Each episode could be recorded drunk or not-drunk. There’s an equal chance of each episode being recorded drunk. Whether one episode is drunk or not doesn’t depend on whether the one before was, and doesn’t affect whether the next one is. (I’ll come back to this.)

The most likely case was for there to be one drunk episode. The probability of exactly one drunk episode was a little over 38%. No drunk episodes was also a likely outcome. There was a better than 35% chance it would never have turned up. The chance of exactly two drunk episodes was about 19%. There drunk episodes had a slightly less than 6% chance of happening. Four drunk episodes a slightly more than 1% chance of happening. And after that you get into the deeply unlikely cases.

As the Deep Space Nine season turned out, this one-in-fifteen chance came up twice. It turned out they sort of did three drunk episodes, though. One of the drunk episodes turned out to be the first of two they planned to record that day. I’m not sure why they didn’t just swap what episode they recorded first, but I trust they had logistical reasons. As often happens with probability questions, the independence of events — whether a success for one affects the outcome of another — changes calculations.

There’s not going to be a second-season update to this. They’ve chosen to make a more elaborate recording game of things. They’ve set up a modified Snakes and Ladders type board with a handful of spots marked for stunts. Some sound like fun, such as recording without taking any notes about the episode. Some are, yes, drinking episodes. But this is all a very different and more complicated thing to project. If I were going to tackle that it’d probably be by running a bunch of simulations and taking averages from that.

Also I trust they’ve been warned about the episode where Quark has a sex change so he can meet a top Ferengi soda magnate after accidentally giving his mother a heart attack because gads but that was a thing that happened somehow.

I was all set to say how complaining about GoComics.com’s pages not loading had gotten them fixed. But they only worked for Monday alone; today they’re broken again. Right. I haven’t tried sending an error report again; we’ll see if that works. Meanwhile, I’m still not through last week’s comic strips and I had just enough for one day to nearly enough justify an installment for the one day. Should finish off the rest of the week next essay, probably in time for next week.

Mark Leiknes’s Cow and Boy rerun for the 23rd circles around some of Zeno’s Paradoxes. At the heart of some of them is the question of whether a thing can be divided infinitely many times, or whether there must be some smallest amount of a thing. Zeno wonders about space and time, but you can do as well with substance, with matter. Mathematics majors like to say the problem is easy; Zeno just didn’t realize that a sum of infinitely many things could be a finite and nonzero number. This misses the good question of how the sum of infinitely many things, none of which are zero, can be anything but infinitely large? Or, put another way, what’s different in adding $\frac11 + \frac12 + \frac13 + \frac14 + \cdots$ and adding $\frac11 + \frac14 + \frac19 + \frac{1}{16} + \cdots$ that the one is infinitely large and the other not?

Or how about this. Pick your favorite string of digits. 23. 314. 271828. Whatever. Add together the series $\frac11 + \frac12 + \frac13 + \frac14 + \cdots$except that you omit any terms that have your favorite string there. So, if you picked 23, don’t add $\frac{1}{23}$, or $\frac{1}{123}$, or $\frac{1}{802301}$ or such. That depleted series does converge. The heck is happening there? (Here’s why it’s true for a single digit being thrown out. Showing it’s true for longer strings of digits takes more work but not really different work.)

J C Duffy’s Lug Nuts for the 23rd is, I think, the first time I have to give a content warning for one of these. It’s a porn-movie advertisement spoof. But it mentions Einstein and Pi and has the tagline “she didn’t go for eggheads … until he showed her a new equation!”. So, you know, it’s using mathematics skill as a signifier of intelligence and riffing on the idea that nerds like sex too.

John Graziano’s Ripley’s Believe It or Not for the 23rd has a trivia that made me initially think “not”. It notes Vince Parker, Senior and Junior, of Alabama were both born on Leap Day, the 29th of February. I’ll accept this without further proof because of the very slight harm that would befall me were I to accept this wrongly. But it also asserted this was a 1-in-2.1-million chance. That sounded wrong. Whether it is depends on what you think the chance is of.

Because what’s the remarkable thing here? That a father and son have the same birthday? Surely the chance of that is 1 in 365. The father could be born any day of the year; the son, also any day. Trusting there’s no influence of the father’s birthday on the son’s, then, 1 in 365 it is. Or, well, 1 in about 365.25, since there are leap days. There’s approximately one leap day every four years, so, surely that, right?

And not quite. In four years there’ll be 1,461 days. Four of them will be the 29th of January and four the 29th of September and four the 29th of August and so on. So if the father was born any day but leap day (a “non-bissextile day”, if you want to use a word that starts a good fight in a Scrabble match), the chance the son’s birth is the same is 4 chances in 1,461. 1 in 365.25. If the father was born on Leap Day, then the chance the son was born the same day is only 1 chance in 1,461. Still way short of 1-in-2.1-million. So, Graziano’s Ripley’s is wrong if that’s the chance we’re looking at.

Ah, but what if we’re looking at a different chance? What if we’re looking for the chance that the father is born the 29th of February and the son is also born the 29th of February? There’s a 1-in-1,461 chance the father’s born on Leap Day. And a 1-in-1,461 chance the son’s born on Leap Day. And if those events are independent, the father’s birth date not influencing the son’s, then the chance of both those together is indeed 1 in 2,134,521. So Graziano’s Ripley’s is right if that’s the chance we’re looking at.

Which is a good reminder: if you want to work out the probability of some event, work out precisely what the event is. Ordinary language is ambiguous. This is usually a good thing. But it’s fatal to discussing probability questions sensibly.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 23rd presents his mathematician discovering a new set of numbers. This will happen. Mathematics has had great success, historically, finding new sets of things that look only a bit like numbers were understood. And showing that if they follow rules that are, as much as possible, like the old numbers, we get useful stuff out of them. The mathematician claims to be a formalist, in the punch line. This is a philosophy that considers mathematical results to be the things you get by starting with some symbols and some rules for manipulating them. What this stuff means, and whether it reflects anything of interest in the real world, isn’t of interest. We can know the results are good because they follow the rules.

This sort of approach can be fruitful. It can force you to accept results that are true but intuition-defying. And it can give results impressive confidence. You can even, at least in principle, automate the creating and the checking of logical proofs. The disadvantages are that it takes forever to get anything done. And it’s hard to shake the idea that we ought to have some idea what any of this stuff means.

## Reading the Comics, December 9, 2017: Zach Weinersmith Wants My Attention Edition

If anything dominated the week in mathematically-themed comic strips it was Zach Weinersmith’s Saturday Morning Breakfast Cereal. I don’t know how GoComics selects the strips to (re?)print on their site. But there were at least four that seemed on-point enough for me to mention. So, okay. He’s got my attention. What’s he do with it?

On the 3rd of December is a strip I can say is about conditional probability. The mathematician might be right that the chance someone will be murdered by a serial killer are less than one in ten million. But that is the chance of someone drawn from the whole universe of human experiences. There are people who will never be near a serial killer, for example, or who never come to his attention or who evade his interest. But if we know someone is near a serial killer, or does attract his interest? The information changes the probability. And this is where you get all those counter-intuitive and somewhat annoying logic puzzles about, like, the chance someone’s other child is a girl if the one who just walked in was, and how that changes if you’re told whether the girl who just entered was the elder.

On the 5th is a strip about sequences. And built on the famous example of exponential growth from doubling a reward enough times. Well, you know these things never work out for the wise guy. The “Fibonacci Spiral” spoken of in the next-to-last panel is a spiral, like you figure. The dimensions of the spiral are based on those of golden-ratio rectangles. It looks a great deal like a logarithmic spiral to the untrained eye. Also to the trained eye, but you knew that. I think it’s supposed to be humiliating that someone would call such a spiral “random”. But I admit I don’t get that part.

The strip for the 6th has a more implicit mathematical content. It hypothesizes that mathematicians, given the chance, will be more interested in doing recreational puzzles than even in eating and drinking. It’s amusing, but I’ll admit I’ve found very few puzzles all that compelling. This isn’t to say there aren’t problems I keep coming back to because I’m curious about them, just that they don’t overwhelm my common sense. Don’t ask me when I last received actual pay for doing something mathematical.

And then on the 9th is one more strip, about logicians. And logic puzzles, such as you might get in a Martin Gardner collection. The problem is written out on the chalkboard with some shorthand logical symbols. And they’re symbols both philosophers and mathematicians use. The letter that looks like a V with a crossbar means “for all”. (The mnemonic I got was “it’s an A-for-all, upside-down”. This paired with the other common symbol, which looks like a backwards E and means there exists: “E-for-exists, backwards”. Later I noticed upside-down A and backwards E could both be just 180-degree-rotated A and E. But try saying “180-degree-rotated” in a quick way.) The curvy E between the letters ‘x’ and ‘S’ means “belongs to the set”. So that first line says “for all x that belong to the set S this follows”. Writing out “isLiar(x)” instead of, say, “L(x)”, is more a philosopher’s thing than a mathematician’s. But it wouldn’t throw anyway. And the T just means emphasizing that this is true.

And that is as much about Saturday Morning Breakfast Cereal as I have to say this week.

Sam Hurt’s Eyebeam for the 4th tells a cute story about twins trying to explain infinity to one another. I’m not sure I can agree with the older twin’s assertion that infinity means there’s no biggest number. But that’s just because I worry there’s something imprecise going on there. I’m looking forward to the kids learning about negative numbers, though, and getting to wonder what’s the biggest negative real number.

Percy Crosby’s Skippy for the 4th starts with Skippy explaining a story problem. One about buying potatoes, in this case. I’m tickled by how cranky Skippy is about boring old story problems. Motivation is always a challenge. The strip originally ran the 7th of October, 1930.

Dave Whamond’s Reality Check for the 6th uses a panel of (gibberish) mathematics as an example of an algorithm. Algorithms are mathematical, in origin at least. The word comes to us from the 9th century Persian mathematician Al-Khwarizmi’s text about how to calculate. The modern sense of the word comes from trying to describe the methods by which a problem can be solved. So, legitimate use of mathematics to show off the idea. The symbols still don’t mean anything.

Rick Detorie’s One Big Happy for the 7th has Joe trying to get his mathematics homework done at the last minute. … And it’s caused me to reflect on how twenty multiplication problems seems like a reasonable number to do. But there’s only fifty multiplications to even do, at least if you’re doing the times tables up to the 10s. No wonder students get so bored seeing the same problems over and over. It’s a little less dire if you’re learning times tables up to the 12s, but not that much better. Yow.

Olivia Walch’s Imogen Quest for the 8th looks pretty legitimate to me. It’s going to read as gibberish to people who haven’t done parametric functions, though. Start with the plane and the familiar old idea of ‘x’ and ‘y’ representing how far one is along a horizontal and a vertical direction. Here, we’re given a dummy variable ‘t’, and functions to describe a value for ‘x’ and ‘y’ matching each value of ‘t’. The plot then shows all the points that ever match a pair of ‘x’ and ‘y’ coordinates for some ‘t’. The top drawing is a shape known as the cardioid, because it kind of looks like a Valentine-heart. The lower figure is a much more complicated parametric equation. It looks more anatomically accurate,

Still no sign of Mark Anderson’s Andertoons and the drought is worrying me, yes.

But they’re still going on the cartoonist’s web site, so there’s that.

## How Drunk Can We Expect The Greatest Generation Podcast Hosts To Get?

Among my entertainments is listening to the Greatest Generation podcast, hosted by Benjamin Ahr Harrison and Adam Pranica. They recently finished reviewing all the Star Trek: The Next Generation episodes, and have started Deep Space Nine. To add some fun and risk to episode podcasts the hosts proposed to record some episodes while drinking heavily. I am not a fun of recreational over-drinking, but I understand their feelings. There’s an episode where Quark has a sex-change operation because he gave his mother a heart attack right before a politically charged meeting with a leading Ferengi soda executive. Nobody should face that mess sober.

At the end of the episode reviewing “Babel”, Harrison proposed: there’s 15 episodes left in the season. Use a random number generator to pick a number from 1 to 15; if it’s one, they do the next episode (“Captive Pursuit”) drunk. And it was; what are the odds? One in fifteen. I just said.

The question: how many episodes would they be doing drunk? As they discussed in the next episode, this would imply they’d always get smashed for the last episode of the season. This is a straightforward expectation-value problem. The expectation value of a thing is the sum of all the possible outcomes times the chance of each outcome. Here, the possible outcome is adding 1 to the number of drunk episodes. The chance of any particular episode being a drunk episode is 1 divided by ‘N’, if ‘N’ is the number of episodes remaining. So the next-to-the-last episode has 1 chance in 2 of being drunk. The second-from-the-last has 1 chance in 3 of being drunk. And so on.

This expectation value isn’t hard to calculate. If we start counting from the last episode of the season, then it’s easy. Add up $1 + \frac12 + \frac13 + \frac14 + \frac15 + \frac16 + \cdots$, ending when we get up to one divided by the number of episodes in the season. 25 or 26, for most seasons of Deep Space Nine. 15, from when they counted here. This is the start of the harmonic series.

The harmonic series gets taught in sequences and series in calculus because it does some neat stuff if you let it go on forever. For example, every term in this sequence gets smaller and smaller. (The “sequence” is the terms that go into the sum: $1, \frac12, \frac13, \frac14, \frac{1}{1054}, \frac{1}{2038}$, and so on. The “series” is the sum of a sequence, a single number. I agree it seems weird to call a “series” that sum, but it’s the word we’re stuck with. If it helps, consider: when we talk about “a TV series” we usually mean the whole body of work, not individual episodes.) You can pick any number, however tiny you like. I can then respond with the last term in the sequence bigger than your number. Infinitely many terms in the sequence will be smaller than your pick. And yet: you can pick any number you like, however big. And I can take a finite number of terms in this sequence to make a sum bigger than whatever number you liked. The sum will eventually be bigger than 10, bigger than 100, bigger than a googolplex. These two facts are easy to prove, but they seem like they ought to be contradictory. You can see why infinite series are fun and produce much screaming on the part of students.

No Star Trek show has a season has infinitely many episodes, though, however long the second season of Enterprise seemed to drag out. So we don’t have to worry about infinitely many drunk episodes.

Since there were 15 episodes up for drunkenness in the first season of Deep Space Nine the calculation’s easy. I still did it on the computer. For the first season we could expect $1 + \frac12 + \frac13 + \cdots + \frac{1}{15}$ drunk episodes. This is a number a little bigger than 3.318. So, more likely three drunk episodes, four being likely. For the 25-episode seasons (seasons four and seven, if I’m reading this right), we could expect $1 + \frac12 + \frac13 + \cdots + \frac{1}{25}$ or just over 3.816 drunk episodes. Likely four, maybe three. For the 26-episode seasons (seasons two, five, and six), we could expect $1 + \frac12 + \frac13 + \cdots + \frac{1}{26}$ drunk episodes. That’s just over 3.854.

The number of drunk episodes to expect keeps growing. The harmonic series grows without bounds. But it keeps growing slower, compared to the number of terms you add together. You need a 31-episode season to be able to expect at four drunk episodes. To expect five drunk episodes you’d need an 83-episode season. If the guys at Worst Episode Ever, reviewing The Simpsons, did all 625-so-far episodes by this rule we could only expect seven drunk episodes.

Still, three, maybe four, drunk episodes of the 15 remaining first season is a fair number. They shouldn’t likely be evenly spaced. The chance of a drunk episode rises the closer they get to the end of the season. Expected length between drunk episodes is interesting but I don’t want to deal with that. I’ll just say that it probably isn’t the five episodes the quickest, easiest suggested by taking 15 divided by 3.

And it’s moot anyway. The hosts discussed it just before starting “Captive Pursuit”. Pranica pointed out, for example, the smashed-last-episode problem. What they decided they meant was there would be a 1-in-15 chance of recording each episode this season drunk. For the 25- or 26-episode seasons, each episode would get its 1-in-25 or 1-in-26 chance.

That changes the calculations. Not in spirit: that’s still the same. Count the number of possible outcomes and the chance of each one being a drunk episode and add that all up. But the work gets simpler. Each episode has a 1-in-15 chance of adding 1 to the total of drunk episodes. So the expected number of drunk episodes is the number of episodes (15) times the chance each is a drunk episode (1 divided by 15). We should expect 1 drunk episode. The same reasoning holds for all the other seasons; we should expect 1 drunk episode per season.

Still, since each episode gets an independent draw, there might be two drunk episodes. Could be three. There’s no reason that all 15 couldn’t be drunk. (Except that at the end of reviewing “Captive Pursuit” they drew for the next episode and it’s not to be a drunk one.) What are the chances there’s no drunk episodes? What are the chances there’s two, or three, or eight drunk episodes?

There’s a rule for this. This kind of problem is a mathematically-famous one. We get our results from the “binomial distribution”. This applies whenever there’s a bunch of attempts at something. And each attempt can either clearly succeed or clearly fail. And the chance of success (or failure) each attempt is always the same. That’s what applies here. If there’s ‘N’ episodes, and the chance is ‘p’ that any one will be drunk, then we get the chance ‘y’ of turning up exactly ‘k’ drunk episodes by the formula:

$y = \frac{N!}{k! \cdot \left(n - k\right)!} p^k \left(1 - p\right)^{n - k}$

That looks a bit ugly, yeah. (I don’t like using ‘y’ as the name for a probability. I ran out of good letters and didn’t want to do subscripts.) It’s just tedious to calculate is all. Factorials and everything. Better to let the computer work it out. There is a formula that’s easy enough to work with, though. That’s because the chance of a drunk episode is the same each episode. I don’t know a formula to get the chance of exactly zero or one or four drunk episodes with the first, one-in-N chance. Probably the only thing to do is run a lot of simulations and trust that’s approximately right.

But for this rule it’s easy enough. There’s this formula, like I said. I figured out the chance of all the possible drunk episode combinations for the seasons. I mean I had the computer work it out. All I figured out was how to make it give me the results in a format I liked. Here’s what I got.

The chance of these many drunk episodes In a 15-episode season is
0 0.355
1 0.381
2 0.190
3 0.059
4 0.013
5 0.002
6 0.000
7 0.000
8 0.000
9 0.000
10 0.000
11 0.000
12 0.000
13 0.000
14 0.000
15 0.000

Sorry it’s so dull, but the chance of a one-in-fifteen event happening 15 times in a row? You’d expect that to be pretty small. It’s got a probability of something like 0.000 000 000 000 000 002 28 of happening. Not technically impossible, but yeah, impossible.

How about for the 25- and 26-episode seasons? Here’s the chance of all the outcomes:

The chance of these many drunk episodes In a 25-episode season is
0 0.360
1 0.375
2 0.188
3 0.060
4 0.014
5 0.002
6 0.000
7 0.000
8 or more 0.000

And things are a tiny bit different for a 26-episode season.

The chance of these many drunk episodes In a 26-episode season is
0 0.361
1 0.375
2 0.188
3 0.060
4 0.014
5 0.002
6 0.000
7 0.000
7 0.000
8 or more 0.000

Yes, there’s a greater chance of no drunk episodes. The difference is really slight. It only looks so big because of rounding. A no-drunk 25 episode season has a chance of about 0.3604, while a no-drunk 26 episodes season has a chance of about 0.3607. The difference comes from the chance of lots of drunk episodes all being even worse somehow.

And there’s some neat implications through this. There’s a slightly better than one in three chance that each of the second through seventh seasons won’t have any drunk episodes. We could expect two dry seasons, hopefully not the one with Quark’s sex-change episode. We can reasonably expect at least one season with two drunk episodes. There’s a slightly more than 40 percent chance that some season will have three drunk episodes. There’s just under a 10 percent chance some season will have four drunk episodes.

There’s no guarantees, though. Probability has a curious blend. There’s no predicting when any drunk episode will come. But we can make meaningful predictions about groups of episodes. These properties seem like they should be contradictions. And they’re not, and that’s wonderful.

## Reading the Comics, November 25, 2017: Shapes and Probability Edition

This week was another average-grade week of mathematically-themed comic strips. I wonder if I should track them and see what spurious correlations between events and strips turn up. That seems like too much work and there’s better things I could do with my time, so it’s probably just a few weeks before I start doing that.

Ruben Bolling’s Super-Fun-Pax Comics for the 19th is an installment of A Voice From Another Dimension. It’s in that long line of mathematics jokes that are riffs on Flatland, and how we might try to imagine spaces other than ours. They’re taxing things. We can understand some of the rules of them perfectly well. Does that mean we can visualize them? Understand them? I’m not sure, and I don’t know a way to prove whether someone does or does not. This wasn’t one of the strips I was thinking of when I tossed “shapes” into the edition title, but you know what? It’s close enough to matching.

Olivia Walch’s Imogen Quest for the 20th — and I haven’t looked, but it feels to me like I’m always featuring Imogen Quest lately — riffs on the Monty Hall Problem. The problem is based on a game never actually played on Monty Hall’s Let’s Make A Deal, but very like ones they do. There’s many kinds of games there, but most of them amount to the contestant making a choice, and then being asked to second-guess the choice. In this case, pick a door and then second-guess whether to switch to another door. The Monty Hall Problem is a great one for Internet commenters to argue about while the rest of us do something productive. The trouble — well, one trouble — is that whether switching improves your chance to win the car is that whether it does depends on the rules of the game. It’s not stated, for example, whether the host must open a door showing a goat behind it. It’s not stated that the host certainly knows which doors have goats and so chooses one of those. It’s not certain the contestant even wants a car when, hey, goats. What assumptions you make about these issues affects the outcome.

If you take the assumptions that I would, given the problem — the host knows which door the car’s behind, and always offers the choice to switch, and the contestant would rather have a car, and such — then Walch’s analysis is spot on.

Jonathan Mahood’s Bleeker: The Rechargeable Dog for the 20th features a pretend virtual reality arithmetic game. The strip is of incredibly low mathematical value, but it’s one of those comics I like that I never hear anyone talking about, so, here.

Richard Thompson’s Cul de Sac rerun for the 20th talks about shapes. And the names for shapes. It does seem like mathematicians have a lot of names for slightly different quadrilaterals. In our defense, if you’re talking about these a lot, it helps to have more specific names than just “quadrilateral”. Rhomboids are those parallelograms which have all four sides the same length. A parallelogram has to have two pairs of equal-sized legs, but the two pairs’ sizes can be different. Not so a rhombus. Mathworld says a rhombus with a narrow angle that’s 45 degrees is sometimes called a lozenge, but I say they’re fibbing. They make even more preposterous claims on the “lozenge” page.

Todd Clark’s Lola for the 20th does the old “when do I need to know algebra” question and I admit getting grumpy like this when people ask. Do French teachers have to put up with this stuff?

Brian Fies’s Mom’s Cancer rerun for the 23rd is from one of the delicate moments in her story. Fies’s mother just learned the average survival rate for her cancer treatment is about five percent and, after months of things getting haltingly better, is shaken. But as with most real-world probability questions context matters. The five-percent chance is, as described, the chance someone who’d just been diagnosed in the state she’d been diagnosed in would survive. The information that she’s already survived months of radiation and chemical treatment and physical therapy means they’re now looking at a different question. What is the chance she will survive, given that she has survived this far with this care?

Mark Anderson’s Andertoons for the 24th is the Mark Anderson’s Andertoons for the week. It’s a protesting-student kind of joke. For the student’s question, I’m not sure how many sides a polygon has before we can stop memorizing them. I’d say probably eight. Maybe ten. Of the shapes whose names people actually care about, mm. Circle, triangle, a bunch of quadrilaterals, pentagons, hexagons, octagons, maybe decagon and dodecagon. No, I’ve never met anyone who cared about nonagons. I think we could drop heptagons without anyone noticing either. Among quadrilaterals, ugh, let’s see. Square, rectangle, rhombus, parallelogram, trapezoid (or trapezium), and I guess diamond although I’m not sure what that gets you that rhombus doesn’t already. Toss in circles, ellipses, and ovals, and I think that’s all the shapes whose names you use.

Stephan Pastis’s Pearls Before Swine for the 25th does the rounding-up joke that’s been going around this year. It’s got a new context, though.

## When Is Thanksgiving Most Likely To Happen?

I thought I had written this up. Which is good because I didn’t want to spend the energy redoing these calculations.

The date of Thanksgiving, as observed in the United States, is that it’s the fourth Thursday of November. So it might happen anytime from the 22nd through the 28th. But because of the quirks of the Gregorian calendar, it can happen that a particular date, like the 23rd of November, is more or less likely to be a Thursday than some other day of the week.

So here’s the results of what days are most and least likely to be Thanksgiving. It turns out the 23rd, this year’s candidate, is tied for the rarest of Thanksgiving days. It’s not that rare, in comparison. It happens only two fewer times every 400 years than do Thanksgivings on the 22nd of November, the (tied) most common day.

## Reading the Comics, November 18, 2017: Story Problems and Equation Blackboards Edition

It was a normal-paced week at Comic Strip Master Command. It was also one of those weeks that didn’t have anything from Comics Kingdom or Creators.Com. So I’m afraid you’ll all just have to click the links for strips you want to actually see. Sorry.

Bill Amend’s FoxTrot for the 12th has Jason and Marcus creating “mathic novels”. They, being a couple of mathematically-gifted smart people, credit mathematics knowledge with smartness. A “chiliagon” is a thousand-sided regular polygon that’s mostly of philosophical interest. A regular polygon with a thousand equal sides and a thousand equal angles looks like a circle. There’s really no way to draw one so that the human eye could see the whole figure and tell it apart from a circle. But if you can understand the idea of a regular polygon it seems like you can imagine a chilagon and see how that’s not a circle. So there’s some really easy geometry things that can’t be visualized, or at least not truly visualized, and just have to be reasoned with.

Rick Detorie’s One Big Happy for the 12th is a story-problem-subversion joke. The joke’s good enough as it is, but the supposition of the problem is that the driving does cover fifty miles in an hour. This may not be the speed the car travels at the whole time of the problem. Mister Green is maybe speeding to make up for all the time spent travelling slower.

Brandon Sheffield and Dami Lee’s Hot Comics for Cool People for the 13th uses a blackboard full of equations to represent the deep thinking being done on a silly subject.

Shannon Wheeler’s Too Much Coffee Man for the 15th also uses a blackboard full of equations to represent the deep thinking being done on a less silly subject. It’s a really good-looking blackboard full of equations, by the way. Beyond the appearance of our old friend E = mc2 there’s a lot of stuff that looks like legitimate quantum mechanics symbols there. They’re at least not obvious nonsense, as best I can tell without the ability to zoom the image in. I wonder if Wheeler didn’t find a textbook and use some problems from it for the feeling of authenticity.

Samson’s Dark Side of the Horse for the 16th is a story-problem subversion joke.

Jef Mallett’s Frazz for the 18th talks about making a bet on the World Series, which wrapped up a couple weeks ago. It raises the question: can you bet on an already known outcome? Well, sure, you can bet on anything you like, given a willing partner. But there does seem to be something fundamentally different between betting on something whose outcome isn’t in principle knowable, such as the winner of the next World Series, and betting on something that could be known but happens not to be, such as the winner of the last. We see this expressed in questions like “is it true the 13th of a month is more likely to be Friday than any other day of the week?” If you know which month and year is under discussion the chance the 13th is Friday is either 1 or 0. But we mean something more like, if we don’t know what month and year it is, what’s the chance this is a month with a Friday the 13th? Something like this is at work in this World Series bet. (The Astros won the recently completed World Series.)

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 18th is also featured on some underemployed philosopher’s “Reading the Comics” WordPress blog and fair enough. Utilitarianism exists in an odd triple point, somewhere on the borders of ethics, economics, and mathematics. The idea that one could quantize the good or the utility or the happiness of society, and study how actions affect it, is a strong one. It fits very well the modern mindset that holds everything can be quantified even if we don’t know how to do it well just yet. And it appeals strongly to a mathematically-minded person since it sounds like pure reason. It’s not, of course, any more than any ethical scheme can be. But it sounds like the ethics a Vulcan would come up with and that appeals to a certain kind of person. (The comic is built on one of the implications of utilitarianism that makes it seem like the idea’s gone off the rails.)

There’s some mathematics symbols on The Utilitarian’s costume. The capital U on his face is probably too obvious to need explanation. The $\sum u$ on his chest relies on some mathematical convention. For maybe a half-millennium now mathematicians have been using the capital sigma to mean “take a sum of things”. The things are whatever the expression after that symbol is. Usually, the Sigma will have something below and above which carries meaning. It says what the index is for the thing after the symbol, and what the bounds of the index are. Here, it’s not set. This is common enough, though, if this is understood from context. Or if it’s obvious. The small ‘u’ to the right suggests the utility of whatever’s thought about. (“Utility” being the name for the thing measured and maximized; it might be happiness, it might be general well-being, it might be the number of people alive.) So the symbols would suggest “take the sum of all the relevant utilities”. Which is the calculation that would be done in this case.

## Reading the Comics, September 29, 2017: Anthropomorphic Mathematics Edition

The rest of last week had more mathematically-themed comic strips than Sunday alone did. As sometimes happens, I noticed an objectively unimportant detail in one of the comics and got to thinking about it. Whether I could solve the equation as posted, or whether at least part of it made sense as a mathematics problem. Well, you’ll see.

Patrick McDonnell’s Mutts for the 25th of September I include because it’s cute and I like when I can feature some comic in these roundups. Maybe there’s some discussion that could be had about what “equals” means in ordinary English versus what it means in mathematics. But I admit that’s a stretch.

Olivia Walch’s Imogen Quest for the 25th uses, and describes, the mathematics of a famous probability problem. This is the surprising result of how few people you need to have a 50 percent chance that some pair of people have a birthday in common. It then goes over to some other probability problems. The examples are silly. But the reasoning is sound. And the approach is useful. To find the chance of something happens it’s often easiest to work out the chance it doesn’t. Which is as good as knowing the chance it does, since a thing can either happen or not happen. At least in probability problems, which define “thing” and “happen” so there’s not ambiguity about whether it happened or not.

Piers Baker’s Ollie and Quentin rerun for the 26th I’m pretty sure I’ve written about before, although back before I included pictures of the Comics Kingdom strips. (The strip moved from Comics Kingdom over to GoComics, which I haven’t caught removing old comics from their pages.) Anyway, it plays on a core piece of probability. It sets out the world as things, “events”, that can have one of multiple outcomes, and which must have one of those outcomes. Coin tossing is taken to mean, by default, an event that has exactly two possible outcomes, each equally likely. And that is near enough true for real-world coin tossing. But there is a little gap between “near enough” and “true”.

Rick Stromoski’s Soup To Nutz for the 27th is your standard sort of Dumb Royboy joke, in this case about him not knowing what percentages are. You could do the same joke about fractions, including with the same breakdown of what part of the mathematics geek population ruins it for the remainder.

Nate Fakes’s Break of Day for the 28th is not quite the anthropomorphic-numerals joke for the week. Anthropomorphic mathematics problems, anyway. The intriguing thing to me is that the difficult, calculus, problem looks almost legitimate to me. On the right-hand-side of the first two lines, for example, the calculation goes from

$\int -8 e^{-\frac{ln 3}{14} t}$

to
$-8 -\frac{14}{ln 3} e^{-\frac{ln 3}{14} t}$

This is a little sloppy. The first line ought to end in a ‘dt’, and the second ought to have a constant of integration. If you don’t know what these calculus things are let me explain: they’re calculus things. You need to include them to express the work correctly. But if you’re just doing a quick check of something, the mathematical equivalent of a very rough preliminary sketch, it’s common enough to leave that out.

It doesn’t quite parse or mean anything precisely as it is. But it looks like the sort of thing that some context would make meaningful. That there’s repeated appearances of $- \frac{ln 3}{14}$, or $- \frac{14}{ln 3}$, particularly makes me wonder if Frakes used a problem he (or a friend) was doing for some reason.

Mark Anderson’s Andertoons for the 29th is a welcome reassurance that something like normality still exists. Something something student blackboard story problem something.

Anthony Blades’s Bewley rerun for the 29th depicts a parent once again too eager to help with arithmetic homework.

Maria Scrivan’s Half Full for the 29th gives me a proper anthropomorphic numerals panel for the week, and none too soon.

## The Summer 2017 Mathematics A To Z: Sárközy’s Theorem

Gaurish, of For the love of Mathematics, gives me another chance to talk number theory today. Let’s see how that turns out.

# Sárközy’s Theorem.

I have two pieces to assemble for this. One is in factors. We can take any counting number, a positive whole number, and write it as the product of prime numbers. 2038 is equal to the prime 2 times the prime 1019. 4312 is equal to 2 raised to the third power times 7 raised to the second times 11. 1040 is 2 to the fourth power times 5 times 13. 455 is 5 times 7 times 13.

There are many ways to divide up numbers like this. Here’s one. Is there a square number among its factors? 2038 and 455 don’t have any. They’re each a product of prime numbers that are never repeated. 1040 has a square among its factors. 2 times 2 divides into 1040. 4312, similarly, has a square: we can write it as 2 squared times 2 times 7 squared times 11. So that is my first piece. We can divide counting numbers into squarefree and not-squarefree.

The other piece is in binomial coefficients. These are numbers, often quite big numbers, that get dumped on the high school algebra student as she tries to work with some expression like $(a + b)^n$. They’re also dumped on the poor student in calculus, as something about Newton’s binomial coefficient theorem. Which we hear is something really important. In my experience it wasn’t explained why this should rank up there with, like, the differential calculus. (Spoiler: it’s because of polynomials.) But it’s got some great stuff to it.

Binomial coefficients are among those utility players in mathematics. They turn up in weird places. In dealing with polynomials, of course. They also turn up in combinatorics, and through that, probability. If you run, for example, 10 experiments each of which could succeed or fail, the chance you’ll get exactly five successes is going to be proportional to one of these binomial coefficients. That they touch on polynomials and probability is a sign we’re looking at a thing woven into the whole universe of mathematics. We saw them some in talking, last A-To-Z around, about Yang Hui’s Triangle. That’s also known as Pascal’s Triangle. It has more names too, since it’s been found many times over.

The theorem under discussion is about central binomial coefficients. These are one specific coefficient in a row. The ones that appear, in the triangle, along the line of symmetry. They’re easy to describe in formulas. for a whole number ‘n’ that’s greater than or equal to zero, evaluate what we call 2n choose n:

${{2n} \choose{n}} = \frac{(2n)!}{(n!)^2}$

If ‘n’ is zero, this number is $\frac{0!}{(0!)^2}$ or 1. If ‘n’ is 1, this number is $\frac{2!}{(1!)^2}$ or 2. If ‘n’ is 2, this number is $\frac{4!}{(2!)^2}$ 6. If ‘n’ is 3, this number is (sparing the formula) 20. The numbers keep growing. 70, 252, 924, 3432, 12870, and so on.

So. 1 and 2 and 6 are squarefree numbers. Not much arguing that. But 20? That’s 2 squared times 5. 70? 2 times 5 times 7. 252? 2 squared times 3 squared times 7. 924? That’s 2 squared times 3 times 7 times 11. 3432? 2 cubed times 3 times 11 times 13; there’s a 2 squared in there. 12870? 2 times 3 squared times it doesn’t matter anymore. It’s not a squarefree number.

There’s a bunch of not-squarefree numbers in there. The question: do we ever stop seeing squarefree numbers here?

So here’s Sárközy’s Theorem. It says that this central binomial coefficient ${{2n} \choose{n}}$ is never squarefree as long as ‘n’ is big enough. András Sárközy showed in 1985 that this was true. How big is big enough? … We have a bound, at least, for this theorem. If ‘n’ is larger than the number $2^{8000}$ then the corresponding coefficient can’t be squarefree. It might not surprise you that the formulas involved here feature the Riemann Zeta function. That always seems to turn up for questions about large prime numbers.

That’s a common state of affairs for number theory problems. Very often we can show that something is true for big enough numbers. I’m not sure there’s a clear reason why. When numbers get large enough it can be more convenient to deal with their logarithms, I suppose. And those look more like the real numbers than the integers. And real numbers are typically easier to prove stuff about. Maybe that’s it. This is vague, yes. But to ask ‘why’ some things are easy and some are hard to prove is a hard question. What is a satisfying ’cause’ here?

It’s tempting to say that since we know this is true for all ‘n’ above a bound, we’re done. We can just test all the numbers below that bound, and the rest is done. You can do a satisfying proof this way: show that eventually the statement is true, and show all the special little cases before it is. This particular result is kind of useless, though. $2^{8000}$ is a number that’s something like 241 digits long. For comparison, the total number of things in the universe is something like a number about 80 digits long. Certainly not more than 90. It’d take too long to test all those cases.

That’s all right. Since Sárközy’s proof in 1985 there’ve been other breakthroughs. In 1988 P Goetgheluck proved it was true for a big range of numbers: every ‘n’ that’s larger than 4 and less than $2^{42,205,184}$. That’s a number something more than 12 million digits long. In 1991 I Vardi proved we had no squarefree central binomial coefficients for ‘n’ greater than 4 and less than $2^{774,840,978}$, which is a number about 233 million digits long. And then in 1996 Andrew Granville and Olivier Ramare showed directly that this was so for all ‘n’ larger than 4.

So that 70 that turned up just a few lines in is the last squarefree one of these coefficients.

Is this surprising? Maybe, maybe not. I’ll bet most of you didn’t have an opinion on this topic twenty minutes ago. Let me share something that did surprise me, and continues to surprise me. In 1974 David Singmaster proved that any integer divides almost all the binomial coefficients out there. “Almost all” is here a term of art, but it means just about what you’d expect. Imagine the giant list of all the numbers that can be binomial coefficients. Then pick any positive integer you like. The number you picked will divide into so many of the giant list that the exceptions won’t be noticeable. So that square numbers like 4 and 9 and 16 and 25 should divide into most binomial coefficients? … That’s to be expected, suddenly. Into the central binomial coefficients? That’s not so obvious to me. But then so much of number theory is strange and surprising and not so obvious.

## The Summer 2017 Mathematics A To Z: Quasirandom numbers

Gaurish, host of, For the love of Mathematics, gives me the excuse to talk about amusement parks. You may want to brace yourself. Yes, this essay includes a picture. It would have included a video if I had enough WordPress privileges for that.

# Quasirandom numbers.

Think of a merry-go-round. Or carousel, if you prefer. I will venture a guess. You might like merry-go-rounds. They’re beautiful. They can evoke happy thoughts of childhood when they were a big ride it was safe to go on. But they don’t often make one think of thrills.. They’re generally sedate things. They don’t need to be. There’s no great secret to making a carousel a thrill ride. They knew it a century ago, when all the great American carousels were carved. It’s simple. Make the thing spin fast enough, at the five or six rotations per minute the ride was made for. There are places that do this yet. There’s the Cedar Downs ride at Cedar Point, Sandusky, Ohio. There’s the antique carousel at Crossroads Village, a historical village/park just outside Flint, Michigan. There’s the Derby Racer at Playland in Rye, New York. There’s the carousel in the Merry-Go-Round Museum in Sandusky, Ohio. Any of them are great rides. Two of them have a special edge. I’ll come back to them.

Randomness is a valuable resource. We know it’s key to many things. We have major fields of mathematics built on it. We can understand the behavior of variables without ever knowing what value they have. All we need is to know than the chance they might be in some particular range. This makes possible all kinds of problems too complicated to do otherwise. We know it’s critical. Quantum mechanics would not work without randomness. Without quantum mechanics, matter doesn’t work. And that’s true randomness, the kind where something is unpredictable. It’s not the kind of randomness we talk about when we ask, say, what’s the chance someone was born on a Tuesday. That’s mere hidden information: if we knew the month and date and year of a person’s birth we would know whether they were born Tuesday or not. We need more.

So the trouble is actually getting a random number. Well, a sequence of randomly drawn numbers. We rarely need this if we’re doing analysis. We can understand how some process changes the shape of a distribution without ever using the distribution. We can take derivatives of a function without ever evaluating the original function, after all.

But we do need randomly drawn numbers. We do too much numerical work with them. For example, it’s impossible to exactly integrate most functions. Numerical methods can take a ferociously long time to evaluate. A family of methods called Monte Carlo rely on randomly-drawn values to estimate the integral. The results are strikingly good for the work required. But they must have random numbers. The name “Monte Carlo” is not some cryptic code. It is an expression of how randomly drawn numbers make the tool work.

It’s hard to get random numbers. Consider: we can’t write an algorithm to do it. If we were to write one, then we’d be able to predict that the sequence of numbers was. We have some recourse. We could set up instruments to rely on the randomness that seems to be in the world. Thermal fluctuations, for example, created by processes outside any computer’s control, can give us a pleasant dose of randomness. If we need higher-quality random numbers than that we can go to exotic equipment. Geiger counters watching the decay of a not-alarmingly-radioactive sample. Cosmic ray detectors watching the sky.

Or we can write something that produces numbers that look random enough. They won’t really be random, and if we wait long enough we’ll notice the sequence repeats itself. But if we only need, say, ten numbers, who cares if the sequence will repeat after ten million numbers? (We’ll surely need more than ten numbers. But we can postpone the repetition until we’ve drawn far more than ten million numbers.)

Two of the carousels I’ve mentioned have an astounding property. The horses in a file move. I mean, relative to each other. Some horse will start the race in front of its neighbors; some will start behind. The four move forward and back thanks to a mechanism of, I am assured, staggering complexity. There are only three carousels in the world that have it. There’s Cedar Downs at Cedar Point in Sandusky, Ohio; the Racing Downs at Playland in Rye, New York; and the Derby Racer at Blackpool Pleasure Beach in Blackpool, England. The mechanism in Blackpool’s hasn’t operated in years. The one at Playland’s had not run in years, but was restored for the 2017 season. My love and I made a trip specifically to ride that. (You may have heard of a fire at the carousel in Playland this summer. This was of part of the building for their other, non-racing, antique carousel. My last information was that the carousel itself was all right.)

These racing derbies have the horses in a file move forward and back in a “random” way. It’s not truly random. If you knew exactly which gears were underneath each horse, and where in their rotations they were, you could say which horse was about to gain on its partners and which was about to fall back. But all that is concealed from the rider. The horse patterns will eventually, someday, repeat. If the gear cycles aren’t interrupted by maintenance or malfunctions. But nobody’s going to ride any horse long enough to notice. We have in these rides a randomness as good as what your computer makes, at least for the purpose it serves.

What does it mean to look random? Some things seem obvious. All the possible numbers ought to come up, sooner or later. Any particular possible number shouldn’t repeat too often. Any particular possible number shouldn’t go too long without repeating. There shouldn’t be clumps of numbers; if, say, ‘4’ turns up, we shouldn’t see ‘5’ turn up right away all the time.

We can make the idea of “looking” random quite literal. Suppose we’re selecting numbers from 0 through 9. We can draw the random numbers we’ve picked. Use the numbers as coordinates. Say we pick four digits: 1, 3, 9, and 0. Then draw the point that’s at x-coordinate 13, y-coordinate 90. Then the next four digits. Let’s say they’re 4, 2, 3, and 8. Then draw the point that’s at x-coordinate 42, y-coordinate 38. And repeat. What will this look like?

If it clumps up, we probably don’t have good random numbers. If we see lines that points collect along, or avoid, there’s a good chance our numbers aren’t very random. If there’s whole blocks of space that they occupy, and others they avoid, we may have a defective source of random numbers. We should expect the points to cover a space pretty uniformly. (There are more rigorous, logically sound, methods. The eye can be fooled easily enough. But it’s the same principle. We have some test that notices clumps and gaps.) But …

The thing is, there’s always going to be some clumps. There’ll always be some gaps. Part of randomness is that it forms patterns, or at least things that look like patterns to us. We can describe how big a clump (or gap; it’s the same thing, really) is for any particular quantity of randomly drawn numbers. If we see clumps bigger than that we can throw out the numbers as suspect. But … still …

Toss a coin fairly twenty times, and there’s no reason it can’t turn up tails sixteen times. This doesn’t happen often, but it will happen sometimes. Just luck. This surplus of tails should evaporate as we take more tosses. That is, we most likely won’t see 160 tails out of 200 tosses. We certainly will not see 1,600 tails out of 2,000 tosses. We know this as the Law of Large Numbers. Wait long enough and weird fluctuations will average out.

What if we don’t have time, though? For coin-tossing that’s silly; of course we have time. But for Monte Carlo integration? It could take too long to be confident we haven’t got too-large gaps or too-tight clusters.

This is why we take quasi-random numbers. We begin with what randomness we’re able to manage. But we massage it. Imagine our coins example. Suppose after ten fair tosses we noticed there had been eight tails turn up. Then we would start tossing less fairly, trying to make heads more common. We would be happier if there were 12 rather than 16 tails after twenty tosses.

Draw the results. We get now a pattern that looks still like randomness. But it’s a finer sorting; it looks like static tidied up some. The quasi-random numbers are not properly random. Knowing that, say, the last several numbers were odd means the next one is more likely to be even, the Gambler’s Fallacy put to work. But in aggregate, we trust, we’ll be able to enjoy the speed and power of randomly-drawn numbers. It shows its strengths when we don’t know just how finely we must sample a range of numbers to get good, reliable results.

To carousels. I don’t know whether the derby racers have quasirandom outcomes. I would find believable someone telling me that all the possible orderings of the four horses in any file are equally likely. To know would demand detailed knowledge of how the gearing works, though. Also probably simulations of how the system would work if it ran long enough. It might be easier to watch the ride for a couple of days and keep track of the outcomes. If someone wants to sponsor me doing a month-long research expedition to Cedar Point, drop me a note. Or just pay for my season pass. You folks would do that for me, wouldn’t you? Thanks.

## The Summer 2017 Mathematics A To Z: Benford's Law

Today’s entry in the Summer 2017 Mathematics A To Z is one for myself. I couldn’t post this any later.

# Benford’s Law.

My car’s odometer first read 9 on my final test drive before buying it, in June of 2009. It flipped over to 10 barely a minute after that, somewhere near Jersey Freeze ice cream parlor at what used to be the Freehold Traffic Circle. Ask a Central New Jersey person of sufficient vintage about that place. Its odometer read 90 miles sometime that weekend, I think while I was driving to The Book Garden on Route 537. Ask a Central New Jersey person of sufficient reading habits about that place. It’s still there. It flipped over to 100 sometime when I was driving back later that day.

The odometer read 900 about two months after that, probably while I was driving to work, as I had a longer commute in those days. It flipped over to 1000 a couple days after that. The odometer first read 9,000 miles sometime in spring of 2010 and I don’t remember what I was driving to for that. It flipped over from 9,999 to 10,000 miles several weeks later, as I pulled into the car dealership for its scheduled servicing. Yes, this kind of impressed the dealer that I got there exactly on the round number.

The odometer first read 90,000 in late August of last year, as I was driving to some competitive pinball event in western Michigan. It’s scheduled to flip over to 100,000 miles sometime this week as I get to the dealer for its scheduled maintenance. While cars have gotten to be much more reliable and durable than they used to be, the odometer will never flip over to 900,000 miles. At least I can’t imagine owning it long enough, at my rate of driving the past eight years, that this would ever happen. It’s hard to imagine living long enough for the car to reach 900,000 miles. Thursday or Friday it should flip over to 100,000 miles. The leading digit on the odometer will be 1 or, possibly, 2 for the rest of my association with it.

The point of this little autobiography is this observation. Imagine all the days that I have owned this car, from sometime in June 2009 to whatever day I sell, lose, or replace it. Pick one. What is the leading digit of my odometer on that day? It could be anything from 1 to 9. But it’s more likely to be 1 than it is 9. Right now it’s as likely to be any of the digits. But after this week the chance of ‘1’ being the leading digit will rise, and become quite more likely than that of ‘9’. And it’ll never lose that edge.

This is a reflection of Benford’s Law. It is named, as most mathematical things are, imperfectly. The law-namer was Frank Benford, a physicist, who in 1938 published a paper The Law Of Anomalous Numbers. It confirmed the observation of Simon Newcomb. Newcomb was a 19th century astronomer and mathematician of an exhausting number of observations and developments. Newcomb observed the logarithm tables that anyone who needed to compute referred to often. The earlier pages were more worn-out and dirty and damaged than the later pages. People worked with numbers that start with ‘1’ more than they did numbers starting with ‘2’. And more those that start ‘2’ than start ‘3’. More that start with ‘3’ than start with ‘4’. And on. Benford showed this was not some fluke of calculations. It turned up in bizarre collections of data. The surface areas of rivers. The populations of thousands of United States municipalities. Molecular weights. The digits that turned up in an issue of Reader’s Digest. There is a bias in the world toward numbers that start with ‘1’.

And this is, prima facie, crazy. How can the surface areas of rivers somehow prefer to be, say, 100-199 hectares instead of 500-599 hectares? A hundred is a human construct. (Indeed, it’s many human constructs.) That we think ten is an interesting number is an artefact of our society. To think that 100 is a nice round number and that, say, 81 or 144 are not is a cultural choice. Grant that the digits of street addresses of people listed in American Men of Science — one of Benford’s data sources — have some cultural bias. How can another of his sources, molecular weights, possibly?

The bias sneaks in subtly. Don’t they all? It lurks at the edge of the table of data. The table header, perhaps, where it says “River Name” and “Surface Area (sq km)”. Or at the bottom where it says “Length (miles)”. Or it’s never explicit, because I take for granted people know my car’s mileage is measured in miles.

What would be different in my introduction if my car were Canadian, and the odometer measured kilometers instead? … Well, I’d not have driven the 9th kilometer; someone else doing a test-drive would have. The 90th through 99th kilometers would have come a little earlier that first weekend. The 900th through 999th kilometers too. I would have passed the 99,999th kilometer years ago. In kilometers my car has been in the 100,000s for something like four years now. It’s less absurd that it could reach the 900,000th kilometer in my lifetime, but that still won’t happen.

What would be different is the precise dates about when my car reached its milestones, and the amount of days it spent in the 1’s and the 2’s and the 3’s and so on. But the proportions? What fraction of its days it spends with a 1 as the leading digit versus a 2 or a 5? … Well, that’s changed a little bit. There is some final mile, or kilometer, my car will ever register and it makes a little difference whether that’s 239,000 or 385,000. But it’s only a little difference. It’s the difference in how many times a tossed coin comes up heads on the first 1,000 flips versus the second 1,000 flips. They’ll be different numbers, but not that different.

What’s the difference between a mile and a kilometer? A mile is longer than a kilometer, but that’s it. They measure the same kinds of things. You can convert a measurement in miles to one in kilometers by multiplying by a constant. We could as well measure my car’s odometer in meters, or inches, or parsecs, or lengths of football fields. The difference is what number we multiply the original measurement by. We call this “scaling”.

Whatever we measure, in whatever unit we measure, has to have a leading digit of something. So it’s got to have some chance of starting out with a ‘1’, some chance of starting out with a ‘2’, some chance of starting out with a ‘3’, and so on. But that chance can’t depend on the scale. Measuring something in smaller or larger units doesn’t change the proportion of how often each leading digit is there.

These facts combine to imply that leading digits follow a logarithmic-scale law. The leading digit should be a ‘1’ something like 30 percent of the time. And a ‘2’ about 18 percent of the time. A ‘3’ about one-eighth of the time. And it decreases from there. ‘9’ gets to take the lead a meager 4.6 percent of the time.

Roughly. It’s not going to be so all the time. Measure the heights of humans in meters and there’ll be far more leading digits of ‘1’ than we should expect, as most people are between 1 and 2 meters tall. Measure them in feet and ‘5’ and ‘6’ take a great lead. The law works best when data can sprawl over many orders of magnitude. If we lived in a world where people could as easily be two inches as two hundred feet tall, Benford’s Law would make more accurate predictions about their heights. That something is a mathematical truth does not mean it’s independent of all reason.

For example, the reader thinking back some may be wondering: granted that atomic weights and river areas and populations carry units with them that create this distribution. How do street addresses, one of Benford’s observed sources, carry any unit? Well, street addresses are, at least in the United States custom, a loose measure of distance. The 100 block (for example) of a street is within one … block … from whatever the more important street or river crossing that street is. The 900 block is farther away.

This extends further. Block numbers are proxies for distance from the major cross feature. House numbers on the block are proxies for distance from the start of the block. We have a better chance to see street number 418 than 1418, to see 418 than 488, or to see 418 than to see 1488. We can look at Benford’s Law in the second and third and other minor digits of numbers. But we have to be more cautious. There is more room for variation and quirk events. A block-filling building in the downtown area can take whatever street number the owners think most auspicious. Smaller samples of anything are less predictable.

Nevertheless, Benford’s Law has become famous to forensic accountants the past several decades, if we allow the use of the word “famous” in this context. But its fame is thanks to the economists Hal Varian and Mark Nigrini. They observed that real-world financial data should be expected to follow this same distribution. If they don’t, then there might be something suspicious going on. This is not an ironclad rule. There might be good reasons for the discrepancy. If your work trips are always to the same location, and always for one week, and there’s one hotel it makes sense to stay at, and you always learn you’ll need to make the trips about one month ahead of time, of course the hotel bill will be roughly the same. Benford’s Law is a simple, rough tool, a way to decide what data to scrutinize for mischief. With this in mind I trust none of my readers will make the obvious leading-digit mistake when padding their expense accounts anymore.

Since I’ve done you that favor, anyone out there think they can pick me up at the dealer’s Thursday, maybe Friday? Thanks in advance.

## Reading the Comics, April 29, 2017: The Other Half Of The Week Edition

I’d been splitting Reading the Comics posts between Sunday and Thursday to better space them out. But I’ve got something prepared that I want to post Thursday, so I’ll bump this up. Also I had it ready to go anyway so don’t gain anything putting it off another two days.

Bill Amend’s FoxTrot Classics for the 27th reruns the strip for the 4th of May, 2006. It’s another probability problem, in its way. Assume Jason is honest in reporting whether Paige has picked his number correctly. Assume that Jason picked a whole number. (This is, I think, the weakest assumption. I know Jason Fox’s type and he’s just the sort who’d pick an obscure transcendental number. They’re all obscure after π and e.) Assume that Jason is equally likely to pick any of the whole numbers from 1 to one billion. Then, knowing nothing about what numbers Jason is likely to pick, Paige would have one chance in a billion of picking his number too. Might as well call it certainty that she’ll pay a dollar to play the game. How much would she have to get, in case of getting the number right, to come out even or ahead? … And now we know why Paige is still getting help on probability problems in the 2017 strips.

Jeff Stahler’s Moderately Confused for the 27th gives me a bit of a break by just being a snarky word problem joke. The student doesn’t even have to resist it any.

Sandra Bell-Lundy’s Between Friends for the 29th also gives me a bit of a break by just being a Venn Diagram-based joke. At least it’s using the shape of a Venn Diagram to deliver the joke. It’s not really got the right content.

Harley Schwadron’s 9 to 5 for the 29th is this week’s joke about arithmetic versus propaganda. It’s a joke we’re never really going to be without again.

## Reading the Comics, April 24, 2017: Reruns Edition

I went a little wild explaining the first of last week’s mathematically-themed comic strips. So let me split the week between the strips that I know to have been reruns and the ones I’m not so sure were.

Bill Amend’s FoxTrot for the 23rd — not a rerun; the strip is still new on Sundays — is a probability question. And a joke about story problems with relevance. Anyway, the question uses the binomial distribution. I know that because the question is about doing a bunch of things, homework questions, each of which can turn out one of two ways, right or wrong. It’s supposed to be equally likely to get the question right or wrong. It’s a little tedious but not hard to work out the chance of getting exactly six problems right, or exactly seven, or exactly eight, or so on. To work out the chance of getting six or more questions right — the problem given — there’s two ways to go about it.

One is the conceptually easy but tedious way. Work out the chance of getting exactly six questions right. Work out the chance of getting exactly seven questions right. Exactly eight questions. Exactly nine. All ten. Add these chances up. You’ll get to a number slightly below 0.377. That is, Mary Lou would have just under a 37.7 percent chance of passing. The answer’s right and it’s easy to understand how it’s right. The only drawback is it’s a lot of calculating to get there.

So here’s the conceptually harder but faster way. It works because the problem says Mary Lou is as likely to get a problem wrong as right. So she’s as likely to get exactly ten questions right as exactly ten wrong. And as likely to get at least nine questions right as at least nine wrong. To get at least eight questions right as at least eight wrong. You see where this is going: she’s as likely to get at least six right as to get at least six wrong.

There’s exactly three possibilities for a ten-question assignment like this. She can get four or fewer questions right (six or more wrong). She can get exactly five questions right. She can get six or more questions right. The chance of the first case and the chance of the last have to be the same.

So, take 1 — the chance that one of the three possibilities will happen — and subtract the chance she gets exactly five problems right, which is a touch over 24.6 percent. So there’s just under a 75.4 percent chance she does not get exactly five questions right. It’s equally likely to be four or fewer, or six or more. Just-under-75.4 divided by two is just under 37.7 percent, which is the chance she’ll pass as the problem’s given. It’s trickier to see why that’s right, but it’s a lot less calculating to do. That’s a common trade-off.

Ruben Bolling’s Super-Fun-Pax Comix rerun for the 23rd is an aptly titled installment of A Million Monkeys At A Million Typewriters. It reminds me that I don’t remember if I’d retired the monkeys-at-typewriters motif from Reading the Comics collections. If I haven’t I probably should, at least after making a proper essay explaining what the monkeys-at-typewriters thing is all about.

Ted Shearer’s Quincy from the 28th of February, 1978 reveals to me that pocket calculators were a thing much earlier than I realized. Well, I was too young to be allowed near stuff like that in 1978. I don’t think my parents got their first credit-card-sized, solar-powered calculator that kind of worked for another couple years after that. Kids, ask about them. They looked like good ideas, but you could use them for maybe five minutes before the things came apart. Your cell phone is so much better.

Bil Watterson’s Calvin and Hobbes rerun for the 24th can be classed as a resisting-the-word-problem joke. It’s so not about that, but who am I to slow you down from reading a Calvin and Hobbes story?

Garry Trudeau’s Doonesbury rerun for the 24th started a story about high school kids and their bad geography skills. I rate it as qualifying for inclusion here because it’s a mathematics teacher deciding to include more geography in his course. I was amused by the week’s jokes anyway. There’s no hint given what mathematics Gil teaches, but given the links between geometry, navigation, and geography there is surely something that could be relevant. It might not help with geographic points like which states are in New England and where they are, though.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 24th is built on a plot point from Carl Sagan’s science fiction novel Contact. In it, a particular “message” is found in the digits of π. (By “message” I mean a string of digits that are interesting to us. I’m not sure that you can properly call something a message if it hasn’t got any sender and if there’s not obviously some intended receiver.) In the book this is an astounding thing because the message can’t be; any reasonable explanation for how it should be there is impossible. But short “messages” are going to turn up in π also, as per the comic strips.

I assume the peer review would correct the cartoon mathematicians’ unfortunate spelling of understanding.

## What Is The Most Probable Date For Easter? What Is The Least?

If I’d started pondering the question a week earlier I’d have a nice timely post. Too bad. Shouldn’t wait nearly a year to use this one, though.

My love and I got talking about early and late Easters. We know that we’re all but certainly not going to be alive to see the earliest possible Easter, at least not unless the rule for setting the date of Easter changes. Easter can be as early as the 22nd of March or as late as the 25th of April. Nobody presently alive has seen a 22nd of March Easter; the last one was in 1818. Nobody presently alive will; the next will be 2285. The last time Easter was its latest date was 1943; the next time will be 2038. I know people who’ve seen the one in 1943 and hope to make it at least through 2038.

But that invites the question: what dates are most likely to be Easter? What ones are least? In a sense the question is nonsense. The rules establishing Easter and the Gregorian calendar are known. To speak of the “chance” of a particular day being Easter is like asking the probability that Grover Cleveland was president of the United States in 1894. Technically there’s a probability distribution there. But it’s different in some way from asking the chance of rolling at least a nine on a pair of dice.

But as with the question about what day is most likely to be Thanksgiving we can make the question sensible. We have to take the question to mean “given a month and day, and no information about what year it is, what is the chance that this as Easter?” (I’m still not quite happy with that formulation. I’d be open to a more careful phrasing, if someone’s got one.)

When we’ve got that, though, we can tackle the problem. We could do as I did for working out what days are most likely to be Thanksgiving. Run through all the possible configurations of the calendar, tally how often each of the days in the range is Easter, and see what comes up most often. There’s a hassle here. Working out the date of Easter follows a rule, yes. The rule is that it’s the first Sunday after the first full moon after the spring equinox. There are wrinkles, mostly because the Moon is complicated. A notional Moon that’s a little more predictable gets used instead. There are algorithms you can use to work out when Easter is. They all look like some kind of trick being used to put something over on you. No matter. They seem to work, as far as we know. I found some Matlab code that uses the Easter-computing routine that Karl Friedrich Gauss developed and that’ll do.

Problem. The Moon and the Earth follow cycles around the sun, yes. Wait long enough and the positions of the Earth and Moon and Sun. This takes 532 years and is known as the Paschal Cycle. In the Julian calendar Easter this year is the same date it was in the year 1485, and the same it will be in 2549. It’s no particular problem to set a computer program to run a calculation, even a tedious one, 532 times. But it’s not meaningful like that either.

The problem is the Julian calendar repeats itself every 28 years, which fits nicely with the Paschal Cycle. The Gregorian calendar, with different rules about how to handle century years like 1900 and 2100, repeats itself only every 400 years. So it takes much longer to complete the cycle and get Earth, Moon, and calendar date back to the same position. To fully account for all the related cycles would take 5,700,000 years, estimates Duncan Steel in Marking Time: The Epic Quest To Invent The Perfect Calendar.

Write code to calculate Easter on a range of years and you can do that, of course. It’s no harder to calculate the dates of Easter for six million years than it is for six hundred years. It just takes longer to finish. The problem is that it is meaningless to do so. Over the course of a mere(!) 26,000 years the precession of the Earth’s axes will change the times of the seasons completely. If we still use the Gregorian calendar there will be a time that late September is the start of the Northern Hemisphere’s spring, and another time that early February is the heart of the Canadian summer. Within five thousand years we will have to change the calendar, change the rule for computing Easter, or change the idea of it as happening in Europe’s early spring. To calculate a date for Easter of the year 5,002,017 is to waste energy.

We probably don’t need it anyway, though. The differences between any blocks of 532 years are, I’m going to guess, minor things. I would be surprised if the frequency of any date’s appearance changed more than a quarter of a percent. That might scramble the rankings of dates if we have several nearly-as-common dates, but it won’t be much.

So let me do that. Here’s a table of how often each particular calendar date appears as Easter from the years 2000 to 5000, inclusive. And I don’t believe that by the year we would call 5000 we’ll still have the same calendar and Easter and expectations of Easter all together, so I’m comfortable overlooking that. Indeed, I expect we’ll have some different calendar or Easter or expectation of Easter by the year 4985 at the latest.

For this enormous date range, though, here’s the frequency of Easters on each possible date:

Date Number Of Occurrences, 2000 – 5000 Probability Of Occurence
22 March 12 0.400%
23 March 17 0.566%
24 March 41 1.366%
25 March 74 2.466%
26 March 75 2.499%
27 March 68 2.266%
28 March 90 2.999%
29 March 110 3.665%
30 March 114 3.799%
31 March 99 3.299%
1 April 87 2.899%
2 April 83 2.766%
3 April 106 3.532%
4 April 112 3.732%
5 April 110 3.665%
6 April 92 3.066%
7 April 86 2.866%
8 April 98 3.266%
9 April 112 3.732%
10 April 114 3.799%
11 April 96 3.199%
12 April 88 2.932%
13 April 90 2.999%
14 April 108 3.599%
15 April 117 3.899%
16 April 104 3.466%
17 April 90 2.999%
18 April 93 3.099%
19 April 114 3.799%
20 April 116 3.865%
21 April 93 3.099%
22 April 60 1.999%
23 April 46 1.533%
24 April 57 1.899%
25 April 29 0.966%

If I haven’t missed anything, this indicates that the 15th of April is the most likely date for Easter, with the 20th close behind and the 10th and 14th hardly rare. The least probable date is the 22nd of March, with the 23rd of March and the 25th of April almost as unlikely.

And since the date range does affect the results, here’s a smaller sampling, one closer fit to the dates of anyone alive to read this as I publish. For the years 1925 through 2100 the appearance of each Easter date are:

Date Number Of Occurrences, 1925 – 2100 Probability Of Occurence
22 March 0 0.000%
23 March 1 0.568%
24 March 1 0.568%
25 March 3 1.705%
26 March 6 3.409%
27 March 3 1.705%
28 March 5 2.841%
29 March 6 3.409%
30 March 7 3.977%
31 March 7 3.977%
1 April 6 3.409%
2 April 4 2.273%
3 April 6 3.409%
4 April 6 3.409%
5 April 7 3.977%
6 April 7 3.977%
7 April 4 2.273%
8 April 4 2.273%
9 April 6 3.409%
10 April 7 3.977%
11 April 7 3.977%
12 April 7 3.977%
13 April 4 2.273%
14 April 6 3.409%
15 April 7 3.977%
16 April 6 3.409%
17 April 7 3.977%
18 April 6 3.409%
19 April 6 3.409%
20 April 6 3.409%
21 April 7 3.977%
22 April 5 2.841%
23 April 2 1.136%
24 April 2 1.136%
25 April 2 1.136%

If we take this as the “working lifespan” of our common experience then the 22nd of March is the least likely Easter we’ll see, as we never do. The 23rd and 24th are the next least likely Easter. There’s a ten-way tie for the most common date of Easter, if I haven’t missed one or more. But the 30th and 31st of March, and the 5th, 6th, 10th, 11th, 12th, 15th, 17th, and 21st of April each turn up seven times in this range.

The Julian calendar Easter dates are different and perhaps I’ll look at that sometime.

## Did This German Retiree Solve A Decades-Old Conjecture?

And then this came across my desktop (my iPad’s too old to work with the Twitter client anymore):

The underlying news is that one Thomas Royen, a Frankfurt (Germany)-area retiree, seems to have proven the Gaussian Correlation Inequality. It wasn’t a conjecture that sounded familiar to me, but the sidebar (on the Quanta Magazine article to which I’ve linked there) explains it and reminds me that I had heard about it somewhere or other. It’s about random variables. That is, things that can take on one of a set of different values. If you think of them as the measurements of something that’s basically consistent but never homogenous you’re doing well.

Suppose you have two random variables, two things that can be measured. There’s a probability the first variable is in a particular range, greater than some minimum and less than some maximum. There’s a probability the second variable is in some other particular range. What’s the probability that both variables are simultaneously in these particular ranges? This is easy to answer for some specific cases. For example if the two variables have nothing to do with each other then everybody who’s taken a probability class knows. The probability of both variables being in their ranges is the probability the first is in its range times the probability the second is in its range. The challenge is telling whether it’s always true, whether the variables are related to each other or not. Or telling when it’s true if it isn’t always.

The article (and pop reporting on this) is largely about how the proof has gone unnoticed. There’s some interesting social dynamics going on there. Royen published in an obscure-for-the-field journal, one he was an editor for; this makes it look dodgy, at least. And the conjecture’s drawn “proofs” that were just wrong; this discourages people from looking for obscurely-published proofs.

Some of the articles I’ve seen on this make Royen out to be an amateur. And I suppose there is a bias against amateurs in professional mathematics. There is in every field. It’s true that mathematics doesn’t require professional training the way that, say, putting out oil rig fires does. Anyone capable of thinking through an argument rigorously is capable of doing important original work. But there are a lot of tricks to thinking an argument through that are important, and I’d bet on the person with training.

In any case, Royen isn’t a newcomer to the field who just heard of an interesting puzzle. He’d been a statistician, first for a pharmaceutical company and then for a technical university. He may not have a position or tie to a mathematics department or a research organization but he’s someone who would know roughly what to do.

So did he do it? I don’t know; I’m not versed enough in the field to say. It’s interesting to see if he has.

## Reading the Comics, April 6, 2017: Abbreviated Week Edition

I’m writing this a little bit early because I’m not able to include the Saturday strips in the roundup. There won’t be enough to make a split week edition; I’ll just add the Saturday strips to next week’s report. In the meanwhile:

Mac King and Bill King’s Magic in a Minute for the 2nd is a magic trick, as the name suggests. It figures out a card by way of shuffling a (partial) deck and getting three (honest) answers from the other participant. If I’m not counting wrongly, you could do this trick with up to 27 cards and still get the right card after three answers. I feel like there should be a way to explain this that’s grounded in information theory, but I’m not able to put that together. I leave the suggestion here for people who see the obvious before I get to it.

Bil Keane and Jeff Keane’s Family Circus (probable) rerun for the 6th reassured me that this was not going to be a single-strip week. And a dubiously included single strip at that. I’m not sure that lotteries are the best use of the knowledge of numbers, but they’re a practical use anyway.

Bill Bettwy’s Take It From The Tinkersons for the 6th is part of the universe of students resisting class. I can understand the motivation problem in caring about numbers of apples that satisfy some condition. In the role of distinct objects whose number can be counted or deduced cards are as good as apples. In the role of things to gamble on, cards open up a lot of probability questions. Counting cards is even about how the probability of future events changes as information about the system changes. There’s a lot worth learning there. I wouldn’t try teaching it to elementary school students.

Jeffrey Caulfield and Alexandre Rouillard’s Mustard and Boloney for the 6th uses mathematics as the stuff know-it-alls know. At least I suppose it is; Doctor Know It All speaks of “the pathagorean principle”. I’m assuming that’s meant to be the Pythagorean theorem, although the talk about “in any right triangle the area … ” skews things. You can get to stuf about areas of triangles from the Pythagorean theorem. One of the shorter proofs of it depends on the areas of the squares of the three sides of a right triangle. But it’s not what people typically think of right away. But he wouldn’t be the first know-it-all to start blathering on the assumption that people aren’t really listening. It’s common enough to suppose someone who speaks confidently and at length must know something.

Dave Whamond’s Reality Check for the 6th is a welcome return to anthropomorphic-numerals humor. Been a while.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 6th builds on the form of a classic puzzle, about a sequence indexed to the squares of a chessboard. The story being riffed on is a bit of mathematical legend. The King offered the inventor of chess any reward. The inventor asked for one grain of wheat for the first square, two grains for the second square, four grains for the third square, eight grains for the fourth square, and so on, through all 64 squares. An extravagant reward, but surely one within the king’s power to grant, right? And of course not: by the 64th doubling the amount of wheat involved is so enormous it’s impossibly great wealth.

The father’s offer is meant to evoke that. But he phrases it in a deceptive way, “one penny for the first square, two for the second, and so on”. That “and so on” is the key. Listing a sequence and ending “and so on” is incomplete. The sequence can go in absolutely any direction after the given examples and not be inconsistent. There is no way to pick a single extrapolation as the only logical choice.

We do it anyway, though. Even mathematicians say “and so on”. This is because we usually stick to a couple popular extrapolations. We suppose things follow a couple common patterns. They’re polynomials. Or they’re exponentials. Or they’re sine waves. If they’re polynomials, they’re lower-order polynomials. Things like that. Most of the time we’re not trying to trick our fellow mathematicians. Or we know we’re modeling things with some physical base and we have reason to expect some particular type of function.

In this case, the \$1.27 total is consistent with getting two cents for every chess square after the first. There are infinitely many other patterns that would work, and the kid would have been wise to ask for what precisely “and so on” meant before choosing.

Berkeley Breathed’s Bloom County 2017 for the 7th is the climax of a little story in which Oliver Wendell Holmes has been annoying people by shoving scientific explanations of things into their otherwise pleasant days. It’s a habit some scientifically-minded folks have, and it’s an annoying one. Many of us outgrow it. Anyway, this strip is about the curious evidence suggesting that the universe is not just expanding, but accelerating its expansion. There are mathematical models which allow this to happen. When developing General Relativity, Albert Einstein included a Cosmological Constant for little reason besides that without it, his model would suggest the universe was of a finite age and had expanded from an infinitesimally small origin. He had grown up without anyone knowing of any evidence that the size of the universe was a thing that could change.

Anyway, the Cosmological Constant is a puzzle. We can find values that seem to match what we observe, but we don’t know of a good reason it should be there. We sciencey types like to have models that match data, but we appreciate more knowing why the models look like that and not anything else. So it’s a good problem some of the cosmologists have been working on. But we’ve been here before. A great deal of physics, especially in the 20th Century, has been driven by looking for reasons behind what look like arbitrary points in a successful model. If Oliver were better-versed in the history of science — something scientifically minded people are often weak on, myself included — he’d be less easily taunted by Opus.

Mikael Wulff and Anders Morgenthaler’s TruthFacts for the 7th thinks that we forgot they ran this same strip back on the 17th of March. I spotted it, though. Nyah.

## How Much Might I Have Lost At Pinball?

After the state pinball championship last month there was a second, side tournament. It was a sort-of marathon event in which I played sixteen games in short order. I won three of them and lost thirteen, a disheartening record. The question I can draw from this: was I hopelessly outclassed in the side tournament? Is it plausible that I could do so awfully?

The answer would be “of course not”. I was playing against, mostly, the same people who were in the state finals. (A few who didn’t qualify for the finals joined the side tournament.) In that I had done well enough, winning seven games in all out of fifteen played. It’s implausible that I got significantly worse at pinball between the main and the side tournament. But can I make a logically sound argument about this?

In full, probably not. It’s too hard. The question is, did I win way too few games compared to what I should have expected? But what should I have expected? I haven’t got any information on how likely it should have been that I’d win any of the games, especially not when I faced something like a dozen different opponents. (I played several opponents twice.)

But we can make a model. Suppose that I had a fifty percent chance of winning each match. This is a lie in detail. The model contains lies; all models do. The lies might let us learn something interesting. Some people there I could only beat with a stroke of luck on my side. Some people there I could fairly often expect to beat. If we pretend I had the same chance against everyone, though, we get something that we can model. It might tell us something about what really happened.

If I play 16 matches, and have a 50 percent chance of winning each of them, then I should expect to win eight matches. But there’s no reason I might not win seven instead, or nine. Might win six, or ten, without that being too implausible. It’s even possible I might not win a single match, or that I might win all sixteen matches. How likely?

This calls for a creature from the field of probability that we call the binomial distribution. It’s “binomial” because it’s about stuff for which there are exactly two possible outcomes. This fits. Each match I can win or I can lose. (If we tie, or if the match is interrupted, we replay it, so there’s not another case.) It’s a “distribution” because we describe, for a set of some number of attempted matches, how the possible outcomes are distributed. The outcomes are: I win none of them. I win exactly one of them. I win exactly two of them. And so on, all the way up to “I win exactly all but one of them” and “I win all of them”.

To answer the question of whether it’s plausible I should have done so badly I need to know more than just how likely it is I would win only three games. I need to also know the chance I’d have done worse. If I had won only two games, or only one, or none at all. Why?

Here I admit: I’m not sure I can give a compelling reason, at least not in English. I’ve been reworking it all week without being happy at the results. Let me try pieces.

One part is that as I put the question — is it plausible that I could do so awfully? — isn’t answered just by checking how likely it is I would win only three games out of sixteen. If that’s awful, then doing even worse must also be awful. I can’t rule out even-worse results from awfulness without losing a sense of what the word “awful” means. Fair enough, to answer that question. But I made up the question. Why did I make up that one? Why not just “is it plausible I’d get only three out of sixteen games”?

Habit, largely. Experience shows me that the probability of any particular result turns out to be implausibly low. It isn’t quite that case here; there’s only seventeen possible noticeably different outcomes of playing sixteen games. But there can be so many possible outcomes that even the most likely one isn’t.

Take an extreme case. (Extreme cases are often good ways to build an intuitive understanding of things.) Imagine I played 16,000 games, with a 50-50 chance of winning each one of them. It is most likely that I would win 8,000 of the games. But the probability of winning exactly 8,000 games is small: only about 0.6 percent. What’s going on there is that there’s almost the same chance of winning exactly 8,001 or 8,002 games. As the number of games increases the number of possible different outcomes increases. If there are 16,000 games there are 16,001 possible outcomes. It’s less likely that any of them will stand out. What saves our ability to predict the results of things is that the number of plausible outcomes increases more slowly. It’s plausible someone would win exactly three games out of sixteen. It’s impossible that someone would win exactly three thousand games out of sixteen thousand, even though that’s the same ratio of won games.

Card games offer another way to get comfortable with this idea. A bridge hand, for example, is thirteen cards drawn out of fifty-two. But the chance that you were dealt the hand you just got? Impossibly low. Should we conclude from this all bridge hands are hoaxes? No, but ask my mother sometime about the bridge class she took that one cruise. “Three of sixteen” is too particular; “at best three of sixteen” is a class I can study.

Unconvinced? I don’t blame you. I’m not sure I would be convinced of that, but I might allow the argument to continue. I hope you will. So here are the specifics. These are the chance of each count of wins, and the chance of having exactly that many wins, for sixteen matches:

Wins Percentage
0 0.002 %
1 0.024 %
2 0.183 %
3 0.854 %
4 2.777 %
5 6.665 %
6 12.219 %
7 17.456 %
8 19.638 %
9 17.456 %
10 12.219 %
11 6.665 %
12 2.777 %
13 0.854 %
14 0.183 %
15 0.024 %
16 0.002 %

So the chance of doing as awfully as I had — winning zero or one or two or three games — is pretty dire. It’s a little above one percent.

Is that implausibly low? Is there so small a chance that I’d do so badly that we have to figure I didn’t have a 50-50 chance of winning each game?

I hate to think that. I didn’t think I was outclassed. But here’s a problem. We need some standard for what is “it’s implausibly unlikely that this happened by chance alone”. If there were only one chance in a trillion that someone with a 50-50 chance of winning any game would put in the performance I did, we could suppose that I didn’t actually have a 50-50 chance of winning any game. If there were only one chance in a million of that performance, we might also suppose I didn’t actually have a 50-50 chance of winning any game. But here there was only one chance in a hundred? Is that too unlikely?

It depends. We should have set a threshold for “too implausibly unlikely” before we started research. It’s bad form to decide afterward. There are some thresholds that are commonly taken. Five percent is often useful for stuff where it’s hard to do bigger experiments and the harm of guessing wrong (dismissing the idea I had a 50-50 chance of winning any given game, for example) isn’t so serious. One percent is another common threshold, again common in stuff like psychological studies where it’s hard to get more and more data. In a field like physics, where experiments are relatively cheap to keep running, you can gather enough data to insist on fractions of a percent as your threshold. Setting the threshold after is bad form.

In my defense, I thought (without doing the work) that I probably had something like a five percent chance of doing that badly by luck alone. It suggests that I did have a much worse than 50 percent chance of winning any given game.

Is that credible? Well, yeah; I may have been in the top sixteen players in the state. But a lot of those people are incredibly good. Maybe I had only one chance in three, or something like that. That would make the chance I did that poorly something like one in six, likely enough.

And it’s also plausible that games are not independent, that whether I win one game depends in some way on whether I won or lost the previous. But it does feel like it’s easier to win after a win, or after a close loss. And it feels harder to win a game after a string of losses. I don’t know that this can be proved, not on the meager evidence I have available. And you can almost always question the independence of a string of events like this. It’s the safe bet.

## Reading the Comics, March 6, 2017: Blackboards Edition

I can’t say there’s a compelling theme to the first five mathematically-themed comics of last week. Screens full of mathematics turned up in a couple of them, so I’ll run with that. There were also just enough strips that I’m splitting the week again. It seems fair to me and gives me something to remember Wednesday night that I have to rush to complete.

Jimmy Hatlo’s Little Iodine for the 1st of January, 1956 was rerun on the 5th of March. The setup demands Little Iodine pester her father for help with the “hard homework” and of course it’s arithmetic that gets to play hard work. It’s a word problem in terms of who has how many apples, as you might figure. Don’t worry about Iodine’s boss getting fired; Little Iodine gets her father fired every week. It’s their schtick.

Dana Simpson’s Phoebe and her Unicorn for the 5th mentions the “most remarkable of unicorn confections”, a sugar dodecahedron. Dodecahedrons have long captured human imaginations, as one of the Platonic Solids. The Platonic Solids are one of the ways we can make a solid-geometry analogue to a regular polygon. Phoebe’s other mentioned shape of cubes is another of the Platonic Solids, but that one’s common enough to encourage no sense of mystery or wonder. The cube’s the only one of the Platonic Solids that will fill space, though, that you can put into stacks that don’t leave gaps between them. Sugar cubes, Wikipedia tells me, have been made only since the 19th century; the Moravian sugar factory director Jakub Kryštof Rad got a patent for cutting block sugar into uniform pieces in 1843. I can’t dispute the fun of “dodecahedron” as a word to say. Many solid-geometric shapes have names that are merely descriptive, but which are rendered with Greek or Latin syllables so as to sound magical.

Bud Grace’s Piranha Club for the 6th started a sequence in which the Future Disgraced Former President needs the most brilliant person in the world, Bud Grace. A word balloon full of mathematics is used as symbol for this genius. I feel compelled to point out Bud Grace was a physics major. But while Grace could as easily have used something from the physics department to show his deep thinking abilities, that would all but certainly have been rendered as equation and graphs, the stuff of mathematics again.

Scott Meyer’s Basic Instructions rerun for the 6th is aptly titled, “How To Unify Newtonian Physics And Quantum Mechanics”. Meyer’s advice is not bad, really, although generic enough it applies to any attempts to reconcile two different models of a phenomenon. Also there’s not particularly a problem reconciling Newtonian physics with quantum mechanics. It’s general relativity and quantum mechanics that are so hard to reconcile.

Still, Basic Instructions is about how you can do a thing, or learn to do a thing. It’s not about how to allow anything to be done for the first time. And it’s true that, per quantum mechanics, we can’t predict exactly what any one particle will do at any time. We can say what possible things it might do and how relatively probable they are. But big stuff, the stuff for which Newtonian physics is relevant, involve so many particles that the unpredictability becomes too small to notice. We can see this as the Law of Large Numbers. That’s the probability rule that tells us we can’t predict any coin flip, but we know that a million fair tosses of a coin will not turn up 800,000 tails. There’s more to it than that (there’s always more to it), but that’s a starting point.

Michael Fry’s Committed rerun for the 6th features Albert Einstein as the icon of genius. Natural enough. And it reinforces this with the blackboard full of mathematics. I’m not sure if that blackboard note of “E = md3” is supposed to be a reference to the famous Far Side panel of Einstein hearing the maid talk about everything being squared away. I’ll take it as such.