## The Summer 2017 Mathematics A To Z: Sárközy’s Theorem

Gaurish, of For the love of Mathematics, gives me another chance to talk number theory today. Let’s see how that turns out.

# Sárközy’s Theorem.

I have two pieces to assemble for this. One is in factors. We can take any counting number, a positive whole number, and write it as the product of prime numbers. 2038 is equal to the prime 2 times the prime 1019. 4312 is equal to 2 raised to the third power times 7 raised to the second times 11. 1040 is 2 to the fourth power times 5 times 13. 455 is 5 times 7 times 13.

There are many ways to divide up numbers like this. Here’s one. Is there a square number among its factors? 2038 and 455 don’t have any. They’re each a product of prime numbers that are never repeated. 1040 has a square among its factors. 2 times 2 divides into 1040. 4312, similarly, has a square: we can write it as 2 squared times 2 times 7 squared times 11. So that is my first piece. We can divide counting numbers into squarefree and not-squarefree.

The other piece is in binomial coefficients. These are numbers, often quite big numbers, that get dumped on the high school algebra student as she tries to work with some expression like $(a + b)^n$. They’re also dumped on the poor student in calculus, as something about Newton’s binomial coefficient theorem. Which we hear is something really important. In my experience it wasn’t explained why this should rank up there with, like, the differential calculus. (Spoiler: it’s because of polynomials.) But it’s got some great stuff to it.

Binomial coefficients are among those utility players in mathematics. They turn up in weird places. In dealing with polynomials, of course. They also turn up in combinatorics, and through that, probability. If you run, for example, 10 experiments each of which could succeed or fail, the chance you’ll get exactly five successes is going to be proportional to one of these binomial coefficients. That they touch on polynomials and probability is a sign we’re looking at a thing woven into the whole universe of mathematics. We saw them some in talking, last A-To-Z around, about Yang Hui’s Triangle. That’s also known as Pascal’s Triangle. It has more names too, since it’s been found many times over.

The theorem under discussion is about central binomial coefficients. These are one specific coefficient in a row. The ones that appear, in the triangle, along the line of symmetry. They’re easy to describe in formulas. for a whole number ‘n’ that’s greater than or equal to zero, evaluate what we call 2n choose n:

${{2n} \choose{n}} = \frac{(2n)!}{(n!)^2}$

If ‘n’ is zero, this number is $\frac{0!}{(0!)^2}$ or 1. If ‘n’ is 1, this number is $\frac{2!}{(1!)^2}$ or 2. If ‘n’ is 2, this number is $\frac{4!}{(2!)^2}$ 6. If ‘n’ is 3, this number is (sparing the formula) 20. The numbers keep growing. 70, 252, 924, 3432, 12870, and so on.

So. 1 and 2 and 6 are squarefree numbers. Not much arguing that. But 20? That’s 2 squared times 5. 70? 2 times 5 times 7. 252? 2 squared times 3 squared times 7. 924? That’s 2 squared times 3 times 7 times 11. 3432? 2 cubed times 3 times 11 times 13; there’s a 2 squared in there. 12870? 2 times 3 squared times it doesn’t matter anymore. It’s not a squarefree number.

There’s a bunch of not-squarefree numbers in there. The question: do we ever stop seeing squarefree numbers here?

So here’s Sárközy’s Theorem. It says that this central binomial coefficient ${{2n} \choose{n}}$ is never squarefree as long as ‘n’ is big enough. András Sárközy showed in 1985 that this was true. How big is big enough? … We have a bound, at least, for this theorem. If ‘n’ is larger than the number $2^{8000}$ then the corresponding coefficient can’t be squarefree. It might not surprise you that the formulas involved here feature the Riemann Zeta function. That always seems to turn up for questions about large prime numbers.

That’s a common state of affairs for number theory problems. Very often we can show that something is true for big enough numbers. I’m not sure there’s a clear reason why. When numbers get large enough it can be more convenient to deal with their logarithms, I suppose. And those look more like the real numbers than the integers. And real numbers are typically easier to prove stuff about. Maybe that’s it. This is vague, yes. But to ask ‘why’ some things are easy and some are hard to prove is a hard question. What is a satisfying ’cause’ here?

It’s tempting to say that since we know this is true for all ‘n’ above a bound, we’re done. We can just test all the numbers below that bound, and the rest is done. You can do a satisfying proof this way: show that eventually the statement is true, and show all the special little cases before it is. This particular result is kind of useless, though. $2^{8000}$ is a number that’s something like 241 digits long. For comparison, the total number of things in the universe is something like a number about 80 digits long. Certainly not more than 90. It’d take too long to test all those cases.

That’s all right. Since Sárközy’s proof in 1985 there’ve been other breakthroughs. In 1988 P Goetgheluck proved it was true for a big range of numbers: every ‘n’ that’s larger than 4 and less than $2^{42,205,184}$. That’s a number something more than 12 million digits long. In 1991 I Vardi proved we had no squarefree central binomial coefficients for ‘n’ greater than 4 and less than $2^{774,840,978}$, which is a number about 233 million digits long. And then in 1996 Andrew Granville and Olivier Ramare showed directly that this was so for all ‘n’ larger than 4.

So that 70 that turned up just a few lines in is the last squarefree one of these coefficients.

Is this surprising? Maybe, maybe not. I’ll bet most of you didn’t have an opinion on this topic twenty minutes ago. Let me share something that did surprise me, and continues to surprise me. In 1974 David Singmaster proved that any integer divides almost all the binomial coefficients out there. “Almost all” is here a term of art, but it means just about what you’d expect. Imagine the giant list of all the numbers that can be binomial coefficients. Then pick any positive integer you like. The number you picked will divide into so many of the giant list that the exceptions won’t be noticeable. So that square numbers like 4 and 9 and 16 and 25 should divide into most binomial coefficients? … That’s to be expected, suddenly. Into the central binomial coefficients? That’s not so obvious to me. But then so much of number theory is strange and surprising and not so obvious.

• #### gaurish 2:56 pm on Tuesday, 12 September, 2017 Permalink | Reply

Nice exposition, like always :-) Another place where this central binomial coefficient appears is in Paul Erdős’s proof of Bertrand’s postulate: https://en.wikipedia.org/wiki/Proof_of_Bertrand%27s_postulate

Like

• #### Joseph Nebus 1:39 am on Friday, 15 September, 2017 Permalink | Reply

Thank you. And I’m not sure how I overlooked that, since Bertrand’s Postulate is such a nice, easy-to-understand result. (The Postulate, which can be proven, is that there is always at least one prime between a whole number ‘n’ and its double, ‘2n’. With Sarkozy’s Theorem you can show this has to be true for numbers larger than 468. For the numbers from 1 up to 468, you can just check each case. It’s time-consuming but not hard.)

Like

## The Summer 2017 Mathematics A To Z: Quasirandom numbers

Gaurish, host of, For the love of Mathematics, gives me the excuse to talk about amusement parks. You may want to brace yourself. Yes, this essay includes a picture. It would have included a video if I had enough WordPress privileges for that.

# Quasirandom numbers.

Think of a merry-go-round. Or carousel, if you prefer. I will venture a guess. You might like merry-go-rounds. They’re beautiful. They can evoke happy thoughts of childhood when they were a big ride it was safe to go on. But they don’t often make one think of thrills.. They’re generally sedate things. They don’t need to be. There’s no great secret to making a carousel a thrill ride. They knew it a century ago, when all the great American carousels were carved. It’s simple. Make the thing spin fast enough, at the five or six rotations per minute the ride was made for. There are places that do this yet. There’s the Cedar Downs ride at Cedar Point, Sandusky, Ohio. There’s the antique carousel at Crossroads Village, a historical village/park just outside Flint, Michigan. There’s the Derby Racer at Playland in Rye, New York. There’s the carousel in the Merry-Go-Round Museum in Sandusky, Ohio. Any of them are great rides. Two of them have a special edge. I’ll come back to them.

Rye (New York) Playland Amusement Park’s is the fastest carousel I’m aware of running. Riders are warned ahead of time to sit so they’re leaning to the left, and the ride will not get up to full speed until the ride operator checks everyone during the ride. To get some idea of its speed, notice the ride operator on the left and how far she leans. She’s not being dramatic; that’s the natural stance. Also the tilt in the carousel’s floor is not camera trickery; it does lean like that. If you have a spare day in the New York City area and any interest in classic amusement parks, this is worth the trip.

Randomness is a valuable resource. We know it’s key to many things. We have major fields of mathematics built on it. We can understand the behavior of variables without ever knowing what value they have. All we need is to know than the chance they might be in some particular range. This makes possible all kinds of problems too complicated to do otherwise. We know it’s critical. Quantum mechanics would not work without randomness. Without quantum mechanics, matter doesn’t work. And that’s true randomness, the kind where something is unpredictable. It’s not the kind of randomness we talk about when we ask, say, what’s the chance someone was born on a Tuesday. That’s mere hidden information: if we knew the month and date and year of a person’s birth we would know whether they were born Tuesday or not. We need more.

So the trouble is actually getting a random number. Well, a sequence of randomly drawn numbers. We rarely need this if we’re doing analysis. We can understand how some process changes the shape of a distribution without ever using the distribution. We can take derivatives of a function without ever evaluating the original function, after all.

But we do need randomly drawn numbers. We do too much numerical work with them. For example, it’s impossible to exactly integrate most functions. Numerical methods can take a ferociously long time to evaluate. A family of methods called Monte Carlo rely on randomly-drawn values to estimate the integral. The results are strikingly good for the work required. But they must have random numbers. The name “Monte Carlo” is not some cryptic code. It is an expression of how randomly drawn numbers make the tool work.

It’s hard to get random numbers. Consider: we can’t write an algorithm to do it. If we were to write one, then we’d be able to predict that the sequence of numbers was. We have some recourse. We could set up instruments to rely on the randomness that seems to be in the world. Thermal fluctuations, for example, created by processes outside any computer’s control, can give us a pleasant dose of randomness. If we need higher-quality random numbers than that we can go to exotic equipment. Geiger counters watching the decay of a not-alarmingly-radioactive sample. Cosmic ray detectors watching the sky.

Or we can write something that produces numbers that look random enough. They won’t really be random, and if we wait long enough we’ll notice the sequence repeats itself. But if we only need, say, ten numbers, who cares if the sequence will repeat after ten million numbers? (We’ll surely need more than ten numbers. But we can postpone the repetition until we’ve drawn far more than ten million numbers.)

Two of the carousels I’ve mentioned have an astounding property. The horses in a file move. I mean, relative to each other. Some horse will start the race in front of its neighbors; some will start behind. The four move forward and back thanks to a mechanism of, I am assured, staggering complexity. There are only three carousels in the world that have it. There’s Cedar Downs at Cedar Point in Sandusky, Ohio; the Racing Downs at Playland in Rye, New York; and the Derby Racer at Blackpool Pleasure Beach in Blackpool, England. The mechanism in Blackpool’s hasn’t operated in years. The one at Playland’s had not run in years, but was restored for the 2017 season. My love and I made a trip specifically to ride that. (You may have heard of a fire at the carousel in Playland this summer. This was of part of the building for their other, non-racing, antique carousel. My last information was that the carousel itself was all right.)

These racing derbies have the horses in a file move forward and back in a “random” way. It’s not truly random. If you knew exactly which gears were underneath each horse, and where in their rotations they were, you could say which horse was about to gain on its partners and which was about to fall back. But all that is concealed from the rider. The horse patterns will eventually, someday, repeat. If the gear cycles aren’t interrupted by maintenance or malfunctions. But nobody’s going to ride any horse long enough to notice. We have in these rides a randomness as good as what your computer makes, at least for the purpose it serves.

The racing nature of Playland’s and Cedar Point’s derby racers mean that every ride includes exciting extra moments of overtaking or falling behind your partners to the side. It also means quarreling with your siblings about who really won the race because your horse started like four feet behind your sister’s and it ended only two feet behind so hers didn’t beat yours and, long story short, there was some punching, there was some spitting, and now nobody is gonna be allowed to get ice cream at the Carvel’s (for Playland) or cheese on a stick (for Cedar Point). This is the Cedar Downs ride at Cedar Point, and focuses on the poles that move the horses.

What does it mean to look random? Some things seem obvious. All the possible numbers ought to come up, sooner or later. Any particular possible number shouldn’t repeat too often. Any particular possible number shouldn’t go too long without repeating. There shouldn’t be clumps of numbers; if, say, ‘4’ turns up, we shouldn’t see ‘5’ turn up right away all the time.

We can make the idea of “looking” random quite literal. Suppose we’re selecting numbers from 0 through 9. We can draw the random numbers we’ve picked. Use the numbers as coordinates. Say we pick four digits: 1, 3, 9, and 0. Then draw the point that’s at x-coordinate 13, y-coordinate 90. Then the next four digits. Let’s say they’re 4, 2, 3, and 8. Then draw the point that’s at x-coordinate 42, y-coordinate 38. And repeat. What will this look like?

If it clumps up, we probably don’t have good random numbers. If we see lines that points collect along, or avoid, there’s a good chance our numbers aren’t very random. If there’s whole blocks of space that they occupy, and others they avoid, we may have a defective source of random numbers. We should expect the points to cover a space pretty uniformly. (There are more rigorous, logically sound, methods. The eye can be fooled easily enough. But it’s the same principle. We have some test that notices clumps and gaps.) But …

The thing is, there’s always going to be some clumps. There’ll always be some gaps. Part of randomness is that it forms patterns, or at least things that look like patterns to us. We can describe how big a clump (or gap; it’s the same thing, really) is for any particular quantity of randomly drawn numbers. If we see clumps bigger than that we can throw out the numbers as suspect. But … still …

Toss a coin fairly twenty times, and there’s no reason it can’t turn up tails sixteen times. This doesn’t happen often, but it will happen sometimes. Just luck. This surplus of tails should evaporate as we take more tosses. That is, we most likely won’t see 160 tails out of 200 tosses. We certainly will not see 1,600 tails out of 2,000 tosses. We know this as the Law of Large Numbers. Wait long enough and weird fluctuations will average out.

What if we don’t have time, though? For coin-tossing that’s silly; of course we have time. But for Monte Carlo integration? It could take too long to be confident we haven’t got too-large gaps or too-tight clusters.

This is why we take quasi-random numbers. We begin with what randomness we’re able to manage. But we massage it. Imagine our coins example. Suppose after ten fair tosses we noticed there had been eight tails turn up. Then we would start tossing less fairly, trying to make heads more common. We would be happier if there were 12 rather than 16 tails after twenty tosses.

Draw the results. We get now a pattern that looks still like randomness. But it’s a finer sorting; it looks like static tidied up some. The quasi-random numbers are not properly random. Knowing that, say, the last several numbers were odd means the next one is more likely to be even, the Gambler’s Fallacy put to work. But in aggregate, we trust, we’ll be able to enjoy the speed and power of randomly-drawn numbers. It shows its strengths when we don’t know just how finely we must sample a range of numbers to get good, reliable results.

To carousels. I don’t know whether the derby racers have quasirandom outcomes. I would find believable someone telling me that all the possible orderings of the four horses in any file are equally likely. To know would demand detailed knowledge of how the gearing works, though. Also probably simulations of how the system would work if it ran long enough. It might be easier to watch the ride for a couple of days and keep track of the outcomes. If someone wants to sponsor me doing a month-long research expedition to Cedar Point, drop me a note. Or just pay for my season pass. You folks would do that for me, wouldn’t you? Thanks.

• #### gaurish 6:55 pm on Wednesday, 6 September, 2017 Permalink | Reply

Liked by 1 person

• #### Joseph Nebus 1:22 am on Friday, 8 September, 2017 Permalink | Reply

This was actually an analogy I had waiting to be unleashed. I’d been thinking about using the racing derbies as an exciting case for pseudorandom numbers for ages, and this gave me the excuse to actually do it.

If I figure out how to upload videos I might do another essay about making pseudorandom sequences of numbers. I’ve got the movie footage of the Cedar Point and the Playland derbies. (Blackpool’s I visited with a barely-functional camera; it had gotten soaked in heavy rains a few days earlier. So I have precious few pictures of Blackpool Pleasure Beach and d’Efteling in the Netherlands. But that just gives me a pretext to go back and revisit both places.)

Liked by 2 people

## The Summer 2017 Mathematics A To Z: Benford's Law

Today’s entry in the Summer 2017 Mathematics A To Z is one for myself. I couldn’t post this any later.

# Benford’s Law.

My car’s odometer first read 9 on my final test drive before buying it, in June of 2009. It flipped over to 10 barely a minute after that, somewhere near Jersey Freeze ice cream parlor at what used to be the Freehold Traffic Circle. Ask a Central New Jersey person of sufficient vintage about that place. Its odometer read 90 miles sometime that weekend, I think while I was driving to The Book Garden on Route 537. Ask a Central New Jersey person of sufficient reading habits about that place. It’s still there. It flipped over to 100 sometime when I was driving back later that day.

The odometer read 900 about two months after that, probably while I was driving to work, as I had a longer commute in those days. It flipped over to 1000 a couple days after that. The odometer first read 9,000 miles sometime in spring of 2010 and I don’t remember what I was driving to for that. It flipped over from 9,999 to 10,000 miles several weeks later, as I pulled into the car dealership for its scheduled servicing. Yes, this kind of impressed the dealer that I got there exactly on the round number.

The odometer first read 90,000 in late August of last year, as I was driving to some competitive pinball event in western Michigan. It’s scheduled to flip over to 100,000 miles sometime this week as I get to the dealer for its scheduled maintenance. While cars have gotten to be much more reliable and durable than they used to be, the odometer will never flip over to 900,000 miles. At least I can’t imagine owning it long enough, at my rate of driving the past eight years, that this would ever happen. It’s hard to imagine living long enough for the car to reach 900,000 miles. Thursday or Friday it should flip over to 100,000 miles. The leading digit on the odometer will be 1 or, possibly, 2 for the rest of my association with it.

The point of this little autobiography is this observation. Imagine all the days that I have owned this car, from sometime in June 2009 to whatever day I sell, lose, or replace it. Pick one. What is the leading digit of my odometer on that day? It could be anything from 1 to 9. But it’s more likely to be 1 than it is 9. Right now it’s as likely to be any of the digits. But after this week the chance of ‘1’ being the leading digit will rise, and become quite more likely than that of ‘9’. And it’ll never lose that edge.

This is a reflection of Benford’s Law. It is named, as most mathematical things are, imperfectly. The law-namer was Frank Benford, a physicist, who in 1938 published a paper The Law Of Anomalous Numbers. It confirmed the observation of Simon Newcomb. Newcomb was a 19th century astronomer and mathematician of an exhausting number of observations and developments. Newcomb observed the logarithm tables that anyone who needed to compute referred to often. The earlier pages were more worn-out and dirty and damaged than the later pages. People worked with numbers that start with ‘1’ more than they did numbers starting with ‘2’. And more those that start ‘2’ than start ‘3’. More that start with ‘3’ than start with ‘4’. And on. Benford showed this was not some fluke of calculations. It turned up in bizarre collections of data. The surface areas of rivers. The populations of thousands of United States municipalities. Molecular weights. The digits that turned up in an issue of Reader’s Digest. There is a bias in the world toward numbers that start with ‘1’.

And this is, prima facie, crazy. How can the surface areas of rivers somehow prefer to be, say, 100-199 hectares instead of 500-599 hectares? A hundred is a human construct. (Indeed, it’s many human constructs.) That we think ten is an interesting number is an artefact of our society. To think that 100 is a nice round number and that, say, 81 or 144 are not is a cultural choice. Grant that the digits of street addresses of people listed in American Men of Science — one of Benford’s data sources — have some cultural bias. How can another of his sources, molecular weights, possibly?

The bias sneaks in subtly. Don’t they all? It lurks at the edge of the table of data. The table header, perhaps, where it says “River Name” and “Surface Area (sq km)”. Or at the bottom where it says “Length (miles)”. Or it’s never explicit, because I take for granted people know my car’s mileage is measured in miles.

What would be different in my introduction if my car were Canadian, and the odometer measured kilometers instead? … Well, I’d not have driven the 9th kilometer; someone else doing a test-drive would have. The 90th through 99th kilometers would have come a little earlier that first weekend. The 900th through 999th kilometers too. I would have passed the 99,999th kilometer years ago. In kilometers my car has been in the 100,000s for something like four years now. It’s less absurd that it could reach the 900,000th kilometer in my lifetime, but that still won’t happen.

What would be different is the precise dates about when my car reached its milestones, and the amount of days it spent in the 1’s and the 2’s and the 3’s and so on. But the proportions? What fraction of its days it spends with a 1 as the leading digit versus a 2 or a 5? … Well, that’s changed a little bit. There is some final mile, or kilometer, my car will ever register and it makes a little difference whether that’s 239,000 or 385,000. But it’s only a little difference. It’s the difference in how many times a tossed coin comes up heads on the first 1,000 flips versus the second 1,000 flips. They’ll be different numbers, but not that different.

What’s the difference between a mile and a kilometer? A mile is longer than a kilometer, but that’s it. They measure the same kinds of things. You can convert a measurement in miles to one in kilometers by multiplying by a constant. We could as well measure my car’s odometer in meters, or inches, or parsecs, or lengths of football fields. The difference is what number we multiply the original measurement by. We call this “scaling”.

Whatever we measure, in whatever unit we measure, has to have a leading digit of something. So it’s got to have some chance of starting out with a ‘1’, some chance of starting out with a ‘2’, some chance of starting out with a ‘3’, and so on. But that chance can’t depend on the scale. Measuring something in smaller or larger units doesn’t change the proportion of how often each leading digit is there.

These facts combine to imply that leading digits follow a logarithmic-scale law. The leading digit should be a ‘1’ something like 30 percent of the time. And a ‘2’ about 18 percent of the time. A ‘3’ about one-eighth of the time. And it decreases from there. ‘9’ gets to take the lead a meager 4.6 percent of the time.

Roughly. It’s not going to be so all the time. Measure the heights of humans in meters and there’ll be far more leading digits of ‘1’ than we should expect, as most people are between 1 and 2 meters tall. Measure them in feet and ‘5’ and ‘6’ take a great lead. The law works best when data can sprawl over many orders of magnitude. If we lived in a world where people could as easily be two inches as two hundred feet tall, Benford’s Law would make more accurate predictions about their heights. That something is a mathematical truth does not mean it’s independent of all reason.

For example, the reader thinking back some may be wondering: granted that atomic weights and river areas and populations carry units with them that create this distribution. How do street addresses, one of Benford’s observed sources, carry any unit? Well, street addresses are, at least in the United States custom, a loose measure of distance. The 100 block (for example) of a street is within one … block … from whatever the more important street or river crossing that street is. The 900 block is farther away.

This extends further. Block numbers are proxies for distance from the major cross feature. House numbers on the block are proxies for distance from the start of the block. We have a better chance to see street number 418 than 1418, to see 418 than 488, or to see 418 than to see 1488. We can look at Benford’s Law in the second and third and other minor digits of numbers. But we have to be more cautious. There is more room for variation and quirk events. A block-filling building in the downtown area can take whatever street number the owners think most auspicious. Smaller samples of anything are less predictable.

Nevertheless, Benford’s Law has become famous to forensic accountants the past several decades, if we allow the use of the word “famous” in this context. But its fame is thanks to the economists Hal Varian and Mark Nigrini. They observed that real-world financial data should be expected to follow this same distribution. If they don’t, then there might be something suspicious going on. This is not an ironclad rule. There might be good reasons for the discrepancy. If your work trips are always to the same location, and always for one week, and there’s one hotel it makes sense to stay at, and you always learn you’ll need to make the trips about one month ahead of time, of course the hotel bill will be roughly the same. Benford’s Law is a simple, rough tool, a way to decide what data to scrutinize for mischief. With this in mind I trust none of my readers will make the obvious leading-digit mistake when padding their expense accounts anymore.

Since I’ve done you that favor, anyone out there think they can pick me up at the dealer’s Thursday, maybe Friday? Thanks in advance.

• #### ivasallay 6:12 pm on Wednesday, 2 August, 2017 Permalink | Reply

Fascinating. I’ve never given this much thought, but it makes sense. Clearly, given any random whole number greater than 9, there will be at least as many numbers less than it that start with a 1 than any other number, too.

Back to your comment about odometers. We owned a van until it started costing more in repairs than most people pay in car payments. The odometer read something like 97,000 miles. We should have suspected from the beginning that it wasn’t made to last because IF it had made it to 99,999, it would then start over at 00,000.

Like

• #### Joseph Nebus 6:40 pm on Wednesday, 2 August, 2017 Permalink | Reply

Thank you. This is one of my favorite little bits of mathematics because it is something lurking around us all the time, just waiting to be discovered, and it’s really there once we try measuring things.

I’m amused to hear of a car with that short an odometer reel. I do remember thinking as a child that there was trouble if a car’s odometer rolled past 999,999. My father I remember joking that when that happened you had a brand-new car. I also remember hearing vaguely of flags that would drop beside the odometer reels if that ever happened.

Electromechanical and early solid-state pinball machines, with scoring reels or finitely many digits to display a score, can have this problem happen. Some of them handle it by having a light turn on to show, say, ‘100,000’ above the score and which does nothing to help with someone who rolls the score twice. Some just shrug and give up; when I’ve rolled our home Tri-Zone machine, its score just goes back to the 000,000 mark. Some of the pinball machines made by European manufacturer Zaccaria in the day would have the final digit — fixed at zero by long pinball custom — switch to a flashing 1, or (I trust) 2, or 3, or so on. It’s a bit odd to read at first, but it’s a good way to make the rollover problem a much better one to have.

Like

## Reading the Comics, April 29, 2017: The Other Half Of The Week Edition

I’d been splitting Reading the Comics posts between Sunday and Thursday to better space them out. But I’ve got something prepared that I want to post Thursday, so I’ll bump this up. Also I had it ready to go anyway so don’t gain anything putting it off another two days.

Bill Amend’s FoxTrot Classics for the 27th reruns the strip for the 4th of May, 2006. It’s another probability problem, in its way. Assume Jason is honest in reporting whether Paige has picked his number correctly. Assume that Jason picked a whole number. (This is, I think, the weakest assumption. I know Jason Fox’s type and he’s just the sort who’d pick an obscure transcendental number. They’re all obscure after π and e.) Assume that Jason is equally likely to pick any of the whole numbers from 1 to one billion. Then, knowing nothing about what numbers Jason is likely to pick, Paige would have one chance in a billion of picking his number too. Might as well call it certainty that she’ll pay a dollar to play the game. How much would she have to get, in case of getting the number right, to come out even or ahead? … And now we know why Paige is still getting help on probability problems in the 2017 strips.

Jeff Stahler’s Moderately Confused for the 27th gives me a bit of a break by just being a snarky word problem joke. The student doesn’t even have to resist it any.

Sandra Bell-Lundy’s Between Friends for the 29th of April, 2017. And while it’s not a Venn Diagram I’m not sure of a better way to visually represent that the cartoonist is going for. I suppose the intended meaning comes across cleanly enough and that’s the most important thing. It’s a strange state of affairs is all.

Sandra Bell-Lundy’s Between Friends for the 29th also gives me a bit of a break by just being a Venn Diagram-based joke. At least it’s using the shape of a Venn Diagram to deliver the joke. It’s not really got the right content.

Harley Schwadron’s 9 to 5 for the 29th is this week’s joke about arithmetic versus propaganda. It’s a joke we’re never really going to be without again.

## Reading the Comics, April 24, 2017: Reruns Edition

I went a little wild explaining the first of last week’s mathematically-themed comic strips. So let me split the week between the strips that I know to have been reruns and the ones I’m not so sure were.

Bill Amend’s FoxTrot for the 23rd — not a rerun; the strip is still new on Sundays — is a probability question. And a joke about story problems with relevance. Anyway, the question uses the binomial distribution. I know that because the question is about doing a bunch of things, homework questions, each of which can turn out one of two ways, right or wrong. It’s supposed to be equally likely to get the question right or wrong. It’s a little tedious but not hard to work out the chance of getting exactly six problems right, or exactly seven, or exactly eight, or so on. To work out the chance of getting six or more questions right — the problem given — there’s two ways to go about it.

One is the conceptually easy but tedious way. Work out the chance of getting exactly six questions right. Work out the chance of getting exactly seven questions right. Exactly eight questions. Exactly nine. All ten. Add these chances up. You’ll get to a number slightly below 0.377. That is, Mary Lou would have just under a 37.7 percent chance of passing. The answer’s right and it’s easy to understand how it’s right. The only drawback is it’s a lot of calculating to get there.

So here’s the conceptually harder but faster way. It works because the problem says Mary Lou is as likely to get a problem wrong as right. So she’s as likely to get exactly ten questions right as exactly ten wrong. And as likely to get at least nine questions right as at least nine wrong. To get at least eight questions right as at least eight wrong. You see where this is going: she’s as likely to get at least six right as to get at least six wrong.

There’s exactly three possibilities for a ten-question assignment like this. She can get four or fewer questions right (six or more wrong). She can get exactly five questions right. She can get six or more questions right. The chance of the first case and the chance of the last have to be the same.

So, take 1 — the chance that one of the three possibilities will happen — and subtract the chance she gets exactly five problems right, which is a touch over 24.6 percent. So there’s just under a 75.4 percent chance she does not get exactly five questions right. It’s equally likely to be four or fewer, or six or more. Just-under-75.4 divided by two is just under 37.7 percent, which is the chance she’ll pass as the problem’s given. It’s trickier to see why that’s right, but it’s a lot less calculating to do. That’s a common trade-off.

Ruben Bolling’s Super-Fun-Pax Comix rerun for the 23rd is an aptly titled installment of A Million Monkeys At A Million Typewriters. It reminds me that I don’t remember if I’d retired the monkeys-at-typewriters motif from Reading the Comics collections. If I haven’t I probably should, at least after making a proper essay explaining what the monkeys-at-typewriters thing is all about.

Ted Shearer’s Quincy from the 28th of February, 1978. So, that FoxTrot problem I did? The conceptually-easy-but-tedious way is not too hard to do if you have a calculator. It’s a buch of typing but nothing more. If you don’t have a calculator, though, the desire not to do a whole bunch of calculating could drive you to the conceptually-harder-but-less-work answer. Is that a good thing? I suppose; insight is a good thing to bring. But the less-work answer only works because of a quirk in the problem, that Mary Lou is supposed to have a 50 percent chance of getting a question right. The low-insight-but-tedious problem will aways work. Why skip on having something to do the tedious part?

Ted Shearer’s Quincy from the 28th of February, 1978 reveals to me that pocket calculators were a thing much earlier than I realized. Well, I was too young to be allowed near stuff like that in 1978. I don’t think my parents got their first credit-card-sized, solar-powered calculator that kind of worked for another couple years after that. Kids, ask about them. They looked like good ideas, but you could use them for maybe five minutes before the things came apart. Your cell phone is so much better.

Bil Watterson’s Calvin and Hobbes rerun for the 24th can be classed as a resisting-the-word-problem joke. It’s so not about that, but who am I to slow you down from reading a Calvin and Hobbes story?

Garry Trudeau’s Doonesbury rerun for the 24th started a story about high school kids and their bad geography skills. I rate it as qualifying for inclusion here because it’s a mathematics teacher deciding to include more geography in his course. I was amused by the week’s jokes anyway. There’s no hint given what mathematics Gil teaches, but given the links between geometry, navigation, and geography there is surely something that could be relevant. It might not help with geographic points like which states are in New England and where they are, though.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 24th is built on a plot point from Carl Sagan’s science fiction novel Contact. In it, a particular “message” is found in the digits of π. (By “message” I mean a string of digits that are interesting to us. I’m not sure that you can properly call something a message if it hasn’t got any sender and if there’s not obviously some intended receiver.) In the book this is an astounding thing because the message can’t be; any reasonable explanation for how it should be there is impossible. But short “messages” are going to turn up in π also, as per the comic strips.

I assume the peer review would correct the cartoon mathematicians’ unfortunate spelling of understanding.

## What Is The Most Probable Date For Easter? What Is The Least?

If I’d started pondering the question a week earlier I’d have a nice timely post. Too bad. Shouldn’t wait nearly a year to use this one, though.

My love and I got talking about early and late Easters. We know that we’re all but certainly not going to be alive to see the earliest possible Easter, at least not unless the rule for setting the date of Easter changes. Easter can be as early as the 22nd of March or as late as the 25th of April. Nobody presently alive has seen a 22nd of March Easter; the last one was in 1818. Nobody presently alive will; the next will be 2285. The last time Easter was its latest date was 1943; the next time will be 2038. I know people who’ve seen the one in 1943 and hope to make it at least through 2038.

But that invites the question: what dates are most likely to be Easter? What ones are least? In a sense the question is nonsense. The rules establishing Easter and the Gregorian calendar are known. To speak of the “chance” of a particular day being Easter is like asking the probability that Grover Cleveland was president of the United States in 1894. Technically there’s a probability distribution there. But it’s different in some way from asking the chance of rolling at least a nine on a pair of dice.

But as with the question about what day is most likely to be Thanksgiving we can make the question sensible. We have to take the question to mean “given a month and day, and no information about what year it is, what is the chance that this as Easter?” (I’m still not quite happy with that formulation. I’d be open to a more careful phrasing, if someone’s got one.)

When we’ve got that, though, we can tackle the problem. We could do as I did for working out what days are most likely to be Thanksgiving. Run through all the possible configurations of the calendar, tally how often each of the days in the range is Easter, and see what comes up most often. There’s a hassle here. Working out the date of Easter follows a rule, yes. The rule is that it’s the first Sunday after the first full moon after the spring equinox. There are wrinkles, mostly because the Moon is complicated. A notional Moon that’s a little more predictable gets used instead. There are algorithms you can use to work out when Easter is. They all look like some kind of trick being used to put something over on you. No matter. They seem to work, as far as we know. I found some Matlab code that uses the Easter-computing routine that Karl Friedrich Gauss developed and that’ll do.

Problem. The Moon and the Earth follow cycles around the sun, yes. Wait long enough and the positions of the Earth and Moon and Sun. This takes 532 years and is known as the Paschal Cycle. In the Julian calendar Easter this year is the same date it was in the year 1485, and the same it will be in 2549. It’s no particular problem to set a computer program to run a calculation, even a tedious one, 532 times. But it’s not meaningful like that either.

The problem is the Julian calendar repeats itself every 28 years, which fits nicely with the Paschal Cycle. The Gregorian calendar, with different rules about how to handle century years like 1900 and 2100, repeats itself only every 400 years. So it takes much longer to complete the cycle and get Earth, Moon, and calendar date back to the same position. To fully account for all the related cycles would take 5,700,000 years, estimates Duncan Steel in Marking Time: The Epic Quest To Invent The Perfect Calendar.

Write code to calculate Easter on a range of years and you can do that, of course. It’s no harder to calculate the dates of Easter for six million years than it is for six hundred years. It just takes longer to finish. The problem is that it is meaningless to do so. Over the course of a mere(!) 26,000 years the precession of the Earth’s axes will change the times of the seasons completely. If we still use the Gregorian calendar there will be a time that late September is the start of the Northern Hemisphere’s spring, and another time that early February is the heart of the Canadian summer. Within five thousand years we will have to change the calendar, change the rule for computing Easter, or change the idea of it as happening in Europe’s early spring. To calculate a date for Easter of the year 5,002,017 is to waste energy.

We probably don’t need it anyway, though. The differences between any blocks of 532 years are, I’m going to guess, minor things. I would be surprised if the frequency of any date’s appearance changed more than a quarter of a percent. That might scramble the rankings of dates if we have several nearly-as-common dates, but it won’t be much.

So let me do that. Here’s a table of how often each particular calendar date appears as Easter from the years 2000 to 5000, inclusive. And I don’t believe that by the year we would call 5000 we’ll still have the same calendar and Easter and expectations of Easter all together, so I’m comfortable overlooking that. Indeed, I expect we’ll have some different calendar or Easter or expectation of Easter by the year 4985 at the latest.

For this enormous date range, though, here’s the frequency of Easters on each possible date:

Date Number Of Occurrences, 2000 – 5000 Probability Of Occurence
22 March 12 0.400%
23 March 17 0.566%
24 March 41 1.366%
25 March 74 2.466%
26 March 75 2.499%
27 March 68 2.266%
28 March 90 2.999%
29 March 110 3.665%
30 March 114 3.799%
31 March 99 3.299%
1 April 87 2.899%
2 April 83 2.766%
3 April 106 3.532%
4 April 112 3.732%
5 April 110 3.665%
6 April 92 3.066%
7 April 86 2.866%
8 April 98 3.266%
9 April 112 3.732%
10 April 114 3.799%
11 April 96 3.199%
12 April 88 2.932%
13 April 90 2.999%
14 April 108 3.599%
15 April 117 3.899%
16 April 104 3.466%
17 April 90 2.999%
18 April 93 3.099%
19 April 114 3.799%
20 April 116 3.865%
21 April 93 3.099%
22 April 60 1.999%
23 April 46 1.533%
24 April 57 1.899%
25 April 29 0.966%

Dates of Easter from 2000 through 5000. Computed using Gauss’s algorithm.

If I haven’t missed anything, this indicates that the 15th of April is the most likely date for Easter, with the 20th close behind and the 10th and 14th hardly rare. The least probable date is the 22nd of March, with the 23rd of March and the 25th of April almost as unlikely.

And since the date range does affect the results, here’s a smaller sampling, one closer fit to the dates of anyone alive to read this as I publish. For the years 1925 through 2100 the appearance of each Easter date are:

Date Number Of Occurrences, 1925 – 2100 Probability Of Occurence
22 March 0 0.000%
23 March 1 0.568%
24 March 1 0.568%
25 March 3 1.705%
26 March 6 3.409%
27 March 3 1.705%
28 March 5 2.841%
29 March 6 3.409%
30 March 7 3.977%
31 March 7 3.977%
1 April 6 3.409%
2 April 4 2.273%
3 April 6 3.409%
4 April 6 3.409%
5 April 7 3.977%
6 April 7 3.977%
7 April 4 2.273%
8 April 4 2.273%
9 April 6 3.409%
10 April 7 3.977%
11 April 7 3.977%
12 April 7 3.977%
13 April 4 2.273%
14 April 6 3.409%
15 April 7 3.977%
16 April 6 3.409%
17 April 7 3.977%
18 April 6 3.409%
19 April 6 3.409%
20 April 6 3.409%
21 April 7 3.977%
22 April 5 2.841%
23 April 2 1.136%
24 April 2 1.136%
25 April 2 1.136%

Dates of Easter from 1925 through 2100. Computed using Gauss’s algorithm.

If we take this as the “working lifespan” of our common experience then the 22nd of March is the least likely Easter we’ll see, as we never do. The 23rd and 24th are the next least likely Easter. There’s a ten-way tie for the most common date of Easter, if I haven’t missed one or more. But the 30th and 31st of March, and the 5th, 6th, 10th, 11th, 12th, 15th, 17th, and 21st of April each turn up seven times in this range.

The Julian calendar Easter dates are different and perhaps I’ll look at that sometime.

• #### ksbeth 7:34 pm on Tuesday, 18 April, 2017 Permalink | Reply

Very interesting

Liked by 1 person

• #### Joseph Nebus 3:31 am on Wednesday, 19 April, 2017 Permalink | Reply

Thank you!

Liked by 1 person

• #### mx. fluffy 💜 (@fluffy) 11:51 pm on Thursday, 20 April, 2017 Permalink | Reply

I’m surprised there’s such a periodicity in the modal peaks! What happens if you extend the computations out for a few more millennia? Do they even out or get even more pronounced?

Like

• #### Joseph Nebus 2:59 am on Tuesday, 25 April, 2017 Permalink | Reply

I’m surprised by it too, yes. If we pretend that the current scheme for calculating Easter would be meaningful, then, extended over the full 5,700,000-year cycle … the peaks don’t disappear. The 19th of April turns up as Easter about 3.9 percent of the time. Next most likely are the 18th, 17th, 15th, 12th, and 10th of April.

I don’t know just what causes this. I suspect it’s some curious interaction between the 19-year Metonic cycle of the lunar behavior and the very slight asymmetries the Gregorian calendar. The 21st of March is a tiny bit more likely to be a Tuesday, Wednesday, or Sunday than it is any other day of the week. My hunch is these combine to make the little peaks that linger.

The 22nd of March and 25th of April are the least common Easters; the 23rd and 24th of March, then 24th of April, come slightly more commonly.

Like

## Did This German Retiree Solve A Decades-Old Conjecture?

And then this came across my desktop (my iPad’s too old to work with the Twitter client anymore):

The underlying news is that one Thomas Royen, a Frankfurt (Germany)-area retiree, seems to have proven the Gaussian Correlation Inequality. It wasn’t a conjecture that sounded familiar to me, but the sidebar (on the Quanta Magazine article to which I’ve linked there) explains it and reminds me that I had heard about it somewhere or other. It’s about random variables. That is, things that can take on one of a set of different values. If you think of them as the measurements of something that’s basically consistent but never homogenous you’re doing well.

Suppose you have two random variables, two things that can be measured. There’s a probability the first variable is in a particular range, greater than some minimum and less than some maximum. There’s a probability the second variable is in some other particular range. What’s the probability that both variables are simultaneously in these particular ranges? This is easy to answer for some specific cases. For example if the two variables have nothing to do with each other then everybody who’s taken a probability class knows. The probability of both variables being in their ranges is the probability the first is in its range times the probability the second is in its range. The challenge is telling whether it’s always true, whether the variables are related to each other or not. Or telling when it’s true if it isn’t always.

The article (and pop reporting on this) is largely about how the proof has gone unnoticed. There’s some interesting social dynamics going on there. Royen published in an obscure-for-the-field journal, one he was an editor for; this makes it look dodgy, at least. And the conjecture’s drawn “proofs” that were just wrong; this discourages people from looking for obscurely-published proofs.

Some of the articles I’ve seen on this make Royen out to be an amateur. And I suppose there is a bias against amateurs in professional mathematics. There is in every field. It’s true that mathematics doesn’t require professional training the way that, say, putting out oil rig fires does. Anyone capable of thinking through an argument rigorously is capable of doing important original work. But there are a lot of tricks to thinking an argument through that are important, and I’d bet on the person with training.

In any case, Royen isn’t a newcomer to the field who just heard of an interesting puzzle. He’d been a statistician, first for a pharmaceutical company and then for a technical university. He may not have a position or tie to a mathematics department or a research organization but he’s someone who would know roughly what to do.

So did he do it? I don’t know; I’m not versed enough in the field to say. It’s interesting to see if he has.

• #### mathtuition88 4:29 am on Thursday, 13 April, 2017 Permalink | Reply

He seems to have a PhD earned in 1975. (http://www.genealogy.ams.org/id.php?id=134663).

Like

• #### Joseph Nebus 5:32 am on Friday, 14 April, 2017 Permalink | Reply

Ah, thank you! I appreciate the reassurance that he wasn’t wholly an amateur or someone whose expertise came from on-the-job training.

Liked by 1 person

## Reading the Comics, April 6, 2017: Abbreviated Week Edition

I’m writing this a little bit early because I’m not able to include the Saturday strips in the roundup. There won’t be enough to make a split week edition; I’ll just add the Saturday strips to next week’s report. In the meanwhile:

Mac King and Bill King’s Magic in a Minute for the 2nd is a magic trick, as the name suggests. It figures out a card by way of shuffling a (partial) deck and getting three (honest) answers from the other participant. If I’m not counting wrongly, you could do this trick with up to 27 cards and still get the right card after three answers. I feel like there should be a way to explain this that’s grounded in information theory, but I’m not able to put that together. I leave the suggestion here for people who see the obvious before I get to it.

Bil Keane and Jeff Keane’s Family Circus (probable) rerun for the 6th reassured me that this was not going to be a single-strip week. And a dubiously included single strip at that. I’m not sure that lotteries are the best use of the knowledge of numbers, but they’re a practical use anyway.

Bil Keane and Jeff Keane’s Family Circus for the 6th of April, 2017. I’m not familiar enough with the evolution of the Family Circus style to say whether this is a rerun, a newly-drawn strip, or an old strip with a new caption. I suppose there is a certain timelessness to it, at least once we get into the era when states sported lotteries again.

Bill Bettwy’s Take It From The Tinkersons for the 6th is part of the universe of students resisting class. I can understand the motivation problem in caring about numbers of apples that satisfy some condition. In the role of distinct objects whose number can be counted or deduced cards are as good as apples. In the role of things to gamble on, cards open up a lot of probability questions. Counting cards is even about how the probability of future events changes as information about the system changes. There’s a lot worth learning there. I wouldn’t try teaching it to elementary school students.

Bill Bettwy’s Take It From The Tinkersons for the 6th of April, 2017. That tree in the third panel is a transplant from a Slylock Fox six-differences panel. They’ve been trying to rebuild the population of trees that are sometimes three triangles and sometimes four triangles tall.

Jeffrey Caulfield and Alexandre Rouillard’s Mustard and Boloney for the 6th uses mathematics as the stuff know-it-alls know. At least I suppose it is; Doctor Know It All speaks of “the pathagorean principle”. I’m assuming that’s meant to be the Pythagorean theorem, although the talk about “in any right triangle the area … ” skews things. You can get to stuf about areas of triangles from the Pythagorean theorem. One of the shorter proofs of it depends on the areas of the squares of the three sides of a right triangle. But it’s not what people typically think of right away. But he wouldn’t be the first know-it-all to start blathering on the assumption that people aren’t really listening. It’s common enough to suppose someone who speaks confidently and at length must know something.

Dave Whamond’s Reality Check for the 6th is a welcome return to anthropomorphic-numerals humor. Been a while.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 6th builds on the form of a classic puzzle, about a sequence indexed to the squares of a chessboard. The story being riffed on is a bit of mathematical legend. The King offered the inventor of chess any reward. The inventor asked for one grain of wheat for the first square, two grains for the second square, four grains for the third square, eight grains for the fourth square, and so on, through all 64 squares. An extravagant reward, but surely one within the king’s power to grant, right? And of course not: by the 64th doubling the amount of wheat involved is so enormous it’s impossibly great wealth.

The father’s offer is meant to evoke that. But he phrases it in a deceptive way, “one penny for the first square, two for the second, and so on”. That “and so on” is the key. Listing a sequence and ending “and so on” is incomplete. The sequence can go in absolutely any direction after the given examples and not be inconsistent. There is no way to pick a single extrapolation as the only logical choice.

We do it anyway, though. Even mathematicians say “and so on”. This is because we usually stick to a couple popular extrapolations. We suppose things follow a couple common patterns. They’re polynomials. Or they’re exponentials. Or they’re sine waves. If they’re polynomials, they’re lower-order polynomials. Things like that. Most of the time we’re not trying to trick our fellow mathematicians. Or we know we’re modeling things with some physical base and we have reason to expect some particular type of function.

In this case, the \$1.27 total is consistent with getting two cents for every chess square after the first. There are infinitely many other patterns that would work, and the kid would have been wise to ask for what precisely “and so on” meant before choosing.

Berkeley Breathed’s Bloom County 2017 for the 7th is the climax of a little story in which Oliver Wendell Holmes has been annoying people by shoving scientific explanations of things into their otherwise pleasant days. It’s a habit some scientifically-minded folks have, and it’s an annoying one. Many of us outgrow it. Anyway, this strip is about the curious evidence suggesting that the universe is not just expanding, but accelerating its expansion. There are mathematical models which allow this to happen. When developing General Relativity, Albert Einstein included a Cosmological Constant for little reason besides that without it, his model would suggest the universe was of a finite age and had expanded from an infinitesimally small origin. He had grown up without anyone knowing of any evidence that the size of the universe was a thing that could change.

Anyway, the Cosmological Constant is a puzzle. We can find values that seem to match what we observe, but we don’t know of a good reason it should be there. We sciencey types like to have models that match data, but we appreciate more knowing why the models look like that and not anything else. So it’s a good problem some of the cosmologists have been working on. But we’ve been here before. A great deal of physics, especially in the 20th Century, has been driven by looking for reasons behind what look like arbitrary points in a successful model. If Oliver were better-versed in the history of science — something scientifically minded people are often weak on, myself included — he’d be less easily taunted by Opus.

Mikael Wulff and Anders Morgenthaler’s TruthFacts for the 7th thinks that we forgot they ran this same strip back on the 17th of March. I spotted it, though. Nyah.

## How Much Might I Have Lost At Pinball?

After the state pinball championship last month there was a second, side tournament. It was a sort-of marathon event in which I played sixteen games in short order. I won three of them and lost thirteen, a disheartening record. The question I can draw from this: was I hopelessly outclassed in the side tournament? Is it plausible that I could do so awfully?

The answer would be “of course not”. I was playing against, mostly, the same people who were in the state finals. (A few who didn’t qualify for the finals joined the side tournament.) In that I had done well enough, winning seven games in all out of fifteen played. It’s implausible that I got significantly worse at pinball between the main and the side tournament. But can I make a logically sound argument about this?

In full, probably not. It’s too hard. The question is, did I win way too few games compared to what I should have expected? But what should I have expected? I haven’t got any information on how likely it should have been that I’d win any of the games, especially not when I faced something like a dozen different opponents. (I played several opponents twice.)

But we can make a model. Suppose that I had a fifty percent chance of winning each match. This is a lie in detail. The model contains lies; all models do. The lies might let us learn something interesting. Some people there I could only beat with a stroke of luck on my side. Some people there I could fairly often expect to beat. If we pretend I had the same chance against everyone, though, we get something that we can model. It might tell us something about what really happened.

If I play 16 matches, and have a 50 percent chance of winning each of them, then I should expect to win eight matches. But there’s no reason I might not win seven instead, or nine. Might win six, or ten, without that being too implausible. It’s even possible I might not win a single match, or that I might win all sixteen matches. How likely?

This calls for a creature from the field of probability that we call the binomial distribution. It’s “binomial” because it’s about stuff for which there are exactly two possible outcomes. This fits. Each match I can win or I can lose. (If we tie, or if the match is interrupted, we replay it, so there’s not another case.) It’s a “distribution” because we describe, for a set of some number of attempted matches, how the possible outcomes are distributed. The outcomes are: I win none of them. I win exactly one of them. I win exactly two of them. And so on, all the way up to “I win exactly all but one of them” and “I win all of them”.

To answer the question of whether it’s plausible I should have done so badly I need to know more than just how likely it is I would win only three games. I need to also know the chance I’d have done worse. If I had won only two games, or only one, or none at all. Why?

Here I admit: I’m not sure I can give a compelling reason, at least not in English. I’ve been reworking it all week without being happy at the results. Let me try pieces.

One part is that as I put the question — is it plausible that I could do so awfully? — isn’t answered just by checking how likely it is I would win only three games out of sixteen. If that’s awful, then doing even worse must also be awful. I can’t rule out even-worse results from awfulness without losing a sense of what the word “awful” means. Fair enough, to answer that question. But I made up the question. Why did I make up that one? Why not just “is it plausible I’d get only three out of sixteen games”?

Habit, largely. Experience shows me that the probability of any particular result turns out to be implausibly low. It isn’t quite that case here; there’s only seventeen possible noticeably different outcomes of playing sixteen games. But there can be so many possible outcomes that even the most likely one isn’t.

Take an extreme case. (Extreme cases are often good ways to build an intuitive understanding of things.) Imagine I played 16,000 games, with a 50-50 chance of winning each one of them. It is most likely that I would win 8,000 of the games. But the probability of winning exactly 8,000 games is small: only about 0.6 percent. What’s going on there is that there’s almost the same chance of winning exactly 8,001 or 8,002 games. As the number of games increases the number of possible different outcomes increases. If there are 16,000 games there are 16,001 possible outcomes. It’s less likely that any of them will stand out. What saves our ability to predict the results of things is that the number of plausible outcomes increases more slowly. It’s plausible someone would win exactly three games out of sixteen. It’s impossible that someone would win exactly three thousand games out of sixteen thousand, even though that’s the same ratio of won games.

Card games offer another way to get comfortable with this idea. A bridge hand, for example, is thirteen cards drawn out of fifty-two. But the chance that you were dealt the hand you just got? Impossibly low. Should we conclude from this all bridge hands are hoaxes? No, but ask my mother sometime about the bridge class she took that one cruise. “Three of sixteen” is too particular; “at best three of sixteen” is a class I can study.

Unconvinced? I don’t blame you. I’m not sure I would be convinced of that, but I might allow the argument to continue. I hope you will. So here are the specifics. These are the chance of each count of wins, and the chance of having exactly that many wins, for sixteen matches:

Wins Percentage
0 0.002 %
1 0.024 %
2 0.183 %
3 0.854 %
4 2.777 %
5 6.665 %
6 12.219 %
7 17.456 %
8 19.638 %
9 17.456 %
10 12.219 %
11 6.665 %
12 2.777 %
13 0.854 %
14 0.183 %
15 0.024 %
16 0.002 %

So the chance of doing as awfully as I had — winning zero or one or two or three games — is pretty dire. It’s a little above one percent.

Is that implausibly low? Is there so small a chance that I’d do so badly that we have to figure I didn’t have a 50-50 chance of winning each game?

I hate to think that. I didn’t think I was outclassed. But here’s a problem. We need some standard for what is “it’s implausibly unlikely that this happened by chance alone”. If there were only one chance in a trillion that someone with a 50-50 chance of winning any game would put in the performance I did, we could suppose that I didn’t actually have a 50-50 chance of winning any game. If there were only one chance in a million of that performance, we might also suppose I didn’t actually have a 50-50 chance of winning any game. But here there was only one chance in a hundred? Is that too unlikely?

It depends. We should have set a threshold for “too implausibly unlikely” before we started research. It’s bad form to decide afterward. There are some thresholds that are commonly taken. Five percent is often useful for stuff where it’s hard to do bigger experiments and the harm of guessing wrong (dismissing the idea I had a 50-50 chance of winning any given game, for example) isn’t so serious. One percent is another common threshold, again common in stuff like psychological studies where it’s hard to get more and more data. In a field like physics, where experiments are relatively cheap to keep running, you can gather enough data to insist on fractions of a percent as your threshold. Setting the threshold after is bad form.

In my defense, I thought (without doing the work) that I probably had something like a five percent chance of doing that badly by luck alone. It suggests that I did have a much worse than 50 percent chance of winning any given game.

Is that credible? Well, yeah; I may have been in the top sixteen players in the state. But a lot of those people are incredibly good. Maybe I had only one chance in three, or something like that. That would make the chance I did that poorly something like one in six, likely enough.

And it’s also plausible that games are not independent, that whether I win one game depends in some way on whether I won or lost the previous. But it does feel like it’s easier to win after a win, or after a close loss. And it feels harder to win a game after a string of losses. I don’t know that this can be proved, not on the meager evidence I have available. And you can almost always question the independence of a string of events like this. It’s the safe bet.

## Reading the Comics, March 6, 2017: Blackboards Edition

I can’t say there’s a compelling theme to the first five mathematically-themed comics of last week. Screens full of mathematics turned up in a couple of them, so I’ll run with that. There were also just enough strips that I’m splitting the week again. It seems fair to me and gives me something to remember Wednesday night that I have to rush to complete.

Jimmy Hatlo’s Little Iodine for the 1st of January, 1956 was rerun on the 5th of March. The setup demands Little Iodine pester her father for help with the “hard homework” and of course it’s arithmetic that gets to play hard work. It’s a word problem in terms of who has how many apples, as you might figure. Don’t worry about Iodine’s boss getting fired; Little Iodine gets her father fired every week. It’s their schtick.

Jimmy Hatlo’s Little Iodine for the 1st of January, 1956. I guess class started right back up the 2nd, but it would’ve avoided so much trouble if she’d done her homework sometime during the winter break. That said, I never did.

Dana Simpson’s Phoebe and her Unicorn for the 5th mentions the “most remarkable of unicorn confections”, a sugar dodecahedron. Dodecahedrons have long captured human imaginations, as one of the Platonic Solids. The Platonic Solids are one of the ways we can make a solid-geometry analogue to a regular polygon. Phoebe’s other mentioned shape of cubes is another of the Platonic Solids, but that one’s common enough to encourage no sense of mystery or wonder. The cube’s the only one of the Platonic Solids that will fill space, though, that you can put into stacks that don’t leave gaps between them. Sugar cubes, Wikipedia tells me, have been made only since the 19th century; the Moravian sugar factory director Jakub Kryštof Rad got a patent for cutting block sugar into uniform pieces in 1843. I can’t dispute the fun of “dodecahedron” as a word to say. Many solid-geometric shapes have names that are merely descriptive, but which are rendered with Greek or Latin syllables so as to sound magical.

Bud Grace’s Piranha Club for the 6th started a sequence in which the Future Disgraced Former President needs the most brilliant person in the world, Bud Grace. A word balloon full of mathematics is used as symbol for this genius. I feel compelled to point out Bud Grace was a physics major. But while Grace could as easily have used something from the physics department to show his deep thinking abilities, that would all but certainly have been rendered as equation and graphs, the stuff of mathematics again.

Bud Grace’s Piranha Club for the 6th of March, 2017. 241 times 635 is 153,035 by the way. I wouldn’t work that out in my head if I needed the number. I might work out an estimate of how big it was, in which case I’d do this: 241 is about 250, which is one-quarter of a thousand. One-quarter of 635 is something like 150, which times a thousand is 150,000. If I needed it exactly I’d get a calculator. Unless I just needed something to occupy my mind without having any particular emotional charge.

Scott Meyer’s Basic Instructions rerun for the 6th is aptly titled, “How To Unify Newtonian Physics And Quantum Mechanics”. Meyer’s advice is not bad, really, although generic enough it applies to any attempts to reconcile two different models of a phenomenon. Also there’s not particularly a problem reconciling Newtonian physics with quantum mechanics. It’s general relativity and quantum mechanics that are so hard to reconcile.

Still, Basic Instructions is about how you can do a thing, or learn to do a thing. It’s not about how to allow anything to be done for the first time. And it’s true that, per quantum mechanics, we can’t predict exactly what any one particle will do at any time. We can say what possible things it might do and how relatively probable they are. But big stuff, the stuff for which Newtonian physics is relevant, involve so many particles that the unpredictability becomes too small to notice. We can see this as the Law of Large Numbers. That’s the probability rule that tells us we can’t predict any coin flip, but we know that a million fair tosses of a coin will not turn up 800,000 tails. There’s more to it than that (there’s always more to it), but that’s a starting point.

Michael Fry’s Committed rerun for the 6th features Albert Einstein as the icon of genius. Natural enough. And it reinforces this with the blackboard full of mathematics. I’m not sure if that blackboard note of “E = md3” is supposed to be a reference to the famous Far Side panel of Einstein hearing the maid talk about everything being squared away. I’ll take it as such.

## How Much I Did Lose In Pinball

A follow-up for people curious how much I lost at the state pinball championships Saturday: I lost at the state pinball championships Saturday. As I expected I lost in the first round. I did beat my expectations, though. I’d figured I would win one, maybe two games in our best-of-seven contest. As it happened I won three games and I had a fighting chance in game seven.

I’d mentioned in the previous essay about how much contingency there is especially in a short series like this one. My opponent picked the game I expected she would to start out. And she got an awful bounce on the first ball, while I got a very lucky bounce that started multiball on the last. So I won, but not because I was playing better. The seventh game was one that I had figured she might pick if she needed to crush me, and if I had gotten a better bounce on the first ball I’d still have had an uphill struggle. Just less of one.

After the first round I got into a set of three “tie-breaking” rounds, used to sort out which of the sixteen players ranked as number 11 versus number 10. Each of those were a best-of-three series. I did win one series and lost two others, dropping me into 12th place. Over the three series I had four wins and four losses, so I can’t say that I mismatched there.

Where I might have been mismatched is the side tournament. This was a two-hour marathon of playing a lot of games one after the other. I finished with three wins and 13 losses, enough to make me wonder whether I somehow went from competent to incompetent in the hour or so between the main and the side tournament. Of course not, based on a record like that, but — can I prove it?

Meanwhile a friend pointed out The New York Times covering the New York State pinball championship:

The article is (at least for now) at https://www.nytimes.com/2017/02/12/nyregion/pinball-state-championship.html. What my friend couldn’t have known, and what shows how networked people are, is that I know one of the people featured in the article, Sean “The Storm” Grant. Well, I knew him, back in college. He was an awesome pinball player even then. And he’s only got more awesome since.

How awesome? Let me give you some background. The International Flipper Pinball Association (IFPA) gives players ranking points. These points are gathered by playing in leagues and tournaments. Each league or tournament has a certain point value. That point value is divided up among the players, in descending order from how they finish. How many points do the events have? That depends on how many people play and what their ranking is. So, yes, how much someone’s IFPA score increases depends on the events they go to, and the events they go to depend on their score. This might sound to you like there’s a differential equation describing all this. You’re close: it’s a difference equation, because these rankings change with the discrete number of events players go to. But there’s an interesting and iterative system at work there.

(Points only expire with time. The system is designed to encourage people to play a lot of things and keep playing them. You can’t lose ranking points by playing, although it might hurt your player-versus-player rating. That’s calculated by a formula I don’t understand at all.)

Anyway, Sean Grant plays in the New York Superleague, a crime-fighting band of pinball players who figured out how to game the IFPA rankings system. They figured out how to turn the large number of people who might visit a Manhattan bar and casually play one or two games into a source of ranking points for the serious players. The IFPA, combatting this scheme, just this week recalculated the Superleague values and the rankings of everyone involved in it. It’s fascinating stuff, in that way a heated debate over an issue you aren’t emotionally invested in can be.

Anyway. Grant is such a skilled player that he lost more points in this nerfing than I have gathered in my whole competitive-pinball-playing career.

So while I knew I’d be knocked out in the first round of the Michigan State Championships I’ll admit I had fantasies of having an impossibly lucky run. In that case, I’d have gone to the nationals and been turned into a pale, silverball-covered paste by people like Grant.

Thanks again for all your good wishes, kind readers. Now we start the long road to the 2017 State Championships, to be held in February of next year. I’m already in 63rd place in the state for the year! (There haven’t been many events for the year yet, and the championship and side tournament haven’t posted their ranking scores yet.)

## How Much Can I Expect To Lose In Pinball?

This weekend, all going well, I’ll be going to the Michigan state pinball championship contest. There, I will lose in the first round.

I’m not trying to run myself down. But I know who I’m scheduled to play in the first round, and she’s quite a good player. She’s the state’s highest-ranked woman playing competitive pinball. So she starts off being better than me. And then the venue is one she gets to play in more than I do. Pinball, a physical thing, is idiosyncratic. The reflexes you build practicing on one table can betray you on a strange machine. She’s had more chance to practice on the games we have and that pretty well settles the question. I’m still showing up, of course, and doing my best. Stranger things have happened than my winning a game. But I’m going in with I hope realistic expectations.

That bit about having realistic expectations, though, makes me ask what are realistic expectations. The first round is a best-of-seven match. How many games should I expect to win? And that becomes a probability question. It’s a great question to learn on, too. Our match is straightforward to model: we play up to seven times. Each time we play one or the other wins.

So we can start calculating. There’s some probability I have of winning any particular game. Call that number ‘p’. It’s at least zero (I’m not sure to lose) but it’s less than one (I’m not sure to win). Let’s suppose the probability of my winning never changes over the course of seven games. I will come back to the card I palmed there. If we’re playing 7 games, and I have a chance ‘p’ of winning any one of them, then the number of games I can expect to win is 7 times ‘p’. This is the number of wins you might expect if you were called on in class and had no idea and bluffed the first thing that came to mind. Sometimes that works.

7 times p isn’t very enlightening. What number is ‘p’, after all? And I don’t know exactly. The International Flipper Pinball Association tracks how many times I’ve finished a tournament or league above her and vice-versa. We’ve played in 54 recorded events together, and I’ve won 23 and lost 29 of them. (We’ve tied twice.) But that isn’t all head-to-head play. It counts matches where I’m beaten by someone she goes on to beat as her beating me, and vice-versa. And it includes a lot of playing not at the venue. I lack statistics and must go with my feelings. I’d estimate my chance of beating her at about one in three. Let’s say ‘p’ is 1/3 until we get evidence to the contrary. It is “Flipper Pinball” because the earliest pinball machines had no flippers. You plunged the ball into play and nudged the machine a little to keep it going somewhere you wanted. (The game Simpsons Pinball Party has a moment where Grampa Simpson says, “back in my day we didn’t have flippers”. It’s the best kind of joke, the one that is factually correct.)

Seven times one-third is not a difficult problem. It comes out to two and a third, raising the question of how one wins one-third of a pinball game. Most games involve playing three rounds, called balls, is the obvious observation. But this one-third of a game is an average. Imagine the two of us playing three thousand seven-game matches, without either of us getting the least bit better or worse or collapsing of exhaustion. I would expect to win seven thousand of the games, or two and a third games per seven-game match.

Ah, but … that’s too high. I would expect to win two and a third games out of seven. But we probably won’t play seven. We’ll stop when she or I gets to four wins. This makes the problem hard. Hard is the wrong word. It makes the problem tedious. At least it threatens to. Things will get easy enough, but we have to go through some difficult parts first.

There are eight different ways that our best-of-seven match can end. She can win in four games. I can win in four games. She can win in five games. I can win in five games. She can win in six games. I can win in six games. She can win in seven games. I can win in seven games. There is some chance of each of those eight outcomes happening. And exactly one of those will happen; it’s not possible that she’ll win in four games and in five games, unless we lose track of how many games we’d played. They give us index cards to write results down. We won’t lose track.

It’s easy to calculate the probability that I win in four games, if the chance of my winning a game is the number ‘p’. The probability is p4. Similarly it’s easy to calculate the probability that she wins in four games. If I have the chance ‘p’ of winning, then she has the chance ‘1 – p’ of winning. So her probability of winning in four games is (1 – p)4.

The probability of my winning in five games is more tedious to work out. It’s going to be p4 times (1 – p) times 4. The 4 here is the number of different ways that she can win one of the first four games. Turns out there’s four ways to do that. She could win the first game, or the second, or the third, or the fourth. And in the same way the probability she wins in five games is p times (1 – p)4 times 4.

The probability of my winning in six games is going to be p4 times (1 – p)2 times 10. There are ten ways to scatter four wins by her among the first five games. The probability of her winning in six games is the strikingly parallel p2 times (1 – p)4 times 10.

The probability of my winning in seven games is going to be p4 times (1 – p)3 times 20, because there are 20 ways to scatter three wins among the first six games. And the probability of her winning in seven games is p3 times (1 – p)4 times 20.

Add all those probabilities up, no matter what ‘p’ is, and you should get 1. Exactly one of those four outcomes has to happen. And we can work out the probability that the series will end after four games: it’s the chance she wins in four games plus the chance I win in four games. The probability that the series goes to five games is the probability that she wins in five games plus the probability that I win in five games. And so on for six and for seven games.

So that’s neat. We can figure out the probability of the match ending after four games, after five, after six, or after seven. And from that we can figure out the expected length of the match. This is the expectation value. Take the product of ‘4’ and the chance the match ends at four games. Take the product of ‘5’ and the chance the match ends at five games. Take the product of ‘6’ and the chance the match ends at six games. Take the product of ‘7’ and the chance the match ends at seven games. Add all those up. That’ll be, wonder of wonders, the number of games a match like this can be expected to run.

Now it’s a matter of adding together all these combinations of all these different outcomes and you know what? I’m not doing that. I don’t know what the chance is I’d do all this arithmetic correctly is, but I know there’s no chance I’d do all this arithmetic correctly. This is the stuff we pirate Mathematica to do. (Mathematica is supernaturally good at working out mathematical expressions. A personal license costs all the money you will ever have in your life plus ten percent, which it will calculate for you.)

Happily I won’t have to work it out. A person appearing to be a high school teacher named B Kiggins has worked it out already. Kiggins put it and a bunch of other interesting worksheets on the web. (Look for the Voronoi Diagramas!)

There’s a lot of arithmetic involved. But it all simplifies out, somehow. Per Kiggins’ work, the expected number of games in a best-of-seven match, if one of the competitors has the chance ‘p’ of winning any given game, is:

$E(p) = 4 + 4\cdot p + 4\cdot p^2 + 4\cdot p^3 - 52\cdot p^4 + 60\cdot p^5 - 20\cdot p^6$

Whatever you want to say about that, it’s a polynomial. And it’s easy enough to evaluate it, especially if you let the computer evaluate it. Oh, I would say it seems like a shame all those coefficients of ‘4’ drop off and we get weird numbers like ’52’ after that. But there’s something beautiful in there being four 4’s, isn’t there? Good enough.

So. If the chance of my winning a game, ‘p’, is one-third, then we’d expect the series to go 5.5 games. This accords well with my intuition. I thought I would be likely to win one game. Winning two would be a moral victory akin to championship.

Let me go back to my palmed card. This whole analysis is based on the idea that I have some fixed probability of winning and that it isn’t going to change from one game to the next. If the probability of winning is entirely based on my and my opponents’ abilities this is fair enough. Neither of us is likely to get significantly more or less skilled over the course of even seven matches. We won’t even play long enough to get fatigued. But ability isn’t everything.

But our abilities aren’t everything. We’re going to be playing up to seven different tables. How each table reacts to our play is going to vary. Some tables may treat me better, some tables my opponent. Luck of the draw. And there’s an important psychological component. It’s easy to get thrown and to let a bad ball wreck the rest of one’s game. It’s hard to resist feeling nervous if you go into the last ball from way behind your opponent. And it seems as if a pinball knows you’re nervous and races out of play to help you calm down. (The best pinball players tend to have outstanding last balls, though. They don’t get rattled. And they spend the first several balls building up to high-value shots they can collect later on.) And there will be freak events. Last weekend I was saved from elimination in a tournament by the pinball machine spontaneously resetting. We had to replay the game. I did well in the tournament, but it was the freak event that kept me from being knocked out in the first round.

That’s some complicated stuff to fit together. I suppose with enough data we could possibly model how much the differences between pinball machines affects the outcome. That’s what sabermetrics is all about. Representing how severely I’ll build a little bad luck into a lot of bad luck? Oh, that’s hard.

Too hard to deal with, at least not without much more sports psychology and modelling of pinball players than we have data to do. The supposition that my chance of winning is fixed for the duration of the match may not be true. But we won’t be playing enough games to be able to tell the difference. The assumption that my chance of winning doesn’t change over the course of the match may be false. But it’s near enough, and it gets us some useful information. We have to know not to demand too much precision from our model.

And seven games isn’t statistically significant. Not when players are as closely matched as we are. I could be worse and still get a couple wins in when they count; I could play better than my average and still get creamed four games straight. I’ll be trying my best, of course. But I expect my best is one or two wins, then getting to the snack room and waiting for the side tournament to start. Shall let you know if something interesting happens.

• #### ksbeth 6:03 pm on Friday, 10 February, 2017 Permalink | Reply

Woo hoo! Good luck )

Liked by 1 person

• #### Joseph Nebus 4:43 am on Saturday, 11 February, 2017 Permalink | Reply

Thank you! I’m feeling good heading into tomorrow.

Liked by 1 person

• #### vagabondurges 7:33 pm on Friday, 10 February, 2017 Permalink | Reply

Best of luck! I am loving these pinball posts! And there’s a pinball place in Alameda, CA that you’ve just inspired me to visit again.

Liked by 1 person

• #### Joseph Nebus 4:45 am on Saturday, 11 February, 2017 Permalink | Reply

Thank you! I’m sorry I don’t find more excuses to write about pinball, since there’s so much about it I do like. And I’m glad you’re feeling inspired; I hope it’s a good visit.

The secrets are: plunge the ball softly, let the ball bounce back and forth on the flippers until it’s moving slowly, and hold the flipper up until the ball comes to a rest so you can aim. So much of pinball is about letting things calm down so you can understand what’s going on and what you want to do next.

Liked by 1 person

• #### mathtuition88 12:32 am on Saturday, 11 February, 2017 Permalink | Reply

Good luck and all the best!

Like

• #### Joseph Nebus 4:47 am on Saturday, 11 February, 2017 Permalink | Reply

Thank you! I shall be doing what I can.

Liked by 1 person

• #### davekingsbury 3:52 pm on Saturday, 11 February, 2017 Permalink | Reply

All this work and you’ll tell me you’re not a betting man …

Like

• #### Joseph Nebus 11:05 pm on Thursday, 16 February, 2017 Permalink | Reply

I honestly am not. The occasional lottery ticket is my limit. But probability questions are so hard to resist. They usually involve very little calculation but demand thoughtful analysis. It’s great.

Liked by 1 person

## Reading the Comics, January 7, 2016: Just Before GoComics Breaks Everything Edition

Most of the comics I review here are printed on GoComics.com. Well, most of the comics I read online are from there. But even so I think they have more comic strips that mention mathematical themes. Anyway, they’re unleashing a complete web site redesign on Monday. I don’t know just what the final version will look like. I know that the beta versions included the incredibly useful, that is to say dumb, feature where if a particular comic you do read doesn’t have an update for the day — and many of them don’t, as they’re weekly or three-times-a-week or so — then it’ll show some other comic in its place. I mean, the idea of encouraging people to find new comics is a good one. To some extent that’s what I do here. But the beta made no distinction between “comic you don’t read because you never heard of Microcosm” and “comic you don’t read because glancing at it makes your eyes bleed”. And on an idiosyncratic note, I read a lot of comics. I don’t need to see Dude and Dude reruns in fourteen spots on my daily comics page, even if I didn’t mind it to start.

Anyway. I am hoping, desperately hoping, that with the new site all my old links to comics are going to keep working. If they don’t then I suppose I’m just ruined. We’ll see. My suggestion is if you’re at all curious about the comics you read them today (Sunday) just to be safe.

Ashleigh Brilliant’s Pot-Shots is a curious little strip I never knew of until GoComics picked it up a few years ago. Its format is compellingly simple: a little illustration alongside a wry, often despairing, caption. I love it, but I also understand why was the subject of endless queries to the Detroit Free Press (Or Whatever) about why was this thing taking up newspaper space. The strip rerun the 31st of December is a typical example of the strip and amuses me at least. And it uses arithmetic as the way to communicate reasoning, both good and bad. Brilliant’s joke does address something that logicians have to face, too. Whether an argument is logically valid depends entirely on its structure. If the form is correct the reasoning may be excellent. But to be sound an argument has to be correct and must also have its assumptions be true. We can separate whether an argument is right from whether it could ever possibly be right. If you don’t see the value in that, you have never participated in an online debate about where James T Kirk was born and whether Spock was the first Vulcan in Star Fleet.

Thom Bluemel’s Birdbrains for the 2nd of January, 2017, is a loaded-dice joke. Is this truly mathematics? Statistics, at least? Close enough for the start of the year, I suppose. Working out whether a die is loaded is one of the things any gambler would like to know, and that mathematicians might be called upon to identify or exploit. (I had a grandmother unshakably convinced that I would have some natural ability to beat the Atlantic City casinos if she could only sneak the underaged me in. I doubt I could do anything of value there besides see the stage magic show.)

Jack Pullan’s Boomerangs rerun for the 2nd is built on the one bit of statistical mechanics that everybody knows, that something or other about entropy always increasing. It’s not a quantum mechanics rule, but it’s a natural confusion. Quantum mechanics has the reputation as the source of all the most solid, irrefutable laws of the universe’s working. Statistical mechanics and thermodynamics have this musty odor of 19th-century steam engines, no matter how much there is to learn from there. Anyway, the collapse of systems into disorder is not an irrevocable thing. It takes only energy or luck to overcome disorderliness. And in many cases we can substitute time for luck.

Scott Hilburn’s The Argyle Sweater for the 3rd is the anthropomorphic-geometry-figure joke that’s I’ve been waiting for. I had thought Hilburn did this all the time, although a quick review of Reading the Comics posts suggests he’s been more about anthropomorphic numerals the past year. This is why I log even the boring strips: you never know when I’ll need to check the last time Scott Hilburn used “acute” to mean “cute” in reference to triangles.

Mike Thompson’s Grand Avenue uses some arithmetic as the visual cue for “any old kind of schoolwork, really”. Steve Breen’s name seems to have gone entirely from the comic strip. On Usenet group rec.arts.comics.strips Brian Henke found that Breen’s name hasn’t actually been on the comic strip since May, and D D Degg found a July 2014 interview indicating Thompson had mostly taken the strip over from originator Breen.

Mark Anderson’s Andertoons for the 5th is another name-drop that doesn’t have any real mathematics content. But come on, we’re talking Andertoons here. If I skipped it the world might end or something untoward like that.

Ted Shearer’s Quincy for the 14th of November, 1977, and reprinted the 7th of January, 2017. I kind of remember having a lamp like that. I don’t remember ever sitting down to do my mathematics homework with a paintbrush.

Ted Shearer’s Quincy for the 14th of November, 1977, doesn’t have any mathematical content really. Just a mention. But I need some kind of visual appeal for this essay and Shearer is usually good for that.

Corey Pandolph, Phil Frank, and Joe Troise’s The Elderberries rerun for the 7th is also a very marginal mention. But, what the heck, it’s got some of your standard wordplay about angles and it’ll get this week’s essay that much closer to 800 words.

## Reading the Comics, December 17, 2016: Sleepy Week Edition

Comic Strip Master Command sent me a slow week in mathematical comics. I suppose they knew I was on somehow a busier schedule than usual and couldn’t spend all the time I wanted just writing. I appreciate that but don’t want to see another of those weeks when nothing qualifies. Just a warning there.

John Rose’s Barney Google and Snuffy Smith for the 12th of December, 2016. I appreciate the desire to pay attention to continuity that makes Rose draw in the coffee cup both panels, but Snuffy Smith has to swap it from one hand to the other to keep it in view there. Not implausible, just kind of busy. Also I can’t fault Jughaid for looking at two pages full of unillustrated text and feeling lost. That’s some Bourbaki-grade geometry going on there.

John Rose’s Barney Google and Snuffy Smith for the 12th is a bit of mathematical wordplay. It does use geometry as the “hard mathematics we don’t know how to do”. That’s a change from the usual algebra. And that’s odd considering the joke depends on an idiom that is actually used by real people.

Patrick Roberts’s Todd the Dinosaur for the 12th uses mathematics as the classic impossibly hard subject a seven-year-old can’t be expected to understand. The worry about fractions seems age-appropriate. I don’t know whether it’s fashionable to give elementary school students experience thinking of ‘x’ and ‘y’ as numbers. I remember that as a time when we’d get a square or circle and try to figure what number fits in the gap. It wasn’t a 0 or a square often enough.

Patrick Roberts’s Todd the Dinosaur for the 12th of December, 2016. Granting that Todd’s a kid dinosaur and that T-Rexes are not renowned for the hugeness of their arms, wouldn’t that still be enough space for a lot of text to fit around? I would have thought so anyway. I feel like I’m pluralizing ‘T-Rex’ wrong, but what would possibly be right? ‘Ts-rex’? Don’t make me try to spell tyrannosaurus.

Jef Mallett’s Frazz for the 12th uses one of those great questions I think every child has. And it uses it to question how we can learn things from statistical study. This is circling around the “Bayesian” interpretation of probability, of what odds mean. It’s a big idea and I’m not sure I’m competent to explain it. It amounts to asking what explanations would be plausibly consistent with observations. As we get more data we may be able to rule some cases in or out. It can be unsettling. It demands we accept right up front that we may be wrong. But it lets us find reasonably clean conclusions out of the confusing and muddy world of actual data.

Sam Hepburn’s Questionable Quotebook for the 14th illustrates an old observation about the hypnotic power of decimal points. I think Hepburn’s gone overboard in this, though: six digits past the decimal in this percentage is too many. It draws attention to the fakeness of the number. One, two, maybe three digits past the decimal would have a more authentic ring to them. I had thought the John Allen Paulos tweet above was about this comic, but it’s mere coincidence. Funny how that happens.

## When Is Thanksgiving Most Likely To Happen?

So my question from last Thursday nagged at my mind. And I learned that Octave (a Matlab clone that’s rather cheaper) has a function that calculates the day of the week for any given day. And I spent longer than I would have expected fiddling with the formatting to get what I wanted to know.

It turns out there are some days in November more likely to be the fourth Thursday than others are. (This is the current standard for Thanksgiving Day in the United States.) And as I’d suspected without being able to prove, this doesn’t quite match the breakdown of which months are more likely to have Friday the 13ths. That is, it’s more likely that an arbitrarily selected month will start on Sunday than any other day of the week. It’s least likely that an arbitrarily selected month will start on a Saturday or Monday. The difference is extremely tiny; there are only four more Sunday-starting months than there are Monday-starting months over the course of 400 years.

But an arbitrary month is different from an arbitrary November. It turns out Novembers are most likely to start on a Sunday, Tuesday, or Thursday. And that makes the 26th, 24th, and 22nd the most likely days to be Thanksgiving. The 23rd and 25th are the least likely days to be Thanksgiving. Here’s the full roster, if I haven’t made any serious mistakes with it:

November Will Be Thanksgiving
22 58
23 56
24 58
25 56
26 58
27 57
28 57
times in 400 years

I don’t pretend there’s any significance to this. But it is another of those interesting quirks of probability. What you would say the probability is of a month starting on the 1st — equivalently, of having a Friday the 13th, or a Fourth Thursday of the Month that’s the 26th — depends on how much you know about the month. If you know only that it’s a month on the Gregorian calendar it’s one thing (specifically, it’s 688/4800, or about 0.14333). If you know only that it’s a November than it’s another (58/400, or 0.145). If you know only that it’s a month in 2016 then it’s another yet (1/12, or about 0.08333). If you know that it’s November 2016 then the probability is 0. Information does strange things to probability questions.

## Reading the Comics, November 26, 2016: What is Pre-Algebra Edition

Here I’m just closing out last week’s mathematically-themed comics. The new week seems to be bringing some more in at a good pace, too. Should have stuff to talk about come Sunday.

Darrin Bell and Theron Heir’s Rudy Park for the 24th brings out the ancient question, why do people need to do mathematics when we have calculators? As befitting a comic strip (and Sadie’s character) the question goes unanswered. But it shows off the understandable confusion people have between mathematics and calculation. Calculation is a fine and necessary thing. And it’s fun to do, within limits. And someone who doesn’t like to calculate probably won’t be a good mathematician. (Or will become one of those master mathematicians who sees ways to avoid calculations in getting to an answer!) But put aside the obviou that we need mathematics to know what calculations to do, or to tell whether a calculation done makes sense. Much of what’s interesting about mathematics isn’t a calculation. Geometry, for an example that people in primary education will know, doesn’t need more than slight bits of calculation. Group theory swipes a few nice ideas from arithmetic and builds its own structure. Knot theory uses polynomials — everything does — but more as a way of naming structures. There aren’t things to do that a calculator would recognize.

Richard Thompson’s Poor Richard’s Almanac for the 25th I include because I’m a fan, and on the grounds that the Summer Reading includes the names of shapes. And I’ve started to notice how often “rhomboid” is used as a funny word. Those who search for the evolution and development of jokes, take heed.

John Atkinson’s Wrong Hands for the 25th is the awaited anthropomorphic-numerals and symbols joke for this past week. I enjoy the first commenter’s suggestion tha they should have stayed in unknown territory.

Rick Kirkman and Jerry Scott’s Baby Blues for the 26th of November, 2016. I suppose Kirkman and Scott know their characters better than I do but isn’t Zoe like nine or ten? Isn’t pre-algebra more a 7th or 8th grade thing? I can’t argue Grandma being post-algebra but I feel like the punch line was written and then retrofitted onto the characters.

Rick Kirkman and Jerry Scott’s Baby Blues for the 26th does a little wordplay built on pre-algebra. I’m not sure that Zoe is quite old enough to take pre-algebra. But I also admit not being quite sure what pre-algebra is. The central idea of (primary school) algebra — that you can do calculations with a number without knowing what the number is — certainly can use some preparatory work. It’s a dazzling idea and needs plenty of introduction. But my dim recollection of taking it was that it was a bit of a subject heap, with some arithmetic, some number theory, some variables, some geometry. It’s all stuff you’ll need once algebra starts. But it is hard to say quickly what belongs in pre-algebra and what doesn’t.

Art Sansom and Chip Sansom’s The Born Loser for the 26th uses two ancient staples of jokes, probabilities and weather forecasting. It’s a hard joke not to make. The prediction for something is that it’s very unlikely, and it happens anyway? We all laugh at people being wrong, which might be our whistling past the graveyard of knowing we will be wrong ourselves. It’s hard to prove that a probability is wrong, though. A fairly tossed die may have only one chance in six of turning up a ‘4’. But there’s no reason to think it won’t, and nothing inherently suspicious in it turning up ‘4’ four times in a row.

We could do it, though. If the die turned up ‘4’ four hundred times in a row we would no longer call it fair. (This even if examination proved the die really was fair after all!) Or if it just turned up a ‘4’ significantly more often than it should; if it turned up two hundred times out of four hundred rolls, say. But one or two events won’t tell us much of anything. Even the unlikely happens sometimes.

Even the impossibly unlikely happens if given enough attempts. If we do not understand that instinctively, we realize it when we ponder that someone wins the lottery most weeks. Presumably the comic’s weather forecaster supposed the chance of snow was so small it could be safely rounded down to zero. But even something with literally zero percent chance of happening might.

Imagine tossing a fair coin. Imagine tossing it infinitely many times. Imagine it coming up tails every single one of those infinitely many times. Impossible: the chance that at least one toss of a fair coin will turn up heads, eventually, is 1. 100 percent. The chance heads never comes up is zero. But why could it not happen? What law of physics or logic would it defy? It challenges our understanding of ideas like “zero” and “probability” and “infinity”. But we’re well-served to test those ideas. They hold surprises for us.

• #### Matthew Wright 6:55 pm on Tuesday, 29 November, 2016 Permalink | Reply

‘Rhomboid’ is a wonderful word. Always makes me think of British First World War tanks.

Like

• #### Joseph Nebus 9:30 pm on Wednesday, 30 November, 2016 Permalink | Reply

It is a great word and you’re right; it’s perfectly captured by British First World War tanks.

Liked by 1 person

• #### Matthew Wright 6:09 am on Thursday, 1 December, 2016 Permalink | Reply

A triumph of mathematics on the part of Sir Eustace Tennyson-d’Eyncourt and his colleagues – as I understand it the shape was calculated to match the diameter of a 60-foot wheel as a trench-crossing mechanism, but without the radius (well, a triumph of geometry, which isn’t exactly mathematical in the pure sense…). I probably should stop making appalling puns now…

Like

• #### Joseph Nebus 4:46 pm on Friday, 9 December, 2016 Permalink | Reply

Liked by 1 person

• #### davekingsbury 5:35 pm on Wednesday, 30 November, 2016 Permalink | Reply

Your comments about tossing a coin suggests to me than working out probability is probably an inherited instinct, which is probably why it’s so tempting to enter a betting shop. (Do you guys have betting shops over the Pond?)

Like

• #### Joseph Nebus 9:40 pm on Wednesday, 30 November, 2016 Permalink | Reply

I think we don’t have any instinct for probability. There’s maybe a vague idea but it’s just awful for any but the simplest problems. Which is fair enough; for most of our existence probability questions were relatively straightforward things. But it took a generation of mathematicians to work out whether you were more likely to roll a 9 or a 10 on tossing three dice.

There are some betting parlors in the United States, mostly under the name Off-Track Betting shops. I don’t think there’s really a culture of them, though, at least not away from the major horse-racing tracks. I may be mistaken though; it’s not a hobby I’ve been interested in. I believe they’re all limited to horse- and greyhound-racing, though. There are many places that sell state-sponsored lotteries but that isn’t really what I understand betting shops to be about. And lottery tickets are just sidelines from some more reputable concern like being a convenience store.

Like

• #### davekingsbury 1:37 am on Thursday, 1 December, 2016 Permalink | Reply

Our betting shops are plentiful, several on every high street, and they are full of FOBTs – fixed odds betting terminals – which are a prime source of problem gambling in poorer communities. Looking this up, I’ve just watched a worrying clip of somebody gambling while convincing themselves erroneously that they’re on the verge of a big win … it’s been described as the crack cocaine of gambling and there are 35,000 machines in the UK. If we have any instinct for probability, it’s being abused …

Like

• #### Joseph Nebus 4:45 pm on Friday, 9 December, 2016 Permalink | Reply

I suspect the fixed odds betting terminals translate in the United States to ordinary slot machines. They’ve been creeping over the United States as Native American nations realize they can license casinos as they are, theoretically, sovereigns on the territory reserved to them. (The state and federal governments get very upset when Native Americans do anything that brings them too much prosperity, though, so casinos get a lot of scrutiny.) But they similarly are all about having a lot of machines, making a lot of noise, and making a huge payout seem imminent and making a small payout seem huge.

Of course, my favorite hobby is pinball, which uses nearly all the same tricks and is the nearly-reputable cousin of slot machines. Pinball machines were banned in many United States municipalities for decades as gambling machines, and it’s a fair cop. Occasionally there’ll be a bit a human-interest news about a city getting around to repealing its pinball-machine ban, and everybody thinks it a hilarious quaint bit about how square, say, Oakland, California, used to be. But the ban was for legitimate reasons, even if they’re now obsolete.

Liked by 1 person

• #### davekingsbury 8:00 pm on Friday, 9 December, 2016 Permalink | Reply

Fascinating historical perspectives here and I’m completely with you on the thrills of pinball – the virtual versions don’t have the physicality of the real machines, do they, especially that bit where you jerk the machine to wrench back control? My favourite was table football, though, which helped me waste hours as an undergraduate – my defence game was pretty nigh impossible to get round! Of course, it’s all gone downhill since …

Like

• #### Joseph Nebus 5:33 am on Saturday, 17 December, 2016 Permalink | Reply

The virtual machines have gotten to be really, really good. But yes, there’s this lack of physicality that’s important. Part of it is just the table getting worn and dirty and a little unresponsive, which is so key to actual play and competitive play. The app for Zaccaria Pinball machines allow you to include simulated grime on the playfield, making things play less well and more realistically; it’s a great addition. But the abstraction of nudging really makes a difference. Giving the table just the right shove is one of the big, essential skills on a pinball game and I just haven’t seen anything that gets the physics of it right.

We have table football and several of the bars with pinball machines where we play, but almost never see anyone using them. The nearest hipster bar even had a bumper pool table for months, but since nobody ever knew what the rules of bumper pool were it didn’t get much use. I printed out a set of rules I found on the Internet somewhere and left it on the table, but failed to laminate it or anything and the rules were discarded or lost after about a month. A relatively busy month for game play, too.

Liked by 1 person

• #### davekingsbury 11:21 am on Saturday, 17 December, 2016 Permalink | Reply

If one wanted a reason to reject the virtual world altogether, it could be the ‘clean’ aspect of the experience – perhaps we could throw in photography while we’re at it, and its dubious relationship with truth … or am I just being a grumpy old fart? Lifting the table in table football was a key tactic, as I recall …

Like

• #### Joseph Nebus 6:35 am on Wednesday, 21 December, 2016 Permalink | Reply

The clean aspect is a fair reason, yes. Part of the fun of real-world things is that while they can be predictable they’re never perfectly consistent. And there is some definite skill in recovering from stuff that isn’t working quite right.

Like

• #### davekingsbury 3:56 pm on Wednesday, 21 December, 2016 Permalink | Reply

And learning to grin and bear it when the recovery doesn’t occur!!

Like

• #### Joseph Nebus 5:02 am on Thursday, 5 January, 2017 Permalink | Reply

Oh, my yes. Learning what to do when recovery isn’t working is a big challenge.

Like

• #### davekingsbury 9:50 am on Thursday, 5 January, 2017 Permalink | Reply

Character-forming … 67 and still waiting! ;)

Like

## A Thanksgiving Thought Fresh From The Shower

It’s well-known, at least in calendar-appreciation circles, that the 13th of a month is more likely to be Friday than any other day of the week. That’s on the Gregorian calendar, which has some funny rules about whether a century year — 1900, 2000, 2100 — will be a leap year. Three of them aren’t in every four centuries. The result is the pattern of dates on the calendar is locked into this 400-year cycle, instead of the 28-year cycle you might imagine. And this makes some days of the week more likely for some dates than they otherwise might be.

This got me wondering. Does the 13th being slightly more likely imply that the United States Thanksgiving is more likely to be on the 26th of the month? The current rule is that Thanksgiving is the fourth Thursday of November. We’ll pretend that’s an unalterable fact of nature for the sake of having a problem we can solve. So if the 13th is more likely to be a Friday than any other day of the week, isn’t the 26th more likely to be a Thursday than any other day of the week?

And that’s so, but I’m not quite certain yet. What’s got me pondering this in the shower is that the 13th is more likely a Friday for an arbitrary month. That is, if I think of a month and don’t tell you anything about what it is, all we can say is it chance of the 13th being a Friday is such-and-such. But if I pick a particular month — say, November 2017 — things are different. The chance the 13th of November, 2017 is a Friday is zero. So the chance the 26th of December, 2017 is a Thursday is zero. Our calendar system sets rules. We’ll pretend that’s an unalterable fact of nature for the sake of having a problem we can solve, too.

So: does knowing that I am thinking of November, rather than a completely unknown month, change the probabilities? And I don’t know. My gut says “it’s plausible the dates of Novembers are different from the dates of arbitrary months”. I don’t know a way to argue this purely logically, though. It might have to be tested by going through 400 years of calendars and counting when the fourth Thursdays are. (The problem isn’t so tedious as that. There’s formulas computers are good at which can do this pretty well.)

But I would like to know if it can be argued there’s a difference, or that there isn’t.

## Reading the Comics, November 12, 2016: Frazz and Monkeys Edition

Two things made repeat appearances in the mathematically-themed comics this week. They’re the comic strip Frazz and the idea of having infinitely many monkeys typing. Well, silly answers to word problems also turned up, but that’s hard to say many different things about. Here’s what I make the week in comics out to be.

Sandra Bell-Lundy’s Between Friends for the 6th of November, 2016. I’m surprised Bell-Lundy used the broader space of a Sunday strip for a joke that doesn’t need that much illustration, but I understand sometimes you just have to go with the joke that you have. And it isn’t as though Sunday comics get that much space anymore either. Anyway, I suppose we have all been there, although for me that’s more often because I used to have a six-digit pin, and a six-digit library card pin, and those were just close enough to each other that I could never convince myself I was remembering the right one in context, so I would guess wrong.

Sandra Bell-Lundy’s Between Friends for the 6th introduces the infinite monkeys problem. I wonder sometimes why the monkeys-on-typewriters thing has so caught the public imagination. And then I remember it encourages us to stare directly into infinity and its intuition-destroying nature from the comfortable furniture of the mundane — typewriters, or keyboards, for goodness’ sake — with that childish comic dose of monkeys. Given that it’s a wonder we ever talk about anything else, really.

Monkeys writing Shakespeare has for over a century stood as a marker for what’s possible but incredibly improbable. I haven’t seen it compared to finding a four-digit PIN. It has got me wondering about the chance that four randomly picked letters will be a legitimate English word. I’m sure the chance is more than the one-in-a-thousand chance someone would guess a randomly drawn PIN correctly on one try. More than one in a hundred? I’m less sure. The easy-to-imagine thing to do is set a computer to try out all 456,976 possible sets of four letters and check them against a dictionary. The number of hits divided by the number of possibilities would be the chance of drawing a legitimate word. If I had a less capable computer, or were checking even longer words, I might instead draw some set number of words, never minding that I didn’t get every possibility. The fraction of successful words in my sample would be something close to the chance of drawing any legitimate word.

If I thought a little deeper about the problem, though, I’d just count how many four-letter words are already in my dictionary and divide that into 456,976. It’s always a mistake to start programming before you’ve thought the problem out. The trouble is not being able to tell when that thinking-out is done.

Richard Thompson’s Poor Richard’s Almanac for the 7th is the other comic strip to mention infinite monkeys. Well, chimpanzees in this case. But for the mathematical problem they’re not different. I’ve featured this particular strip before. But I’m a Thompson fan. And goodness but look at the face on the T S Eliot fan in the lower left corner there.

Jeff Mallet’s Frazz for the 6th gives Caulfield one of those flashes of insight that seems like it should be something but doesn’t mean much. He’s had several of these lately, as mentioned here last week. As before this is a fun discovery about Roman Numerals, but it doesn’t seem like it leads to much. Perhaps a discussion of how the subtractive principle — that you can write “four” as “IV” instead of “IIII” — evolved over time. But then there isn’t much point to learning Roman Numerals at all. It’s got some value in showing how much mathematics depends on culture. Not just that stuff can be expressed in different ways, but that those different expressions make different things easier or harder to do. But I suspect that isn’t the objective of lessons about Roman Numerals.

Frazz got my attention again the 12th. This time it just uses arithmetic, and a real bear of an arithmetic problem, as signifier for “a big pile of hard work”. This particular problem would be — well, I have to call it tedious, rather than hard. doing it is just a long string of adding together two numbers. But to do that over and over, by my count, at least 47 times for this one problem? Hardly any point to doing that much for one result.

Patrick Roberts’s Todd the Dinosaur for the 7th calls out fractions, and arithmetic generally, as the stuff that ruins a child’s dreams. (Well, a dinosaur child’s dreams.) Still, it’s nice to see someone reminding mathematicians that a lot of their field is mostly used by accountants. Actuaries we know about; mathematics departments like to point out that majors can get jobs as actuaries. I don’t know of anyone I went to school with who chose to become one or expressed a desire to be an actuary. But I admit not asking either.

Patrick Roberts’s Todd the Dinosaur for the 7th of November, 2016. I don’t remember being talked to by classmates’ parents about what they where, but that might just be that it’s been a long time since I was in elementary school and everybody had the normal sorts of jobs that kids don’t understand. I guess we talked about what our parents did but that should make a weaker impression.

Mike Thompson’s Grand Avenue started off a week of students-resisting-the-test-question jokes on the 7th. Most of them are hoary old word problem jokes. But, hey, I signed up to talk about it when a comic strip touches a mathematics topic and word problems do count.

Zach Weinersmith’s Saturday Morning Breakfast Cereal reprinted the 7th is a higher level of mathematical joke. It’s from the genre of nonsense calculation. This one starts off with what’s almost a cliche, at least for mathematics and physics majors. The equation it starts with, $e^{i Pi} = -1$, is true. And famous. It should be. It links exponentiation, imaginary numbers, π, and negative numbers. Nobody would have seen it coming. And from there is the sort of typical gibberish reasoning, like writing “Pi” instead of π so that it can be thought of as “P times i”, to draw to the silly conclusion that P = 0. That much work is legitimate.

From there it sidelines into “P = NP”, which is another equation famous to mathematicians and computer scientists. It’s a shorthand expression of a problem about how long it takes to find solutions. That is, how many steps it takes. How much time it would take a computer to solve a problem. You can see why it’s important to have some study of how long it takes to do a problem. It would be poor form to tie up your computer on a problem that won’t be finished before the computer dies of old age. Or just take too long to be practical.

Most problems have some sense of size. You can look for a solution in a small problem or in a big one. You expect searching for the solution in a big problem to take longer. The question is how much longer? Some methods of solving problems take a length of time that grows only slowly as the size of the problem grows. Some take a length of time that grows crazy fast as the size of the problem grows. And there are different kinds of time growth. One kind is called Polynomial, because everything is polynomials. But there’s a polynomial in the problem’s size that describes how long it takes to solve. We call this kind of problem P. Another is called Non-Deterministic Polynomial, for problems that … can’t. We assume. We don’t know. But we know some problems that look like they should be NP (“NP Complete”, to be exact).

It’s an open question whether P and NP are the same thing. It’s possible that everything we think might be NP actually can be solved by a P-class algorithm we just haven’t thought of yet. It would be a revolution in our understanding of how to find solutions if it were. Most people who study algorithms think P is not NP. But that’s mostly (as I understand it) because it seems like if P were NP then we’d have some leads on proving that by now. You see how this falls short of being rigorous. But it is part of expertise to get a feel for what seems to make sense in light of everything else we know. We may be surprised. But it would be inhuman not to have any expectations of a problem like this.

Mark Anderson’s Andertoons for the 8th gives us the Andertoons content for the week. It’s a fair question why a right triangle might have three sides, three angles, three vertices, and just the one hypotenuse. The word’s origin, from Greek, meaning “stretching under” or “stretching between”. It’s unobjectionable that we might say this is the stretch from one leg of the right triangle to another. But that leaves unanswered why there’s just the one hypothenuse, since the other two legs also stretch from the end of one leg to another. Dr Sarah on The Math Forum suggests we need to think of circles. Draw a circle and a diameter line on it. Now pick any point on the circle other than where the diameter cuts it. Draw a line from one end of the diameter to your point. And from your point to the other end of the diameter. You have a right triangle! And the hypothenuse is the leg stretching under the other two. Yes, I’m assuming you picked a point above the diameter. You did, though, didn’t you? Humans do that sort of thing.

I don’t know if Dr Sarah’s explanation is right. It sounds plausible and sensible. But those are weak pins to hang an etymology on. But I have no reason to think she’s mistaken. And the explanation might help people accept there is the one hypothenuse and there’s something interesting about it.

The first (and as I write this only) commenter, Kristiaan, has a good if cheap joke there.

• #### davekingsbury 10:38 pm on Monday, 14 November, 2016 Permalink | Reply

I reckon it was Bob Newhart’s sketch about it that made the monkey idea so popular. Best bit, something like, hey one of them has something over here er to be or not to be that is the … gezoinebplatf!

Like

• #### Joseph Nebus 3:35 am on Sunday, 20 November, 2016 Permalink | Reply

I like to think that helped. I fear that that particular routine’s been forgotten, though. I was surprised back in the 90s when I was getting his albums and ran across that bit, as I’d never heard it before. But it might’ve been important in feeding the idea to other funny people. There’s probably a good essay to be written tracing the monkeys at typewriters through pop culture.

Liked by 1 person

## The End 2016 Mathematics A To Z: Distribution (statistics)

As I’ve done before I’m using one of my essays to set up for another essay. It makes a later essay easier. What I want to talk about is worth some paragraphs on its own.

## Distribution (statistics)

The 19th Century saw the discovery of some unsettling truths about … well, everything, really. If there is an intellectual theme of the 19th Century it’s that everything has an unsettling side. In the 20th Century craziness broke loose. The 19th Century, though, saw great reasons to doubt that we knew what we knew.

But one of the unsettling truths grew out of mathematical physics. We start out studying physics the way Galileo or Newton might have, with falling balls. Ones that don’t suffer from air resistance. Then we move up to more complicated problems, like balls on a spring. Or two balls bouncing off each other. Maybe one ball, called a “planet”, orbiting another, called a “sun”. Maybe a ball on a lever swinging back and forth. We try a couple simple problems with three balls and find out that’s just too hard. We have to track so much information about the balls, about their positions and momentums, that we can’t solve any problems anymore. Oh, we can do the simplest ones, but we’re helpless against the interesting ones.

And then we discovered something. By “we” I mean people like James Clerk Maxwell and Josiah Willard Gibbs. And that is that we can know important stuff about how millions and billions and even vaster numbers of things move around. Maxwell could work out how the enormously many chunks of rock and ice that make up Saturn’s rings move. Gibbs could work out how the trillions of trillions of trillions of trillions of particles of gas in a room move. We can’t work out how four particles move. How is it we can work out how a godzillion particles move?

We do it by letting go. We stop looking for that precision and exactitude and knowledge down to infinitely many decimal points. Even though we think that’s what mathematicians and physicists should have. What we do instead is consider the things we would like to know. Where something is. What its momentum is. What side of a coin is showing after a toss. What card was taken off the top of the deck. What tile was drawn out of the Scrabble bag.

There are possible results for each of these things we would like to know. Perhaps some of them are quite likely. Perhaps some of them are unlikely. We track how likely each of these outcomes are. This is called the distribution of the values. This can be simple. The distribution for a fairly tossed coin is “heads, 1/2; tails, 1/2”. The distribution for a fairly tossed six-sided die is “1/6 chance of 1; 1/6 chance of 2; 1/6 chance of 3” and so on. It can be more complicated. The distribution for a fairly tossed pair of six-sided die starts out “1/36 chance of 2; 2/36 chance of 3; 3/36 chance of 4” and so on. If we’re measuring something that doesn’t come in nice discrete chunks we have to talk about ranges: the chance that a 30-year-old male weighs between 180 and 185 pounds, or between 185 and 190 pounds. The chance that a particle in the rings of Saturn is moving between 20 and 21 kilometers per second, or between 21 and 22 kilometers per second, and so on.

We may be unable to describe how a system evolves exactly. But often we’re able to describe how the distribution of its possible values evolves. And the laws by which probability work conspire to work for us here. We can get quite precise predictions for how a whole bunch of things behave even without ever knowing what any thing is doing.

That’s unsettling to start with. It’s made worse by one of the 19th Century’s late discoveries, that of chaos. That a system can be perfectly deterministic. That you might know what every part of it is doing as precisely as you care to measure. And you’re still unable to predict its long-term behavior. That’s unshakeable too, although statistical techniques will give you an idea of how likely different behaviors are. You can learn the distribution of what is likely, what is unlikely, and how often the outright impossible will happen.

Distributions follow rules. Of course they do. They’re basically the rules you’d imagine from looking at and thinking about something with a range of values. Something like a chart of how many students got what grades in a class, or how tall the people in a group are, or so on. Each possible outcome turns up some fraction of the time. That fraction’s never less than zero nor greater than 1. Add up all the fractions representing all the times every possible outcome happens and the sum is exactly 1. Something happens, even if we never know just what. But we know how often each outcome will.

There is something amazing to consider here. We can know and track everything there is to know about a physical problem. But we will be unable to do anything with it, except for the most basic and simple problems. We can choose to relax, to accept that the world is unknown and unknowable in detail. And this makes imaginable all sorts of problems that should be beyond our power. Once we’ve given up on this precision we get precise, exact information about what could happen. We can choose to see it as a moral about the benefits and costs and risks of how tightly we control a situation. It’s a surprising lesson to learn from one’s training in mathematics.

## Reading the Comics, October 29, 2016: Rerun Comics Edition

There were a couple of rerun comics in this week’s roundup, so I’ll go with that theme. And I’ll put in one more appeal for subjects for my End of 2016 Mathematics A To Z. Have a mathematics term you’d like to see me go on about? Just ask! Much of the alphabet is still available.

John Kovaleski’s Bo Nanas rerun the 24th is about probability. There’s something wondrous and strange that happens when we talk about the probability of things like birth days. They are, if they’re in the past, determined and fixed things. The current day is also a known, determined, fixed thing. But we do mean something when we say there’s a 1-in-365 (or 366, or 365.25 if you like) chance of today being your birthday. It seems to me this is probability based on ignorance. If you don’t know when my birthday is then your best guess is to suppose there’s a one-in-365 (or so) chance that it’s today. But I know when my birthday is; to me, with this information, the chance today is my birthday is either 0 or 1. But what are the chances that today is a day when the chance it’s my birthday is 1? At this point I realize I need much more training in the philosophy of mathematics, and the philosophy of probability. If someone is aware of a good introductory book about it, or a web site or blog that goes into these problems in a way a lay reader will understand, I’d love to hear of it.

I’ve featured this installment of Poor Richard’s Almanac before. I’ll surely feature it again. I like Richard Thompson’s sense of humor. The first panel mentions non-Euclidean geometry, using the connotation that it does have. Non-Euclidean geometries are treated as these magic things — more, these sinister magic things — that defy all reason. They can’t defy reason, of course. And at least some of them are even sensible if we imagine we’re drawing things on the surface of the Earth, or at least the surface of a balloon. (There are non-Euclidean geometries that don’t look like surfaces of spheres.) They don’t work exactly like the geometry of stuff we draw on paper, or the way we fit things in rooms. But they’re not magic, not most of them.

Stephen Bentley’s Herb and Jamaal for the 25th I believe is a rerun. I admit I’m not certain, but it feels like one. (Bentley runs a lot of unannounced reruns.) Anyway I’m refreshed to see a teacher giving a student permission to count on fingers if that’s what she needs to work out the problem. Sometimes we have to fall back on the non-elegant ways to get comfortable with a method.

Dave Whamond’s Reality Check for the 25th name-drops Einstein and one of the three equations that has any pop-culture currency.

Guy Gilchrist’s Today’s Dogg for the 27th is your basic mathematical-symbols joke. We need a certain number of these.

Berkeley Breathed’s Bloom County for the 28th is another rerun, from 1981. And it’s been featured here before too. As mentioned then, Milo is using calculus and logarithms correctly in his rather needless insult of Freida. 10,000 is a constant number, and as mentioned a few weeks back its derivative must be zero. Ten to the power of zero is 1. The log of 10, if we’re using logarithms base ten, is also 1. There are many kinds of logarithms but back in 1981, the default if someone said “log” would be the logarithm base ten. Today the default is more muddled; a normal person would mean the base-ten logarithm by “log”. A mathematician might mean the natural logarithm, base ‘e’, by “log”. But why would a normal person mention logarithms at all anymore?

Jef Mallett’s Frazz for the 28th is mostly a bit of wordplay on evens and odds. It’s marginal, but I do want to point out some comics that aren’t reruns in this batch.

c
Compose new post
j
Next post/Next comment
k
Previous post/Previous comment
r