## My 2019 Mathematics A To Z: Martingales

Today’s A To Z term was nominated again by @aajohannas. The other compelling nomination was from Vayuputrii, for the Mittag-Leffler function. I was tempted. But I realized I could not think of a clear way to describe why the function was interesting. Or even where it comes from that avoided being a heap of technical terms. There’s no avoiding technical terms in writing about mathematics, but there’s only so much I want to put in at once either. It also makes me realize I don’t understand the Mittag-Leffler function, but it is after all something I haven’t worked much with.

The Mittag-Leffler function looks like it’s one of those things named for several contributors, like Runge-Kutta Integration or Cauchy-Kovalevskaya Theorem or something. Not so here; this was one person, Gösta Mittag-Leffler. His name’s all over the theory of functions. And he was one of the people helping Sofia Kovalevskaya, whom you know from every list of pioneering women in mathematics, secure her professorship.

# Martingales.

A martingale is how mathematicians prove you can’t get rich gambling.

Well, that exaggerates. Some people will be lucky, of course. But there’s no strategy that works. The only strategy that works is to rig the game. You can do this openly, by setting rules that give you a slight edge. You usually have to be the house to do this. Or you can do it covertly, using tricks like card-counting (in blackjack) or weighted dice or other tricks. But a fair game? Meaning one not biased towards or against any player? There’s no strategy to guarantee winning that.

We can make this more technical. Martingales arise from the world of stochastic processes. This is an indexed set of random variables. A random variable is some variable with a value that depends on the result of some phenomenon. A tossed coin. Rolled dice. Number of people crossing a particular walkway over a day. Engine temperature. Value of a stock being traded. Whatever. We can’t forecast what the next value will be. But we know the distribution, which values are more likely and which ones are unlikely and which ones impossible.

The field grew out of studying real-world phenomena. Things we could sample and do statistics on. So it’s hard to think of an index that isn’t time, or some proxy for time like “rolls of the dice”. Stochastic processes turn up all over the place. A lot of what we want to know is impossible, or at least impractical, to exactly forecast. Think of the work needed to forecast how many people will cross this particular walk four days from now. But it’s practical to describe what are more and less likely outcomes. What the average number of walk-crossers will be. What the most likely number will be. Whether to expect tomorrow to be a busier or a slower day.

And this is what the martingale is for. Start with a sequence of your random variables. How many people have crossed that street each day since you started studying. What is the expectation value, the best guess, for the next result? Your best guess for how many will cross tomorrow? Keeping in mind your knowledge of how all these past values. That’s an important piece. It’s not a martingale if the history of results isn’t a factor.

Every probability question has to deal with knowledge. Sometimes it’s easy. The probability of a coin coming up tails next toss? That’s one-half. The probability of a coin coming up tails next toss, given that it came up tails last time? That’s still one-half. The probability of a coin coming up tails next toss, given that it came up tails the last 40 tosses? That’s … starting to make you wonder if this is a fair coin. I’d bet tails, but I’d also ask to examine both sides, for a start.

So a martingale is a stochastic process where we can make forecasts about the future. Particularly, the expectation value. The expectation value is the sum of the products of every possible value and how probable they are. In a martingale, the expected value for all time to come is just the current value. So if whatever it was you’re measuring was, say, 40 this time? That’s your expectation for the whole future. Specific values might be above 40, or below 40, but on average, 40 is it.

Put it that way and you’d think, well, how often does that ever happen? Maybe some freak process will give you that, but most stuff?

Well, here’s one. The random walk. Set a value. At each step, it can increase or decrease by some fixed value. It’s as likely to increase as to decrease. This is a martingale. And it turns out a lot of stuff is random walks. Or can be processed into random walks. Even if the original walk is unbalanced — say it’s more likely to increase than decrease. Then we can do a transformation, and find a new random variable based on the original. Then that one is as likely to increase as decrease. That one is a martingale.

It’s not just random walks. Poisson processes are things where the chance of something happening is tiny, but it has lots of chances to happen. So this measures things like how many car accidents happen on this stretch of road each week. Or where a couple plants will grow together into a forest, as opposed to lone trees. How often a store will have too many customers for the cashiers on hand. These processes by themselves aren’t often martingales. But we can use them to make a new stochastic process, and that one is a martingale.

Where this all comes to gambling is in stopping times. This is a random variable that’s based on the stochastic process you started with. Its value at each index represents the probability that the random variable in that has reached some particular value by this index. The language evokes a gambler’s decision: when do you stop? There are two obvious stopping times for any game. One is to stop when you’ve won enough money. The other is to stop when you’ve lost your whole stake.

So there is something interesting about a martingale that has bounds. It will almost certainly hit at least one of those bounds, in a finite time. (“Almost certainly” has a technical meaning. It’s the same thing I mean when I say if you flip a fair coin infinitely many times then “almost certainly” it’ll come up tails at least once. Like, it’s not impossible that it doesn’t. It just won’t happen.) And for the gambler? The boundary of “runs out of money” is a lot closer than “makes the house run out of money”.

Oh, if you just want a little payoff, that’s fine. If you’re happy to walk away from the table with a one percent profit? You can probably do that. You’re closer to that boundary than to the runs-out-of-money one. A ten percent profit? Maybe so. Making an unlimited amount of money, like you’d want to live on your gambling winnings? No, that just doesn’t happen.

This gets controversial when we turn from gambling to the stock market. Or a lot of financial mathematics. Look at the value of a stock over time. I write “stock” for my convenience. It can be anything with a price that’s constantly open for renegotiation. Stocks, bonds, exchange funds, used cars, fish at the market, anything. The price over time looks like it’s random, at least hour-by-hour. So how can you reliably make money if the fluctuations of the price of a stock are random?

Well, if I knew, I’d have smaller student loans outstanding. But martingales seem like they should offer some guidance. Much of modern finance builds on not dealing with a stock price varying. Instead, buy the right to buy the stock at a set price. Or buy the right to sell the stock at a set price. This lets you pay to secure a certain profit, or a worst-possible loss, in case the price reaches some level. And now you see the martingale. Is it likely that the stock will reach a certain price within this set time? How likely? This can, in principle, guide you to a fair price for this right-to-buy.

The mathematical reasoning behind that is fine, so far as I understand it. Trouble arises because pricing correctly means having a good understanding of how likely it is prices will reach different levels. Fortunately, there are few things humans are better at than estimating probabilities. Especially the probabilities of complicated situations, with abstract and remote dangers.

So martingales are an interesting corner of mathematics. They apply to purely abstract problems like random walks. Or to good mathematical physics problems like Brownian motion and the diffusion of particles. And they’re lurking behind the scenes of the finance news. Exciting stuff.

Thanks for reading. This and all the other Fall 2019 A To Z posts should be at this link. Yes, I too am amazed to be halfway done; it feels like I’m barely one-fifth of the way done. For Thursday I hope to publish ‘N’. And I am taking nominations for subjects for the letters O through T, at this link.

## The Summer 2017 Mathematics A To Z: Quasirandom numbers

Gaurish, host of, For the love of Mathematics, gives me the excuse to talk about amusement parks. You may want to brace yourself. Yes, this essay includes a picture. It would have included a video if I had enough WordPress privileges for that.

# Quasirandom numbers.

Think of a merry-go-round. Or carousel, if you prefer. I will venture a guess. You might like merry-go-rounds. They’re beautiful. They can evoke happy thoughts of childhood when they were a big ride it was safe to go on. But they don’t often make one think of thrills.. They’re generally sedate things. They don’t need to be. There’s no great secret to making a carousel a thrill ride. They knew it a century ago, when all the great American carousels were carved. It’s simple. Make the thing spin fast enough, at the five or six rotations per minute the ride was made for. There are places that do this yet. There’s the Cedar Downs ride at Cedar Point, Sandusky, Ohio. There’s the antique carousel at Crossroads Village, a historical village/park just outside Flint, Michigan. There’s the Derby Racer at Playland in Rye, New York. There’s the carousel in the Merry-Go-Round Museum in Sandusky, Ohio. Any of them are great rides. Two of them have a special edge. I’ll come back to them.

Randomness is a valuable resource. We know it’s key to many things. We have major fields of mathematics built on it. We can understand the behavior of variables without ever knowing what value they have. All we need is to know than the chance they might be in some particular range. This makes possible all kinds of problems too complicated to do otherwise. We know it’s critical. Quantum mechanics would not work without randomness. Without quantum mechanics, matter doesn’t work. And that’s true randomness, the kind where something is unpredictable. It’s not the kind of randomness we talk about when we ask, say, what’s the chance someone was born on a Tuesday. That’s mere hidden information: if we knew the month and date and year of a person’s birth we would know whether they were born Tuesday or not. We need more.

So the trouble is actually getting a random number. Well, a sequence of randomly drawn numbers. We rarely need this if we’re doing analysis. We can understand how some process changes the shape of a distribution without ever using the distribution. We can take derivatives of a function without ever evaluating the original function, after all.

But we do need randomly drawn numbers. We do too much numerical work with them. For example, it’s impossible to exactly integrate most functions. Numerical methods can take a ferociously long time to evaluate. A family of methods called Monte Carlo rely on randomly-drawn values to estimate the integral. The results are strikingly good for the work required. But they must have random numbers. The name “Monte Carlo” is not some cryptic code. It is an expression of how randomly drawn numbers make the tool work.

It’s hard to get random numbers. Consider: we can’t write an algorithm to do it. If we were to write one, then we’d be able to predict that the sequence of numbers was. We have some recourse. We could set up instruments to rely on the randomness that seems to be in the world. Thermal fluctuations, for example, created by processes outside any computer’s control, can give us a pleasant dose of randomness. If we need higher-quality random numbers than that we can go to exotic equipment. Geiger counters watching the decay of a not-alarmingly-radioactive sample. Cosmic ray detectors watching the sky.

Or we can write something that produces numbers that look random enough. They won’t really be random, and if we wait long enough we’ll notice the sequence repeats itself. But if we only need, say, ten numbers, who cares if the sequence will repeat after ten million numbers? (We’ll surely need more than ten numbers. But we can postpone the repetition until we’ve drawn far more than ten million numbers.)

Two of the carousels I’ve mentioned have an astounding property. The horses in a file move. I mean, relative to each other. Some horse will start the race in front of its neighbors; some will start behind. The four move forward and back thanks to a mechanism of, I am assured, staggering complexity. There are only three carousels in the world that have it. There’s Cedar Downs at Cedar Point in Sandusky, Ohio; the Racing Downs at Playland in Rye, New York; and the Derby Racer at Blackpool Pleasure Beach in Blackpool, England. The mechanism in Blackpool’s hasn’t operated in years. The one at Playland’s had not run in years, but was restored for the 2017 season. My love and I made a trip specifically to ride that. (You may have heard of a fire at the carousel in Playland this summer. This was of part of the building for their other, non-racing, antique carousel. My last information was that the carousel itself was all right.)

These racing derbies have the horses in a file move forward and back in a “random” way. It’s not truly random. If you knew exactly which gears were underneath each horse, and where in their rotations they were, you could say which horse was about to gain on its partners and which was about to fall back. But all that is concealed from the rider. The horse patterns will eventually, someday, repeat. If the gear cycles aren’t interrupted by maintenance or malfunctions. But nobody’s going to ride any horse long enough to notice. We have in these rides a randomness as good as what your computer makes, at least for the purpose it serves.

What does it mean to look random? Some things seem obvious. All the possible numbers ought to come up, sooner or later. Any particular possible number shouldn’t repeat too often. Any particular possible number shouldn’t go too long without repeating. There shouldn’t be clumps of numbers; if, say, ‘4’ turns up, we shouldn’t see ‘5’ turn up right away all the time.

We can make the idea of “looking” random quite literal. Suppose we’re selecting numbers from 0 through 9. We can draw the random numbers we’ve picked. Use the numbers as coordinates. Say we pick four digits: 1, 3, 9, and 0. Then draw the point that’s at x-coordinate 13, y-coordinate 90. Then the next four digits. Let’s say they’re 4, 2, 3, and 8. Then draw the point that’s at x-coordinate 42, y-coordinate 38. And repeat. What will this look like?

If it clumps up, we probably don’t have good random numbers. If we see lines that points collect along, or avoid, there’s a good chance our numbers aren’t very random. If there’s whole blocks of space that they occupy, and others they avoid, we may have a defective source of random numbers. We should expect the points to cover a space pretty uniformly. (There are more rigorous, logically sound, methods. The eye can be fooled easily enough. But it’s the same principle. We have some test that notices clumps and gaps.) But …

The thing is, there’s always going to be some clumps. There’ll always be some gaps. Part of randomness is that it forms patterns, or at least things that look like patterns to us. We can describe how big a clump (or gap; it’s the same thing, really) is for any particular quantity of randomly drawn numbers. If we see clumps bigger than that we can throw out the numbers as suspect. But … still …

Toss a coin fairly twenty times, and there’s no reason it can’t turn up tails sixteen times. This doesn’t happen often, but it will happen sometimes. Just luck. This surplus of tails should evaporate as we take more tosses. That is, we most likely won’t see 160 tails out of 200 tosses. We certainly will not see 1,600 tails out of 2,000 tosses. We know this as the Law of Large Numbers. Wait long enough and weird fluctuations will average out.

What if we don’t have time, though? For coin-tossing that’s silly; of course we have time. But for Monte Carlo integration? It could take too long to be confident we haven’t got too-large gaps or too-tight clusters.

This is why we take quasi-random numbers. We begin with what randomness we’re able to manage. But we massage it. Imagine our coins example. Suppose after ten fair tosses we noticed there had been eight tails turn up. Then we would start tossing less fairly, trying to make heads more common. We would be happier if there were 12 rather than 16 tails after twenty tosses.

Draw the results. We get now a pattern that looks still like randomness. But it’s a finer sorting; it looks like static tidied up some. The quasi-random numbers are not properly random. Knowing that, say, the last several numbers were odd means the next one is more likely to be even, the Gambler’s Fallacy put to work. But in aggregate, we trust, we’ll be able to enjoy the speed and power of randomly-drawn numbers. It shows its strengths when we don’t know just how finely we must sample a range of numbers to get good, reliable results.

To carousels. I don’t know whether the derby racers have quasirandom outcomes. I would find believable someone telling me that all the possible orderings of the four horses in any file are equally likely. To know would demand detailed knowledge of how the gearing works, though. Also probably simulations of how the system would work if it ran long enough. It might be easier to watch the ride for a couple of days and keep track of the outcomes. If someone wants to sponsor me doing a month-long research expedition to Cedar Point, drop me a note. Or just pay for my season pass. You folks would do that for me, wouldn’t you? Thanks.

## Reading the Comics, October 19, 2016: An Extra Day Edition

I didn’t make noise about it, but last Sunday’s mathematics comic strip roundup was short one day. I was away from home and normal computer stuff Saturday. So I posted without that day’s strips under review. There was just the one, anyway.

Also I want to remind folks I’m doing another Mathematics A To Z, and taking requests for words to explain. There are many appealing letters still unclaimed, including ‘A’, ‘T’, and ‘O’. Please put requests in over on that page because. It’s easier for me to keep track of what’s been claimed that way.

Matt Janz’s Out of the Gene Pool rerun for the 15th missed last week’s cut. It does mention the Law of Cosines, which is what the Pythagorean Theorem looks like if you don’t have a right triangle. You still have to have a triangle. Bobby-Sue recites the formula correctly, if you know the notation. The formula’s $c^2 = a^2 + b^2 - 2 a b \cos\left(C\right)$. Here ‘a’ and ‘b’ and ‘c’ are the lengths of legs of the triangle. ‘C’, the capital letter, is the size of the angle opposite the leg with length ‘c’. That’s a common notation. ‘A’ would be the size of the angle opposite the leg with length ‘a’. ‘B’ is the size of the angle opposite the leg with length ‘b’. The Law of Cosines is a generalization of the Pythagorean Theorem. It’s a result that tells us something like the original theorem but for cases the original theorem can’t cover. And if it happens to be a right triangle the Law of Cosines gives us back the original Pythagorean Theorem. In a right triangle C is the size of a right angle, and the cosine of that is 0.

That said Bobby-Sue is being fussy about the drawings. No geometrical drawing is ever perfectly right. The universe isn’t precise enough to let us draw a right triangle. Come to it we can’t even draw a triangle, not really. We’re meant to use these drawings to help us imagine the true, Platonic ideal, figure. We don’t always get there. Mock proofs, the kind of geometric puzzle showing something we know to be nonsense, rely on that. Give chalkboard art a break.

Samson’s Dark Side of the Horse for the 17th is the return of Horace-counting-sheep jokes. So we get a π joke. I’m amused, although I couldn’t sleep trying to remember digits of π out quite that far. I do better working out Collatz sequences.

Hilary Price’s Rhymes With Orange for the 19th at least shows the attempt to relieve mathematics anxiety. I’m sympathetic. It does seem like there should be ways to relieve this (or any other) anxiety, but finding which ones work, and which ones work best, is partly a mathematical problem. As often happens with Price’s comics I’m particularly tickled by the gag in the title panel.

Norm Feuti’s Gil rerun for the 19th builds on the idea calculators are inherently cheating on arithmetic homework. I’m sympathetic to both sides here. If Gil just wants to know that his answers are right there’s not much reason not to use a calculator. But if Gil wants to know that he followed the right process then the calculator’s useless. By the right process I mean, well, the work to be done. Did he start out trying to calculate the right thing? Did he pick an appropriate process? Did he carry out all the steps in that process correctly? If he made mistakes on any of those he probably didn’t get to the right answer, but it’s not impossible that he would. Sometimes multiple errors conspire and cancel one another out. That may not hurt you with any one answer, but it does mean you aren’t doing the problem right and a future problem might not be so lucky.

Zach Weinersmith’s Saturday Morning Breakfast Cereal rerun for the 19th has God crashing a mathematics course to proclaim there’s a largest number. We can suppose there is such a thing. That’s how arithmetic modulo a number is done, for one. It can produce weird results in which stuff we just naturally rely on doesn’t work anymore. For example, in ordinary arithmetic we know that if one number times another equals zero, then either the first number or the second, or both, were zero. We use this in solving polynomials all the time. But in arithmetic modulo 8 (say), 4 times 2 is equal to 0.

And if we recklessly talk about “infinity” as a number then we get outright crazy results, some of them teased in Weinersmith’s comic. “Infinity plus one”, for example, is “infinity”. So is “infinity minus one”. If we do it right, “infinity minus infinity” is “infinity”, or maybe zero, or really any number you want. We can avoid these logical disasters — so far, anyway — by being careful. We have to understand that “infinity” is not a number, though we can use numbers growing infinitely large.

Induction, meanwhile, is a great, powerful, yet baffling form of proof. When it solves a problem it solves it beautifully. And easily, too, usually by doing something like testing two special cases. Maybe three. At least a couple special cases of whatever you want to know. But picking the cases, and setting them up so that the proof is valid, is not easy. There’s logical pitfalls and it is so hard to learn how to avoid them.

Jon Rosenberg’s Scenes from a Multiverse for the 19th plays on a wonderful paradox of randomness. Randomness is … well, unpredictable. If I tried to sell you a sequence of random numbers and they were ‘1, 2, 3, 4, 5, 6, 7’ you’d be suspicious at least. And yet, perfect randomness will sometimes produce patterns. If there were no little patches of order we’d have reason to suspect the randomness was faked. There is no reason that a message like “this monkey evolved naturally” couldn’t be encoded into a genome by chance. It may just be so unlikely we don’t buy it. The longer the patch of order the less likely it is. And yet, incredibly unlikely things do happen. The study of impossibly unlikely events is a good way to quickly break your brain, in case you need one.

## Reading the Comics, October 14, 2015: Shapes and Statistics Edition

It’s been another strong week for mathematics in the comic strips. The 15th particularly was a busy enough day I’m going to move its strips off to the next Reading the Comics group. What we have already lets me talk about shapes, and statistics, and what randomness can do for you.

Carol Lay’s Lay Lines for the 11th of October turns the infinite-monkeys thought-experiment into a contest. It’s an intriguing idea. To have the monkey save correct pages foils the pure randomness that makes the experiment so mind-boggling. However, saving partial successes like correct pages is, essentially, how randomness can be harnessed to do work for us. This is normally in fields known, generally, as Monte Carlo methods, named in honor of the famed casinos.

Suppose you have a problem in which it’s hard to find the best answer, but it’s easy to compare whether one answer is better than another. For example, suppose you’re trying to find the shortest path through a very complicated web of interactions. It’s easy to say how long a path is, and easy to say which of two paths is shorter. It’s hard to say you’ve found the shortest. So what you can do is pick a path at random, and take its length. Then make an arbitrary, random change in it. The changed path is either shorter or longer. If the random change makes the path shorter, great! If the random change makes the path longer, then (usually) forget it. Repeat this process and you’ll get, by hoarding incremental improvements and throwing away garbage, your shortest possible path. Or at least close to it.

Properly, you have to sometimes go along with changes that lengthen the path. It might turn out there’s a really short path you can get to if you start out in an unpromising direction. For a monkey-typing problem such as in the comic, there’s no need for that. You can save correct pages and discard the junk.

Geoff Grogan’s Jetpack Junior for the 12th of October, and after, continues the explorations of a tesseract. The strip uses the familiar idea that a tesseract opens up to a vast, nearly infinite space. I’m torn about whether that’s a fair representation. A four-dimensional hypercube is still a finite (hyper)volume, after all. A four-dimensional cube ten feet on each side contains 10,000 hypercubic feet, not infinitely great a (hyper)volume. On the other hand … well, think of how many two-dimensional squares you could fit in a three-dimensional box. A two-dimensional object has no volume — zero measure, in three-dimensional space — so you could fit anything into it. This may be reasonable but it still runs against my intuition, and my sense of what makes for a fair story premise.

Ernie Bushmiller’s Nancy for the 13th of October, originally printed in 1955, describes a couple geometric objects. I have to give Nancy credit for a description of a sphere that’s convincing, even if it isn’t exactly correct. Even if the bubble-gum bubble Nancy were blowing didn’t have a distortion to her mouth, it still sags under gravity. But it’s easy, at least if you already have an intuitive understanding of spheres, to go from the bubble-gum bubble to the ideal sphere. (Homework question: why does Sluggo’s description of an octagon need to specify “a figure with eight sides and eight angles”? Why isn’t specifying a figure with eight sides, or eight angles, be enough?)

Jon Rosenberg’s Scenes From A Multiverse for the 13th of October depicts a playground with kids who’re well-versed in the problems of statistical inference. A “statistically significant sample size” nearly explains itself. It is difficult to draw reliable conclusions from a small sample, because a small sample can be weird. Generally, the difference between the statistics of a sample and the statistics of the broader population it’s drawn from will be smaller the larger the sample is. There are several courses hidden in that “generally” there.

“Selection bias” is one of the courses hidden in that “generally”. A good sample should represent the population fairly. Whatever’s being measured should appear in the sample about as often as it appears in the population. It’s hard to say that’s so, though, before you know what the population is like. A biased selection over-represents some part of the population, or under-represents it, in some way.

“Confirmation bias” is another of the courses. That amounts to putting more trust in evidence that supports what we want to believe, and in discounting evidence against it. People tend to do this, without meaning to fool themselves or anyone else. It’s easiest to do with ambiguous evidence: is the car really running smoother after putting in more expensive spark plugs? Is the dog actually walking more steadily after taking this new arthritis medicine? Has the TV program gotten better since the old show-runner was kicked out? If these can be quantified in some way, and a complete record made, it’s typically easier to resist confirmation bias. But not everything can be quantified, and even so, differences can be subtle, and demand more research than we can afford.

On the 15th, Scenes From A Multiverse did another strip with some mathematical content. It’s about the question of whether it’s possible to determine whether the universe is a computer simulation. But the same ideas apply to questions like whether there could be a multiverse, some other universe than ours. The questions seem superficially to be unanswerable. There are some enthusiastic attempts, based on what things we might conclude. I suspect that the universe is just too small a sample size to draw any good conclusions from, though.

Dan Thompson’s Brevity for the 14th of October is another anthropomorphized-numerals joke.

## Reading the Comics, May 9, 2015: Trapezoid Edition

And now I get caught up again, if briefly, to the mathematically-themed comic strips I can find. I’ve dubbed this one the trapezoid edition because one happens to mention the post that will outlive me.

Todd Clark’s Lola (May 4) is a straightforward joke. Monty’s given his chance of passing mathematics and doesn’t understand the prospect is grim.

Joe Martin’s Willy and Ethel (May 4) shows an astounding feat of mind-reading, or of luck. How amazing it is to draw a number at random from a range depends on many things. It’s less impressive to pick the right number if there are only three possible answers than it is to pick the right number out of ten million possibilities. When we ask someone to pick a number we usually mean a range of the counting numbers. My experience suggests it’s “one to ten” unless some other range is specified. But the other thing affecting how amazing it is is the distribution. There might be ten million possible responses, but if only a few of them are likely then the feat is much less impressive.

The distribution of a random number is the interesting thing about it. The number has some value, yes, and we may not know what it is, but we know how likely it is to be any of the possible values. And good mathematics can be done knowing the distribution of a value of something. The whole field of statistical mechanics is an example of that. James Clerk Maxwell, famous for the equations which describe electromagnetism, used such random variables to explain how the rings of Saturn could exist. It isn’t easy to start solving problems with distributions instead of particular values — I’m not sure I’ve seen a good introduction, and I’d be glad to pass one on if someone can suggest it — but the power it offers is amazing.

I had been talking about how much information there is in the outcome of basketball games, or tournaments, or the like. I wanted to fill in at least one technical term, to match some of the others I’d given.

In this information-theory context, an experiment is just anything that could have different outcomes. A team can win or can lose or can tie in a game; that makes the game an experiment. The outcomes are the team wins, or loses, or ties. A team can get a particular score in the game; that makes that game a different experiment. The possible outcomes are the team scores zero points, or one point, or two points, or so on up to whatever the greatest possible score is.

If you know the probability p of each of the different outcomes, and since this is a mathematics thing we suppose that you do, then we have what I was calling the information content of the outcome of the experiment. That’s a number, measured in bits, and given by the formula

$\sum_{j} - p_j \cdot \log\left(p_j\right)$

The sigma summation symbol means to evaluate the expression to the right of it for every value of some index j. The pj means the probability of outcome number j. And the logarithm may be that of any base, although if we use base two then we have an information content measured in bits. Those are the same bits as are in the bytes that make up the megabytes and gigabytes in your computer. You can see this number as an estimate of how many well-chosen yes-or-no questions you’d have to ask to pick the actual result out of all the possible ones.

I’d called this the information content of the experiment’s outcome. That’s an idiosyncratic term, chosen because I wanted to hide what it’s normally called. The normal name for this is the “entropy”.

To be more precise, it’s known as the “Shannon entropy”, after Claude Shannon, pioneer of the modern theory of information. However, the equation defining it looks the same as one that defines the entropy of statistical mechanics, that thing everyone knows is always increasing and somehow connected with stuff breaking down. Well, almost the same. The statistical mechanics one multiplies the sum by a constant number called the Boltzmann constant, after Ludwig Boltzmann, who did so much to put statistical mechanics in its present and very useful form. We aren’t thrown by that. The statistical mechanics entropy describes energy that is in a system but that can’t be used. It’s almost background noise, present but nothing of interest.

Is this Shannon entropy the same entropy as in statistical mechanics? This gets into some abstract grounds. If two things are described by the same formula, are they the same kind of thing? Maybe they are, although it’s hard to see what kind of thing might be shared by “how interesting the score of a basketball game is” and “how much unavailable energy there is in an engine”.

The legend has it that when Shannon was working out his information theory he needed a name for this quantity. John von Neumann, the mathematician and pioneer of computer science, suggested, “You should call it entropy. In the first place, a mathematical development very much like yours already exists in Boltzmann’s statistical mechanics, and in the second place, no one understands entropy very well, so in any discussion you will be in a position of advantage.” There are variations of the quote, but they have the same structure and punch line. The anecdote appears to trace back to an April 1961 seminar at MIT given by one Myron Tribus, who claimed to have heard the story from Shannon. I am not sure whether it is literally true, but it does express a feeling about how people understand entropy that is true.

Well, these entropies have the same form. And they’re given the same name, give or take a modifier of “Shannon” or “statistical” or some other qualifier. They’re even often given the same symbol; normally a capital S or maybe an H is used as the quantity of entropy. (H tends to be more common for the Shannon entropy, but your equation would be understood either way.)

I’m not comfortable saying they’re the same thing, though. After all, we use the same formula to calculate a batting average and to work out the average time of a commute. But we don’t think those are the same thing, at least not more generally than “they’re both averages”. These entropies measure different kinds of things. They have different units that just can’t be sensibly converted from one to another. And the statistical mechanics entropy has many definitions that not just don’t have parallels for information, but wouldn’t even make sense for information. I would call these entropies siblings, with strikingly similar profiles, but not more than that.

But let me point out something about the Shannon entropy. It is low when an outcome is predictable. If the outcome is unpredictable, presumably knowing the outcome will be interesting, because there is no guessing what it might be. This is where the entropy is maximized. But an absolutely random outcome also has a high entropy. And that’s boring. There’s no reason for the outcome to be one option instead of another. Somehow, as looked at by the measure of entropy, the most interesting of outcomes and the most meaningless of outcomes blur together. There is something wondrous and strange in that.

## Calculating Pi Terribly

I’m not really a fan of Pi Day. I’m not fond of the 3/14 format for writing dates to start with — it feels intolerably ambiguous to me for the first third of the month — and it requires reading the / as a . to make sense, when that just is not how the slash works. To use the / in any of its normal forms then Pi Day should be the 22nd of July, but that’s incompatible with the normal American date-writing conventions and leaves a day that’s nominally a promotion of the idea that “mathematics is cool” in the middle of summer vacation. This particular objection evaporates if you use . as the separator between month and day, but I don’t like that either, since it uses something indistinguishable from a decimal point as something which is not any kind of decimal point.

Also it encourages people to post a lot of pictures of pies, and make jokes about pies, and that’s really not a good pun. It plays on the coincidence of sounds without having any of the kind of ambiguity or contrast between or insight into concepts that normally make for the strongest puns, and it hasn’t even got the spontaneity of being something that just came up in conversation. We could use better jokes is my point.

But I don’t want to be relentlessly down about what’s essentially a bit of whimsy. (Although, also, dropping the ’20’ from 2015 so as to make this the Pi Day Of The Century? Tom Servo has a little song about that sort of thing.) So, here’s a neat and spectacularly inefficient way to generate the value of pi, that doesn’t superficially rely on anything to do with circles or diameters, and that’s probability-based. The wonderful randomness of the universe can give us a very specific and definite bit of information.

## At The Pinball Tables

A neat coincidence happened as our local pinball league got plans under way for tonight. There are thirteen pinball machines in the local venue, and normally four of them get picked for the night’s competition. The league president’s gone to a randoom number generator to pick the machines, since this way he doesn’t have to take off his hat and draw pinball table names from it. This week, though, he reported that the random number generator had picked the same four tables as it had last session.

There’s a decent little probability quiz to be built around that fact: how many ways there are to get four tables out of the thirteen available, obviously, and from that what the chance is of repeating the selection of tables from the last session. And there are subtler ones, like, what’s the chance of the same tables being drawn two weeks in a row over the course of the season (which is eight meetings long, and one postseason tournament), or what’s the chance of any week’s selection of tables being repeated over the course of a season, or of a year (which has two seasons). And I leave some space below for people who want to work out these problems or figure out similar related ones.

It’s also a reminder that just because something is randomly drawn doesn’t mean that coincidences and patterns won’t appear. It would be a touch suspicious, in fact, if the random number generator never picked the same table (or several tables) in successive weeks. But it’s still a rare enough event that it’s interesting to see it happen.

## Reading The Comics, November 14, 2014: Rectangular States Edition

I have no idea why Comic Strip Master Command decided this week should see everybody do some mathematics-themed comic strips, but, so they did, and here’s my collection of the, I estimate, six hundred comic strips that touched on something recently. Good luck reading it all.

Samsons Dark Side of the Horse (November 10) is another entry on the theme of not answering the word problem.

Scott Adams’s Dilbert Classics (November 10) started a sequence in which Dilbert gets told the big boss was a geometry major, so, what can he say about rectangles? Further rumors indicate he’s more a geography fan, shifting Dilbert’s topic to the “many” rectangular states of the United States. Of course, there’s only two literally rectangular states, but — and Mark Stein’s How The States Got Their Shapes contains a lot of good explanations of this — many of the states are approximately rectangular. After all, when many of the state boundaries were laid out, the federal government had only vague if any idea what the landscapes looked like in detail, and there weren’t many existing indigenous boundaries the white governments cared about. So setting a proposed territory’s bounds to be within particular lines of latitude and longitude, with some modification for rivers or shorelines or mountain ranges known to exist, is easy, and can be done with rather little of the ambiguity or contradictory nonsense that plagued the eastern states (where, say, a colony’s boundary might be defined as where a river intersects a line of latitude that in fact it never touches). And while perfect rectangularity may be achieved only by Colorado and Wyoming, quite a few states — the Dakotas, Washington, Oregon, Missisippi, Alabama, Iowa — are rectangular enough.

Mikael Wulff and Anders Morgenthaler’s WuMo (November 10) shows that their interest in pi isn’t just a casual thing. They think about what those neglected and non-famous numbers get up to.

Jim Toomey’s Sherman’s Lagoon starts a “struggling with mathematics homework” story on the 11th, with Sherman himself stumped by a problem that “looks more like a short story” than a math problem. By the 14th Megan points out that it’s a problem that really doesn’t make sense when applied to sharks. Such is the natural hazard in writing a perfectly good word problem without considering the audience.

Mike Peters’s Mother Goose and Grimm (November 12) takes one of its (frequent) breaks from the title characters for a panel-strip-style gag about Roman numerals.

Darrin Bell’s Candorville (November 12) starts talking about Zeno’s paradox — not the first time this month that a comic strip’s gotten to the apparent problem of covering any distance when distance is infinitely divisible. On November 13th it’s extended to covering stretches of time, which has exactly the same problem. Now it’s worth reminding people, because a stunning number of them don’t seem to understand this, that Zeno was not suggesting that there’s no such thing as motion (or that he couldn’t imagine an infinite convergent sequence; it’s easy to think of a geometric construction that would satisfy any ancient geometer); he was pointing out that there’s things that don’t make perfect sense about it. Either distance (and time) are infinitely divisible into indistinguishable units, or they are not; and either way has implications that seem contrary to the way motion works. Perhaps they can be rationalized; perhaps they can’t; but when you can find a question that’s easy to pose and hard to answer, you’re probably looking at something really worth thinking hard about.

Bill Amend’s FoxTrot Classics (November 12, a rerun) puns on the various meanings of “irrational”. A fun little fact you might want to try proving sometime, though I wouldn’t fault you if you only tried it out for a couple specific numbers and decided the general case too much to do: any whole number — like 2, 3, 4, or so on — has a square root that’s either another whole number, or else has a square root that’s irrational. There’s not a case where, say, the square root is exactly 45.144 or something like that, though it might be close.

Sandra Bell-Lundy’sBetween Friends (November 13) shows one of those cases where mental arithmetic really is useful, as Susan tries to work out — actually, staring at it, I’m not precisely sure what she is trying to work out. Her and her coffee partner’s ages in Grade Ten, probably, or else just when Grade Ten was. That’s most likely her real problem: if you don’t know what you’re looking for it’s very difficult to find it. Don’t start calculating before you know what you’re trying to work out.

If I wanted to work out what year was 35 years ago I’d probably just use a hack: 35 years before 2014 is one year before “35 years before 2015”, which is a much easier problem to do. 35 years before 2015 is also 20 years before 2000, which is 1980, so subtract one and you get 1979. (Alternatively, I might remember it was 35 years ago that the Buggles’ “Video Killed The Radio Star” first appeared, which I admit is not a method that would work for everyone, or for all years.) If I wanted to work out my (and my partner’s) age in Grade Ten … well, I’d use a slightly different hack: I remember very well that I was ten years old in Grade Five (seriously, the fact that twice my grade was my age overwhelmed my thinking on my tenth birthday, which is probably why I had to stay in mathematics), so, add five to that and I’d be 15 in Grade Ten.

Bill Whitehead’s Free Range (November 13) brings up one of the most-quoted equations in the world in order to show off how kids will insult each other, which is fair enough.

Rick Detorie’s One Big Happy (November 13), this one a rerun from a couple years ago because that’s how his strip works on Gocomics, goes to one of its regular bits of the kid Ruthie teaching anyone she can get in range, and while there’s a bit more to arithmetic than just adding two numbers to get a bigger number, she is showing off an understanding of a useful sanity check: if you add together two (positive) numbers, you have to get a result that’s bigger than either of the ones you started with. As for the 14th, and counting higher, well, there’s not much she could do about that.

Steve McGarry’s Badlands (November 14) talks about the kind of problem people wish to have: how to win a lottery where nobody else picks the same numbers, so that the prize goes undivided? The answer, of course, is to have a set of numbers that nobody else picked, but is there any way to guarantee that? And this gets into the curious psychology of random numbers: there is absolutely no reason that 1-2-3-4-5-6, or for that matter 7-8-9-10-11-12, would not come up just as often as, say, 11-37-39-51-52-55, but the latter set looks more random. But we see some strings of numbers as obviously a pattern, while others we don’t see, and we tend to confuse “we don’t know the pattern” with “there is no pattern”. I have heard the lore that actually a disproportionate number of people pick such obvious patterns like 1-2-3-4-5-6, or numbers that form neat pictures on a lottery card, no doubt cackling at how much more clever they are than the average person, and guaranteeing that if such a string ever does come out there’ll a large number of very surprised lottery winners. All silliness, really; the thing to do, obviously, is buy two tickets with the exact same set of numbers, so that if you do win, you get twice the share of anyone else, unless they’ve figured out the same trick.