What I Learned Doing My 2018 Mathematics A To Z


I have a tradition at the end of an A To Z sequence of looking back and considering what I learned doing it. Sometimes this is mathematics I’ve learned. At the risk of spoiling the magic, I don’t know as much as I present myself as knowing. I’ll often take an essay topic and study up before writing, and hope that I look competent enough that nobody seriously questions me. Yes, I thought I was a pretty good student journalist back in the day, and harbored fantasies of doing that for a career. This before I went on to fantasize about doing mathematics. Still; I also learn things about writing in doing a big writing project like this. And now I’ve had some breathing space to sit and think. I can try finding out what I thought.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

The major differences between this A To Z and those of past years was scheduling. Through 2017 I’d done A To Z’s posted three days a week. This is a thrilling schedule. It makes it easy to have full weeks, even full months, with some original posting every day. I can tell myself that the number of posts doesn’t matter. It’s the quality that does. It doesn’t work. I tend toward compulsive behavior and a-post-a-day is so gratifying.

But I also knew that the last quarter of 2018 would be busy and I had to cut something down. Some of that is Reading the Comics posts. Including pictures of each comic I discuss adds considerably to the production time. This is not least because I feel I can’t reasonably claim Fair Use of the comics without writing something of more substance. Moving files around and writing alt-text for the images also takes time. But the time can be worth it. Doing that has sometimes made me think longer, or even better, about the comics.

So switching to two-a-week seemed the thing to do. It would spread the A To Z over three months, but that’s not bad. I figured to prove out whether that schedule worked. If I could find one that let me do possibly several A To Z sequences in a year that’d be wonderful too. I’ve only done that once, so far, in 2016, but I think the exercise is always good if exhausting for me.

This completely failed to save time. I had fewer things to write, but somehow, I only wrote more. All my writing’s getting longer as I go on, yes. Last year my average post exceeded a thousand words, though, and that counts Reading the Comics and pointers to things I’ve been reading and all that. There’s a similar steady expansion going on in my humor blog. Possibly having more time between essays encouraged me to write longer for each. Work famously expands to fit the time available, and having as many as three full days to write, rather than one, might be dangerous. For the first time in an A To Z I never got ahead of myself. I would, at best, be researching and making notes for the next essay while waiting for the current one to post. There’s value in dangerous writing. But I don’t like that as a habit.

Another scheduling change was in how I took topic suggestions. In the past I’d thrown the whole alphabet open at once. This time I broke the alphabet into a couple of pieces, and asked for about one-quarter at a time. This, overall, worked. For one, it gave me more chances to talk about the A To Z. Talking about something is one of the non-annoying ways to advertise a thing. And I think it helped a greater variety of people suggest topics. I did have more collisions this time around, letters for which several people suggested different ideas. That’s a happy situation to have. Thinking of what to write is the hard part; going on about a topic someone else named? That’s easy. So I’ll certainly keep that.

I did write some about maybe doing supplementary pieces, based on topics I didn’t use for the main line. Might yet do that, perhaps under rules where I do one a week, or limit myself to 700 words, or something like that. It might be worth doing a couple just to have a buffer against weeks that there’s no comic strips worth discussing. Or to head off gaps next time around, although that would spoil some letters for people.

Also I completely ran out of ‘X’ topics, and went with the 90s alternative of “extreme”. There are plenty of “extreme” things in mathematics I could write about. But that feels a bit chintzy to do too often.

This time around I changed focus on many of my essays. It wasn’t a conscious thing, not to start with. But I got to writing more about the meaning and significance and cultural import of topics, rather than definitions or descriptions of the use of things. This is a natural direction to go for a topic like Fermat’s Last Theorem, or the Infinite Monkey Theorem, or mathematics jokes. I liked the way those pieces turned out, though, and tried doing more of it. This likely helped the essays grow so long. Context demands space, after all. And more thinking. Thinking’s the hard part of writing, but it’s also fun, because when you’re thinking about a subject you aren’t typing any specific words.

But it’s probably a worthwhile shift. For a pop mathematics blog to describe what makes something a “smooth” function is all right. But it’s not a story unless it says why we should care. That’s more about context than about definitions, which anyone could get by typing ‘mathworld smooth’ into DuckDuckGo. For all the trouble this causes me, it’s the way to go.

Every good lesson carries its opposite along, though. One of the requests this time around was about Lord Kelvin. There’s no end of things you can write about him: he did important work in basically every field of science and mathematics as the 19th century knew it. It’s easy to start writing about his work and never stop. I did the opposite, taking one tiny and often-overlooked piece and focusing on that. I’m not sure it alone would convince anyone of Kelvin’s exhaustive greatness. But I don’t imagine anyone interested in reading a single essay on Kelvin would never read a second one. It seems to me a couple narrow-focus essays help in that context. Seeing more of one detail gives scale to the big picture.

I’ve done just the one A To Z the last two years. There’s surely an optimal rate for doing these. The sequences are usually good for my readership. My experience, tracking monthly readership figures, suggests that just posting more often is good for my readership. They’re also the thing I write that most directly solicits reader responses. They’re also exhausting. The last several letters are always a challenge. The weeks after a sequence is completed I collapse into a little recuperative bubble. So I want to do these as much as I can without burning out on the idea. Also without overloading Thomas Dye, who’s been so good as to make the snappy banners for these pieces. He has his own projects, including the web comic Projection Edge, to worry about. More than once a year is probably sustainable. I may also want to stack this with hosting the Playful Mathematics Education Blog Carnival again, if I’m able to this year.

Deep down, though, I think the best moment of my Fall 2018 A To Z might have been in the first. I wrote about asymptotes and realized I could put in ordinary words why they were a thing worth having. If I could have three insights like that a year I’d be a great mathematics writer.

I put a roster of things written up in this A To Z at this page. The Summer 2015 A To Z essays should be here. The essays from the Leap Day 2016 A To Z essays are at this link. The essays from the End 2016 A To Z essays are here. Those from the Summer 2017 A To Z sequence are at this link. And I should keep using the A-To-Z tag, so all of these, and any future A To Z essays, should appear at this link. Thank you for reading along.

What I Wrote About in My 2018 Mathematics A To Z


I have reached the end! Thirteen weeks at two essays per week to describe a neat sampling of mathematics. I hope to write a few words about what I learned by doing all this. In the meanwhile, though, I want to gather together the list of all the essays I did put into this project.

My 2018 Mathematics A To Z: Zugzwang


My final glossary term for this year’s A To Z sequence was suggested by aajohannas, who’d also suggested “randomness” and “tiling”. I don’t know of any blogs or other projects they’re behind, but if I do hear, I’ll pass them on.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Zugzwang.

Some areas of mathematics struggle against the question, “So what is this useful for?” As though usefulness were a particular merit — or demerit — for a field of human study. Most mathematics fields discover some use, though, even if it takes centuries. Others are born useful. Probability, for example. Statistics. Know what the fields are and you know why they’re valuable.

Game theory is another of these. The subject, as often happens, we can trace back centuries. Usually as the study of some particular game. Occasionally in the study of some political science problem. But game theory developed a particular identity in the early 20th century. Some of this from set theory experts. Some from probability experts. Some from John von Neumann, because it was the 20th century and all that. Calling it “game theory” explains why anyone might like to study it. Who doesn’t like playing games? Who, studying a game, doesn’t want to play it better?

But why it might be interesting is different from why it might be important. Think of what a game is. It is a string of choices made by one or more parties. The point of the choices is to achieve some goal. Put that way you realize: this is everything. All life is making choices, all in the pursuit of some goal, even if that goal is just “not end up any worse off”. I don’t know that the earliest researchers in game theory as a field realized what a powerful subject they had touched on. But by the 1950s they were doing serious work in strategic planning, and by 1964 were even giving us Stanley Kubrick movies.

This is taking me away from my glossary term. The field of games is enormous. If we narrow the field some we can discuss specific kinds of games. And say more involved things about these games. So first we’ll limit things by thinking only of sequential games. These are ones where there are a set number of players, and they take turns making choices. I’m not sure whether the field expects the order of play to be the same every time. My understanding is that much of the focus is on two-player games. What’s important is that at any one step there’s only one party making a choice.

The other thing narrowing the field is to think of information. There are many things that can affect the state of the game. Some of them might be obvious, like where the pieces are on the game board. Or how much money a player has. We’re used to that. But there can be hidden information. A player might conceal some game money so as to make other players underestimate her resources. Many card games have one or more cards concealed from the other players. There can be information unknown to any party. No one can make a useful prediction what the next throw of the game dice will be. Or what the next event card will be.

But there are games where there’s none of this ambiguity. These are called games with “perfect information”. In them all the players know the past moves every player has made. Or at least should know them. Players are allowed to forget what they ought to know.

There’s a separate but similar-sounding idea called “complete information”. In a game with complete information, players know everything that affects the gameplay. At least, probably, apart from what their opponents intend to do. This might sound like an impossibly high standard, at first. All games with shuffled decks of cards and with dice to roll are out. There’s no concealing or lying about the state of affairs.

Set complete-information aside; we don’t need it here. Think only of perfect-information games. What are they? Some ancient games, certainly. Tic-tac-toe, for example. Some more modern versions, like Connect Four and its variations. Some that are actually deep, like checkers and chess and go. Some that are, arguably, more puzzles than games, as in sudoku. Some that hardly seem like games, like several people agreeing how to cut a cake fairly. Some that seem like tests to prove people are fundamentally stupid, like when you auction off a dollar. (The rules are set so players can easily end up paying more then a dollar.) But that’s enough for me, at least. You can see there are games of clear, tangible interest here.

The last restriction: think only of two-player games. Or at least two parties. Any of these two-party sequential games with perfect information are a part of “combinatorial game theory”. It doesn’t usually allow for incomplete-information games. But at least the MathWorld glossary doesn’t demand they be ruled out. So I will defer to this authority. I’m not sure how the name “combinatorial” got attached to this kind of game. My guess is that it seems like you should be able to list all the possible combinations of legal moves. That number may be enormous, as chess and go players are always going on about. But you could imagine a vast book which lists every possible game. If your friend ever challenged you to a game of chess the two of you could simply agree, oh, you’ll play game number 2,038,940,949,172 and then look up to see who won. Quite the time-saver.

Most games don’t have such a book, though. Players have to act on what they understand of the current state, and what they think the other player will do. This is where we get strategies from. Not just what we plan to do, but what we imagine the other party plans to do. When working out a strategy we often expect the other party to play perfectly. That is, to make no mistakes, to not do anything that worsens their position. Or that reduces their chance of winning.

… And yes, arguably, the word “chance” doesn’t belong there. These are games where the rules are known, every past move is known, every future move is in principle computable. And if we suppose everyone is making the best possible move then we can imagine forecasting the whole future of the game. One player has a “chance” of winning in the same way Christmas day of the year 2038 has a “chance” of being on a Tuesday. That is, the probability is just an expression of our ignorance, that we don’t happen to be able to look it up.

But what choice do we have? I’ve never seen a reference that lists all the possible games of tic-tac-toe. And that’s about the simplest combinatorial-game-theory game anyone might actually play. What’s possible is to look at the current state of the game. And evaluate which player seems to be closer to her goal. And then look at all the possible moves.

There are three things a move can do. It can put the party closer to the goal. It can put the party farther from the goal. Or it can do neither. On her turn the other party might do something that moves you farther from your goal, moves you closer to your goal, or doesn’t affect your status at all. It seems like this makes strategy obvious. On every step take the available move that takes one closest to the goal. This is known as a “greedy” strategy. As the name suggests it isn’t automatically bad. If you expect the game to be a short one, greed might be the best approach. The catch is that moves that seem less good — even ones that seem to hurt you initially — might set up other, even better moves. So strategy requires some thinking beyond the current step. Properly, it requires thinking through to the end of the game. Or at least until the end of the game seems obvious.

We should like a strategy that leaves us no choice but to win. Next-best would be one that leaves the game undecided, since something might happen like the other player needing to catch a bus and so resigning. This is how I got my solitary win in the two months I spent in the college chess club. Worst would be the games that leave us no choice but to lose.

It can be that there are no good moves. That is, that every move available makes it a little less likely that we win. Sometimes a game offers the chance to pass, preserving the state of the game but giving the other party the turn. Then maybe the other party will do something that creates a better opportunity for us. But if we are allowed to pass, there’s a good chance the game lets the other party pass, too, and we end up in the same fix. And it may be the rules of the game don’t allow passing anyway. One must move.

The phenomenon of having to make a move when it’s impossible to make a good move has prominence in chess. I don’t have the chess knowledge to say how common the situation is. But it seems to be a situation people who study chess problems love. I suppose it appeals to a love of lost causes and the hope that you can be brilliant enough to see what everyone else has overlooked. German chess literates gave it a name 160 years ago, “zugzwang”, “compulsion to move”. Somehow I never encountered the term when I was briefly a college chess player. Perhaps because I was never in zugzwang and was just too incompetent a player to find my good moves. I first encountered the term in Michael Chabon’s The Yiddish Policeman’s Union. The protagonist picked up on the term as he investigated the murder of a chess player and then felt himself in one.

Combinatorial game theorists have picked up the word, and sharpened its meaning. If I understand correctly chess players allow the term to be used for any case where a player hurts her position by moving at all. Game theorists make it more dire. This may reflect their knowledge that an optimal strategy might require taking some dismal steps along the way. The game theorist formally grants the term only to the situation where the compulsion to move changes what should be a win into a loss. This seems terrible, but then, we’ve all done this in play. We all feel terrible about it.

I’d like here to give examples. But in searching the web I can find only either courses in game theory. These are a bit too much for even me to sumarize. Or chess problems, which I’m not up to understanding. It seems hard to set out an example: I need to not just set out the game, but show that what had been a win is now, by any available move, turned into a loss. Chess is looser. It even allows, I discover, a double zugzwang, where both players are at a disadvantage if they have to move.

It’s a quite relatable problem. You see why game theory has this reputation as mathematics that touches all life.


And with that … I am done! All of the Fall 2018 Mathematics A To Z posts should be at this link. Next week I’ll post my big list of all the letters, though. And, as has become tradition, a post about what I learned by doing this project. And sometime before then I should have at least one more Reading the Comics post. Thanks kindly for reading and we’ll see when in 2019 I feel up to doing another of these.

My 2018 Mathematics A To Z: Yamada Polynomial


I had another free choice. I thought I’d go back to one of the topics I knew and loved in grad school even though I didn’t have the time to properly study it then. It turned out I had forgotten some important points and spent a night crash-relearning knot theory. This isn’t a bad thing necessarily.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Yamada Polynomial.

This is a thing which comes from graphs. Not the graphs you ever drew in algebra class. Graphs as in graph theory. These figures made of spots called vertices. Pairs of vertices are connected by edges. There’s many interesting things to study about these.

One path to take in understanding graphs is polynomials. Of course I would bring things back to polynomials. But there’s good reasons. These reasons come to graph theory by way of knot theory. That’s an interesting development since we usually learn graph theory before knot theory. But knot theory has the idea of representing these complicated shapes as polynomials.

There are a bunch of different polynomials for any given graph. The oldest kind, the Alexander Polynomial, J W Alexander developed in the 1920s. And that was about it until the 1980s when suddenly everybody was coming up with good new polynomials. The definitions are different. They give polynomials that look different. Some are able to distinguish between a knot and the knot that’s its reflection across a mirror. Some, like the Alexander aren’t. But they’re common in some important ways. One is that they might not actually be, you know, polynomials. I mean, they’ll be the sum of numbers — whole numbers, even — times a variable raised to a power. The variable might be t, might be x. Might be something else, but it doesn’t matter. It’s a pure dummy variable. But the variable might be raised to a negative power, which isn’t really a polynomial. It might even be raised to, oh, one-half or three-halves, or minus nine-halves, or something like that. We can try saying this is “a polynomial in t-to-the-halves”. Mostly it’s because we don’t have a better name for it.

And going from a particular knot to a polynomial follows a pretty common procedure. At least it can, when you’re learning knot theory and feel a bit overwhelmed trying to prove stuff about “knot invariants” and “homologies” and all. Having a specific example can be such a comfort. You can work this out by an iterative process. Take a specific drawing of your knot. There’s places where the strands of the knot cross over one another. For each of those crossings you ponder some alternate cases where the strands cross over in a different way. And then you add together some coefficient times the polynomial of this new, different knot. The coefficient you get by the rules of whatever polynomial you’re making. The new, different knots are, usually, no more complicated than what you started with. They’re often simpler knots. This is what saves you from an eternity of work. You’re breaking the knot down into more but simpler knots. Just the fact of doing that can be satisfying enough. Eventually you get to something really simple, like a circle, and declare that’s some basic polynomial. Then there’s a lot bit of adding up coefficients and powers and all that. Tedious but not hard.

Knots are made from a continuous loop of … we’ll just call it thread. It can fold over itself many times. It has to, really, or it hasn’t got a chance of being more interesting than a circle. A graph is different. That there are vertices seems to change things. Less than you’d think, though. The thread of a knot can cross over and under itself. Edges of a graph can cross over and under other edges. This isn’t too different. We can also imagine replacing a spot where two edges cross over and under the other with an intersection and new vertex.

So we get to the Yamada polynomial by treating a graph an awful lot like we might treat a knot. Take the graph and split it up at each overlap. At each overlap we have something that looks, at least locally, kind of like an X. An upper left, upper right, lower left, and lower right intersection. The lower left connects to the upper right, and the upper left connects to the lower right. But these two edges don’t actually touch; one passes over the other. (By convention, the lower left going to the upper right is on top.)

There’s three alternate graphs. One has the upper left connected to the lower left, and the upper right connected to the lower right. This looks like replacing the X with a )( loop. The second alternate has the upper left connected to the upper right, and the lower left connected to the lower right. This looks like … well, that )( but rotated ninety degrees. I can’t do that without actually including a picture. The third alternate puts a vertex in the X. So now the upper left, upper right, lower left, and lower right all connect to the new vertex in the center.

Probably you’d agree that replacing the original X with a )( pattern, or its rotation, probably doesn’t make the graph any more complicated. And it might make the graph simpler. But adding that new vertex looks like trouble. It looks like it’s getting more complicated. We might get stuck in an infinite regression of more-complicated polynomials.

What saves us is the coefficient we’re multiplying the polynomials for these new graphs by. It’s called the “chromatic coefficient” and it reflects how many different colors you need to color in this graph. An edge needs to connect two different colors. And — what happens if an edge connects a vertex to itself? That is, the edge loops around back to where it started? That’s got a chromatic number of zero and the moment we get a single one of these loops anywhere in our graph we can stop calculating. We’re done with that branch of the calculations. This is what saves us.

There’s a catch. It’s a catch that knot polynomials have, too. This scheme writes a polynomial not just for a particular graph but a particular way of rendering this graph. There’s always other ways to draw it. If nothing else you can always twirl a edge over itself, into a loop like you get when Christmas tree lights start tangling themselves up. But you can move the vertices to different places. You can have an edge go outside the rest of the figure instead of inside, that sort of thing. Starting from a different rendition of the shape gets you to a different polynomial.

Superficially different, anyway. What you get from two different renditions of the same graph are polynomials different by your dummy variable raised to a whole number. Also maybe a plus-or-minus sign. You can see a difference between, say, t^{-1} - 2 + 3t (to make up an example) and t - 2t^2 + 3t^3 . But you can see that second polynomial is just t^2\left(t^{-1} - 2 + 3t\right) . It’s some confounding factor times something that is distinctive to the graph.

And that distinctive part, the thing that doesn’t change if you draw the graph differently? That’s the Yamada polynomial, at last. It’s a way to represent this collection of vertices and edges using only coefficients and exponents.

I would like to give an impressive roster of uses for these polynomials here. I’m afraid I have to let you down. There is the obvious use: if you suspect two graphs are really the same, despite how different they look, here’s a test. Calculate their Yamada polynomials and if they’re different, you know the graphs were different. It can be hard to tell. Get anything with more than, say, eight vertices and 24 edges in it and you’re not going to figure that out by sight.

I encountered the Yamada polynomial specifically as part of a textbook chapter about chemistry. It’s easy to imagine there should be great links between knots and graphs and the way that atoms bundle together into molecules. The shape of their structures describes what they will do. But I am not enough of a chemist to say how this description helps chemists understand molecules. It’s possible that it doesn’t: Yamada’s paper introducing the polynomial was published in 1989. My knot theory textbook might have brought it up because it looked exciting. There are trends and fashions in mathematical thought too. I don’t know what several more decades of work have done to the polynomial’s reputation. I’m glad to hear from people who know better.


There’s one more term in the Fall 2018 Mathematics A To Z to come. Will I get the article about it written before Friday? We’ll know on Saturday! At least I don’t have more Reading the Comics posts to write before Sunday.

My 2018 Mathematics A To Z: Extreme Value Theorem


The letter ‘X’ is a problem. For all that the letter ‘x’ is important to mathematics there aren’t many mathematical terms starting with it. Mr Wu, mathematics tutor and author of the MathTuition88 blog, had a suggestion. Why not 90s it up a little and write about an Extreme theorem? I’m game.

The Extreme Value Theorem, which I chose to write about, is a fundamental bit of analysis. There is also a similarly-named but completely unrelated Extreme Value Theory. This exists in the world of statistics. That’s about outliers, and about how likely it is you’ll find an even more extreme outlier if you continue sampling. This is valuable in risk assessment: put another way, it’s the question of what neighborhoods you expect to flood based on how the river’s overflowed the last hundred years. Or be in a wildfire, or be hit by a major earthquake, or whatever. The more I think about it the more I realize that’s worth discussing too. Maybe in the new year, if I decide to do some A To Z extras.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Extreme Value Theorem.

There are some mathematical theorems which defy intuition. You can encounter one and conclude that can’t be so. This can inspire one to study mathematics, to understand how it could be. Famously, the philosopher Thomas Hobbes encountered the Pythagorean Theorem and disbelieved it. He then fell into a controversial love with the subject. Some you can encounter, and study, and understand, and never come to believe. This would be the Banach-Tarski Paradox. It’s the realization that one can split a ball into as few as five pieces, and reassemble the pieces, and have two complete balls. They can even be wildly larger or smaller than the one you started with. It’s dazzling.

And then there are theorems that seem the opposite. Ones that seem so obvious, and so obviously true, that they hardly seem like mathematics. If they’re not axioms, they might as well be. The extreme value theorem is one of these.

It’s a theorem about functions. Here, functions that have a domain and a range that are both real numbers. Even more specifically, about continuous functions. “Continuous” is a tricky idea to make precise, but we don’t have to do it. A century of mathematicians worked out meanings that correspond pretty well to what you’d imagine it should mean. It means you can draw a graph representing the function without lifting the pen. (Do not attempt to use this definition at your thesis defense. I’m skipping what a century’s worth of hard thinking about the subject.)

And it’s a theorem about “extreme” values. “Extreme” is a convenient word. It means “maximum or minimum”. We’re often interested in the greatest or least value of a function. Having a scheme to find the maximum is as good as having one to find a minimum. So there’s little point talking about them as separate things. But that forces us to use a bunch of syllables. Or to adopt a convention that “by maximum we always mean maximum or minimum”. We could say we mean that, but I’ll bet a good number of mathematicians, and 95% of mathematics students, would forget the “or minimum” within ten minutes. “Extreme”, then. It’s short and punchy and doesn’t commit us to a maximum or a minimum. It’s simply the most outstanding value we can find.

The Extreme Value Theorem doesn’t help us find them. It only proves to us there is an extreme to find. Particularly, it says that if a continuous function has a domain that’s a closed interval, then it has to have a maximum and a minimum. And it has to attain the maximum and the minimum at least once each. That is, something in the domain matches to the maximum. And something in the domain matches to the minimum. Could be multiple times, yes.

This might not seem like much of a theorem. Existence proofs rarely do. It’s a bias, I suppose. We like to think we’re out looking for solutions. So we suppose there’s a solution to find. Checking that there is an answer before we start looking? That seems excessive. Before heading to the airport we might check the flight wasn’t delayed. But we almost never check that there is still a Newark to fly to. I’m not sure, in working out problems, that we check it explicitly. We decide early on that we’re working with continuous functions and so we can try out the usual approaches. That we use the theorem becomes invisible.

And that’s sort of the history of this theorem. The Extreme Value Theorem, for example, is part of how we now prove Rolle’s Theorem. Rolle’s theorem is about functions continuous and differentiable on the interval from a to b. And functions that have the same value for a and for b. The conclusion is the function hass got a local maximum or minimum in-between these. It’s the theorem depicted in that xkcd comic you maybe didn’t check out a few paragraphs ago. Rolle’s Theorem is named for Michael Rolle, who proved the theorem (for polynomials) in 1691. The Indian mathematician Bhaskara II, in the 12th century, stated the theorem too. (I’m so ignorant of the Indian mathematical tradition that I don’t know whether Bhaskara II stated it for polynomials, or for functions in general, or how it was proved.)

The Extreme Value Theorem was proven around 1860. (There was an earlier proof, by Bernard Bolzano, whose name you’ll find all over talk about limits and functions and continuity and all. But that was unpublished until 1930. The proofs known about at the time were done by Karl Weierstrass. His is the other name you’ll find all over talk about limits and functions and continuity and all. Go on, now, guess who it was proved the Extreme Value Theorem. And guess what theorem, bearing the name of two important 19th-century mathematicians, is at the core of proving that. You need at most two chances!) That is, mathematicians were comfortable using the theorem before it had a clear identity.

Once you know that it’s there, though, the Extreme Value Theorem’s a great one. It’s useful. Rolle’s Theorem I just went through. There’s also the quite similar Mean Value Theorem. This one is about functions continuous and differentiable on an interval. It tells us there’s at least one point where the derivative is equal to the mean slope of the function on that interval. This is another theorem that’s a quick proof once you have the Extreme Value Theorem. Or we can get more esoteric. There’s a technique known as Lagrange Multipliers. It’s a way to find where on a constrained surface a function is at its maximum or minimum. It’s a clever technique, one that I needed time to accept as a thing that could possibly work. And why should it work? Go ahead, guess what the centerpiece of at least one method of proving it is.

Step back from calculus and into real analysis. That’s the study of why calculus works, and how real numbers work. The Extreme Value Theorem turns up again and again. Like, one technique for defining the integral itself is to approximate a function with a “stepwise” function. This is one that looks like a pixellated, rectangular approximation of the function. The definition depends on having a stepwise rectangular approximation that’s as close as you can get to a function while always staying less than it. And another stepwise rectangular approximation that’s as close as you can get while always staying greater than it.

And then other results. Often in real analysis we want to know about whether sets are closed and bounded. The Extreme Value Theorem has a neat corollary. Start with a continuous function with domain that’s a closed and bounded interval. Then, this theorem demonstrates, the range is also a closed and bounded interval. I know this sounds like a technical point. But it is the sort of technical point that makes life easier.

The Extreme Value Theorem even takes on meaning when we don’t look at real numbers. We can rewrite it in topological spaces. These are sets of points for which we have an idea of a “neighborhood” of points. We don’t demand that we know what distance is exactly, though. What had been a closed and bounded interval becomes a mathematical construct called a “compact set”. The idea of a continuous function changes into one about the image of an open set being another open set. And there is still something recognizably the Extreme Value Theorem. It tells us about things called the supremum and infimum, which are slightly different from the maximum and minimum. Just enough to confuse the student taking real analysis the first time through.

Topological spaces are an abstracted concept. Real numbers are topological spaces, yes. But many other things also are. Neighborhoods and compact sets and open sets are also abstracted concepts. And so this theorem has its same quiet utility in these many spaces. It’s just there quietly supporting more challenging work.


And now I get to really relax: I already have a Reading the Comics post ready for tomorrow, and Sunday’s is partly written. Now I just have to find a mathematical term starting with ‘Y’ that’s interesting enough to write about.

My 2018 Mathematics A To Z: Witch of Agnesi


Nobody had a suggested topic starting with ‘W’ for me! So I’ll take that as a free choice, and get lightly autobiogrpahical.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Witch of Agnesi.

I know I encountered the Witch of Agnesi while in middle school. Eighth grade, if I’m not mistaken. It was a footnote in a textbook. I don’t remember much of the textbook. What I mostly remember of the course was how much I did not fit with the teacher. The only relief from boredom that year was the month we had a substitute and the occasional interesting footnote.

It was in a chapter about graphing equations. That is, finding curves whose points have coordinates that satisfy some equation. In a bit of relief from lines and parabolas the footnote offered this:

y = \frac{8a^3}{x^2 + 4a^2}

In a weird tantalizing moment the footnote didn’t offer a picture. Or say what an ‘a’ was doing in there. In retrospect I recognize ‘a’ as a parameter, and that different values of it give different but related shapes. No hint what the ‘8’ or the ‘4’ were doing there. Nor why ‘a’ gets raised to the third power in the numerator or the second in the denominator. I did my best with the tools I had at the time. Picked a nice easy boring ‘a’. Picked out values of ‘x’ and found the corresponding ‘y’ which made the equation true, and tried connecting the dots. The result didn’t look anything like a witch. Nor a witch’s hat.

It was one of a handful of biographical notes in the book. These were a little attempt to add some historical context to mathematics. It wasn’t much. But it was an attempt to show that mathematics came from people. Including, here, from Maria Gaëtana Agnesi. She was, I’m certain, the only woman mentioned in the textbook I’ve otherwise completely forgotten.

We have few names of ancient mathematicians. Those we have are often compilers like Euclid whose fame obliterated the people whose work they explained. Or they’re like Pythagoras, credited with discoveries by people who obliterated their own identities. In later times we have the mathematics done by, mostly, people whose social positions gave them time to write mathematics results. So we see centuries where every mathematician is doing it as their side hustle to being a priest or lawyer or physician or combination of these. Women don’t get the chance to stand out here.

Today of course we can name many women who did, and do, mathematics. We can name Emmy Noether, Ada Lovelace, and Marie-Sophie Germain. Challenged to do a bit more, we can offer Florence Nightingale and Sofia Kovalevskaya. Well, and also Grace Hopper and Margaret Hamilton if we decide computer scientists count. Katherine Johnson looks likely to make that cut. But in any case none of these people are known for work understandable in a pre-algebra textbook. This must be why Agnesi earned a place in this book. She’s among the earliest women we can specifically credit with doing noteworthy mathematics. (Also physics, but that’s off point for me.) Her curve might be a little advanced for that textbook’s intended audience. But it’s not far off, and pondering questions like “why 8a^3 ? Why not a^3 ?” is more pleasant, to a certain personality, than pondering what a directrix might be and why we might use one.

The equation might be a lousy way to visualize the curve described. The curve is one of that group of interesting shapes you get by constructions. That is, following some novel process. Constructions are fun. They’re almost a craft project.

For this we start with a circle. And two parallel tangent lines. Without loss of generality, suppose they’re horizontal, so, there’s lines at the top and the bottom of the curve.

Take one of the two tangent points. Again without loss of generality, let’s say the bottom one. Draw a line from that point over to the other line. Anywhere on the other line. There’s a point where the line you drew intersects the circle. There’s another point where it intersects the other parallel line. We’ll find a new point by combining pieces of these two points. The point is on the same horizontal as wherever your line intersects the circle. It’s on the same vertical as wherever your line intersects the other parallel line. This point is on the Witch of Agnesi curve.

Now draw another line. Again, starting from the lower tangent point and going up to the other parallel line. Again it intersects the circle somewhere. This gives another point on the Witch of Agnesi curve. Draw another line. Another intersection with the circle, another intersection with the opposite parallel line. Another point on the Witch of Agnesi curve. And so on. Keep doing this. When you’ve drawn all the lines that reach from the tangent point to the other line, you’ll have generated the full Witch of Agnesi curve. This takes more work than writing out y = \frac{8a^3}{x^2 + 4a^2} , yes. But it’s more fun. It makes for neat animations. And I think it prepares us to expect the shape of the curve.

It’s a neat curve. Between it and the lower parallel line is an area four times that of the circle that generated it. The shape is one we would get from looking at the derivative of the arctangent. So there’s some reasons someone working in calculus might find it interesting. And people did. Pierre de Fermat studied it, and found this area. Isaac Newton and Luigi Guido Grandi studied the shape, using this circle-and-parallel-lines construction. Maria Agnesi’s name attached to it after she published a calculus textbook which examined this curve. She showed, according to people who present themselves as having read her book, the curve and how to find it. And she showed its equation and found the vertex and asymptote line and the inflection points. The inflection points, here, are where the curve chances from being cupped upward to cupping downward, or vice-versa.

It’s a neat function. It’s got some uses. It’s a natural smooth-hill shape, for example. So this makes a good generic landscape feature if you’re modeling the flow over a surface. I read that solitary waves can have this curve’s shape, too.

And the curve turns up as a probability distribution. Take a fixed point. Pick lines at random that pass through this point. See where those lines reach a separate, straight line. Some regions are more likely to be intersected than are others. Chart how often any particular line is the new intersection point. That chart will (given some assumptions I ask you to pretend you agree with) be a Witch of Agnesi curve. This might not surprise you. It seems inevitable from the circle-and-intersecting-line construction process. And that’s nice enough. As a distribution it looks like the usual Gaussian bell curve.

It’s different, though. And it’s different in strange ways. Like, for a probability distribution we can find an expected value. That’s … well, what it sounds like. But this is the strange probability distribution for which the law of large numbers does not work. Imagine an experiment that produces real numbers, with the frequency of each number given by this distribution. Run the experiment zillions of times. What’s the mean value of all the zillions of generated numbers? And it … doesn’t … have one. I mean, we know it ought to, it should be the center of that hill. But the calculations for that don’t work right. Taking a bigger sample makes the sample mean jump around more, not less, the way every other distribution should work. It’s a weird idea.

Imagine carving a block of wood in the shape of this curve, with a horizontal lower bound and the Witch of Agnesi curve as the upper bound. Where would it balance? … The normal mathematical tools don’t say, even though the shape has an obvious line of symmetry. And a finite area. You don’t get this kind of weirdness with parabolas.

(Yes, you’ll get a balancing point if you actually carve a real one. This is because you work with finitely-long blocks of wood. Imagine you had a block of wood infinite in length. Then you would see some strange behavior.)

It teaches us more strange things, though. Consider interpolations, that is, taking a couple data points and fitting a curve to them. We usually start out looking for polynomials when we interpolate data points. This is because everything is polynomials. Toss in more data points. We need a higher-order polynomial, but we can usually fit all the given points. But sometimes polynomials won’t work. A problem called Runge’s Phenomenon can happen, where the more data points you have the worse your polynomial interpolation is. The Witch of Agnesi curve is one of those. Carl Runge used points on this curve, and trying to fit polynomials to those points, to discover the problem. More data and higher-order polynomials make for worse interpolations. You get curves that look less and less like the original Witch. Runge is himself famous to mathematicians, known for “Runge-Kutta”. That’s a family of techniques to solve differential equations numerically. I don’t know whether Runge came to the weirdness of the Witch of Agnesi curve from considering how errors build in numerical integration. I can imagine it, though. The topics feel related to me.

I understand how none of this could fit that textbook’s slender footnote. I’m not sure any of the really good parts of the Witch of Agnesi could even fit thematically in that textbook. At least beyond the fact of its interesting name, which any good blog about the curve will explain. That there was no picture, and that the equation was beyond what the textbook had been describing, made it a challenge. Maybe not seeing what the shape was teased the mathematician out of this bored student.


And next is ‘X’. Will I take Mr Wu’s suggestion and use that to describe something “extreme”? Or will I take another topic or suggestion? We’ll see on Friday, barring unpleasant surprises. Thanks for reading.

My 2018 Mathematics A To Z: Volume


Ray Kassinger, of the popular web comic Housepets!, had a silly suggestion when I went looking for topics. In one episode of Mystery Science Theater 3000, Crow T Robot gets the idea that you could describe the size of a space by the number of turkeys which fill it. (It’s based on like two minor mentions of “turkeys” in the show they were watching.)

I liked that episode. I’ve got happy memories of the time when I first saw it. I thought the sketch in which Crow T Robot got so volume-obsessed was goofy and dumb in the fun-nerd way.

I accept Mr Kassinger’s challenge only I’m going to take it seriously.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Volume.

How big is a thing?

There is a legend about Thomas Edison. He was unimpressed with a new hire. So he hazed the college-trained engineer who deeply knew calculus. He demanded the engineer tell him the volume within a light bulb. The engineer went to work, making measurements of the shape of the bulb’s outside. And then started the calculations. This involves a calculus technique called “volumes of rotation”. This can tell the volume within a rotationally symmetric shape. It’s tedious, especially if the outer edge isn’t some special nice shape. Edison, fed up, took the bulb, filled it with water, poured that out into a graduated cylinder and said that was the answer.

I’m skeptical of legends. I’m skeptical of stories about the foolish intellectual upstaged by the practical man-of-action. And I’m skeptical of Edison because, jeez, I’ve read biographies of the man. Even the fawning ones make him out to be yeesh.

But the legend’s Edison had a point. If the volume of a shape is not how much stuff fits inside the shape, what is it? And maybe some object has too complicated a shape to find its volume. Can we think of a way to produce something with the same volume, but that is easier? Sometimes we can. When we do this with straightedge and compass, the way the Ancient Greeks found so classy, we call this “quadrature”. It’s called quadrature from its application in two dimensions. It finds, for a shape, a square with the same area. For a three-dimensional object, we find a cube with the same volume. Cubes are easy to understand.

Straightedge and compass can’t do everything. Indeed, there’s so much they can’t do. Some of it is stuff you’d think it should be able to, like, find a cube with the same volume as a sphere. Integration gives us a mathematical tool for describing how much stuff is inside a shape. It’s even got a beautiful shorthand expression. Suppose that D is the shape. Then its volume V is:

V = \int\int\int_D dV

Here “dV” is the “volume form”, a description of how the coordinates we describe a space in relate to the volume. The \int\int\int is jargon, meaning, “integrate over the whole volume”. The subscript “D” modifies that phrase by adding “of D” to it. Writing “D” is shorthand for “these are all the points inside this shape, in whatever coordinate system you use”. If we didn’t do that we’d have to say, on each \int sign, what points are inside the shape, coordinate by coordinate. At this level the equation doesn’t offer much help. It says the volume is the sum of infinitely many, infinitely tiny pieces of volume. True, but that doesn’t give much guidance about whether it’s more or less than two cups of water. We need to get more specific formulas, usually. We need to pick coordinates, for example, and say what coordinates are inside the shape. A lot of the resulting formulas can’t be integrated exactly. Like, an ellipsoid? Maybe you can integrate that. Don’t try without getting hazard pay.

We can approximate this integral. Pick a tiny shape whose volume is easy to know. Fill your shape with duplicates of it. Count the duplicates. Multiply that count by the volume of this tiny shape. Done. This is numerical integration, sometimes called “numerical quadrature”. If we’re being generous, we can say the legendary Edison did this, using water molecules as the tiny shape. And working so that he didn’t need to know the exact count or the volume of individual molecules. Good computational technique.

It’s hard not to feel we’re begging the question, though. We want the volume of something. So we need the volume of something else. Where does that volume come from?

Well, where does an inch come from? Or a centimeter? Whatever unit you use? You pick something to use as reference. Any old thing will do. Which is why you get fascinating stories about choosing what to use. And bitter arguments about which of several alternatives to use. And we express the length of something as some multiple of this reference length.

Volume works the same way. Pick a reference volume, something that can be one unit-of-volume. Other volumes are some multiple of that unit-of-volume. Possibly a fraction of that unit-of-volume.

Usually we use a reference volume that’s based on the reference length. Typically, we imagine a cube that’s one unit of length on each side. The volume of this cube with sides of length 1 unit-of-length is then 1 unit-of-volume. This seems all nice and orderly and it’s surely not because mathematicians have paid off by six-sided-dice manufacturers.

Does it have to be?

That we need some reference volume seems inevitable. We can’t very well say the area of something is ten times nothing-in-particular. Does that reference volume have to be a cube? Or even a rectangle or something else? It seems obvious that we need some reference shape that tiles, that can fill up space by itself … right?

What if we don’t?

I’m going to drop out of three dimensions a moment. Not because it changes the fundamentals, but because it makes something easier. Specifically, it makes it easier if you decide you want to get some construction paper, cut out shapes, and try this on your own. What this will tell us about area is just as true for volume. Area, for a two-dimensional sapce, and volume, for a three-dimensional, describe the same thing. If you’ll let me continue, then, I will.

So draw a figure on a clean sheet of paper. What’s its area? Now imagine you have a whole bunch of shapes with reference areas. A bunch that have an area of 1. That’s by definition. That’s our reference area. A bunch of smaller shapes with an area of one-half. By definition, too. A bunch of smaller shapes still with an area of one-third. Or one-fourth. Whatever. Shapes with areas you know because they’re marked on them.

Here’s one way to find the area. Drop your reference shapes, the ones with area 1, on your figure. How many do you need to completely cover the figure? It’s all right to cover more than the figure. It’s all right to have some of the reference shapes overlap. All you need is to cover the figure completely. … Well, you know how many pieces you needed for that. You can count them up. You can add up the areas of all these pieces needed to cover the figure. So the figure’s area can’t be any bigger than that sum.

Can’t be exact, though, right? Because you might get a different number if you covered the figure differently. If you used smaller pieces. If you arranged them better. This is true. But imagine all the possible reference shapes you had, and all the possible ways to arrange them. There’s some smallest area of those reference shapes that would cover your figure. Is there a more sensible idea for what the area of this figure would be?

And put this into three dimensions. If we start from some reference shapes of volume 1 and maybe 1/2 and 1/3 and whatever other useful fractions there are? Doesn’t this covering make sense as a way to describe the volume? Cubes or rectangles are easy to imagine. Tetrahedrons too. But why not any old thing? Why not, as the Mystery Science Theater 3000 episode had it, turkeys?

This is a nice, flexible, convenient way to define area. So now let’s see where it goes all bizarre. We know this thanks to Giuseppe Peano. He’s among the late-19th/early-20th century mathematicians who shaped modern mathematics. They did this by showing how much of our mathematics broke intuition. Peano was (here) exploring what we now call fractals. And noted a family of shapes that curl back on themselves, over and over. They’re beautiful.

And they fill area. Fill volume, if done in three dimensions. It seems impossible. If we use this covering scheme, and try to find the volume of a straight line, we get zero. Well, we find that any positive number is too big, and from that conclude that it has to be zero. Since a straight line has length, but not volume, this seems fine. But a Peano curve won’t go along with this. A Peano curve winds back on itself so much that there is some minimum volume to cover it.

This unsettles. But this idea of volume (or area) by covering works so well. To throw it away seems to hobble us. So it seems worth the trade. We allow ourselves to imagine a line so long and so curled up that it has a volume. Amazing.


And now I get to relax and unwind and enjoy a long weekend before coming to the letter ‘W’. That’ll be about some topic I figure I can whip out a nice tight 500 words about, and instead, produce some 1541-word monstrosity while I wonder why I’ve had no free time at all since August. Tuesday, give or take, it’ll be available at this link, as are the rest of these glossary posts. Thanks for reading.

My 2018 Mathematics A To Z: Unit Fractions


My subject for today is another from Iva Sallay, longtime friend of the blog and creator of the Find the Factors recreational mathematics game. I think you’ll likely find something enjoyable at her site, whether it’s the puzzle or the neat bits of trivia as she works through all the counting numbers.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Unit Fractions.

We don’t notice how unit fractions are around us. Likely there’s some in your pocket. Or there have been recently. Think of what you do when paying for a thing, when it’s not a whole number of dollars. (Pounds, euros, whatever the unit of currency is.) Suppose you have exact change. What do you give for the 38 cents?

Likely it’s something like a 25-cent piece and a 10-cent piece and three one-cent pieces. This is an American and Canadian solution. I know that 20-cent pieces are more common than 25-cent ones worldwide. It doesn’t make much difference; if you want it to be three 10-cent, one five-cent, and three one-cent pieces that’s as good. And granted, outside the United States it’s growing common to drop pennies altogether and round prices off to a five- or ten-cent value. Again, it doesn’t make much difference.

But look at the coins. The 25 cent piece is one-quarter of a dollar. It’s even called that, and stamped that on one side. I sometimes hear a dime called “a tenth of a dollar”, although mostly by carnival barkers in one-reel cartoons of the 1930s. A nickel is one-twentieth of a dollar. A penny is one-hundredth. A 20-cent piece is one-fifth of a dollar. And there are half-dollars out there, although not in the United States, not really anymore.

(Pre-decimalized currencies offered even more unit fractions. Using old British coins, for familiarity-to-me and great names, there were farthings, 1/960th of a pound; halfpennies, 1/480th; pennies, 1/240th; threepence, 1/80th of a pound; groats, 1/60th; sixpence, 1/40th; florins, 1/10th; half-crowns, 1/8th; crowns, 1/4th. And what seem to the modern wallet like impossibly tiny fractions like the half-, third-, and quarter-farthings used where 1/3840th of a pound might be a needed sum of money.)

Unit fractions get named and defined somewhere in elementary school arithmetic. They go on, becoming forgotten sometime after that. They might make a brief reappearance in calculus. There are some rational functions that get easier to integrate if you think of them as the sums of fractions, with constant numerators and polynomial denominators. These aren’t unit fractions. A unit fraction has a 1, the unit, in the numerator. But we see units along the way to integrating \frac{1}{x^2 - x} as an example. And see it in the promise that there are still more amazing integrals to learn how to do.

They get more attention if you take a history of computation class. Or read the subject on your own. Unit fractions stand out in history. We learn the Ancient Egyptians worked with fractions as sums of unit fractions. That is, had they dollars, they would not look at the \frac{38}{100} we do. They would look at \frac{1}{4} plus \frac{1}{10} plus \frac{1}{100} plus \frac{1}{100} plus \frac{1}{100} . When we count change we are using, without noticing it, a very old computing scheme.

This isn’t quite true. The Ancient Egyptians seemed to shun repeating a unit like that. To use \frac{1}{100} once is fine; three times is suspicious. They would prefer something like \frac{1}{3} plus \frac{1}{24} plus \frac{1}{200} . Or maybe some other combination. I just wrote out the first one I found.

But there are many ways we can make 38 cents using ordinary coins of the realm. There are infinitely many ways to make up any fraction using unit fractions. There’s surely a most “efficient”. Most efficient might be the one which uses the fewest number of terms. Most efficient might be the one that uses the smallest denominators. Choose what you like; no one knows a scheme that always turns up the most efficient, either way. We can always find some representation, though. It may not be “good”, but it will exist, which may be good enough. Leonardo of Pisa, or as he got named in the 19th century, Fibonacci, proved that was true.

We may ask why the Egyptians used unit fractions. They seem inefficient compared to the way we work with fractions. Or, better, decimals. I’m not sure the question can have a coherent answer. Why do we have a fashion for converting fractions to a “proper” form? Why do we use the number of decimal points we do for a given calculation? Sometimes a particular mode of expression is the fashion. It comes to seem natural because everyone uses it. We do it too.

And there is practicality to them. Even efficiency. If you need π, for example, you can write it as 3 plus \frac{1}{8} plus \frac{1}{61} and your answer is off by under one part in a thousand. Combine this with the Egyptian method of multiplication, where you would think of (say) “11 times π” as “1 times π plus 2 times π plus 8 times π”. And with tables they had worked up which tell you what \frac{2}{8} and \frac{2}{61} would be in a normal representation. You can get rather good calculations without having to do more than addition and looking up doublings. Represent π as 3 plus \frac{1}{8} plus \frac{1}{61} plus \frac{1}{5020} and you’re correct to within one part in 130 million. That isn’t bad for having to remember four whole numbers.

(The Ancient Egyptians, like many of us, were not absolutely consistent in only using unit fractions. They had symbols to represent \frac{2}{3} and \frac{3}{4} , probably due to these numbers coming up all the time. Human systems vary to make the commonest stuff we do easier.)

Enough practicality or efficiency, if this is that. Is there beauty? Is there wonder? Certainly. Much of it is in number theory. Number theory splits between astounding results and results that would be astounding if we had any idea how to prove them. Many of the astounding results are about unit fractions. Take, for example, the harmonic series 1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} + \cdots . Truncate that series whenever you decide you’ve had enough. Different numbers of terms in this series will add up to different numbers. Eventually, infinitely many numbers. The numbers will grow ever-higher. There’s no number so big that it won’t, eventually, be surpassed by some long-enough truncated harmonic series. And yet, past the number 1, it’ll never touch a whole number again. Infinitely many partial sums. Partial sums differing from one another by one-googol-plex and smaller. And yet of the infinitely many whole numbers this series manages to miss them all, after its starting point. Worse, any set of consecutive terms, not even starting from 1, will never hit a whole number. I can understand a person who thinks mathematics is boring, but how can anyone not find it astonishing?

There are more strange, beautiful things. Consider heptagonal numbers, which Iva Sallay knows well. These are numbers like 1 and 7 and 18 and 34 and 55 and 1288. Take a heptagonal number of, oh, beads or dots or whatever, and you can lay them out to form a regular seven-sided figure. Add together the reciprocals of the heptagonal numbers. What do you get? It’s a weird number. It’s irrational, which you maybe would have guessed as more likely than not. But it’s also transcendental. Most real numbers are transcendental. But it’s often hard to prove any specific number is.

Unit fractions creep back into actual use. For example, in modular arithmetic, they offer a way to turn division back into multiplication. Division, in modular arithmetic, tends to be hard. Indeed, if you need an algorithm to make random-enough numbers, you often will do something with division in modular arithmetic. Suppose you want to divide by a number x, modulo y, and x and y are relatively prime, though. Then unit fractions tell us how to turn this into finding a greatest common denominator problem.

They teach us about our computers, too. Much of serious numerical mathematics involves matrix multiplication. Matrices are, for this purpose, tables of numbers. The Hilbert Matrix has elements that are entirely unit fractions. The Hilbert Matrix is really a family of square matrices. Pick any of the family you like. It can have two rows and two columns, or three rows and three columns, or ten rows and ten columns, or a million rows and a million columns. Your choice. The first row is made of the numbers 1, \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, and so on. The second row is made of the numbers \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5}, and so on. The third row is made of the numbers \frac{1}{3}, \frac{1}{4}, \frac{1}{5}, \frac{1}{6}, and so on. You see how this is going.

Matrices can have inverses. It’s not guaranteed; matrices are like that. But the Hilbert Matrix does. It’s another matrix, of the same size. All the terms in it are integers. Multiply the Hilbert Matrix by its inverse and you get the Identity Matrix. This is a matrix, the same number of rows and columns as you started with. But nearly every element in the identity matrix is zero. The only exceptions are on the diagonal — first row, first column; second row, second column; third row, third column; and so on. There, the identity matrix has a 1. The identity matrix works, for matrix multiplication, much like the real number 1 works for normal multiplication.

Matrix multiplication is tedious. It’s not hard, but it involves a lot of multiplying and adding and it just takes forever. So set a computer to do this! And you get … uh ..

For a small Hilbert Matrix and its inverse, you get an identity matrix. That’s good. For a large Hilbert Matrix and its inverse? You get garbage. Large isn’t maybe very large. A 12 by 12 matrix gives you trouble. A 14 by 14 matrix gives you a mess. Well, on my computer it does. Cute little laptop I got when my former computer suddenly died. On a better computer? One designed for computation? … You could do a little better. Less good than you might imagine.

The trouble is that computers don’t really do mathematics. They do an approximation of it, numerical computing. Most use a scheme called floating point arithmetic. It mostly works well. There’s a bit of error in every calculation. For most calculations, though, the error stays small. At least relatively small. The Hilbert Matrix, built of unit fractions, doesn’t respect this. It and its inverse have a “numerical instability”. Some kinds of calculations make errors explode. They’ll overwhelm the meaningful calculation. It’s a bit of a mess.

Numerical instability is something anyone doing mathematics on the computer must learn. Must grow comfortable with. Must understand. The matrix multiplications, and inverses, that the Hilbert Matrix involves highlights those. A great and urgent example of a subtle danger of computerized mathematics waits for us in these unit fractions. And we’ve known and felt comfortable with them for thousands of years.


There’ll be some mathematical term with a name starting ‘V’ that, barring surprises, should be posted Friday. What’ll it be? I have an idea at least. It’ll be available at this link, as are the rest of these glossary posts.

My 2018 Mathematics A To Z: Tiling


For today’s a to Z topic I again picked one nominated by aajohannas. This after I realized I was falling into a never-ending research spiral on Mr Wu, of Mathtuition’s suggested “torus”. I do have an older essay describing the torus, as a set. But that does leave out a lot of why a torus is interesting. Well, we’ll carry on.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Tiling.

Here is a surprising thought for the next time you consider remodeling the kitchen. It’s common to tile the floor. Perhaps some of the walls behind the counter. What patterns could you use? And there are infinitely many possibilities. You might leap ahead of me and say, yes, but they’re all boring. A tile that’s eight inches square is different from one that’s twelve inches square and different from one that’s 12.01 inches square. Fine. Let’s allow that all square tiles are “really” the same pattern. The only difference between a square two feet on a side and a square half an inch on a side is how much grout you have to deal with. There are still infinitely many possibilities.

You might still suspect me of being boring. Sure, there’s a rectangular tile that’s, say, six inches by eight inches. And one that’s six inches by nine inches. Six inches by ten inches. Six inches by one millimeter. Yes, I’m technically right. But I’m not interested in that. Let’s allow that all rectangular tiles are “really” the same pattern. So we have “squares” and “rectangles”. There are still infinitely many tile possibilities.

Let me shorten the discussion here. Draw a quadrilateral. One that doesn’t intersect itself. That is, there’s four corners, four lines, and there’s no X crossings. If you have that, then you have a tiling. Get enough of these tiles and arrange them correctly and you can cover the plane. Or the kitchen floor, if you have a level floor. It might not be obvious how to do it. You might have to rotate alternating tiles, or set them in what seem like weird offsets. But you can do it. You’ll need someone to make the tiles for you, if you pick some weird pattern. I hope I live long enough to see it become part of the dubious kitchen package on junk home-renovation shows.

Let me broaden the discussion here. What do I mean by a tiling if I’m allowing any four-sided figure to be a tile? We start with a surface. Usually the plane, a flat surface stretching out infinitely far in two dimensions. The kitchen floor, or any other mere mortal surface, approximates this. But the floor stops at some point. That’s all right. The ideas we develop for the plane work all right for the kitchen. There’s some weird effects for the tiles that get too near the edges of the room. We don’t need to worry about them here. The tiles are some collection of open sets. No two tiles overlap. The tiles, plus their boundaries, cover the whole plane. That is, every point on the plane is either inside exactly one of the open sets, or it’s on the boundary between one (or more) sets.

There isn’t a requirement that all these sets have the same shape. We usually do, and will limit our tiles to one or two shapes endlessly repeated. It seems to appeal to our aesthetics and our installation budget. Using a single pattern allows us to cover the plane with triangles. Any triangle will do. Similarly any quadrilateral will do. For convex pentagonal tiles — here things get weird. There are fourteen known families of pentagons that tile the plane. Each member of the family looks about the same, but there’s some room for variation in the sides. Plus there’s one more special case that can tile the plane, but only that one shape, with no variation allowed. We don’t know if there’s a sixteenth pattern. But then until 2015 we didn’t know there was a 15th, and that was the first pattern found in thirty years. Might be an opening for someone with a good eye for doodling.

There are also exciting opportunities in convex hexagons. Anyone who plays strategy games knows a regular hexagon will tile the plane. (Regular hexagonal tilings fit a certain kind of strategy game well. Particularly they imply an equal distance between the centers of any adjacent tiles. Square and triangular tiles don’t guarantee that. This can imply better balance for territory-based games.) Irregular hexagons will, too. There are three known families of irregular hexagons that tile the plane. You can treat the regular hexagon as a special case of any of these three families. No one knows if there’s a fourth family. Ready your notepad at the next overlong, agenda-less meeting.

There aren’t tilings for identical convex heptagons, figures with seven sides. Nor eight, nor nine, nor any higher figure. You can cover them if you have non-convex figures. See any Tetris game where you keep getting the ‘s’ or ‘t’ shapes. And you can cover them if you use several shapes.

There’s some guidance if you want to create your own periodic tilings. I see it called the Conway Criterion. I don’t know the field well enough to say whether that is a common term. It could be something one mathematics popularizer thought of and that other popularizers imitated. (I don’t find “Conway Criterion” on the Mathworld glossary, but that isn’t definitive.) Suppose your polygon satisfies a couple of rules about the shapes of the edges. The rules are given in that link earlier this paragraph. If your shape does, then it’ll be able to tile the plane. If you don’t satisfy the rules, don’t despair! It might yet. The Conway Criterion tells you when some shape will tile the plane. It won’t tell you that something won’t.

(The name “Conway” may nag at you as familiar from somewhere. This criterion is named for John H Conway, who’s famous for a bunch of work in knot theory, group theory, and coding theory. And in popular mathematics for the “Game of Life”. This is a set of rules on a grid of numbers. The rules say how to calculate a new grid, based on this first one. Iterating them, creating grid after grid, can make patterns that seem far too complicated to be implicit in the simple rules. Conway also developed an algorithm to calculate the day of the week, in the Gregorian calendar. It is difficult to explain to the non-calendar fan how great this sort of thing is.)

This has all gotten to periodic tilings. That is, these patterns might be complicated. But if need be, we could get them printed on a nice square tile and cover the floor with that. Almost as beautiful and much easier to install. Are there tilings that aren’t periodic? Aperiodic tilings?

Well, sure. Easily. Take a bunch of tiles with a right angle, and two 45-degree angles. Put any two together and you have a square. So you’re “really” tiling squares that happen to be made up of a pair of triangles. Each pair, toss a coin to decide whether you put the diagonal as a forward or backward slash. Done. That’s not a periodic tiling. Not unless you had a weird run of luck on your coin tosses.

All right, but is that just a technicality? We could have easily installed this periodically and we just added some chaos to make it “not work”. Can we use a finite number of different kinds of tiles, and have it be aperiodic however much we try to make it periodic? And through about 1966 mathematicians would have mostly guessed that no, you couldn’t. If you had a set of tiles that would cover the plane aperiodically, there was also some way to do it periodically.

And then in 1966 came a surprising result. No, not Penrose tiles. I know you want me there. I’ll get there. Not there yet though. In 1966 Robert Berger — who also attended Rensselaer Polytechnic Institute, thank you — discovered such a tiling. It’s aperiodic, and it can’t be made periodic. Why do we know Penrose Tiles rather than Berger Tiles? Couple reasons, including that Berger has to use 20,426 distinct tile shapes. In 1971 Raphael M Robinson simplified matters a bit and got that down to six shapes. Roger Penrose in 1974 squeezed the set down to two, although by adding some rules about what edges may and may not touch one another. (You can turn this into a pure edges thing by putting notches into the shapes.) That really caught the public imagination. It’s got simplicity and accessibility to combine with beauty. Aperiodic tiles seem to relate to “quasicrystals”, which are what the name suggests and do happen in some materials. And they’ve got beauty. Aperiodic tiling embraces our need to have not too much order in our order.

I’ve discussed, in all this, tiling the plane. It’s an easy surface to think about and a popular one. But we can form tiling questions about other shapes. Cylinders, spheres, and toruses seem like they should have good tiling questions available. And we can imagine “tiling” stuff in more dimensions too. If we can fill a volume with cubes, or rectangles, it’s natural to wonder what other shapes we can fill it with. My impression is that fewer definite answers are known about the tiling of three- and four- and higher-dimensional space. Possibly because it’s harder to sketch out ideas and test them. Possibly because the spaces are that much stranger. I would be glad to hear more.


I’m hoping now to have a nice relaxing weekend. I won’t. I need to think of what to say for the letter ‘U’. On Tuesday I hope that it will join the rest of my A to Z essays at this link.

My 2018 Mathematics A To Z: Sorites Paradox


Today’s topic is the lone (so far) request by bunnydoe, so I’m under pressure to make it decent. If she or anyone else would like to nominate subjects for the letters U through Z, please drop me a note at this post. I keep fooling myself into thinking I’ll get one done in under 1200 words.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Sorites Paradox.

This is a story which makes a capitalist look kind of good. I say nothing about its truth, or even, at this remove, where I got it. The story as I heard it was about Ray Kroc, who made McDonald’s into a thing people of every land can complain about. The story has him demonstrate skepticism about the use of business consultants. A consultant might find, for example, that each sesame-seed hamburger bun has (say) 43 seeds. And that if they just cut it down to 41 seeds then each franchise would save (say) $50,000 annually. And no customer would notice the difference. Fine; trim the seeds a little. The next round of consultant would point out, cutting from 41 seeds to 38 would save a further $65,000 per store per year. And again no customer would notice the difference. Cut to 36 seeds? No customer would notice. This process would end when each bun had three sesame seeds, and the customers notice.

I mention this not for my love of sesame-seed buns. It’s just a less-common version of the Sorites Paradox. It’s a very old logical problem. We draw it, and its name, from the Ancient Greek philosophers. In the oldest form, it’s about a heap of sand, and which grain of sand’s removal destroys the heap. This form we attribute to Eubulides of Miletus. Eubulides is credited with a fair number of logical paradoxes. One of them we all know, the Liar Paradox, “What I am saying now is a lie”. Another, the Horns Paradox, I hadn’t encountered before researching this essay. But it bids fair to bring me some delight every day of the rest of my life. “What you have not lost, you have. But you have not lost horns. Therefore you have horns.” Eubelides has a bunch of other paradoxes. Some read, to my uninformed eye, like restatements of other paradoxes. Some look ready to be recast as arguments about Lois Lane’s relationship with Superman. Miletus we know because for a good stretch there every interesting philosopher was hanging around Miletus.

Part of the paradox’s intractability must be that it’s so nearly induction. Induction is a fantastic tool for mathematical problems. We couldn’t do without it. But consider the argument. If a bun is unsatisfying, one more seed won’t make it satisfying. A bun with one seed is unsatisfying. Therefore all buns have an unsatisfying number of sesame seeds on them. It suggests there must be some point at which “adding one more seed won’t help” stops being true. Fine; where is that point, and why isn’t it one fewer or one more seed?

A certain kind of nerd has a snappy answer for the Sorites Paradox. Test a broad population on a variety of sesame-seed buns. There’ll be some so sparse that nearly everyone will say they’re unsatisfying. There’ll be some so abundant most everyone agrees they’re great. So there’s the buns most everyone says are fine. There’s the buns most everyone says are not. The dividing line is at any point between the sparsest that satisfy most people and the most abundant that don’t. The nerds then declare the problem solved and go off. Let them go. We were lucky to get as much of their time as we did. They’re quite busy solving what “really” happened for Rashomon. The approach of “set a line somewhere” is fine if all want is guidance on where to draw a line. It doesn’t help say why we can anoint some border over any other. At least when we use a river as border between states we can agree going into the water disrupts what we were doing with the land. And even then we have to ask what happens during droughts and floods, and if the river is an estuary, how tides affect matters.

We might see an answer by thinking more seriously about these sesame-seed buns. We force a problem by declaring that every bun is either satisfying or it is not. We can imagine buns with enough seeds that we don’t feel cheated by them, but that we also don’t feel satisfied by. This reflects one of the common assumptions of logic. Mathematicians know it as the Law of the Excluded Middle. A thing is true or it is not true. There is no middle case. This is fine for logic. But for everyday words?

It doesn’t work when considering sesame-seed buns. I can imagine a bun that is not satisfying, but also is not unsatisfying. Surely we can make some logical provision for the concept of “meh”. Now we need not draw some arbitrary line between “satisfying” and “unsatisfying”. We must draw two lines, one of them between “unsatisfying” and “meh”. There is a potential here for regression. Also for the thought of a bun that’s “satisfying-meh-satisfying by unsatisfying”. I shall step away from this concept.

But there are more subtle ways to not exclude the middle. For example, we might decide a statement’s truth exists on a spectrum. We can match how true a statement is to a number. Suppose an obvious falsehood is zero; an unimpeachable truth is one, and normal mortal statements somewhere in the middle. “This bun with a single sesame seed is satisfying” might have a truth of 0.01. This perhaps reflects the tastes of people who say they want sesame seeds but don’t actually care. “This bun with fifteen sesame seeds is satisfying” might have a truth of 0.25, say. “This bun with forty sesame seeds is satisfying” might have a truth of 0.97. (It’s true for everyone except those who remember the flush times of the 43-seed bun.) This seems to capture the idea that nothing is always wholly anything. But we can still step into absurdity. Suppose “this bun with 23 sesame seeds is satisfying” has a truth of 0.50. Then “this bun with 23 sesame seeds is not satisfying” should also have a truth of 0.50. What do we make of the statement “this bun with 23 sesame seeds is simultaneously satisfying and not satisfying”? Do we make something different to “this bun with 23 sesame seeds is simultaneously satisfying and satisfying”?

I see you getting tired in the back there. This may seem like word games. And we all know that human words are imprecise concepts. What has this to do with logic, or mathematics, or anything but the philosophy of language? And the first answer is that we understand logic and mathematics through language. When learning mathematics we get presented with definitions that seem absolute and indisputable. We start to see the human influence in mathematics when we ask why 1 is not a prime number. Later we see things like arguments about whether a ring has a multiplicative identity. And then there are more esoteric debates about the bounds of mathematical concepts.

Perhaps we can think of a concept we can’t describe in words. If we don’t express it to other people, the concept dies with us. We need words. No, putting it in symbols does not help. Mathematical symbols may look like slightly alien scrawl. But they are shorthand for words, and can be read as sentences, and there is this fuzziness in all of them.

And we find mathematical properties that share this problem. Consider: what is the color of the chemical element flerovium? Before you say I just made that up, flerovium was first synthesized in 1998, and officially named in 2012. We’d guess that it’s a silvery-white or maybe grey metallic thing. Humanity has only ever observed about ninety atoms of the stuff. It’s, for atoms this big, amazingly stable. We know an isotope of it that has a half-life of two and a half seconds. But it’s hard to believe we’ll ever have enough of the stuff to look at it and say what color it is.

That’s … all right, though? Maybe? Because we know the quantum mechanics that seem to describe how atoms form. And how they should pack together. And how light should be absorbed, and how light should be emitted, and how light should be scattered by it. At least in principle. The exact answers might be beyond us. But we can imagine having a solution, at least in principle. We can imagine the computer that after great diligent work gives us a picture of what a ten-ton lump of flerovium would look like.

So where does its color come from? Or any of the other properties that these atoms have as a group? No one atom has a color. No one atom has a density, either, or a viscosity. No one atom has a temperature, or a surface tension, or a boiling point. In combination, though, they have.

These are known to statistical mechanics, and through that thermodynamics, as intensive properties. If we have a partition function, which describes all the ways a system can be organized, we can extract information about these properties. They turn up as derivatives with respect to the right parameters of the system.

But the same problem exists. Take a homogeneous gas. It has some temperature. Divide it into two equal portions. Both sides have the same temperature. Divide each half into two equal portions again. All four pieces have the same temperature. Divide again, and again, and a few more times. You eventually get containers with so little gas in them they don’t have a temperature. Where did it go? When did it disappear?

The counterpart to an intensive property is an extensive one. This is stuff like the mass or the volume or the energy of a thing. Cut the gas’s container in two, and each has half the volume. Cut it in half again, and each of the four containers has one-quarter the volume. Keep this up and you stay in uncontroversial territory, because I am not discussing Zeno’s Paradoxes here.

And like Zeno’s Paradoxes, the Sorites Paradox can seem at first trivial. We can distinguish a heap from a non-heap; who cares where the dividing line is? Or whether the division is a gradual change? It seems easy. To show why it is easy is hard. Each potential answer is interesting, and plausible, and when you think hard enough of it, not quite satisfying. Good material to think about.


I hope to find some material think about the letter ‘T’ and have it published Friday. It’ll be available at this link, as are the rest of these glossary posts.

My 2018 Mathematics A To Z: Randomness


Today’s topic is an always rich one. It was suggested by aajohannas, who so far as I know has’t got an active blog or other project. If I’m mistaken please let me know. I’m glad to mention the creative works of people hanging around my blog.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Randomness.

An old Sydney Harris cartoon I probably won’t be able to find a copy of before this publishes. A couple people gather around an old fanfold-paper printer. On the printout is the sequence “1 … 2 … 3 … 4 … 5 … ” The caption: ‘Bizarre sequence of computer-generated random numbers’.

Randomness feels familiar. It feels knowable. It means surprise, unpredictability. The upending of patterns. The obliteration of structure. I imagine there are sociologists who’d say it’s what defines Modernity. It’s hard to avoid noticing that the first great scientific theories that embrace unpredictability — evolution and thermodynamics — came to public awareness at the same time impressionism came to arts, and the subconscious mind came to psychology. It’s grown since then. Quantum mechanics is built on unpredictable specifics. Chaos theory tells us even if we could predict statistics it would do us no good. Randomness feels familiar, even necessary. Even desirable. A certain type of nerd thinks eagerly of the Singularity, the point past which no social interactions are predictable anymore. We live in randomness.

And yet … it is hard to find randomness. At least to be sure we have found it. We might choose between options we find ambivalent by tossing a coin. This seems random. But anyone who was six years old and trying to cheat a sibling knows ways around that. Drop the coin without spinning it, from a half-inch above the table, and you know the outcome, all the way through to the sibling’s punching you. When we’re older and can be made to be better sports we’re fairer about it. We toss the coin and give it a spin. There’s no way we could predict the outcome. Unless we knew just how strong a toss we gave it, and how fast it spun, and how the mass of the coin was distributed. … Really, if we knew enough, our tossed coin would be as predictably as the coin we dropped as a six-year-old. At least unless we tossed in some chaotic way, where each throw would be deterministic, but we couldn’t usefully make a prediction.

At a craps table, Commander Data looks with robo-concern at the dice in his hand. Riker, Worf, and some characters from the casino hotel watch, puzzled.
Dice are also predictable, if you are able to precisely measure how the weight inside them is distributed, and can be precise enough about how you’ll throw them, and know enough about the surface they’ll roll on. Screen capture from TrekCore’s archive of Star Trek: The Next Generation images.

Our instinctive idea of what randomness must be is flawed. That shouldn’t surprise. Our instinctive idea of anything is flawed. But randomness gives us trouble. It’s obvious, for example, that randomly selected things should have no pattern. But then how is that reasonable? If we draw letters from the alphabet at random, we should expect sometimes to get some cute pattern like ‘aaaaa’ or ‘qwertyuiop’ or the works of Shakespeare. Perhaps we mean we shouldn’t get patterns any more often than we would expect. All right; how often is that?

We can make tests. Some of them are obvious. Take something that generates possibly-random results. Look up how probable each of those outcomes is. Then run off a bunch of outcomes. Do we get about as many of each result as we should expect? Probability tells us we should get as close as we like to the expected frequency if we let the random process run long enough. If this doesn’t happen, great! We can conclude we don’t really have something random.

We can do more tests. Some of them are brilliantly clever. Suppose there’s a way to order the results. Since mathematicians usually want numbers, putting them in order is easy to do. If they’re not, there’s usually a way to match results to numbers. You’ll see me slide here into talking about random numbers as though that were the same as random results. But if I can distinguish different outcomes, then I can label them. If I can label them, I can use numbers as labels. If the order of the numbers doesn’t matter — should “red” be a 1 or a 2? Should “green” be a 3 or an 8? — then, fine; any order is good.

There are 120 ways to order five distinct things. So generate lots of sets of, say, five numbers. What order are they in? There’s 120 possibilities. Do each of the possibilities turn up as often as expected? If they don’t, great! We can conclude we don’t really have something random.

I can go on. There are many tests which will let us say something isn’t a truly random sequence. They’ll allow for something like Sydney Harris’s peculiar sequence of random numbers. Mostly by supposing that if we let it run long enough the sequence would stop. But these all rule out random number generators. Do we have any that rule them in? That say yes, this generates randomness?

I don’t know of any. I suspect there can’t be any, on the grounds that a test of a thousand or a thousand million or a thousand million quadrillion numbers can’t assure us the generator won’t break down next time we use it. If we knew the algorithm by which the random numbers were generated — oh, but there we’re foiled before we can start. An algorithm is the instructions of how to do a thing. How can an instruction tell us how to do a thing that can’t be predicted?

Algorithms seem, briefly, to offer a way to tell whether we do have a good random sequence, though. We can describe patterns. A strong pattern is easy to describe, the way a familiar story is easy to reference. A weak pattern, a random one, is hard to describe. It’s like a dream, in which you can just list events. So we can call random something which can’t be described any more efficiently than just giving a list of all the results. But how do we know that can’t be done? 7, 7, 2, 4, 5, 3, 8, 5, 0, 9 looks like a pretty good set of digits, whole numbers from 0 through 9. I’ll bet not more than one in ten of you guesses correctly what the next digit in the sequence is. Unless you’ve noticed that these are the digits in the square root of π, so that the next couple digits have to be 0, 5, 5, and 1.

We know, on theoretical grounds, that we have randomness all around us. Quantum mechanics depends on it. If we need truly random numbers we can set a sensor. It will turn the arrival of cosmic rays, or the decay of radioactive atoms, or the sighing of a material flexing in the heat into numbers. We trust we gather these and process them in a way that doesn’t spoil their unpredictability. To what end?

That is, why do we care about randomness? Especially why should mathematicians care? The image of mathematics is that it is a series of logical deductions. That is, things known to be true because they follow from premises known to be true. Where can randomness fit?

One answer, one close to my heart, is called Monte Carlo methods. These are techniques that find approximate answers to questions. They do well when exact answers are too hard for us to find. They use random numbers to approximate answers and, often, to make approximate answers better. This demands computations. The field didn’t really exist before computers, although there are some neat forebears. I mean the Buffon needle problem, which lets you calculate the digits of π about as slowly as you could hope to do.

Another, linked to Monte Carlo methods, is stochastic geometry. “Stochastic” is the word mathematicians attach to things when they feel they’ve said “random” too often, or in an undignified manner. Stochastic geometery is what we can know about shapes when there’s randomness about how the shapes are formed. This sounds like it’d be too weak a subject to study. That it’s built on relatively weak assumptions means it describes things in many fields, though. It can be seen in understanding how forests grow. How to find structures inside images. How to place cell phone towers. Why materials should act like they do instead of some other way. Why galaxies cluster.

There’s also a stochastic calculus, a bit of calculus with randomness added. This is useful for understanding systems where some persistent unpredictable behavior is there. It comes, if I understand the histories of this right, from studying the ways molecules will move around in weird zig-zagging twists. They do this even when there is no overall flow, just a fluid at a fixed temperature. It too has surprising applications. Without the assumption that some prices of things are regularly jostled by arbitrary and unpredictable forces, and the treatment of that by stochastic calculus methods, we wouldn’t have nearly the ability to hedge investments against weird chaotic events. This would be a bad thing, I am told by people with more sophisticated investments than I have. I personally own like ten shares of the Tootsie Roll corporation and am working my way to a $2.00 rebate check from Boyer.

Playland's Derby Racer in motion, at night, featuring a ride operator leaning maybe twenty degrees inward.
Rye Playland’s is the fastest carousel I’m aware of running. Riders are warned ahead of time to sit so they’re leaning to the left, and the ride will not get up to full speed until the ride operator checks everyone during the ride. To get some idea of its speed, notice the ride operator on the left and how far he leans. He’s not being dramatic; that’s the natural stance. Also the tilt in the carousel’s floor is not camera trickery; it does lean like that.

Given that we need randomness, but don’t know how to get it — or at least don’t know how to be sure we have it — what is there to do? We accept our failings and make do with “quasirandom numbers”. We find some process that generates numbers which look about like random numbers should. These have failings. Most important is that if we could predict them. They’re random like “the date Easter will fall on” is random. The date Easter will fall is not at all random; it’s defined by a specific and humanly knowable formula. But if the only information you have is that this year, Easter fell on the 1st of April (Gregorian computus), you don’t have much guidance to whether this coming year it’ll be on the 7th, 14th, or 21st of April the next year. Most notably, quasirandom number generators will tend to repeat after enough numbers are drawn. If we know we won’t need enough numbers to see a repetition, though? Another stereotype of the mathematician is that of a person who demands exactness. It is often more true to say she is looking for an answer good enough. We are usually all right with a merely good enough quasirandomness.

Boyer candies — Mallo Cups, most famously, although I more like the peanut butter Smoothies — come with a cardboard card backing. Each card has two play money “coins”, of values from 5 cents to 50 cents. These can be gathered up for a rebate check or for various prizes. Whether your coin is 5 cents, 10, 25, or 50 cents … well, there’s no way to tell, before you open the package. It’s, so far as you can tell, randomness.


My next A To Z post should be available at this link. It’s coming Tuesday and should be the letter ‘S’.

I’m Looking For The Last Topics For My Fall 2018 Mathematics A-To-Z


And now it’s my last request for my Fall 2018 mathematics A-To-Z. There’s only a half-dozen letters left, but nto to fear: they include letters with no end of potential topics, like, ‘X’.

If you have any mathematical topics with a name that starts U through Z that you’d like to see me write about, please say so. I’m happy to write what I fully mean to be a tight 500 words about the subject and then find I’ve put up my second 1800-word essay of the week. I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me. This might be the only way to get a good ‘X’ letter.

Also when you do make a request, please feel free to mention your blog, Twitter feed, YouTube channel, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.


So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

Excerpted From The Summer 2015 A To Z


Excerpted From The Leap Day 2016 A To Z


Excerpted From The End 2016 A To Z


Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

  • U
  • V
  • W
  • X
  • Y
  • Z

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

My 2018 Mathematics A To Z: Quadratic Equation


I have another topic today suggested by Dina Yagodich. I’ve mentioned before her YouTube channel. It’s got a variety of educational videos you might enjoy. Give it a try.

I’m planning this week to open up the end of the alphabet — and the year — to topic suggestions. So there’s no need to panic about that.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Quadratic Equation.

The Quadratic Equation is the tool humanity used to discover mathematics. Yes, I exaggerate a bit. But it touches a stunning array of important things. It is most noteworthy because of the time I impressed by several-levels-removed boss at the summer job I had while an undergraduate. He had been stumped by a data-optimization problem for weeks. I noticed it was just a quadratic equation, that’s easy to solve. He was, must be said, overly impressed. I would go on to grad school where I was once stymied for a week because I couldn’t find the derivative of e^t correctly. It is, correctly, e^t . So I have sympathy for my remote supervisor.

We normally write the Quadratic Equation in one of two forms:

ax^2 + bx + c = 0

a_0 + a_1 x + a_2 x^2 = 0

The first form is great when you are first learning about polynomials, and parabolas. And you’re content to something raised to the second power. The second form is great when you are learning advanced stuff about polynomials. Then you start wanting to know things true about polynomials that go up to arbitrarily high powers. And we always want to know about polynomials. The subscripts under a_j mean we can’t run out of letters to be coefficients. Setting the subscripts and powers to keep increasing lets us write this out neatly.

We don’t have to use x. We never do. But we mostly use x. Maybe t, if we’re writing an equation that describes something changing with time. Maybe z, if we want to emphasize how complex-valued numbers might enter into things. The name of the independent variable doesn’t matter. But stick to the obvious choices. If you’re going to make the variable ‘f’ you better have a good reason.

The equation is very old. We have ancient Babylonian clay tablets which describe it. Well, not the quadratic equation as we write it. The oldest problems put it as finding numbers that simultaneously solve two equations, one of them a sum and one of them a product. Changing one equation into two is a venerable mathematical process. It often makes problems simpler. We do this all the time in Ordinary Differential Equations. I doubt there is a direct connection between Ordinary Differential Equations and this alternate form of the Quadratic Equation. But it is a reminder that the ways we express mathematical problems are our conventions. We can rewrite problems to make our lives easier, to make answers clearer. We should look for chances to do that.

It weaves into everything. Some things seem obvious. Suppose the coefficients — a, b, and c; or a_0, a_1, a_2 if you’d rather — are all real-valued numbers. Then the quadratic equation has to hav two solutions. There can be two real-valued solutions. There can be one real-valued solution, counted twice for reasons that make sense but are too much a digression for me to justify here. There can be two complex-valued solutions. We can infer the usefulness of imaginary and complex-valued numbers by finding solutions to the quadratic equation.

(The quadratic equation is a great introduction complex-valued numbers. It’s not how mathematicians came to them. Complex-valued numbers looked like obvious nonsense. They corresponded to there being no real-valued answers. A formula that gives obvious nonsense when there’s no answer is great. It’s formulas that give subtle nonsense when there’s no answer that are dangerous. But similar-in-design formulas for cubic and quartic polynomials could use complex-valued numbers in intermediate steps. Plunging ahead as though these complex-valued numbers were proper would get to the real-valued answers. This made the argument that complex-valued numbers should be taken seriously.)

We learn useful things right away from trying to solve it. We teach students to “complete the square” as a first approach to solving it. Completing the square is not that useful by itself: a few pages later in the textbook we get to the quadratic formula and that has every quadratic equation solved. Just plug numbers into the formula. But completing the square teaches something more useful than just how to solve an equation. It’s a method in which we solve a problem by saying, you know, this would be easy to solve if only it were different. And then thinking how to change it into a different-looking problem with the same solutions. This is brilliant work. A mathematician is imagined to have all sorts of brilliant ideas on how to solve problems. Closer to to the truth is that she’s learned all sorts of brilliant ways to make a problem more like one she already knows how to solve. (This is the nugget of truth which makes one genre of mathematical jokes. These jokes have the punch line, “the mathematician declares, `this is a problem already solved’ and goes back to sleep.”)

Stare at the solutions of the quadratic equation. You will find patterns. Suppose the coefficients are all real numbers. Then there are some numbers that can be solutions: 0, 1, square root of 15, -3.5, these can all turn up. There are some numbers that can’t be. π. e. The tangent of 2. It’s not just a division between rational and irrational numbers. There are different kinds of irrational numbers. This — alongside looking at other polynomials — leads us to transcendental numbers.

Keep staring at the two solutions of the quadratic equation. You’ll notice the sum of the solutions is -\frac{b}{a} . You’ll notice the product of the two solutions is \frac{c}{a} . You’ll glance back at those ancient Babylonian tablets. This seems interesting, but little more than that. It’s a lead, though. Similar formulas exist for the sum of the solutions for a cubic, for a quartic, for other polynomials. Also for the sum of products of pairs of these solutions. Or the sum of products of triplets of these solutions. Or the product of all these solutions. These are known as Vieta’s Formulas, after the 16th-century mathematician François Viète. (This by way of his Latinized, academic’sona, name, Franciscus Vieta.) This gives us a way to rewrite the original polynomial as a set of polynomials in several variables. What’s interesting is the set of polynomials have symmetries. They all look like, oh, “xy + yz + zx”. No one variable gets used in a way distinguishable from the others.

This leads us to group theory. The coefficients start out in a ring. The quotients from these Vieta’s Formulas give us an “extension” of the ring. An extension is roughly what the common use of the word suggests. It takes the ring and builds from it a bigger thing that satisfies some nice interesting rules. And it leads us to surprises. The ancient Greeks had several challenges to be done with only straightedge and compass. One was to make a cube double the volume of a given cube. It’s impossible to do, with these tools. (Even ignoring the question of what we would draw on.) Another was to trisect any arbitrary angle; it turns out, there are angles it’s just impossible. The group theory derived, in part, from this tells us why. One more impossibility: drawing a square that has exactly the same area as a given circle.

But there are possible things still. Step back from the quadratic equation, that ax^2 + bx + c = 0 bit. Make a function, instead, something that matches numbers (real, complex, what have you) to numbers (the same). Its rule: any x in the domain matches to the number f(x) = ax^2 + bx + c in the range. We can make a picture that represents this. Set Cartesian coordinates — the x and y coordinates that people think of as the default — on a surface. Then highlight all the points with coordinates (x, y) which make true the equation y = f(x) . This traces out a particular shape, the parabola.

Draw a line that crosses this parabola twice. There’s now one fully-enclosed piece of the surface. How much area is enclosed there? It’s possible to find a triangle with area three-quarters that of the enclosed part. It’s easy to use straightedge and compass to draw a square the same area as a given triangle. Showing the enclosed area is four-thirds the triangle’s area? That can … kind of … be done by straightedge and compass. It takes infinitely many steps to do this. But if you’re willing to allow a process to go on forever? And you show that the process would reach some fixed, knowable answer? This could be done by the ancient Greeks; indeed, it was. Aristotle used this as an example of the method of exhaustion. It’s one of the ideas that reaches toward integral calculus.

This has been a lot of exact, “analytic” results. There are neat numerical results too. Vieta’s formulas, for example, give us good ways to find approximate solutions of the quadratic equation. They work well if one solution is much bigger than the other. Numerical methods for finding solutions tend to work better if you can start from a decent estimate of the answer. And you can learn of numerical stability, and the need for it, studying these.

Numerical calculations have a problem. We have a set number of decimal places with which to work. What happens if we need a calculation that takes more decimal places than we’re given to do perfectly? Here’s a toy version: two-thirds is the number 0.6666. Or 0.6667. Already we’re in trouble. What is three times two-thirds? We’re going to get either 1.9998 or 2.0001 and either way something’s wrong. The wrongness looks small. But any formula you want to use has some numbers that will turn these small errors into big ones. So numerical stability is, in fairness, not something unique to the quadratic equation. It is something you learn if you study the numerics of the equation deeply enough.

I’m also delighted to learn, through Wikipedia, that there’s a prosthaphaeretic method for solving the quadratic equation. Prosthaphaeretic methods use trigonometric functions and identities to rewrite problems. You might call it madness to rely on arctangents and half-angle formulas and such instead of, oh, doing a division or taking a square root. This is because you have calculators. But if you don’t? If you have to do all that work by hand? That’s terrible. But if someone has already prepared a table listing the sines and cosines and tangents of a great variety of angles? They did a great many calculations already. You just need to pick out the one that tells you what you hope to know. I’ll spare you the steps of solving the quadratic equation using trig tables. Wikipedia describes it fine enough.

So you see how much mathematics this connects to. It’s a bit of question-begging to call it that important. As I said, we’ve known the quadratic equation for a long time. We’ve thought about it for a long while. It would be surprising if we didn’t find many and deep links to other things. Even if it didn’t have links, we would try to understand new mathematical tools in terms of how they affect familiar old problems like this. But these are some of the things which we’ve found, and which run through much of what we understand mathematics to be.


The letter ‘R’ for this Fall 2018 Mathematics A-To-Z post should be published Friday. It’ll be available at this link, as are the rest of these glossary posts.

My 2018 Mathematics A To Z: Pigeonhole Principle


Today’s topic is another that Dina Yagodich offered me. She keeps up a YouTube channel, with a variety of educational videos you might enjoy. And I apologize to Roy Kassinger, but I might come back around to “parasitic numbers” if I feel like doing some supplements or side terms.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Pigeonhole Principle.

Mathematics has a reputation for obscurity. It’s full of jargon. All these weird, technical terms. Properties that mathematicians take to be obvious but that normal people find baffling. “The only word I understood was `the’,” is the feedback mathematicians get when they show their family their thesis. I’m happy to share one that is not. This is one of those principles that anyone can understand. It’s so accessible that people might ask how it’s even mathematics.

The Pigeonhole Principle is usually put something like this. If you have more pigeons than there are different holes to nest them in, then at least one pigeonhole has to hold more than one pigeon. This is if the speaker wishes to keep pigeons in the discussion and is assuming there’s a positive number of both pigeons and holes. Tying a mathematical principle to something specific seems odd. We don’t talk about addition as apples put together or divided among friends. Not after elementary school anyway. Not once we’ve trained our natural number sense to work with abstractions.

One pigeon, as photographed by me in Pittsburgh in summer 2017. Not depicted: pigeonholes.
A pigeon, on a Pittsburgh sidewalk in summer 2017, giving me a good looking-over.

If we want to make it abstract we can. Put it as “if you have more objects to put in boxes then you have boxes, then at least one box must hold more than one object”. In this form it is known as the Dirichlet Box Principle. Dirichlet here is Johan Peter Gustav Lejeune Dirichlet. He’s one of the seemingly infinite number of great 19th-Century French-German mathematicians. His family name was “Lejeune Dirichlet”, so his surname is an example of his own box principle. Everyone speaks of him as just Dirichlet, though. And they speak of him a lot, for stuff in mathematical physics, in thermodynamics, in Fourier transforms, in number theory (he proved two specific cases of Fermat’s Last Theorem), and in probability.

Still, at least in my experience, it’s “pigeonhole principle”. I don’t know why pigeons. It would be as good a metaphor to speak of horses put in stalls, or letters put in mailboxes, or pairs of socks put in hotel drawers. Perhaps it’s a reflection of the long history of breeding pigeons. That they’re familiar, likable animals, despite invective. That a bird in a cubby-hole seems like a cozy, pleasant image.

The pigeonhole principle is one of those neat little utility theorems. I think of it as something handy for existence proofs. These are proofs where you show there must be a thing. They don’t typically tell you what the thing is, or even help you to find it. They promise there is something to find.

Some of its uses seem too boring to bother proving. Pick five cards from a standard deck of cards; at least two will be the same suit. There are at least two non-bald people in Philadelphia who have the same number of hairs on their heads. Some of these uses seem interesting enough to prove, but worth nothing more than a shrug and a huh. Any 27-word sequence in the Constitution of the United States includes at least two words that begin with the same letter. Also at least two words that end with the same letter. If you pick any five integers from 1 to 8 (inclusive), then at least two of them will sum to nine.

Some uses start feeling unsettling. Draw five dots on the surface of an orange. It’s always possible to cut the orange in half in such a way that four points are on the same half. (This supposes that a point on the cut counts as being on both halves.)

Pick a set of 100 different whole numbers. It is always possible to select fifteen of these numbers, so that the difference between any pair of these select fifteen is some whole multiple of 7.

Select six people. There is always a triad of three people who all know one another, or who are all strangers to one another. (This supposes that “knowing one another” is symmetric. Real world relationships are messier than this. I have met Roger Penrose. There is not the slightest chance he is aware of this. Do we count as knowing one another or not?)

Some seem to transcend what we could possibly know. Drop ten points anywhere along a circle of diameter 5. Then we can conclude there are at least two points a distance of less than 2 from one another.

Drop ten points into an equilateral triangle whose sides are all length 1. Then there must be at least two points that are no more than distance \frac{1}{3} apart.

Start with any lossless data compression algorithm. Your friend with the opinions about something called “Ubuntu Linux” can give you one. There must be at least one data set it cannot compress. Your friend is angry about this fact.

Take a line of length L. Drop on it some number of points n + 1. There is some shortest length between consecutive points. What is the largest possible shortest-length-between-points? It is the distance \frac{L}{n} .

As I say, this won’t help you find the examples. You need to check the points in your triangle to see which ones are close to one another. You need to try out possible sets of your 100 numbers to find the ones that are all multiples of seven apart. But you have the assurance that the search will end in success, which is no small thing. And many of the conclusions you can draw are delights: results unexpected and surprising and wonderful. It’s great mathematics.


A note on sources. I drew pigeonhole-principle uses from several places. John Allen Paulos’s Innumeracy. Paulos is another I know without their knowing me. But also 16 Fun Applications of the Pigeonhole Principle, by Presh Talwalkar. Brilliant.org’s Pigeonhole Principle, by Lawrence Chiou, Parth Lohomi, Andrew Ellinor, et al. The Art of Problem Solving’s Pigeonhole Principle, author not made clear on the page. If you’re stumped by how to prove one or more of these claims, and don’t feel like talking them out here, try these pages.


My next Fall 2018 Mathematics A-To-Z post should be Tuesday. It’ll be available at this link, as are the rest of these glossary posts.

My 2018 Mathematics A To Z: Oriented Graph


I am surprised to have had no suggestions for an ‘O’ letter. I’m glad to take a free choice, certainly. It let me get at one of those fields I didn’t specialize in, but could easily have. And let me mention that while I’m still taking suggestions for the letters P through T, each other letter has gotten at least one nomination. I can be swayed by a neat term, though, so if you’ve thought of something hard to resist, try me. And later this month I’ll open up the letters U through Z. Might want to start thinking right away about what X, Y, and Z could be.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Oriented Graph.

This is another term from graph theory, one of the great mathematical subjects for doodlers. A graph, here, is made of two sets of things. One is a bunch of fixed points, called ‘vertices’. The other is a bunch of curves, called ‘edges’. Every edge starts at one vertex and ends at one vertex. We don’t require that every vertex have an edge grow from it.

Already you can see why this is a fun subject. It models some stuff really well. Like, anything where you have a bunch of sources of stuff, that come together and spread out again? Chances are there’s a graph that describes this. There’s a compelling all-purpose interpretation. Have vertices represent the spots where something accumulates, or rests, or changes, or whatever. Have edges represent the paths along which something can move. This covers so much.

The next step is a “directed graph”. This comes from making the edges different. If we don’t say otherwise we suppose that stuff can move along an edge in either direction. But suppose otherwise. Suppose there are some edges that can be used in only one direction. This makes a “directed edge”. It’s easy to see in graph theory networks of stuff like city streets. Once you ponder that, one-way streets follow close behind. If every edge in a graph is directed, then you have a directed graph. Moving from a regular old undirected graph to a directed graph changes everything you’d learned about graph theory. Mostly it makes things harder. But you get some good things in trade. We become able to model sources, for example. This is where whatever might move comes from. Also sinks, which is where whatever might move disappears from our consideration.

You might fear that by switching to a directed graph there’s no way to have a two-way connection between a pair of vertices. Or that if there is you have to go through some third vertex. I understand your fear, and wish to reassure you. We can get a two-way connection even in a directed graph: just have the same two vertices be connected by two edges. One goes one way, one goes the other. I hope you feel some comfort.

What if we don’t have that, though? What if the directed graph doesn’t have any vertices with a pair of opposite-directed edges? And that, then, is an oriented graph. We get the orientation from looking at pairs of vertices. Each pair either has no edge connecting them, or has a single directed edge between them.

There’s a lot of potential oriented graphs. If you have three vertices, for example, there’s seven oriented graphs to make of that. You’re allowed to have a vertex not connected to any others. You’re also allowed to have the vertices grouped into a couple of subsets, and connect only to other vertices in their own subset. This is part of why four vertices can give you 42 different oriented graphs. Five vertices can give you 582 different oriented graphs. You can insist on a connected oriented graph.

A connected graph is what you guess. It’s a graph where there’s no vertices off on their own, unconnected to anything. There’s no subsets of vertices connected only to each other. This doesn’t mean you can always get from any one vertex to any other vertex. The directions might not allow you to that. But if you’re willing to break the laws, and ignore the directions of these edges, you could then get from any vertex to any other vertex. Limiting yourself to connected graphs reduces the number of oriented graphs you can get. But not by as much as you might guess, at least not to start. There’s only one connected oriented graph for two vertices, instead of two. Three vertices have five connected oriented graphs, rather than seven. Four vertices have 34, rather than 42. Five vertices, 535 rather than 582. The total number of lost graphs grows, of course. The percentage of lost graphs dwindles, though.

There’s something more. What if there are no unconnected vertices? That is, every pair of vertices has an edge? If every pair of vertices in a graph has a direct connection we call that a “complete” graph. This is true whether the graph is directed or not. If you do have a complete oriented graph — every pair of vertices has a direct connection, and only the one direction — then that’s a “tournament”. If that seems like a whimsical name, consider one interpretation of it. Imagine a sports tournament in which every team played every other team once. And that there’s no ties. Each vertex represents one team. Each edge is the match played by the two teams. The direction is, let’s say, from the losing team to the winning team. (It’s as good if the direction is from the winning team to the losing team.) Then you have a complete, oriented, directed graph. And it represents your tournament.

And that delights me. A mathematician like me might talk a good game about building models. How one can represent things with mathematical constructs. Here, it’s done. You can make little dots, for vertices, and curved lines with arrows, for edges. And draw a picture that shows how a round-robin tournament works. It can be that direct.


My next Fall 2018 Mathematics A-To-Z post should be Friday. It’ll be available at this link, as are the rest of these glossary posts. And I’ve got requests for the next letter. I just have to live up to at least one of them.

My 2018 Mathematics A To Z: Nearest Neighbor Model


I had a free choice of topics for today! Nobody had a suggestion for the letter ‘N’, so, I’ll take one of my own. If you did put in a suggestion, I apologize; I somehow missed the comment in which you did. I’ll try to do better in future.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Nearest Neighbor Model.

Why are restaurants noisy?

It’s one of those things I wondered while at a noisy restaurant. I have heard it is because restauranteurs believe patrons buy more, and more expensive stuff, in a noisy place. I don’t know that I have heard this correctly, nor that what I heard was correct. I’ll leave it to people who work that end of restaurants to say. But I wondered idly whether mathematics could answer why.

It’s easy to form a rough model. Suppose I want my brilliant words to be heard by the delightful people at my table. Then I have to be louder, to them, than the background noise is. Fine. I don’t like talking loudly. My normal voice is soft enough even I have a hard time making it out. And I’ll drop the ends of sentences when I feel like I’ve said all the interesting parts of them. But I can overcome my instinct if I must.

The trouble comes from other people thinking of themselves the way I think of myself. They want to be heard over how loud I have been. And there’s no convincing them they’re wrong. If there’s bunches of tables near one another, we’re going to have trouble. We’ll each by talking loud enough to drown one another out, until the whole place is a racket. If we’re close enough together, that is. If the tables around mine are empty, chances are my normal voice is enough for the cause. If they’re not, we might have trouble.

So this inspires a model. The restaurant is a space. The tables are set positions, points inside it. Each table is making some volume of noise. Each table is trying to be louder than the background noise. At least until the people at the table reach the limits of their screaming. Or decide they can’t talk, they’ll just eat and go somewhere pleasant.

Making calculations on this demands some more work. Some is obvious: how do you represent “quiet” and “loud”? Some is harder: how far do voices carry? Grant that a loud table is still loud if you’re near it. How far away before it doesn’t sound loud? How far away before you can’t hear it anyway? Imagine a dining room that’s 100 miles long. There’s no possible party at one end that could ever be heard at the other. Never mind that a 100-mile-long restaurant would be absurd. It shows that the limits of people’s voices are a thing we have to consider.

There are many ways to model this distance effect. A realistic one would fall off with distance, sure. But it would also allow for echoes and absorption by the walls, and by other patrons, and maybe by restaurant decor. This would take forever to get answers from, but if done right it would get very good answers. A simpler model would give answers less fitted to your actual restaurant. But the answers may be close enough, and let you understand the system. And may be simple enough that you can get answers quickly. Maybe even by hand.

And so I come to the “nearest neighbor model”. The common English meaning of the words suggest what it’s about. We get it from models, like my restaurant noise problem. It’s made of a bunch of points that have some value. For my problem, tables and their noise level. And that value affects stuff in some region around these points.

In the “nearest neighbor model”, each point directly affects only its nearest neighbors. Saying which is the nearest neighbor is easy if the points are arranged in some regular grid. If they’re evenly spaced points on a line, say. Or a square grid. Or a triangular grid. If the points are in some other pattern, you need to think about what the nearest neighbors are. This is why people working in neighbor-nearness problems get paid the big money.

Suppose I use a nearest neighbor model for my restaurant problem. In this, I pretend the only background noise at my table is that of the people the next table over, in each direction. Two tables over? Nope. I don’t hear them at my table. I do get an indirect effect. Two tables over affects the table that’s between mine and theirs. But vice-versa, too. The table that’s 100 miles away can’t affect me directly, but it can affect a table in-between it and me. And that in-between table can affect the next one closer to me, and so on. The effect is attenuated, yes. Shouldn’t it be, if we’re looking at something farther away?

This sort of model is easy to work with numerically. I’m inclined toward problems that work numerically. Analytically … well, it can be easy. It can be hard. There’s a one-dimensional version of this problem, a bunch of evenly-spaced sites on an infinitely long line. If each site is limited to one of exactly two values, the problem becomes easy enough that freshman physics majors can solve it exactly. They don’t, not the first time out. This is because it requires recognizing a trigonometry trick that they don’t realize would be relevant. But once they know the trick, they agree it’s easy, when they go back two years later and look at it again. It just takes familiarity.

This comes up in thermodynamics, because it makes a nice model for how ferromagnetism can work. More realistic problems, like, two-dimensional grids? … That’s harder to solve exactly. Can be done, though not by undergraduates. Three-dimensional can’t, last time I looked. Weirdly, four-dimensional can. You expect problems to only get harder with more dimensions of space, and then you get a surprise like that.

The nearest-neighbor-model is a first choice. It’s hardly the only one. If I told you there were a next-nearest-neighbor model, what would you suppose it was? Yeah, you’d be right. As long as you supposed it was “things are affected by the nearest and the next-nearest neighbors”. Mathematicians have heard of loopholes too, you know.

As for my restaurant model? … I never actually modelled it. I did think about the model. I concluded my model wasn’t different enough from ferromagnetism models to need me to study it more. I might be mistaken. There may be interesting weird effects caused by the facts of restaurants. That restaurants are pretty small things. That they can have echo-y walls and ceilings. That they can have sound-absorbing things like partial walls or plants. Perhaps I gave up too easily when I thought I knew the answer. Some of my idle thoughts end up too idle.


I should have my next Fall 2018 Mathematics A-To-Z post on Tuesday. It’ll be available at this link, as are the rest of these glossary posts.

My 2018 Mathematics A To Z: Manifold


Two commenters suggested the topic for today’s A to Z post. I suspect I’d have been interested in it if only one had. (Although Dina Yagoditch’s suggestion of the Menger Sponge is hard to resist.) But a double domination? The topic got suggested by Mr Wu, author of MathTuition88, and by John Golden, author of Math Hombre. My thanks to all for interesting things to think about.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Manifold.

So you know how in the first car you ever owned the alternator was always going bad? If you’re lucky, you reach a point where you start owning cars good enough that the alternator is not the thing always going bad. Once you’re there, congratulations. Now the thing that’s always going bad in your car will be the manifold. That one’s for my dad.

Manifolds are a way to do normal geometry on weird shapes. What’s normal geometry? It’s … you know, the way shapes work on your table, or in a room. The Euclidean geometry that we’re so used to that it’s hard to imagine it not working. Why worry about weird shapes? They’re interesting, for one. And they don’t have to be that weird to count as weird. A sphere, like the surface of the Earth, can be weird. And these weird shapes can be useful. Mathematical physics, for example, can represent the evolution of some complicated thing as a path drawn on a weird shape. Bringing what we know about geometry from years of study, and moving around rooms, to a problem that abstract makes our lives easier.

We use language that sounds like that of map-makers when discussing manifolds. We have maps. We gather together charts. The collection of charts describing a surface can be an atlas. All these words have common meanings. Mercifully, these common meanings don’t lead us too far from the mathematical meanings. We can even use the problem of mapping the surface of the Earth to understand manifolds.

If you love maps, the geography kind, you learn quickly that there’s no making a perfect two-dimensional map of the Earth’s surface. Some of these imperfections are obvious. You can distort shapes trying to make a flat map of the globe. You can distort sizes. But you can’t represent every point on the globe with a point on the paper. Not without doing something that really breaks continuity. Like, say, turning the North Pole into the whole line at the top of the map. Like in the Equirectangular projection. Or skipping some of the points, like in the Mercator projection. Or adding some cuts into a surface that doesn’t have them, like in the Goode homolosine projection. You may recognize this as the one used in classrooms back when the world had first begun.

But what if we don’t need the whole globedone in a single map? Turns out we can do that easy. We can make charts that cover a part of the surface. No one chart has to cover the whole of the Earth’s surface. It only has to cover some part of it. It covers the globe with a piece that looks like a common ordinary Euclidean space, where ordinary geometry holds. It’s the collection of charts that covers the whole surface. This collection of charts is an atlas. You have a manifold if it’s possible to make a coherent atlas. For this every point on the manifold has to be on at least one chart. It’s okay if a point is on several charts. It’s okay if some point is on all the charts. Like, suppose your original surface is a circle. You can represent this with an atlas of two charts. Each chart maps the circle, except for one point, onto a line segment. The two charts don’t both skip the same point. All but two points on this circle are on all the maps of this chart. That’s cool. What’s not okay is if some point can’t be coherently put onto some chart.

This sad fate can happen. Suppose instead of a circle you want to chart a figure-eight loop. That won’t work. The point where the figure crosses itself doesn’t look, locally, like a Euclidean space. It looks like an ‘x’. There’s no getting around that. There’s no atlas that can cover the whole of that surface. So that surface isn’t a manifold.

But many things are manifolds nevertheless. Toruses, the doughnut shapes, are. Möbius strips and Klein bottles are. Ellipsoids and hyperbolic surfaces are, or at least can be. Mathematical physics finds surfaces that describe all the ways the planets could move and still conserve the energy and momentum and angular momentum of the solar system. That cheesecloth surface stretched through 54 dimensions, is a manifold. There are many possible atlases, with many more charts. But each of those means we can, at least locally, for particular problems, understand them the same way we understand cutouts of triangles and pentagons and circles on construction paper.

So to get back to cars: no one has ever said “my car runs okay, but I regret how I replaced the brake covers the moment I suspected they were wearing out”. Every car problem is easier when it’s done as soon as your budget and schedule allow.


This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. What will I choose for ‘N’, later this week? I really should have decided that by now.

My 2018 Mathematics A To Z: Limit


I got an irresistible topic for today’s essay. It’s courtesy Peter Mander, author of Carnot Cycle, “the classical blog about thermodynamics”. It’s bimonthly and it’s one worth waiting for. Some of the essays are historical; some are statistical-mechanics; many are mixtures of them. You could make a fair argument that thermodynamics is the most important field of physics. It’s certainly one that hasn’t gotten the popularization treatment it deserves, for its importance. Mander is doing something to correct that.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Limit.

It is hard to think of limits without thinking of motion. The language even professional mathematicians use suggests it. We speak of the limit of a function “as x goes to a”, or “as x goes to infinity”. Maybe “as x goes to zero”. But a function is a fixed thing, a relationship between stuff in a domain and stuff in a range. It can’t change any more than January, AD 1988 can change. And ‘x’ here is a dummy variable, part of the scaffolding to let us find what we want to know. I suppose ‘x’ can change, but if we ever see it, something’s gone very wrong. But we want to use it to learn something about a function for a point like ‘a’ or ‘infinity’ or ‘zero’.

The language of motion helps us learn, to a point. We can do little experiments: if f(x) = \frac{sin(x)}{x} , then, what should we expect it to be for x near zero? It’s irresistible to try out the calculator. Let x be 0.1. 0.01. 0.001. 0.0001. The numbers say this f(x) gets closer and closer to 1. That’s good, right? We know we can’t just put in an x of zero, because there’s some trouble that makes. But we can imagine creeping up on the zero we really wanted. We might spot some obvious prospects for mischief: what if x is negative? We should try -0.1, -0.01, -0.001 and so on. And maybe we won’t get exactly the right answer. But if all we care about is the first (say) three digits and we try out a bunch of x’s and the corresponding f(x)’s agree to those three digits, that’s good enough, right?

This is good for giving an idea of what to expect a limit to look like. It should be, well, what it really really really looks like a function should be. It takes some thinking to see where it might go wrong. It might go to different numbers based on which side you approach from. But that seems like something you can rationalize. Indeed, we do; we can speak of functions having different limits based on what direction you approach from. Sometimes that’s the best one can say about them.

But it can get worse. It’s possible to make functions that do crazy weird things. Some of these look like you’re just trying to be difficult. Like, set f(x) equal to 1 if x is rational and 0 if x is irrational. If you don’t expect that to be weird you’re not paying attention. Can’t blame someone for deciding that falls outside the realm of stuff you should be able to find limits for. And who would make, say, an f(x) that was 1 if x was 0.1 raised to some power, but 2 if x was 0.2 raised to some power, and 3 otherwise? Besides someone trying to prove a point?

Fine. But you can make a function that looks innocent and yet acts weird if the domain is two-dimensional. Or more. It makes sense to say that the functions I wrote in the above paragraph should be ruled out of consideration. But the limit of f(x, y) = \frac{x^3 y}{x^6 + y^2} at the origin? You get different results approaching in different directions. And the function doesn’t give obvious signs of imminent danger here.

We need a better idea. And we even have one. This took centuries of mathematical wrangling and arguments about what should and shouldn’t be allowed. This should inspire sympathy with Intro Calc students who don’t understand all this by the end of week three. But here’s what we have.

I need a supplementary idea first. That is the neighborhood. A point has a neighborhood if there’s some open set that contains it. We represent this by drawing a little blob around the point we care about. If we’re looking at the neighborhood of a real number, then this is a little interval, that’s all. When we actually get around to calculating, we make these neighborhoods little circles. Maybe balls. But when we’re doing proofs about how limits work, or how we use them to prove things, we make blobs. This “neighborhood” idea looks simple, but we need it, so here we go.

So start with a function, named ‘f’. It has a domain, which I’ll call ‘D’. And a range, which I want to call ‘R’, but I don’t think I need the shorthand. Now pick some point ‘a’. This is the point at which we want to evaluate the limit. This seems like it ought to be called the “limit point” and it’s not. I’m sorry. Mathematicians use “limit point” to talk about something else. And, unfortunately, it makes so much sense in that context that we aren’t going to change away from that.

‘a’ might be in the domain ‘D’. It might not. It might be on the border of ‘D’. All that’s important is that there be a neighborhood inside ‘D’ that contains ‘a’.

I don’t know what f(a) is. There might not even be an f(a), if a is on the boundary of the domain ‘D’. But I do know that everything inside the neighborhood of ‘a’, apart from ‘a’, is in the domain. So we can look at the values of f(x) for all the x’s in this neighborhood. This will create a set, in the range, that’s known as the image of the neighborhood. It might be a continuous chunk in the range. It might be a couple of chunks. It might be a single point. It might be some crazy-quilt set. Depends on ‘f’. And the neighborhood. No matter.

Now I need you to imagine the reverse. Pick a point in the range. And then draw a neighborhood around it. Then pick out what we call the pre-image of it. That’s all the points in the domain that get matched to values inside that neighborhood. Don’t worry about trying to do it; that’s for the homework practice. Would you agree with me that you can imagine it?

I hope so because I’m about to describe the part where Intro Calc students think hard about whether they need this class after all.

OK. Ready?

All right. Then I want something in the range. I’m going to call it ‘L’. And it’s special. It’s the limit of ‘f’ at ‘a’ if this following bit is true:

Think of every neighborhood you could pick of ‘L’. Can be big, can be small. Just has to be a neighborhood of ‘L’. Now think of the pre-image of that neighborhood. Is there always a neighborhood of ‘a’ inside that pre-image? It’s okay if it’s a tiny neighborhood. Just has to be an open neighborhood. It doesn’t have to contain ‘a’. You can allow a pinpoint hole there.

If you can always do this, however tiny the neighborhood of ‘L’ is, then the limit of ‘f’ at ‘a’ is ‘L’. If you can’t always do this — if there’s even a single exception — then there is no limit of ‘f’ at ‘a’.

I know. I felt like that the first couple times through the subject too. The definition feels backward. Worse, it feels like it begs the question. We suppose there’s an ‘L’ and then test these properties about it and then if it works we say we’re done? I know. It’s a pain when you start calculating this with specific formulas and all that, too. But supposing there is an answer and then learning properties about it, including whether it can exist? That’s a slick trick. We can use it.

Thing is, the pain is worth it. We can calculate with it and not have to out-think tricky functions. It works for domains with as many dimensions as you need. It works for limits that aren’t inside the domain. It works with domains and ranges that aren’t real numbers. It works for functions with weird and complicated domains. We can adapt it if we want to consider limits that are constrained in some way. It won’t be fooled by tricks like I put up above, the f(x) with different rules for the rational and irrational numbers.

So mathematicians shrug, and do enough problems that they get the hang of it, and use this definition. It’s worth it, once you get there.


This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. And I’m still taking nominations for discussion topics, if you’d like to see mathematics terms explained. I know I would.

I’m Looking For The Next Set Of Topics For My Fall 2018 Mathematics A-To-Z


We’re at the end of another month. So it’s a good chance to set out requests for the next several week’s worth of my mathematics A-To-Z. As I say, I’ve been doing this piecemeal so that I can keep track of requests better. I think it’s been working out, too.

If you have any mathematical topics with a name that starts N through T, let me know! I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me.

Also when you do make a request, please feel free to mention your blog, Twitter feed, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.


So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

Excerpted From The Summer 2015 A To Z


Excerpted From The Leap Day 2016 A To Z


Excerpted From The Summer 2016 A To Z


Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

  • N
  • O
  • P
  • Q
  • R
  • S
  • T

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

My 2018 Mathematics A To Z: Kelvin (the scientist)


Today’s request is another from John Golden, @mathhombre on Twitter and similarly on Blogspot. It’s specifically for Kelvin — “scientist or temperature unit”, the sort of open-ended goal I delight in. I decided on the scientist. But that’s a lot even for what I honestly thought would be a quick little essay. So I’m going to take out a tiny slice of a long and amazingly fruitful career. There’s so much more than this.

Before I get into what I did pick, let me repeat an important warning about historical essays. Every history is incomplete, yes. But any claim about something being done for the first time is simplified to the point of being wrong. Any claim about an individual discovering or inventing something is simplified to the point of being wrong. Everything is more complicated and, especially, more ambiguous than this. If you do not love the challenge of working out a coherent narrative when the most discrete and specific facts are also the ones that are trivia, do not get into history. It will only break your heart and mislead your readers. With that disclaimer, let me try a tiny slice of the life of William Thomson, the Baron Kelvin.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Kelvin (the scientist).

The great thing about a magnetic compass is that it’s easy. Set the thing on an axis and let it float freely. It aligns itself to the magnetic poles. It’s easy to see why this looks like magic.

The trouble is that it’s not quite right. It’s near enough for many purposes. But the direction a magnetic compass points out to be north is not the true geographic north. Fortunately, we’ve got a fair idea just how far off north that is. It depends on where you are. If you have a rough idea where you already are, you can make a correction. We can print up charts saying how much of a correction to make.

The trouble is that it’s still not quite right. The location of the magnetic north and south poles wanders. Fortunately we’ve got a fair idea of how quickly it’s moving, and in what direction. So if you have a rough idea how out of date your chart is, and what direction the poles were moving in, you can make a correction. We can communicate how much the variance between true north and magnetic north vary.

The trouble is that it’s still not quite right. The size of the variation depends on the season of the year. But all right; we should have a rough idea what season it is. We can correct for that. The size of the variation also depends on what time of day it is. Compasses point farther east at around 8 am (sun time) than they do the rest of the day, and farther west around 1 pm. At least they did when Alan Gurney’s Compass: A Story of Exploration and Innovation was published. I would be unsurprised if that’s changed since the book came out a dozen years ago. Still. These are all, we might say, global concerns. They’s based on where you are and when you look at the compass. But they don’t depend on you, the specific observer.

The trouble is that it’s still not quite right yet. Almost as soon as compasses were used for navigation, on ships, mariners noticed the compass could vary. And not just because compasses were often badly designed and badly made. The ships themselves got in the way. The problem started with guns, the iron of which led compasses astray. When it was just the ship’s guns the problem could be coped with. Set the compass binnacle far from any source of iron, and the error should be small enough.

The trouble is when the time comes to make ships with iron. There are great benefits you get from cladding ships in iron, or making them of iron altogether. Losing the benefits of navigation, though … that’s a bit much.

There’s an obvious answer. Suppose you know the construction of the ship throws off compass bearings. Then measure what the compass reads, at some point when you know what it should read. Use that to correct your measurements when you aren’t sure. From the early 1800s mariners could use a method called “swinging the ship”, setting the ship at known angles and comparing what the compass read. It’s a bit of a chore. And you should arrange things you need to do so that it’s harder to make a careless mistake at them.

In the 1850s John Gray of Liverpool patented a binnacle — the little pillar that holds the compass — which used the other obvious but brilliant approach. If the iron which builds the ship sends the compass awry, why not put iron near the compass to put the compass back where it should be? This set up a contraption of a binnacle surrounded by adjustable, correcting magnets.

Enter finally William Thomson, who would become Baron Kelvin in 1892. In 1871 the magazine Good Words asked him to write an article about the marine compass. In 1874 he published his first essay on the subject. The second part appeared five years after that. I am not certain that this is directly related to the tiny slice of story I tell. I just mention it to reassure every academic who’s falling behind on their paper-writing, which is all of them.

But come the 1880s Thomson patented an improved binnacle. Thomson had the sort of talents normally associated only with the heroes of some lovable yet dopey space-opera of the 1930s. He was a talented scientist, competent in thermodynamics and electricity and magnetism and fluid flow. He was a skilled mathematician, as you’d need to be to keep up with all that and along the way prove the Stokes theorem. (This is one of those incredibly useful theorems that gives information about the interior of a volume using only integrals over the surface.) He was a magnificent engineer, with a particular skill at developing instruments that would brilliantly measure delicate matters. He’s famous for saving the trans-Atlantic telegraph cable project. He recognized that what was needed was not more voltage to drive signal through three thousand miles of dubiously made copper wire, but rather ways to pick up the feeble signals that could come across, and amplify them into usability. And also described the forces at work on a ship that is laying a long line of submarine cable. And he was a manufacturer, able to turn these designs into mass-produced products. This through collaborating with James White, of Glasgow, for over half a century. And a businessman, able to convince people and organizations to use the things. He’s an implausible protagonist; and yet, there he is.

Thomson’s revision for the binnacle made it simpler. A pair of spheres, flanking the compass, and adjustable. The Royal Museums Greenwich web site offers a picture of this sort of system. It’s not so shiny as others in the collection. But this angle shows how adjustable the system would be. It’s a design that shows brilliance behind it. What work you might have to do to use it is obvious. At least it’s obvious once you’re told the spheres are adjustable. To reduce a massive, lingering, challenging problem to something easy is one of the great accomplishments of any practical mathematician.

This was not all Thomson did in maritime work. He’d developed an analog computer which would calculate the tides. Wikipedia tells me that Thomson claimed a similar mechanism could solve arbitrary differential equations. I’d accept that claim, if he made it. Thomson also developed better tools for sounding depths. And developed compasses proper, not just the correcting tools for binnacles. A maritime compass is a great practical challenge. It has to be able to move freely, so that it can give a correct direction even as the ship changes direction. But it can’t move too freely, or it becomes useless in rolling seas. It has to offer great precision, or it loses its use in directing long journeys. It has to be quick to read, or it won’t be consulted. Thomson designed a compass that was, my readings indicate, a great fit for all these constraints. By the time of his death in 1907 Kelvin and White (the company had various names) had made something like ten thousand compasses and binnacles.

And this from a person attached to all sorts of statistical mechanics stuff and who’s important for designing electrical circuits and the like.


This and other Fall 2018 Mathematics A-To-Z posts can be read at this link.

My 2018 Mathematics A To Z: Jokes


For today’s entry, Iva Sallay, of Find The Factors, gave me an irresistible topic. I did not resist.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Jokes.

What’s purple and commutes?
An Abelian grape.

Whatever else you say about mathematics we are human. We tell jokes. I will tell some here. You may not understand the words in them. That’s all right. From the Abelian grape there, you gather this is some manner of wordplay. A pun, particularly. It’s built on a technical term. “Abelian groups” come from (not high school) Algebra. In an Abelian group, the group multiplication commutes. That is, if ‘a’ and ‘b’ are any things in the group, then their product “ab” is the same as “ba’. That is, the group works like ordinary addition on numbers does. We say “Abelian” in honor of Niels Henrik Abel, who taught us some fascinating stuff about polynomials. Puns are a common kind of humor. So common, they’re almost base. Even a good pun earns less laughter than groans.

But mathematicians make many puns. A typical page of mathematics jokes has a whole section of puns. “What’s yellow and equivalent to the Axiom of Choice? Zorn’s Lemon.” “What’s nonorientable and lives in the sea?” “Möbius Dick.” “One day Jesus said to his disciples, `The Kingdom of Heaven is like 3x2 + 8x – 9′. Thomas looked very confused and asked peter, `What does the teacher mean?’ Peter replied, `Don’t worry. It’s just another one of his parabolas’.” And there are many jokes built on how it is impossible to tell the difference between the sounds of “π” and “pie”.

It shouldn’t surprise that mathematicians make so many puns. Mathematics trains people to know definitions. To think about precisely what we mean. Puns ignore definitions. They build nonsense out of the ways that sounds interact. Mathematicians practice how to make things interact, even if they don’t know or care what the underlying things are. If you’ve gotten used to proving things about aba^{-1}b^{-1} , without knowing what ‘a’ or ‘b’ are, it’s difficult to avoid turning “poles on the half-plane” (which matters in some mathematical physics) to a story about Polish people on an aircraft.

Popeye's lousy tutor: 'Today I am going to test you at mental multiplication. Quick, how much is 6 1/2 times 656? Quick!' Popeye: '4,264.' 'Right!' 'Blow me down! Anybody what can guess like that don't need no edjacation!'
Elzie Segar’s Thimble Theater from the 14th of September, 1929. Rerun on ComicsKingdom on the 26th of February, 2016. That’s Bernice, the magical Whiffle Hen, as the strange birdlike creature in the last panel there.

If there’s a flaw to this kind of humor it’s that these jokes may sound juvenile. One of the first things that strikes kids as funny is that a thing might have several meanings. Or might sound like another thing. “Why do mathematicians like parks? Because of all the natural logs!”

Jokes can be built tightly around definitions. “What do you get if you cross a mosquito with a mountain climber? Nothing; you can’t cross a vector with a scalar.” “There are 10 kinds of people in the world, those who understand binary mathematics and those who don’t.” “Life is complex; it has real and imaginary parts.”

Paige: 'I keep forgetting ... what's the cosine of 60 degrees?' Jason: 'Well, let's see. If I recall correctly ... 1 - (pi/3)^2/2! + (pi/3)^4/4! - (pi/3)^6/6! + (pi/3)^8/8! - (pi/3)^10/10! + (pi/3)^12/12! - (and this goes on a while, up to (pi/3)^32/32! - ... )' Paige: 'In case you've forgotten, I'm not paying you by the hour.' Jason: '1/2'.
Bill Amend’s FoxTrot Classics for the 23rd of May, 2018. It originally ran the 29th of May, 1996.

There are more sophisticated jokes. Many of them are self-deprecating. “A mathematician is a device for turning coffee into theorems.” “An introvert mathematician looks at her shoes while talking to you. An extrovert mathematician looks at your shoes.” “A mathematics professor is someone who talks in someone else’s sleep”. “Two people are adrift in a hot air balloon. Finally they see someone and shout down, `Where are we?’ The person looks up, and studies them, watching the balloon drift away. Finally, when they are barely in shouting range, the person on the ground shouts back, `You are in a balloon!’ The first passenger curses their luck at running across a mathematician. `How do you know that was a mathematician?’ `Because her answer took a long time, was perfectly correct, and absolutely useless!”’ These have the form of being about mathematicians. But they’re not really. It would be the same joke to say “a poet is a device for turning coffee into couplets”, the sleep-talker anyone who teachers, or have the hot-air balloonists discover a lawyer or a consultant.

Some of these jokes get more specific, with mathematics harder to extract from the story. The tale of the nervous flyer who, before going to the conference, sends a postcard that she has a proof of the Riemann hypothesis. She arrives and admits she has no such thing, of course. But she sends that word ahead of every conference. She knows if she died in a plane crash after that, she’d be famous forever, and God would never give her that. (I wonder if Ian Randal Strock’s little joke of a story about Pierre de Fermat was an adaptation of this joke.) You could recast the joke for physicists uniting gravity and quantum mechanics. But I can’t imagine a way to make this joke about an ISO 9000 consultant.

'If it's a hunnert miles to th' city an' a train is travelin' thurty miles an hour is due t'arrive at 5:00 pm --- what time does th' train leave Hootin' Holler, Jughaid?' 'I dunno, Miz Prunelly, but you better go now jest t'be on th' safe side!!'
John Rose’s Barney Google and Snuffy Smith for the 12th of February, 2016.

A dairy farmer knew he could be milking his cows better. He could surely get more milk, and faster, if only the operations of his farm were arranged better. So he hired a mathematician to find the optimal way to configure everything. The mathematician toured every part of the pastures, the milking barn, the cows, everything relevant. And then the mathematician set to work devising a plan for the most efficient possible cow-milking operation. The mathematician declared, “First, assume a spherical cow.”

This joke is very mathematical. I know of no important results actually based on spherical cows. But the attitude that tries to make spheres of cows comes from observing mathematicians. To describe any real-world process is to make a model of that thing. A model is a simplification of the real thing. You suppose that things behave more predictably than the real thing. You trust the error made by this supposition is small enough for your needs. A cow is complicated, all those pointy ends and weird contours. A sphere is easy. And, besides, cows are funny. “Spherical cow” is a funny string of sounds, at least in English.

The spherical cows approach parodying the work mathematicians do. Many mathematical jokes are burlesques of deductive logic. Or not even burlesques. Charles Dodgson, known to humans as Lewis Carroll, wrote this in Symbolic Logic:

“No one, who means to go by the train and cannot get a conveyance, and has not enough time to walk to the station, can do without running;
This party of tourists mean to go by the train and cannot get a conveyance, but they have plenty of time to walk to the station.
∴ This party of tourists need not run.”

[ Here is another opportunity, gentle Reader, for playing a trick on your innocent friend. Put the proposed Syllogism before him, and ask him what he thinks of the Conclusion.

He will reply “Why, it’s perfectly correct, of course! And if your precious Logic-book tells you it isn’t, don’t believe it! You don’t mean to tell me those tourists need to run? If I were one of them, and knew the Premises to be true, I should be quite clear that I needn’t run — and I should walk!

And you will reply “But suppose there was a mad bull behind you?”

And then your innocent friend will say “Hum! Ha! I must think that over a bit!” ]

The punch line is diffused by the text being so educational. And by being written in the 19th century, when it was bad form to excise any word from any writing. But you can recognize the joke, and why it should be a joke.

Not every mathematical-reasoning joke features some manner of cattle. Some are legitimate:

Claim. There are no uninteresting whole numbers.
Proof. Suppose there is a smalled uninteresting whole number. Call it N. That N is uninteresting is an interesting fact. Therefore N is not an uninteresting whole number.

Three mathematicians step up to the bar. The bartender asks, “you all want a beer?” The first mathematician says, “I don’t know.” The second mathematician says, “I don’t know.” The third says, “Yes”.

Some mock reasoning uses nonsense methods to get a true conclusion. It’s the fun of watching Mister Magoo walk unharmed through a construction site to find the department store exchange counter:

5095 / 1019 = 5095 / 1019 = 505 / 101 = 55 / 11 = 5

This one includes the thrill of division by zero.

The Venn Diagram of Grocery Shopping. Overlap 'have teenagers', 'haven't grocery shopped in two weeks', and 'grocery shopping on an empty stomach' and you get 'will need to go back in two days', 'bought entire bakery aisle', and 'bought two of everything'. Where they all overlap, 'need to take out second mortgage'.
Terri Libenson’s Pajama Diaries for the 16th of November, 2016. I was never one for buying too much of the bakery aisle, myself, but then I also haven’t got teenagers. And I did go through so much of my life figuring there was no reason I shouldn’t eat another bagel again.

Venn Diagrams are not by themselves jokes (most of the time). But they are a great structure for jokes. And easy to draw, which is great for us who want to be funny but don’t feel sure about their drafting abilities.

And then there are personality jokes. Mathematics encourages people to think obsessively. Obsessive people are often funny people. Alexander Grothendieck was one of the candidates for “greatest 20th century mathematician”. His reputation is that he worked so well on abstract problems that he was incompetent at practical ones. The story goes that he was demonstrating something about prime numbers and his audience begged him to speak about a specific number, that they could follow an example. And that he grumbled a bit and, finally, said, “57”. It’s not a prime number. But if you speak of “Grothendieck’s prime”, many will recognize what you mean, and grin.

There are more outstanding, preposterous personalities. Paul Erdös was prolific, and a restless traveller. The stories go that he would show up at some poor mathematician’s door and stay with them several months. And then co-author a paper with the elevator operator. (Erdös is also credited as the originator of the “coffee into theorems” quip above.) John von Neumann was supposedly presented with this problem:

Two trains are on the same track, 60 miles apart, heading toward each other, each travelling 30 miles per hour. A fly travels 60 miles per hour, leaving one engine flying toward the other. When it reaches the other engine it turns around immediately and flies back to the other engine. This is repeated until the two trains crash. How far does the fly travel before the crash?

The first, hard way to do this is to realize how far the fly travels is a series. The fly starts at, let’s say, the left engine and flies to the right. Add to that the distance from the right to the left train now. Then left to the right again. Right to left. This is a bunch of calculations. Most people give up on that and realize the problem is easier. The trains will crash in one hour. The fly travels 60 miles per hour for an hour. It’ll fly 60 miles total. John von Neumann, say witnesses, had the answer instantly. He recognized the trick? “I summed the series.”

Henry is frustrated with his arithmetic, until he goes to the pool hall and counts off numbers on those score chips.
Don Trachte’s Henry for the 6th of September, 2015.

The personalities can be known more remotely, from a handful of facts about who they were or what they did. “Cantor did it diagonally.” Georg Cantor is famous for great thinking about infinitely large sets. His “diagonal proof” shows the set of real numbers must be larger than the set of rational numbers. “Fermat tried to do it in the margin but couldn’t fit it in.” “Galois did it on the night before.” (Évariste Galois wrote out important pieces of group theory the night before a duel. It went badly for him. French politics of the 1830s.) Every field has its celebrities. Mathematicians learn just enough about theirs to know a couple of jokes.

Anthropomorphic 3/5: 'Honey, what's wrong?' Anthropomorphic 1/4: 'Sour son is leaving the faith! He said he's converting to decimals!'
Scott Hilburn’s The Argyle Sweater for the 9th of May, 2018. I like the shout-out to Archimedes in the background art, too. Archimedes, though, didn’t use fractions in the way we’d recognize them. He’d write out a number as a combination of ratios of some reference number. So he might estimate the length of something being as to the length of something else as 19 is to 7, or something like that. This seems like a longwinded and cumbersome way to write out numbers, or much of anything, and makes one appreciate his indefatigability as much as his insight.

The jokes can attach to a generic mathematician personality. “How can you possibly visualize something that happens in a 12-dimensional space?” “Easy, first visualize it in an N-dimensional space, and then let N go to 12.” Three statisticians go hunting. They spot a deer. One shoots, missing it on the left. The second shoots, missing it on the right. The third leaps up, shouting, “We’ve hit it!” An engineer and a mathematician are sleeping in a hotel room when the fire alarm goes off. The engineer ties the bedsheets into a rope and shimmies out of the room. The mathematician looks at this, unties the bedsheets, sets them back on the bed, declares, “this is a problem already solved” and goes back to sleep. (Engineers and mathematicians pair up a lot in mathematics jokes. I assume in engineering jokes too, but that the engineers make wrong assumptions about who the joke is on. If there’s a third person in the party, she’s a physicist.)

Do I have a favorite mathematics joke? I suppose I must. There are jokes I like better than others, and there are — I assume — finitely many different mathematics jokes. So I must have a favorite. What is it? I don’t know. It must vary with the day and my mood and the last thing I thought about. I know a bit of doggerel keeps popping into my head, unbidden. Let me close by giving it to you.

Integral z-squared dz
From 1 to the cube root of 3
   Times the cosine
   Of three π over nine
Equals log of the cube root of e.

This may not strike you as very funny. I’m not sure it strikes me as very funny. But it keeps showing up, all the time. That has to add up.


This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. Also, now and then, I talk about comic strips here. You might like that too.

My 2018 Mathematics A To Z: Infinite Monkey Theorem


Dina Yagodich gave me the topic for today. She keeps up a YouTube channel with a variety of interesting videos. And she did me a favor. I’ve been thinking a long while to write a major post about this theorem. Its subject turns up so often. I’d wanted to have a good essay about it. I hope this might be one.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Infinite Monkey Theorem.

Some mathematics escapes mathematicians and joins culture. This is one such. The monkeys are part of why. They’re funny and intelligent and sad and stupid and deft and clumsy, and they can sit at a keyboard almost look in place. They’re so like humans, except that we empathize with them. To imagine lots of monkeys, and putting them to some silly task, is compelling.

Monkey Typewriter Theory: An immortal monkey pounding on a typewriter will eventually reproduce the text of 'Hamlet'. Baby Keyboard Theory: Left alone, a baby pounding on a computer keyboard will eventually order 32 cases of bathroom caulk from an online retailer.
Paul Trapp’s Thatababy for the 13th of February, 2014.

The metaphor traces back to a 1913 article by the mathematical physicist Émile Borel which I have not read. Searching the web I find much more comment about it than I find links to a translation of the text. And only one copy of the original, in French. And that page wants €10 for it. So I can tell you what everybody says was in Borel’s original text, but can’t verify it. The paper’s title is “Statistical Mechanics and Irreversibility”. From this I surmise that Borel discussed one of the great paradoxes of statistical mechanics. If we open a bottle of one gas in an airtight room, it disperses through the room. Why doesn’t every molecule of gas just happen, by chance, to end up back where it started? It does seem that if we waited long enough, it should. It’s unlikely it would happen on any one day, but give it enough days …

But let me turn to many web sites that are surely not all copying Wikipedia on this. Borel asked us to imagine a million monkeys typing ten hours a day. He posited it was possible but extremely unlikely that they would exactly replicate all the books of the richest libraries of the world. But that would be more likely than the atmosphere in a room un-mixing like that. Fair enough, but we’re not listening anymore. We’re thinking of monkeys. Borel’s is a fantastic image. It would see some adaptation in the years. Physicist Arthur Eddington, in 1928, made it an army of monkeys, with their goal being the writing all the books in the British Museum. By 1960 Bob Newhart had an infinite number of monkeys and typewriters, and a goal of all the great books. Stating the premise gets a laugh I doubt the setup would today. I’m curious whether Newhart brought the idea to the mass audience. (Google NGrams for “monkeys at typewriters” suggest that phrase was unwritten, in books, before about 1965.) We may owe Bob Newhart thanks for a lot of monkeys-at-typewriters jokes.

Kid: 'Mom, Dad, I want to go bungee jumping this summer!' Dad: 'A thousand monkeys working a thousand typewriters would have a better chance of randomly typing the complete works of William Shakespeare over the summer than you have of bungee jumping.' (Awksard pause.) Kid: 'What's a typewriter?' Dad: 'A thousand monkeys randomly TEXTING!'
Bill Hinds’s Cleats rerun for the 1st of July, 2018.

Newhart has a monkey hit on a line from Hamlet. I don’t know if it was Newhart that set the monkeys after Shakespeare particularly, rather than some other great work of writing. Shakespeare does seem to be the most common goal now. Sometimes the number of monkeys diminishes, to a thousand or even to one. Some people move the monkeys off of typewriters and onto computers. Some take the cowardly measure of putting the monkeys at “keyboards”. The word is ambiguous enough to allow for typewriters, computers, and maybe a Megenthaler Linotype. The monkeys now work 24 hours a day. This will be a comment someday about how bad we allowed pre-revolutionary capitalism to get.

The cultural legacy of monkeys-at-keyboards might well itself be infinite. It turns up in comic strips every few weeks at least. Television shows, usually writing for a comic beat, mention it. Computer nerds doing humor can’t resist the idea. Here’s a video of a 1979 Apple ][ program titled THE INFINITE NO. OF MONKEYS, which used this idea to show programming tricks. And it’s a great philosophical test case. If a random process puts together a play we find interesting, has it created art? No deliberate process creates a sunset, but we can find in it beauty and meaning. Why not words? There’s likely a book to write about the infinite monkeys in pop culture. Though the quotations of original materials would start to blend together.

But the big question. Have the monkeys got a chance? In a break from every probability question ever, the answer is: it depends on what the question precisely is. Occasional real-world experiments-cum-art-projects suggest that actual monkeys are worse typists than you’d think. They do more of bashing the keys with a stone before urinating on it, a reminder of how slight is the difference between humans and our fellow primates. So we turn to abstract monkeys who behave more predictably, and run experiments that need no ethical oversight.

Toby: 'So this English writer is like a genius, right? And he's the greatest playwright ever. And I want to be just like him! Cause what he does, see, is he gets infinite monkeys on typewriters and just lets 'em go nuts, so eventually they write ALL of Shakespeare's plays!' Brother: 'Cool! And what kind of monkey is an 'infinite'?' Toby: 'Beats me, but I hope I don't have to buy many of them.' Dad: 'Toby, are you *sure* ywou completely pay attention when your teachers are talking?' Toby: 'What? Yes! Why?'

Greg Cravens’ The Buckets for the 30th of March, 2014.

So we must think what we mean by Shakespeare’s Plays. Arguably the play is a specific performance of actors in a set venue doing things. This is a bit much to expect of even a skilled abstract monkey. So let us switch to the book of a play. This has a more clear representation. It’s a string of characters. Mostly letters, some punctuation. Good chance there’s numerals in there. It’s probably a lot of characters. So the text to match is some specific, long string of characters in a particular order.

And what do we mean by a monkey at the keyboard? Well, we mean some process that picks characters randomly from the allowed set. When I see something is picked “randomly” I want to know what the distribution rule is. Like, are Q’s exactly as probable as E’s? As &’s? As %’s? How likely it is a particular string will get typed is easiest to answer if we suppose a “uniform” distribution. This means that every character is equally likely. We can quibble about capital and lowercase letters. My sense is most people frame the problem supposing case-insensitivity. That the monkey is doing fine to type “whaT beArD weRe i BEsT tO pLAy It iN?”. Or we could set the monkey at an old typesetter’s station, with separate keys for capital and lowercase letters. Some will even forgive the monkeys punctuating terribly. Make your choices. It affects the numbers, but not the point.

Literary Calendar. Several jokes, including: Saturday 7pm: an infinite number of chimpanzees discuss their multi-volume 'Treasury of Western Literature with no Typos' at the Museum of Natural History. Nit picking to follow.
Richard Thompson’s Richard’s Poor Almanac rerun for the 7th of November, 2016.

I’ll suppose there are 91 characters to pick from, as a Linotype keyboard had. So the monkey has capitals and lowercase and common punctuation to get right. Let your monkey pick one character. What is the chance it hit the first character of one of Shakespeare’s plays? Well, the chance is 1 in 91 that you’ve hit the first character of one specific play. There’s several dozen plays your monkey might be typing, though. I bet some of them even start with the same character, so giving an exact answer is tedious. If all we want monkey-typed Shakespeare plays, we’re being fussy if we want The Tempest typed up first and Cymbeline last. If we want a more tractable problem, it’s easier to insist on a set order.

So suppose we do have a set order. Then there’s a one-in-91 chance the first character matches the first character of the desired text. A one-in-91 chance the second character typed matches the second character of the desired text. A one-in-91 chance the third character typed matches the third character of the desired text. And so on, for the whole length of the play’s text. Getting one character right doesn’t make it more or less likely the next one is right. So the chance of getting a whole play correct is \frac{1}{91} raised to the power of however many characters are in the first script. Call it 800,000 for argument’s sake. More characters, if you put two spaces between sentences. The prospects of getting this all correct is … dismal.

I mean, there’s some cause for hope. Spelling was much less fixed in Shakespeare’s time. There are acceptable variations for many of his words. It’d be silly to rule out a possible script that (say) wrote “look’d” or “look’t”, rather than “looked”. Still, that’s a slender thread.

Proverb Busters: testing the validity of old sayings. Doctor: 'A hundred monkeys at a hundred typewriters. Over time, will one of them eventually write a Shakepeare play?' Winky: 'Nope. Just the script for Grown-Ups 3'. Doctor: 'Another proverb busted.'
Tim Rickard’s Brewster Rockit for the 1st of April, 2014.

But there is more reason to hope. Chances are the first monkey will botch the first character. But what if they get the first character of the text right on the second character struck? Or on the third character struck? It’s all right if there’s some garbage before the text comes up. Many writers have trouble starting and build from a first paragraph meant to be thrown away. After every wrong letter is a new chance to type the perfect thing, reassurance for us all.

Since the monkey does type, hypothetically, forever … well, so each character has a probability of only \left(\frac{1}{91}\right)^{800,000} (or whatever) of starting the lucky sequence. The monkey will have 91^{800,000} chances to start. More chances than that.

And we don’t have only one monkey. We have a thousand monkeys. At least. A million monkeys. Maybe infinitely many monkeys. Each one, we trust, is working independently, owing to the monkeys’ strong sense of academic integrity. There are 91^{800,000} monkeys working on the project. And more than that. Each one takes their chance.

Melvin: 'Hold on now --- replacement? Who could you find to do all the tasks only Melvin can perform?' Rita: 'A macaque, in fact. Listen, if an infinite number of monkeys can write all the great works, I'm confident that one will more than cover for you.'
John Zakour and Scott Roberts’s Working Daze for the 29th of May, 2018.

There are dizzying possibilities here. There’s the chance some monkey will get it all exactly right first time out. More. Think of a row of monkeys. What’s the chance the first thing the first monkey in the row types is the first character of the play? What’s the chance the first thing the second monkey in the row types is the second character of the play? The chance the first thing the third monkey in the row types is the third character in the play? What’s the chance a long enough row of monkeys happen to hit the right buttons so the whole play appears in one massive simultaneous stroke of the keys? Not any worse than the chance your one monkey will type this all out. Monkeys at keyboards are ergodic. It’s as good to have a few monkeys working a long while as to have many monkeys working a short while. The Mythical Man-Month is, for this project, mistaken.

That solves it then, doesn’t it? A monkey, or a team of monkeys, has a nonzero probability of typing out all Shakespeare’s plays. Or the works of Dickens. Or of Jorge Luis Borges. Whatever you like. Given infinitely many chances at it, they will, someday, succeed.

Except.

A thousand monkeys at a thousand typewriters ... will eventually write 'Hamlet'. A thousand cats at a thousand typewriters ... will tell you go to write your own danged 'Hamlet'.
Doug Savage’s Savage Chickens for the 14th of August, 2018.

What is the chance that the monkeys screw up? They get the works of Shakespeare just right, but for a flaw. The monkeys’ Midsummer Night’s Dream insists on having the fearsome lion played by “Smaug the joiner” instead. This would send the play-within-the-play in novel directions. The result, though interesting, would not be Shakespeare. There’s a nonzero chance they’ll write the play that way. And so, given infinitely many chances, they will.

What’s the chance that they always will? That they just miss every single chance to write “Snug”. It comes out “Smaug” every time?

Eddie: 'You know the old saying about putting an infinite number of monkeys at an infinite number of typewriters, and eventually they'll accidentally write Shakespeare's plays?' Toby: 'I guess.' Eddie: 'My English teacher says that nothing about our class should worry those monkeys ONE BIT!'
Greg Cravens’s The Buckets for the 6th of October, 2018.

We can say. Call the probability that they make this Snug-to-Smaug typo any given time p . That’s a number from 0 to 1. 0 corresponds to not making this mistake; 1 to certainly making it. The chance they get it right is 1 - p . The chance they make this mistake twice is smaller than p . The chance that they get it right at least once in two tries is closer to 1 than 1 - p is. The chance that, given three tries, they make the mistake every time is even smaller still. The chance that they get it right at least once is even closer to 1.

You see where this is going. Every extra try makes the chance they got it wrong every time smaller. Every extra try makes the chance they get it right at least once bigger. And now we can let some analysis come into play.

So give me a positive number. I don’t know your number, so I’ll call it ε. It’s how unlikely you want something to be before you say it won’t happen. Whatever your ε was, I can give you a number M . If the monkeys have taken more than M tries, the chance they get it wrong every single time is smaller than your ε. The chance they get it right at least once is bigger than 1 – ε. Let the monkeys have infinitely many tries. The chance the monkey gets it wrong every single time is smaller than any positive number. So the chance the monkey gets it wrong every single time is zero. It … can’t happen, right? The chance they get it right at least once is closer to 1 than to any other number. So it must be 1. So it must be certain. Right?

Poncho, the dog, looking over his owner's laptop: 'They say if you let an infinite number of cats walk on an infinite number of keyboards, they'll eventually type all the great works of Shakespeare.' The cat walks across the laptop, connecting to their owner's bank site and entering the correct password. Poncho: 'I'll take it.'
Paul Gilligan’s Pooch Cafe for the 17th of September, 2018.

But let me give you this. Detach a monkey from typewriter duty. This one has a coin to toss. It tosses fairly, with the coin having a 50% chance of coming up tails and 50% chance of coming up heads each time. The monkey tosses the coin infinitely many times. What is the chance the coin comes up tails every single one of these infinitely many times? The chance is zero, obviously. At least you can show the chance is smaller than any positive number. So, zero.

Yet … what power enforces that? What forces the monkey to eventually have a coin come up heads? It’s … nothing. Each toss is a fair toss. Each toss is independent of its predecessors. But there is no force that causes the monkey, after a hundred million billion trillion tosses of “tails”, to then toss “heads”. It’s the gambler’s fallacy to think there is one. The hundred million billion trillionth-plus-one toss is as likely to come up tails as the first toss is. It’s impossible that the monkey should toss tails infinitely many times. But there’s no reason it can’t happen. It’s also impossible that the monkeys still on the typewriters should get Shakespeare wrong every single time. But there’s no reason that can’t happen.

It’s unsettling. Well, probability is unsettling. If you don’t find it disturbing you haven’t thought long enough about it. Infinities, too, are unsettling so.

Researcher overseeing a room of monkeys: 'Shakespeare would be OK, but I'd prefer they come up with a good research grant proposal.'
John Deering’s Strange Brew for the 20th of February, 2014.

Formally, mathematicians interpret this — if not explain it — by saying the set of things that can happen is a “probability space”. The likelihood of something happening is what fraction of the probability space matches something happening. (I’m skipping a lot of background to say something that simple. Do not use this at your thesis defense without that background.) This sort of “impossible” event has “measure zero”. So its probability of happening is zero. Measure turns up in analysis, in understanding how calculus works. It complicates a bunch of otherwise-obvious ideas about continuity and stuff. It turns out to apply to probability questions too. Imagine the space of all the things that could possibly happen as being the real number line. Pick one number from that number line. What is the chance you have picked exactly the number -24.11390550338228506633488? I’ll go ahead and say you didn’t. It’s not that you couldn’t. It’s not impossible. It’s just that the chance that this happened, out of the infinity of possible outcomes, is zero.

The infinite monkeys give us this strange set of affairs. Some things have a probability of zero of happening, which does not rule out that they can. Some things have a probability of one of happening, which does not mean they must. I do not know what conclusion Borel ultimately drew about the reversibility problem. I expect his opinion to be that we have a clear answer, and unsettlingly great room for that answer to be incomplete.


This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. The next essay should come Friday and will, I hope, be shorter.

My 2018 Mathematics A To Z: Hyperbolic Half-Plane


Today’s term was one of several nominations I got for ‘H’. This one comes from John Golden, @mathhobre on Twitter and author of the Math Hombre blog on Blogspot. He brings in a lot of thought about mathematics education and teaching tools that you might find interesting or useful or, better, both.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Hyperbolic Half-Plane.

The half-plane part is easy to explain. By the “plane” mathematicians mean, well, the plane. What you’d get if a sheet of paper extended forever. Also if it had zero width. To cut it in half … well, first we have to think hard what we mean by cutting an infinitely large thing in half. Then we realize we’re overthinking this. Cut it by picking a line on the plane, and then throwing away everything on one side or the other of that line. Maybe throw away everything on the line too. It’s logically as good to pick any line. But there are a couple lines mathematicians use all the time. This is because they’re easy to describe, or easy to work with. At least once you fix an origin and, with it, x- and y-axes. The “right half-plane”, for example, is everything in the positive-x-axis direction. Every point with coordinates you’d describe with positive x-coordinate values. Maybe the non-negative ones, if you want the edge included. The “upper half plane” is everything in the positive-y-axis direction. All the points whose coordinates have a positive y-coordinate value. Non-negative, if you want the edge included. You can make guesses about what the “left half-plane” or the “lower half-plane” are. You are correct.

The “hyperbolic” part takes some thought. What is there to even exaggerate? Wrong sense of the word “hyperbolic”. The word here is the same one used in “hyperbolic geometry”. That takes explanation.

The Western mathematics tradition, as we trace it back to Ancient Greece and Ancient Egypt and Ancient Babylon and all, gave us “Euclidean” geometry. It’s a pretty good geometry. It describes how stuff on flat surfaces works. In the Euclidean formation we set out a couple of axioms that aren’t too controversial. Like, lines can be extended indefinitely and that all right angles are congruent. And one axiom that is controversial. But which turns out to be equivalent to the idea that there’s only one line that goes through a point and is parallel to some other line.

And it turns out that you don’t have to assume that. You can make a coherent “spherical” geometry, one that describes shapes on the surface of a … you know. You have to change your idea of what a line is; it becomes a “geodesic” or, on the globe, a “great circle”. And it turns out that there’s no lines geodesics that go through a point and that are parallel to some other line geodesic. (I know you want to think about globes. I do too. You maybe want to say the lines of latitude are parallel one another. They’re even called parallels, sometimes. So they are. But they’re not geodesics. They’re “little circles”. I am not throwing in ad hoc reasons I’m right and you’re not.)

There is another, though. This is “hyperbolic” geometry. This is the way shapes work on surfaces that mathematicians call saddle-shaped. I don’t know what the horse enthusiasts out there call these shapes. My guess is they chuckle and point out how that would be the most painful saddle ever. Doesn’t matter. We have surfaces. They act weird. You can draw, through a point, infinitely many lines parallel to a given other line.

That’s some neat stuff. That’s weird and interesting. They’re even called “hyperparallel lines” if that didn’t sound great enough. You can see why some people would find this worth studying. The catch is that it’s hard to order a pad of saddle-shaped paper to try stuff out on. It’s even harder to get a hyperbolic blackboard. So what we’d like is some way to represent these strange geometries using something easier to work with.

The hyperbolic half-plane is one of those approaches. This uses the upper half-plane. It works by a move as brilliant and as preposterous as that time Q told Data and LaForge how to stop that falling moon. “Simple. Change the gravitational constant of the universe.”

What we change here is the “metric”. The metric is a function. It tells us something about how points in a space relate to each other. It gives us distance. In Euclidean geometry, plane geometry, we use the Euclidean metric. You can find the distance between point A and point B by looking at their coordinates, (x_A, y_A) and (x_B, y_B) . This distance is \sqrt{\left(x_B - x_A\right)^2 + \left(y_B - y_A\right)^2} . Don’t worry about the formulas. The lines on a sheet of graph paper are a reflection of this metric. Each line is (normally) a fixed distance from its parallel neighbors. (Yes, there are polar-coordinate graph papers. And there are graph papers with logarithmic or semilogarithmic spacing. I mean graph paper like you can find at the office supply store without asking for help.)

But the metric is something we choose. There are some rules it has to follow to be logically coherent, yes. But those rules give us plenty of room to play. By picking the correct metric, we can make this flat plane obey the same geometric rules as the hyperbolic surface. This metric looks more complicated than the Euclidean metric does, but only because it has more terms and takes longer to write out. What’s important about it is that the distance your thumb put on top of the paper covers up is bigger if your thumb is near the bottom of the upper-half plane than if your thumb is near the top of the paper.

So. There are now two things that are “lines” in this. One of them is vertical lines. The graph paper we would make for this has a nice file of parallel lines like ordinary paper does. The other thing, though … well, that’s half-circles. They’re half-circles with a center on the edge of the half-plane. So our graph paper would also have a bunch of circles, of different sizes, coming from regularly-spaced sources on the bottom of the paper. A line segment is a piece of either these vertical lines or these half-circles. You can make any polygon you like with these, if you pick out enough line segments. They’re there.

There are many ways to represent hyperbolic surfaces. This is one of them. It’s got some nice properties. One of them is that it’s “conformal”. Angles that you draw using this metric are the same size as those on the corresponding hyperbolic surface. You don’t appreciate how sweet that is until you’re working in non-Euclidean geometries. Circles that are entirely within the hyperbolic half-plane match to circles on a hyperbolic surface. Once you’ve got your intuition for this hyperbolic half-plane, you can step into hyperbolic half-volumes. And that lets you talk about the geometry of hyperbolic spaces that reach into four or more dimensions of human-imaginable spaces. Isometries — picking up a shape and moving it in ways that don’t change distance — match up with the Möbius Transformations. These are a well-understood set of altering planes that comes from a different corner of geometry. Also from that fellow with the strip, August Ferdinand Möbius. It’s always exciting to find relationships like that in mathematical structures.

Pictures often help. I don’t know why I don’t include them. But here is a web site with pages, and pictures, that describe much of the hyperbolic half-plane. It includes code to use with the Geometer Sketchpad software, which I have never used and know nothing about. That’s all right. There’s at least one page there showing a wondrous picture. I hope you enjoy.


This and other essays in the Fall 2018 A-To-Z should be at this link. And I’ll start paneling for more letters soon.

My 2018 Mathematics A To Z: Group Action


I got several great suggestions for topics for ‘g’. The one that most caught my imagination was mathtuition88’s, the group action. Mathtuition88 is run by Mr Wu, a mathematics tutor in Singapore. His mathematics blog recounts his own explorations of interesting topics.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Group Action.

This starts from groups. A group, here, means a pair of things. The first thing is a set of elements. The second is some operation. It takes a pair of things in the set and matches it to something in the set. For example, try the integers as the set, with addition as the operation. There are many kinds of groups you can make. There can be finite groups, ones with as few as one element or as many as you like. (The one-element groups are so boring. We usually need at least two to have much to say about them.) There can be infinite groups, like the integers. There can be discrete groups, where there’s always some minimum distance between elements. There can be continuous groups, like the real numbers, where there’s no smallest distance between distinct elements.

Groups came about from looking at how numbers work. So the first examples anyone gets are based on numbers. The integers, especially, and then the integers modulo something. For example, there’s Z_2 , which has two numbers, 0 and 1. Addition works by the rule that 0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, and 1 + 1 = 0. There’s similar rules for Z_3 , which has three numbers, 0, 1, and 2.

But after a few comfortable minutes on this, group theory moves on to more abstract things. Things with names like the “permutation group”. This starts with some set of things and we don’t even care what the things are. They can be numbers. They can be letters. They can be places. They can be anything. We don’t care. The group is all of the ways to swap elements around. All the relabellings we can do without losing or gaining an item. Or another, the “symmetry group”. This is, for some given thing — plates, blocks, and wallpaper patterns are great examples — all the ways you can rotate or move or reflect the thing without changing the way it looks.

And now we’re creeping up on what a “group action” is. Let me just talk about permutations here. These are where you swap around items. Like, start out with a list of items “1 2 3 4”. And pick out a permutation, say, swap the second with the fourth item. We write that, in shorthand, as (2 4). Maybe another permutation too. Say, swap the first item with the third. Write that out as (1 3). We can multiply these permutations together. Doing these permutations, in this order, has a particular effect: it swaps the second and fourth items, and swaps the first and third items. This is another permutation on these four items.

These permutations, these “swap this item with that” rules, are a group. The set for the group is instructions like “swap this with that”, or “swap this with that, and that with this other thing, and this other thing with the first thing”. Or even “leave this thing alone”. The operation between two things in the set is, do one and then the other. For example, (2 3) and then (3 4) has the effect of moving the second thing to the fourth spot, the (original) fourth thing to the third spot, and the original third thing to the second spot. That is, it’s the permutation (2 3 4). If you ever need something to doodle during a slow meeting, try working out all the ways you can shuffle around, say, six things. And what happens as you do all the possible combinations of these things. Hey, you’re only permuting six items. How many ways could that be?

So here’s what sounds like a fussy point. The group here is made up the ways you can permute these items. The items aren’t part of the group. They just gave us something to talk about. This is where I got so confused, as an undergraduate, working out groups and group actions.

When we move back to talking about the original items, then we get a group action. You get a group action by putting together a group with some set of things. Let me call the group ‘G’ and the set ‘X’. If I need something particular in the group I’ll call that ‘g’. If I need something particular from the set ‘X’ I’ll call that ‘x’. This is fairly standard mathematics notation. You see how subtly clever this notation is. The group action comes from taking things in G and applying them to things in X, to get things in X. Usually other things, but not always. In the lingo, we say the group action maps the pair of things G and X to the set X.

There are rules these actions have to follow. They’re what you would expect, if you’ve done any fiddling with groups. Don’t worry about them. What’s interesting is what we get from group actions.

First is group orbits. Take some ‘g’ out of the group G. Take some ‘x’ out of the set ‘X’. And build this new set. First, x. Then, whatever g does to x, which we write as ‘gx’. But ‘gx’ is still something in ‘X’, so … what does g do to that? So toss in ‘ggx’. Which is still something in ‘X’, so, toss in ‘gggx’. And ‘ggggx’. And keep going, until you stop getting new things. If ‘X’ is finite, this sequence has to be finite. It might be the whole set of X. It might be some subset of X. But if ‘X’ is finite, it’ll get back, eventually, to where you started, which is why we call this the “group orbit”. We use the same term even if X isn’t finite and we can’t guarantee that all these iterations of g on x eventually get back to the original x. This is a subgroup of X, based on the same group operation that G has.

There can be other special groups. Like, are there elements ‘g’ that map ‘x’ to ‘x’? Sure. The has to be at least one, since the group G has an identity element. There might be others. So, for any given ‘x’, what are all the elements in ‘g’ that don’t change it? The set of all the values of g for which gx is x is the “isotropy group” Gx. Or the “stabilizer subgroup”. This is a subgroup of G, based on x.

Yes, but the point?

Well, the biggest thing we get from group actions is the chance to put group theory principles to work on specific things. A group might describe the ways you can rotate or reflect a square plate without leaving an obvious change in the plate. The group action lets you make this about the plate. Much of modern physics is about learning how the geometry of a thing affects its behavior. This can be the obvious sorts of geometry, like, whether it’s rotationally symmetric. But it can be subtler things, like, whether the forces in the system are different at different times. Group actions let us put what we know from geometry and topology to work in specifics.

A particular favorite of mine is that they let us express the wallpaper groups. These are the ways we can use rotations and reflections and translations (linear displacements) to create different patterns. There are fewer different patterns than you might have guessed. (Different, here, overlooks such petty things as whether the repeated pattern is a diamond, a flower, or a hexagon. Or whether the pattern repeats every two inches versus every three inches.)

And they stay useful for abstract mathematical problems. All this talk about orbits and stabilizers lets us find something called the Orbit Stabilization Theorem. This connects the size of the group G to the size of orbits of x and of the stabilizer subgroups. This has the exciting advantage of letting us turn many proofs into counting arguments. A counting argument is just what you think: showing there’s as many of one thing as there are another. here’s a nice page about the Orbit Stabilization Theorem, and how to use it. This includes some nice, easy-to-understand problems like “how many different necklaces could you make with three red, two green, and one blue bead?” Or if that seems too mundane a problem, an equivalent one from organic chemistry: how many isomers of naphthol could there be? You see where these group actions give us useful information about specific problems.


If you should like a more detailed introduction, although one that supposes you’re more conversant with group theory than I do here, this is a good sequence: Group Actions I, which actually defines the things. Group actions II: the orbit-stabilizer theorem, which is about just what it says. Group actions III — what’s the point of them?, which has the sort of snappy title I like, but which gives points that make sense when you’re comfortable talking about quotient groups and isomorphisms and the like. And what I think is the last in the sequence, Group actions IV: intrinsic actions, which is about using group actions to prove stuff. And includes a mention of one of my favorite topics, the points the essay-writer just didn’t get the first time through. (And more; there’s a point where the essay goes wrong, and needs correction. I am not the Joseph who found the problem.)

My 2018 Mathematics A To Z: Fermat’s Last Theorem


Today’s topic is another request, this one from a Dina. I’m not sure if this is Dina Yagodich, who’d also suggested using the letter ‘e’ for the number ‘e’. Trusting that it is, Dina Yagodich has a YouTube channel of mathematics videos. They cover topics like how to convert degrees and radians to one another, what the chance of a false positive (or false negative) on a medical test is, ways to solve differential equations, and how to use computer tools like MathXL, TI-83/84 calculators, or Matlab. If I’m mistaken, original-commenter Dina, please let me know and let me know if you have any creative projects that should be mentioned here.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Fermat’s Last Theorem.

It comes to us from number theory. Like many great problems in number theory, it’s easy to understand. If you’ve heard of the Pythagorean Theorem you know, at least, there are triplets of whole numbers so that the first number squared plus the second number squared equals the third number squared. It’s easy to wonder about generalizing. Are there quartets of numbers, so the squares of the first three add up to the square of the fourth? Quintuplets? Sextuplets? … Oh, yes. That’s easy. What about triplets of whole numbers, including negative numbers? Yeah, and that turns out to be boring. Triplets of rational numbers? Turns out to be the same as triplets of whole numbers. Triplets of real-valued numbers? Turns out to be very boring. Triplets of complex-valued numbers? Also none too interesting.

Ah, but, what about a triplet of numbers, only raised to some other power? All three numbers raised to the first power is easy; we call that addition. To the third power, though? … The fourth? Any other whole number power? That’s hard. It’s hard finding, for any given power, a trio of numbers that work, although some come close. I’m informed there was an episode of The Simpsons which included, as a joke, the equation 1782^{12} + 1841^{12} = 1922^{12} . If it were true, this would be enough to show Fermat’s Last Theorem was false. … Which happens. Sometimes, mathematicians believe they have found something which turns out to be wrong. Often this comes from noticing a pattern, and finding a proof for a specific case, and supposing the pattern holds up. This equation isn’t true, but it is correct for the first nine digits. An episode of The Wizard of Evergreen Terrace puts forth 3987^{12} + 4365^{12} = 4472^{12} , which apparently matches ten digits. This includes the final digit, also known as “the only one anybody could check”. (The last digit of 398712 is 1. Last digit of 436512 is 5. Last digit of 447212 is 6, and there you go.) Really makes you think there’s something weird going on with 12th powers.

For a Fermat-like example, Leonhard Euler conjectured a thing about “Sums of Like Powers”. That for a whole number ‘n’, you need at least n whole numbers-raised-to-an-nth-power to equal something else raised to an n-th power. That is, you need at least three whole numbers raised to the third power to equal some other whole number raised to the third power. At least four whole numbers raised to the fourth power to equal something raised to the fourth power. At least five whole numbers raised to the fifth power to equal some number raised to the fifth power. Euler was wrong, in this case. L J Lander and T R Parkin published, in 1966, the one-paragraph paper Counterexample to Euler’s Conjecture on Sums of Like Powers. 27^5 + 84^5 + 110^5 + 133^5 = 144^5 and there we go. Thanks, CDC 6600 computer!

But Fermat’s hypothesis. Let me put it in symbols. It’s easier than giving everything long, descriptive names. Suppose that the power ‘n’ is a whole number greater than 2. Then there are no three counting numbers ‘a’, ‘b’, and ‘c’ which make true the equation a^n + b^n = c^n . It looks doable. It looks like once you’ve mastered high school algebra you could do it. Heck, it looks like if you know the proof about how the square root of two is irrational you could approach it. Pierre de Fermat himself said he had a wonderful little proof of it.

He was wrong. No shame in that. He was right about a lot of mathematics, including a lot of stuff that leads into the basics of calculus. And he was right in his feeling that this a^n + b^n = c^n stuff was impossible. He was wrong that he had a proof. At least not one that worked for every possible whole number ‘n’ larger than 2.

For specific values of ‘n’, though? Oh yes, that’s doable. Fermat did it himself for an ‘n’ of 4. Euler, a century later, filed in ‘n’ of 3. Peter Dirichlet, a great name in number theory and analysis, and Joseph-Louis Lagrange, who worked on everything, proved the case of ‘n’ of 5. Dirichlet, in 1832, proved the case for ‘n’ of 14. And there were more partial solutions. You could show that if Fermat’s Last Theorem were ever false, it would have to be false for some prime-number value of ‘n’. That’s great work, answering as it does infinitely many possible cases. It just leaves … infinitely many to go.

And that’s how things went for centuries. I don’t know that every mathematician made some attempt on Fermat’s Last Theorem. But it seems hard to imagine a person could love mathematics enough to spend their lives doing it and not at least take an attempt at it. Nobody ever found it, though. In a 1989 episode of Star Trek: The Next Generation, Captain Picard muses on how eight centuries after Fermat nobody’s proven his theorem. This struck me at the time as too pessimistic. Granted humans were stumped for 400 years. But for 800 years? And stumping everyone in a whole Federation of a thousand worlds? And more than a thousand mathematical traditions? And, for some of these species, tens of thousands of years of recorded history? … Still, there wasn’t much sign of the solving the problem. In 1992 Analog Science Fiction Magazine published a funny short-short story by Ian Randal Strock, “Fermat’s Legacy”. In it, Fermat — jealous of figures like René Descartes and Blaise Pascal who upstaged his mathematical accomplishments — jots down the note. He figures an unsupported claim like that will earn true lasting fame.

So that takes us to 1993, when the world heard about elliptic integrals for the first time. Elliptic curves are neat things. They’re polynomials. They have some nice mathematical properties. People first noticed them in studying how long arcs of ellipses are. (This is why they’re called elliptic curves, even though most of them have nothing to do with any ellipse you’d ever tolerate in your presence.) They look ready to use for encryption. And in 1985, Gerhard Frey noticed something. Suppose you did have, for some ‘n’ bigger than 2, a solution a^n + b^n = c^n . Then you could use that a, b, and n to make a new elliptic curve. That curve is the one that satisfies y^2 = x\cdot\left(x - a^n\right)\cdot\left(x + b^n\right) . And then that elliptic curve would not be “modular”.

I would like to tell you what it means for an elliptic curve to be modular. But getting to that point would take at least four subsidiary essays. MathWorld has a description of what it means to be modular, and even links to explaining terms like “meromorphic”. It’s getting exotic stuff.

Frey didn’t show whether elliptic curves of this time had to be modular or not. This is normal enough, for mathematicians. You want to find things which are true and interesting. This includes conjectures like this, that if elliptic curves are all modular then Fermat’s Last Theorem has to be true. Frey was working on consequences of the Taniyama-Shimura Conjecture, itself three decades old at that point. Yutaka Taniyama and Goro Shimura had found there seemed to be a link between elliptic curves and these “modular forms”, which are a kind of group. That is, a group-theory thing.

So in fall of 1993 I was taking an advanced, though still undergraduate, course in (not-high-school) algebra at Rutgers. It’s where we learn group theory, after Intro to Algebra introduced us to group theory. Some exciting news came out. This fellow named Andrew Wiles at Princeton had shown an impressive bunch of things. Most important, that the Taniyama-Shimura Conjecture was true for semistable elliptic curves. This includes the kind of elliptic curve Frey made out of solutions to Fermat’s Last Theorem. So the curves based on solutions to Fermat’s Last Theorem would have be modular. But Frey had shown any curves based on solutions to Fermat’s Last Theorem couldn’t be modular. The conclusion: there can’t be any solutions to Fermat’s Last Theorem. Our professor did his best to explain the proof to us. Abstract Algebra was the undergraduate course closest to the stuff Wiles was working on. It wasn’t very close. When you’re still trying to work out what it means for something to be an ideal it’s hard to even follow the setup of the problem. The proof itself was inaccessible.

Which is all right. Wiles’s original proof had some flaws. At least this mathematics major shrugged when that news came down and wondered, well, maybe it’ll be fixed someday. Maybe not. I remembered how exciting cold fusion was for about six weeks, too. But this someday didn’t take long. Wiles, with Richard Taylor, revised the proof and published about a year later. So far as I’m aware, nobody has any serious qualms about the proof.

So does knowing Fermat’s Last Theorem get us anything interesting? … And here is a sad anticlimax. It’s neat to know that a^n + b^n = c^n can’t be true unless ‘n’ is 1 or 2, at least for positive whole numbers. But I’m not aware of any neat results that follow from that, or that would follow if it were untrue. There are results that follow from the Taniyama-Shimura Conjecture that are interesting, according to people who know them and don’t seem to be fibbing me. But Fermat’s Last Theorem turns out to be a cute little aside.

Which is not to say studying it was foolish. This easy-to-understand, hard-to-solve problem certainly attracted talented minds to think about mathematics. Mathematicians found interesting stuff in trying to solve it. Some of it might be slight. I learned that in a Pythagorean triplet — ‘a’, ‘b’, and ‘c’ with a^2 + b^2 = c^2 — that I was not the infinitely brilliant mathematician at age fifteen I hoped I might be. Also that if ‘a’, ‘b’, and ‘c’ are relatively prime, you can’t have ‘a’ and ‘b’ both odd and ‘c’ even. You had to have ‘c’ and either ‘a’ or ‘b’ odd, with the other number even. Other mathematicians of more nearly infinite ability found stuff of greater import. Ernst Eduard Kummer in the 19th century developed ideals. These are an important piece of group theory. He was busy proving special cases of Fermat’s Last Theorem.

Kind viewers have tried to retcon Picard’s statement about Fermat’s Last Theorem. They say Picard was really searching for the proof Fermat had, or believed he had. Something using the mathematical techniques available to the early 17th century. Or that follow closely enough from that. The Taniyama-Shimura Conjecture definitely isn’t it. I don’t buy the retcon, but I’m willing to play along for the sake of not causing trouble. I suspect there’s not a proof of the general case that uses anything Fermat could have recognized, or thought he had. That’s all right. The search for a thing can be useful even if the thing doesn’t exist.

My 2018 Mathematics A To Z: e


I’m back to requests! Today’s comes from commenter Dina Yagodich. I don’t know whether Yagodich has a web site, YouTube channel, or other mathematics-discussion site, but am happy to pass along word if I hear of one.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

e.

Let me start by explaining integral calculus in two paragraphs. One of the things done in it is finding a `definite integral’. This is itself a function. The definite integral has as its domain the combination of a function, plus some boundaries, and its range is numbers. Real numbers, if nobody tells you otherwise. Complex-valued numbers, if someone says it’s complex-valued numbers. Yes, it could have some other range. But if someone wants you to do that they’re obliged to set warning flares around the problem and precede and follow it with flag-bearers. And you get at least double pay for the hazardous work. The function that gets definite-integrated has its own domain and range. The boundaries of the definite integral have to be within the domain of the integrated function.

For real-valued functions this definite integral has a great physical interpretation. A real-valued function means the domain and range are both real numbers. You see a lot of these. Call the function ‘f’, please. Call its independent variable ‘x’ and its dependent variable ‘y’. Using Euclidean coordinates, or as normal people call it “graph paper”, draw the points that make true the equation “y = f(x)”. Then draw in the x-axis, that is, the points where “y = 0”. The boundaries of the definite integral are going to be two values of ‘x’, a lower and an upper bound. Call that lower bound ‘a’ and the upper bound ‘b’. And heck, call that a “left boundary” and a “right boundary”, because … I mean, look at them. Draw the vertical line at “x = a” and the vertical line at “x = b”. If ‘f(x)’ is always a positive number, then there’s a shape bounded below by “y = 0”, on the left by “x = a”, on the right by “x = b”, and above by “y = f(x)”. And the definite integral is the area of that enclosed space. If ‘f(x)’ is sometimes zero, then there’s several segments, but their combined area is the definite integral. If ‘f(x)’ is sometimes below zero, then there’s several segments. The definite integral is the sum of the areas of parts above “y = 0” minus the area of the parts below “y = 0”.

(Why say “left boundary” instead of “lower boundary”? Taste, pretty much. But I look at the words “lower boundary” and think about the lower edge, that is, the line where “y = 0” here. And “upper boundary” makes sense as a way to describe the curve where “y = f(x)” as well as “x = b”. I’m confusing enough without making the simple stuff ambiguous.)

Don’t try to pass your thesis defense on this alone. But it’s what you need to understand ‘e’. Start out with the function ‘f’, which has domain of the positive real numbers and range of the positive real numbers. For every ‘x’ in the domain, ‘f(x)’ is the reciprocal, one divided by x. This is a shape you probably know well. It’s a hyperbola. Its asymptotes are the x-axis and the y-axis. It’s a nice gentle curve. Its plot passes through such famous points as (1, 1), (2, 1/2), (1/3, 3), and pairs like that. (10, 1/10) and (1/100, 100) too. ‘f(x)’ is always positive on this domain. Use as left boundary the line “x = 1”. And then — let’s think about different right boundaries.

If the right boundary is close to the left boundary, then this area is tiny. If it’s at, like, “x = 1.1” then the area can’t be more than 0.1. (It’s less than that. If you don’t see why that’s so, fit a rectangle of height 1 and width 0.1 around this curve and these boundaries. See?) But if the right boundary is farther out, this area is more. It’s getting bigger if the right boundary is “x = 2” or “x = 3”. It can get bigger yet. Give me any positive number you like. I can find a right boundary so the area inside this is bigger than your number.

Is there a right boundary where the area is exactly 1? … Well, it’s hard to see how there couldn’t be. If a quantity (“area between x = 1 and x = b”) changes from less than one to greater than one, it’s got to pass through 1, right? … Yes, it does, provided some technical points are true, and in this case they are. So that’s nice.

And there is. It’s a number (settle down, I see you quivering with excitement back there, waiting for me to unveil this) a slight bit more than 2.718. It’s a neat number. Carry it out a couple more digits and it turns out to be 2.718281828. So it looks like a great candidate to memorize. It’s not. It’s an irrational number. The digits go off without repeating or falling into obvious patterns after that. It’s a transcendental number, which has to do with polynomials. Nobody knows whether it’s a normal number, because remember, a normal number is just any real number that you never heard of. To be a normal number, every finite string of digits has to appear in the decimal expansion, just as often as every other string of digits of the same length. We can show by clever counting arguments that roughly every number is normal. Trick is it’s hard to show that any particular number is.

So let me do another definite integral. Set the left boundary to this “x = 2.718281828(etc)”. Set the right boundary a little more than that. The enclosed area is less than 1. Set the right boundary way off to the right. The enclosed area is more than 1. What right boundary makes the enclosed area ‘1’ again? … Well, that will be at about “x = 7.389”. That is, at the square of 2.718281828(etc).

Repeat this. Set the left boundary at “x = (2.718281828etc)2”. Where does the right boundary have to be so the enclosed area is 1? … Did you guess “x = (2.718281828etc)3”? Yeah, of course. You know my rhetorical tricks. What do you want to guess the area is between, oh, “x = (2.718281828etc)3” and “x = (2.718281828etc)5”? (Notice I put a ‘5’ in the superscript there.)

Now, relationships like this will happen with other functions, and with other left- and right-boundaries. But if you want it to work with a function whose rule is as simple as “f(x) = 1 / x”, and areas of 1, then you’re going to end up noticing this 2.718281828(etc). It stands out. It’s worthy of a name.

Which is why this 2.718281828(etc) is a number you’ve heard of. It’s named ‘e’. Leonhard Euler, whom you will remember as having written or proved the fundamental theorem for every area of mathematics ever, gave it that name. He used it first when writing for his own work. Then (in November 1731) in a letter to Christian Goldbach. Finally (in 1763) in his textbook Mechanica. Everyone went along with him because Euler knew how to write about stuff, and how to pick symbols that worked for stuff.

Once you know ‘e’ is there, you start to see it everywhere. In Western mathematics it seems to have been first noticed by Jacob (I) Bernoulli, who noticed it in toy compound interest problems. (Given this, I’d imagine it has to have been noticed by the people who did finance. But I am ignorant of the history of financial calculations. Writers of the kind of pop-mathematics history I read don’t notice them either.) Bernoulli and Pierre Raymond de Montmort noticed the reciprocal of ‘e’ turning up in what we’ve come to call the ‘hat check problem’. A large number of guests all check one hat each. The person checking hats has no idea who anybody is. What is the chance that nobody gets their correct hat back? … That chance is the reciprocal of ‘e’. The number’s about 0.368. In a connected but not identical problem, suppose something has one chance in some number ‘N’ of happening each attempt. And it’s given ‘N’ attempts given for it to happen. What’s the chance that it doesn’t happen? The bigger ‘N’ gets, the closer the chance it doesn’t happen gets to the reciprocal of ‘e’.

It comes up in peculiar ways. In high school or freshman calculus you see it defined as what you get if you take \left(1 + \frac{1}{x}\right)^x for ever-larger real numbers ‘x’. (This is the toy-compound-interest problem Bernoulli found.) But you can find the number other ways. You can calculate it — if you have the stamina — by working out the value of

1 + 1 + \frac12\left( 1 + \frac13\left( 1 + \frac14\left( 1 + \frac15\left( 1 + \cdots \right)\right)\right)\right)

There’s a simpler way to write that. There always is. Take all the nonnegative whole numbers — 0, 1, 2, 3, 4, and so on. Take their factorials. That’s 1, 1, 2, 6, 24, and so on. Take the reciprocals of all those. That’s … 1, 1, one-half, one-sixth, one-twenty-fourth, and so on. Add them all together. That’s ‘e’.

This ‘e’ turns up all the time. Any system whose rate of growth depends on its current value has an ‘e’ lurking in its description. That’s true if it declines, too, as long as the decline depends on its current value. It gets stranger. Cross ‘e’ with complex-valued numbers and you get, not just growth or decay, but oscillations. And many problems that are hard to solve to start with become doable, even simple, if you rewrite them as growths and decays and oscillations. Through ‘e’ problems too hard to do become problems of polynomials, or even simpler things.

Simple problems become that too. That property about the area underneath “f(x) = 1/x” between “x = 1” and “x = b” makes ‘e’ such a natural base for logarithms that we call it the base for natural logarithms. Logarithms let us replace multiplication with addition, and division with subtraction, easier work. They change exponentiation problems to multiplication, again easier. It’s a strange touch, a wondrous one.

There are some numbers interesting enough to attract books about them. π, obviously. 0. The base of imaginary numbers, \imath , has a couple. I only know one pop-mathematics treatment of ‘e’, Eli Maor’s e: The Story Of A Number. I believe there’s room for more.


Oh, one little remarkable thing that’s of no use whatsoever. Mathworld’s page about approximations to ‘e’ mentions this. Work out, if you can coax your calculator into letting you do this, the number:

\left(1 + 9^{-(4^{(42)})}\right)^{\left(3^{(2^{85})}\right)}

You know, the way anyone’s calculator will let you raise 2 to the 85th power. And then raise 3 to whatever number that is. Anyway. The digits of this will agree with the digits of ‘e’ for the first 18,457,734,525,360,901,453,873,570 decimal digits. One Richard Sabey found that, by what means I do not know, in 2004. The page linked there includes a bunch of other, no less amazing, approximations to numbers like ‘e’ and π and the Euler-Mascheroni Constant.

My 2018 Mathematics A To Z: Distribution (probability)


Today’s term ended up being a free choice. Nobody found anything appealing in the D’s to ask about. That’s all right.

I’m still looking for topics for the letters G through M, excluding L, if you’d like in on those letters.

And for my own sake, please check out the Playful Mathematics Education Blog Carnival, #121, if you haven’t already.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Distribution (probability).

I have to specify. There’s a bunch of mathematics concepts called `distribution’. Some of them are linked. Some of them are just called that because we don’t have a better word. Like, what else would you call multiplying the sum of something? I want to describe a distribution that comes to us in probability and in statistics. Through these it runs through modern physics, as well as truly difficult sciences like sociology and economics.

We get to distributions through random variables. These are variables that might be any one of multiple possible values. There might be as few as two options. There might be a finite number of possibilities. There might be infinitely many. They might be numbers. At the risk of sounding unimaginative, they often are. We’re always interested in measuring things. And we’re used to measuring them in numbers.

What makes random variables hard to deal with is that, if we’re playing by the rules, we never know what it is. Once we get through (high school) algebra we’re comfortable working with an ‘x’ whose value we don’t know. But that’s because we trust that, if we really cared, we would find out what it is. Or we would know that it’s a ‘dummy variable’, whose value is unimportant but gets us to something that is. A random variable is different. Its value matters, but we can’t know what it is.

Instead we get a distribution. This is a function which gives us information about what the outcomes are, and how likely they are. There are different ways to organize this data. If whoever’s talking about it doesn’t say just what they’re doing, bet on it being a “probability distribution function”. This follows slightly different rules based on whether the range of values is discrete or continuous, but the idea is roughly the same. Every possible outcome has a probability at least zero but not more than one. The total probability over every possible outcome is exactly one. There’s rules about the probability of two distinct outcomes happening. Stuff like that.

Distributions are interesting enough when they’re about fixed things. In learning probability this is stuff like hands of cards or totals of die rolls or numbers of snowstorms in the season. Fun enough. These get to be more personal when we take a census, or otherwise sample things that people do. There’s something wondrous in knowing that while, say, you might not know how long a commute your neighbor has, you know there’s an 80 percent change it’s between 15 and 25 minutes (or whatever). It’s also good for urban planners to know.

It gets exciting when we look at how distributions can change. It’s hard not to think of that as “changing over time”. (You could make a fair argument that “change” is “time”.) But it doesn’t have to. We can take a function with a domain that contains all the possible values in the distribution, and a range that’s something else. The image of the distribution is some new distribution. (Trusting that the function doesn’t do something naughty.) These functions — these mappings — might reflect nothing more than relabelling, going from (say) a distribution of “false and true” values to one of “-5 and 5” values instead. They might reflect regathering data; say, going from the distribution of a die’s outcomes of “1, 2, 3, 4, 5, or 6” to something simpler, like, “less than two, exactly two, or more than two”. Or they might reflect how something does change in time. They’re all mappings; they’re all ways to change what a distribution represents.

These mappings turn up in statistical mechanics. Processes will change the distribution of positions and momentums and electric charges and whatever else the things moving around do. It’s hard to learn. At least my first instinct was to try to warm up to it by doing a couple test cases. Pick specific values for the random variables and see how they change. This can help build confidence that one’s calculating correctly. Maybe give some idea of what sorts of behaviors to expect.

But it’s calculating the wrong thing. You need to look at the distribution as a specific thing, and how that changes. It’s a change of view. It’s like the change in view from thinking of a position as an x- and y- and maybe z-coordinate to thinking of position as a vector. (Which, I realize now, gave me slightly similar difficulties in thinking of what to do for any particular calculation.)

Distributions can change in time, just the way that — in simpler physics — positions might change. Distributions might stabilize, forming an equilibrium. This can mean that everything’s found a place to stop and rest. That will never happen for any interesting problem. What you might get is an equilibrium like the rings of Saturn. Everything’s moving, everything’s changing, but the overall shape stays the same. (Roughly.)

There are many specifically named distributions. They represent patterns that turn up all the time. The binomial distribution, for example, which represents what to expect if you have a lot of examples of something that can be one of two values each. The Poisson distribution, for representing how likely something that could happen any time (or any place) will happen in a particular span of time (or space). The normal distribution, also called the Gaussian distribution, which describes everything that isn’t trying to be difficult. There are like 400 billion dozen more named ones, each really good at describing particular kinds of problems. But they’re all distributions.

I’m Looking For Some More Topics For My 2018 Mathematics A-To-Z


As I’d said about a month ago, I’m hoping to panel topics for this year’s A-To-Z in a more piecemeal manner. Mostly this is so I don’t lose track of requests. I’m hoping not to go more than about three weeks between when a topic gets brought up and when I actually commit words to page.

But please, if you have any mathematical topics with a name that starts G through M, let me know! I generally take topics on a first-come, first-serve basis for each letter. But I reserve the right to use a not-first-pick choice if I realize the topic’s enchanted me. Also to use a synonym or an alternate phrasing if both topics for a particular letter interest me. Also when you do make a request, please feel free to mention your blog, Twitter feed, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.


So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences, for reference. I probably don’t want to revisit them, but if someone’s interested, I’ll at least think over whether I have new opinions about them. Thank you.

Excerpted From The Summer 2015 A To Z


Excerpted From The Leap Day 2016 A To Z


Excerpted From The Summer 2015 A To Z


Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

  • G
  • H
  • I
  • J
  • K
  • L
  • M

And all the Fall 2018 Mathematics A-To-Z should appear at this link, along with some extra stuff like these topic-request pages and such.

My 2018 Mathematics A To Z: Commutative


Today’s A to Z term comes from Reynardo, @Reynardo_red on Twitter, and is a challenge. And the other A To Z posts for this year should be at this link.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Commutative.

Some terms are hard to discuss. This is among them. Mathematicians find commutative things early on. Addition of whole numbers. Addition of real numbers. Multiplication of whole numbers. Multiplication of real numbers. Multiplication of complex-valued numbers. It’s easy to think of this commuting as just having liberty to swap the order of things. And it’s easy to think of commuting as “two things you can do in either order”. It inspires physical examples like rotating a dial, clockwise or counterclockwise, however much you like. Or outside the things that seem obviously mathematical. Add milk and then cereal to the bowl, or cereal and then milk. As long as you don’t overfill the bowl, there’s not an important different. Per Wikipedia, if you’re putting one sock on each foot, it doesn’t matter which foot gets a sock first.

When something is this accessible, and this universal, it gets hard to talk about. It threatens to be invisible. It was hard to say much interesting about the still air in a closed room, at least before there was a chemistry that could tell it wasn’t a homogenous invisible something, and before there was a statistical mechanics that it was doing something even when it was doing nothing.

But commutativity is different. It’s easy to think of mathematics that doesn’t commute. Subtraction doesn’t, for all that it’s as familiar as addition. And despite that we try, in high school algebra, to fuse it into addition. Division doesn’t either, for all that we try to think of it as multiplication. Rotating things in three dimensions doesn’t commute. Nor does multiplying quaternions, which are a kind of number still. (I’m double-dipping here. You can use quaternions to represent three-dimensional rotations, and vice-versa. So they aren’t quite different examples, even though you can use quaternions to do things unrelated to rotations.) Clothing is a mass of things that can and can’t be put on first.

We talk about commuting as if it’s something in (or not in) the operations we do. Adding. Rotating. Walking in some direction. But it’s not entirely in that. Consider walking directions. From an intersection in the city, walk north to the first intersection you encounter. And walk east to the first intersection you encounter. Does it matter whether you walk north first and then east, or east first and then north? In some cases, no; famously, in Midtown Manhattan there’s no difference. At least if we pretend Broadway doesn’t exist.

Also if we don’t start from near the edge of the island, or near Central Park. An operation, even something familiar like addition, is a function. Its domain is an ordered pair. Each thing in the pair is from the set of whatever might be added together. (Or multiplied, or whatever the name of the operation is.) The operation commutes if the order of the pair doesn’t matter. It’s easy to find sets and operations that won’t commute. I suppose it’s for the same reason it’s easier to find rectangular rather than square things. We’re so used to working with operations like multiplication that we forget that multiplication needs things to multiply.

Whether a thing commutes turns up often in group theory. This shouldn’t surprise. Group theory studies how arithmetic works. A “group”, which is a set of things with an operation like multiplication on it, might or might not commute. A “ring”, which has a set of things and two operations, has some commutativity built into it. One ring operation is something like addition. That commutes, or else you don’t have a ring. The other operation is something like multiplication. That might or might not commute. It depends what you need for your problem. A ring with commuting multiplication, plus some other stuff, can reach the heights of being a “field”. Fields are neat. They look a lot like the real numbers, but they can be all weird, too.

But even in a group, that doesn’t have to have a commuting multiplication, we can tease out commutativity. There is a thing named the “commutator”, which is this particular way of multiplying elements together. You can use it to split the original group in the way that odds and evens split the whole numbers. That splitting is based on the same multiplication as the original group. But its domain is now classes based on elements of the original group. What’s created, the “commutator subgroup”, is commutative. We can find a thing, based on what we are interested in, which offers commutativity right nearby.

It reaches further. In analysis, it can be useful to think of functions as “mappings”. We describe this as though a function took a domain and transformed it into a range. We can compose these functions together: take the range from one function and use it as the domain for another. Sometimes these chains of functions will commute. We can get from the original set to the final set by several paths. This can produce fascinating and beautiful proofs that look as if you just drew a lattice-work. The MathWorld page on “Commutative Diagram” has some examples of this, and I recommend just looking at the pictures. Appreciate their aesthetic, particularly the ones immediately after the sentence about “Commutative diagrams are usually composed by commutative triangles and commutative squares”.

Whether these mappings commute can have meaning. This takes us, maybe inevitably, to quantum mechanics. Mathematically, this represents systems as either a wave function or a matrix, whichever is more convenient. We can use this to find the distribution of positions or momentums or energies or anything else we would like to know. Distributions are as much as we can hope for from quantum mechanics. We can say what (eg) the position of something is most likely to be but not what it is. That’s all right.

The mathematics of finding these distributions is just applying an operator, taking a mapping, on this wave function or this matrix. Some pairs of these operators commute, like the ones that let us find momentum and find kinetic energy. Some do not, like those to find position and angular momentum.

We can describe how much two operators do or don’t commute. This is through a thing called the “commutator”. Its form looks almost playfully simple. Call the operators ‘f’ and ‘g’. And that by ‘fg’ we mean, “do g, then do f”. (This seems awkward. But if you think of ‘fg’ as ‘f(g(x))’, where ‘x’ is just something in the domain of g, then this seems less awkward.) The commutator of ‘f’ and ‘g’ is then whatever ‘fg – gf’ is. If it’s always zero, then ‘f’ and ‘g’ commute. If it’s ever not zero, then they don’t.

This is easy to understand physically. Imagine starting from a point on the surface of the earth. Travel south one mile and then west one mile. You are at a different spot than you would be, had you instead travelled west one mile and then south one mile. How different? That’s the commutator. It’s obviously zero, for just multiplying some regular old numbers together. It’s sometimes zero, for these paths on the Earth’s surface. It’s never zero, for finding-the-position and finding-the-angular-momentum. The amount by which that’s never zero we can see as the famous Uncertainty Principle, the limits of what kinds of information we can know about the world.

Still, it is a hard subject to describe. Things which commute are so familiar that it takes work to imagine them not commuting. (How could three times four equal anything but four times three?) Things which do not commute either obviously shouldn’t (add hot water to the instant oatmeal, and eat it), or are unfamiliar enough people need to stop and think about them. (Rotating something in one direction and then another, in three dimensions, generally doesn’t commute. But I wouldn’t fault you for testing this out with a couple objects on hand before being sure about it.) But it can be noticed, once you know to explore.

My 2018 Mathematics A To Z: Box-And-Whisker Plot


Today’s A To Z term is another from Iva Sallay, Find The Factors blog creator and, as with asymptote, friend of the blog. Thank you for it.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Box-And-Whisker Plot.

People can’t remember many things at once. This has effects. Some of them are obvious. Like, how a phone number, back in the days you might have to memorize them, wouldn’t be more than about seven or eight digits. Some are subtle, such as that we have descriptive statistics. We have descriptive statistics because we want to understand collections of a lot of data. But we can’t understand all the data. We have to simplify it. From this we get many numbers, based on data, that try to represent it. Means. Medians. Variance. Quartiles. All these.

And it’s not enough. We try to understand data further by visualization. Usually this is literal, making pictures that represent data. Now and then somebody visualizes data by something slick, like turning it into an audio recording. (Somewhere here I have an early-60s album turning 18 months of solar radio measurements into something music-like.) But that’s rare, and usually more of an artistic statement. Mostly it’s pictures. Sighted people learn much of the world from the experience of seeing it and moving around it. Visualization turns arithmetic into geometry. We can support our sense of number with our sense of space.

Many of the ways we visualize data came from the same person. William Playfair set out the rules for line charts and area charts and bar charts and pie charts and circle graphs. Florence Nightingale used many of them in her reports on medical care in the Crimean War. And this made them public and familiar enough that we still use them.

Box-and-whisker plots are not among them. I’m startled too. Playfair had a great talent for these sorts of visualizations. That he missed this is a reminder to us all. There are great, simple ideas still available for us to discover.

At least for the brilliant among us to discover. Box-and-whisker plots were introduced in 1969. I’m surprised it’s that recent. John Tukey developed them. Computer scientists remember Tukey’s name; he coined the term ‘bit’, as in the element of computer memory. They also remember he was an early user, if not the coiner, of the term ‘software’. Mathematicians know Tukey’s name too. He and James Cooley developed the Fast Fourier Transform. The Fast Fourier Transform appears on every list of the Most Important Algorithms of the 20th Century. Sometimes the Most Important Algorithms of All Time. The Fourier Transform is this great thing. It’s a way of finding patterns in messy, complicated data. It’s hard to calculate, though. Cooley and Tukey, though, found that the calculations you have to do can be made simpler, and much quicker. (In certain conditions. Mostly depending on how the data’s gathered. Fortunately, computers encourage gathering data in ways that make the Fast Fourier Transform possible. And then go and calculate it nice and fast.)

Box-and-whisker plots are a way to visualize sets of data. Too many data points to look at all at once, not without getting confused. They extract a couple bits of information about the distribution. Distributions say what ranges a data point, picked at random, are likely to be in, and are unlikely to be in. Distributions can be good things to look at. They let you know what typical experiences of a thing are likely to be. And they’re stable. A handful of weird fluke events don’t change them much. If you have a lot of fluke events, that changes the distribution. But if you have a lot of fluke events, they’re not flukes. They’re just events.

Box-and-whisker plots start from the median. This is the second of the three things commonly called “average”. It’s the data point that half the remaining data is less than, and half the remaining data is greater than. It’s a nice number to know. Start your box-and-whisker plot with a short line, horizontal or vertical as fits your worksheet, and labelled with that median.

Around this line we’ll draw a box. It’ll be as wide as the line you made for the median. But how tall should it be?

That is, normally, based on the first and third quartiles. These are the data points like the median. The first quartile has one-quarter the data points less than it, and three-quarters the data points more than it. The third quartile has three-quarters the data points less than it, and one-quarter the data points more than it. (And now you might ask if we can’t call the median the “second quartile”. We sure can. And will if we want to think about how the quartiles relate to each other.) Between the first and the third quartile are half of all the data points. The first and the third quartiles the boundaries of your box. They’re where the edges of the rectangle are.

That’s the box. What are the whiskers?

Well, they’re vertical lines. Or horizontal lines. Whatever’s perpendicular to how you started. They start at the quartile lines. Should they go to the maximum or minimum data points?

Maybe. Maximum and minimum data are neat, yes. But they’re also suspect. They’re extremes. They’re not quite reliable. If you went back to the same source of data, and collected it again, you’d get about the same median, and the same first and third quartile. You’d get different minimums and maximums, though. Often crazily different. Still, if you want to understand the data you did get, it’s hard to ignore that this is the data you have. So one choice for representing these is to just use the maximum and minimum points. Draw the whiskers out to the maximum and minimum, and then add a little cross bar or a circle at the end. This makes clear you meant the line to end there, rather than that your ink ran out. (Making a figure safe against misprinting is one of the understated essentials of good visualization.)

But again, the very highest and lowest data may be flukes. So we could look at other, more stable endpoints for the whiskers. The point of this is to show the range of what we believe most data points are. There are different ways to do this. There’s not one that’s always right. It’s important, when showing a box-and-whisker plot, to explain how far out the whiskers go.

Tukey’s original idea, for example, was to extend the whiskers based on the interquartile range. This is the difference between the third quartile and the first quartile. Like, just subtraction. Find a number that’s one-and-a-half times the interquartile range above the third quartile. The upper whisker goes to the data point that’s closest to that boundary without going over. This might well be the maximum already. The other number is the one that’s the first quartile minus one-and-a-halt times the interquartile range. The lower whisker goes to the data point that’s closest to that boundary without falling underneath it. And this might be the minimum. It depends how the data’s distributed. The upper whisker and the lower whisker aren’t guaranteed to be the same lengths. If there are data outside these whisker ranges, mark them with dots or x’s or something else easy to spot. There’ll typically be only a few of these.

But you can use other rules too. Again as long as you are clear about what they represent. The whiskers might go out, for example, to particular percentiles. Or might reach out a certain number of standard deviations from the mean.

The point of doing this box-and-whisker plot is to show where half the data are. That’s inside the box. And where the rest of the non-fluke data is. That’s the whiskers. And the flukes, those are the odd little dots left outside the whiskers. And it doesn’t take any deep calculations. You need to sort the data in ascending order. You need to count how many data points there are, to find the median and the first and third quartiles. (You might have to do addition and division. If you have, for example, twelve distinct data points, then the median is the arithmetic mean of the sixth and seventh values. The first quartile is the arithmetic mean of the third and fourth values. The third quartile is the arithmetic mean of the ninth and tenth values.) You (might) need to subtract, to find the interquartile range. And multiply that by one and a half, and add or subtract that from the quartiles.

This shows you what are likely and what are improbable values. They give you a cruder picture than, say, the standard deviation and the coefficients of variance do. But they need no hard calculations. None of what you need for box-and-whisker plots is computationally intensive. Heck, none of what you need is hard. You knew everything you needed to find these numbers by fourth grade. And yet they tell you about the distribution. You can compare whether two sets of data are similar by eye. Telling whether sets of data are similar becomes telling whether two shapes look about the same. It’s brilliant to represent so much from such simple work.

My 2018 Mathematics A To Z: Asymptote


Welcome, all, to the start of my 2018 Mathematics A To Z. Twice each week for the rest of the year I hope to have a short essay explaining a term from mathematics. These are fun and exciting for me to do, since I mostly take requests for the words, and I always think I’m going to be father farther ahead of deadline than I actually am.

Today’s word comes from longtime friend of my blog Iva Sallay, whose Find the Factors page offers a nice daily recreational logic puzzle. Also trivia about each whole number, in turn.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Asymptote.

You know how everything feels messy and complicated right now? But you also feel that, at least in the distant past, they were simpler and easier to understand? And how you hope that, sometime in the future, all our current woes will have faded and things will be simple again? Hold that thought.

There is no one thing that every mathematician does, apart from insist to friends that they can’t do arithmetic well. But there are things many mathematicians do. One of those is to work with functions. A function is this abstract concept. It’s a triplet of things. One is a domain, a set of things that we draw the independent variables from. One is a range, a set of things that we draw the dependent variables from. And last thing is a rule, something that matches each thing in the domain to one thing in the range.

The domain and range can be the same thing. They’re often things like “the real numbers”. They don’t have to be. The rule can be almost anything. It can be simple. It can be complicated. Usually, if it’s interesting, there’s at least something complicated about it.

The asymptote, then, is an expression of our hope that we have to work with something that’s truly simple, but has some temporary complicated stuff messing it up just now. Outside some local embarrassment, our function is close enough to this simpler asymptote. The past and the future are these simpler things. It’s only the present, the local area, that’s messy and confusing.

We can make this precise. Start off with some function we both agree is interesting. Reach deep into the imagination to call it ‘f’. Suppose that there is an asymptote. That’s also a function, with the same domain and range as ‘f’. Let me call it ‘g’, because that’s a letter very near ‘f’.

You give me some tolerance for error. This number mathematicians usually call ‘ε’. We usually think of it as a small thing. But all we need is that it’s larger than zero. Anyway, you give me that ε. Then I can give you, for that ε, some bounded region in the domain. Everywhere outside that region, the difference between ‘f’ and ‘g’ is smaller than ε. That is, our complicated original function ‘f’ and the asymptote ‘g’ are indistinguishable enough. At least everywhere except this little patch of the domain. There’s different regions for different ε values, unless something weird is going on. The smaller then ε the bigger the region of exceptions. But if the domain is something like the real numbers, well, big deal. Our function and our asymptote are indistinguishable roughly everywhere.

If there is an asymptote. We’re not guaranteed there is one. But if there is, we know some nice things. We know what our function looks like, at least outside this local range of extra complication. If the domain represents something like time or space, and it often does, then the asymptote represents the big picture. What things look like in deep time. What things look like globally. When studying a function we can divide it into the easy part of the asymptote and the local part that’s “function minus the asymptote”.

Usually we meet asymptotes in high school algebra. They’re a pair of crossed lines that hang around hyperbolas. They help you sketch out the hyperbola. Find equations for the asymptotes. Draw these crossed lines. Figure whether the hyperbola should go above-and-below or left-and-right of the crossed lines. Draw discs accordingly. Then match them up to the crossed lines. Asymptotes don’t seem to do much else there. A parabola, the other exotic shape you meet about the same time, doesn’t have any asymptote that’s any simpler than itself. A circle or an ellipse, which you met before but now have equations to deal with, doesn’t have an asymptote at all. They aren’t big enough to have any. So at first introduction asymptotes seem like a lot of mechanism for a slight problem. We don’t need accurate hand-drawn graphs of hyperbolas that much.

In more complicated mathematics they get useful again. In dynamical systems we look at descriptions of how something behaves in time. Often its behavior will have an asymptote. Not always, but it’s nice to see when it does. When we study operations, how long it takes to do a task, we see asymptotes all over the place. How long it takes to perform a task depends on how big a problem it is we’re trying to solve. The relationship between how big the thing is and how long it takes to do is some function. The asymptote appears when thinking about solving huge examples of the problem. What rule most dominates how hard the biggest problems are? That’s the asymptote, in this case.

Not everything has an asymptote. Some functions are always as complicated as they started. Oscillations, for example, if they don’t dampen out. A sine wave isn’t complicated. Not if you’re the kind of person who’ll write things like “a sine wave isn’t complicated”. But if the size of the oscillations doesn’t decrease, then there can’t be an asymptote. Functions might be chaotic, with values that vary along some truly complicated system, and so never have an asymptote.

But often we can find a simpler function that looks enough like the function we care about. Everywhere except some local little embarrassment. We can enjoy the promise that things were understandable at one point, and maybe will be again.

I’m Still Looking For Fun Mathematics And Words


I’m hoping to get my 2018 Mathematics A To Z started the last week of September, which among other things will let me end it in 2018 if I haven’t been counting wrong. We’ll see. If you’ve got requests for the first several letters in the alphabet, there’s still open slots. I’ll be opening up the next quarter of the alphabet soon, too.

And also set for the last week of September — boy, I’m glad I am not going to have any doubts or regrets about how I’m scheduling my time for two weeks hence — is the Playful Mathematic Education Carnival. This project, overseen by Denise Gaskins, tries to bring a bundle of fun stuff about mathematics to different blogs. Iva Sallay’s turn, the end of August, is up here. Have you spotted something mathematical that’s made you smile? Please let me know. I’d love to share it with the world.

I’m Looking For Topics For My Fall 2018 Mathematics A-To-Z


So I have given up on waiting for a moment when my schedule looks easier. I’m going to plunge in and make it all hard again. Thus I announce, to start in about a month, my Fall 2018 Mathematics A To Z.

This is something I’ve done once or twice the last few years. The idea is easy: I take one mathematical term for each letter of the alphabet and explain it. The last several rounds I’ve gotten the words from you, kind readers who would love watching me trying to explain something in a field of mathematics I only just learned anything about. It’s great fun. If you do any kind of explanatory blog I recommend the format.

I do mean to do things a little different this time. First, and most visibly, I’m only going to post two essays a week. In past years I’ve done three, and that’s a great pace. It’s left me sometimes with that special month where I have a fresh posting every single day of the month. It’s also a crushing schedule, at least for me. Especially since I’ve been writing longer and longer, both here and on my humor blog. Two’s my limit and I reserve the right to skip a week when I need to skip a week.

Second. I’m going to open for requests only a few letters at a time. In the past I’ve ended up lost when, for example, my submit-your-requests post ends up being seven weeks back and hard to find under all my notifications. This should help me better match up my requests, my writing pace, and my deadlines. It will not.

Also, in the past I’ve always done first-come, first-serve. I’m still inclined toward that. But I’m going to declare that if I come in and check my declarations some morning and find several requests for the same letter, I may decide to go with the word that most captures my imagination. Probably I won’t have the nerve. But I’d like to think I have. I might do some supplementals after the string is done, too. We’ll see what I feel up to. Doing a whole run is exhilarating but exhausting.


So. Now I’d like to declare myself open for the letters ‘A’ through ‘F’. In past A to Z’s I’ve already given these words, so probably won’t want to revisit them. (Though there are some that I think, mm, I could do better now.)

Excerpted from The Summer 2015 A To Z


Excerpted from The Leap Day 2016 A To Z


Excerpted from The Summer 2015 A To Z


Excerpted from The Summer 2017 A To Z


And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

  •     A    
  •     B    
  •     C    
  • D
  •     E    
  •     F    

Oh, I need to commission some header art from Thomas K Dye, creator of the web comic Newshounds, for this. Also for another project that’ll help my September get a little more overloaded.

%d bloggers like this: