## What I Learned Doing My 2018 Mathematics A To Z

I have a tradition at the end of an A To Z sequence of looking back and considering what I learned doing it. Sometimes this is mathematics I’ve learned. At the risk of spoiling the magic, I don’t know as much as I present myself as knowing. I’ll often take an essay topic and study up before writing, and hope that I look competent enough that nobody seriously questions me. Yes, I thought I was a pretty good student journalist back in the day, and harbored fantasies of doing that for a career. This before I went on to fantasize about doing mathematics. Still; I also learn things about writing in doing a big writing project like this. And now I’ve had some breathing space to sit and think. I can try finding out what I thought.

The major differences between this A To Z and those of past years was scheduling. Through 2017 I’d done A To Z’s posted three days a week. This is a thrilling schedule. It makes it easy to have full weeks, even full months, with some original posting every day. I can tell myself that the number of posts doesn’t matter. It’s the quality that does. It doesn’t work. I tend toward compulsive behavior and a-post-a-day is so gratifying.

But I also knew that the last quarter of 2018 would be busy and I had to cut something down. Some of that is Reading the Comics posts. Including pictures of each comic I discuss adds considerably to the production time. This is not least because I feel I can’t reasonably claim Fair Use of the comics without writing something of more substance. Moving files around and writing alt-text for the images also takes time. But the time can be worth it. Doing that has sometimes made me think longer, or even better, about the comics.

So switching to two-a-week seemed the thing to do. It would spread the A To Z over three months, but that’s not bad. I figured to prove out whether that schedule worked. If I could find one that let me do possibly several A To Z sequences in a year that’d be wonderful too. I’ve only done that once, so far, in 2016, but I think the exercise is always good if exhausting for me.

This completely failed to save time. I had fewer things to write, but somehow, I only wrote more. All my writing’s getting longer as I go on, yes. Last year my average post exceeded a thousand words, though, and that counts Reading the Comics and pointers to things I’ve been reading and all that. There’s a similar steady expansion going on in my humor blog. Possibly having more time between essays encouraged me to write longer for each. Work famously expands to fit the time available, and having as many as three full days to write, rather than one, might be dangerous. For the first time in an A To Z I never got ahead of myself. I would, at best, be researching and making notes for the next essay while waiting for the current one to post. There’s value in dangerous writing. But I don’t like that as a habit.

Another scheduling change was in how I took topic suggestions. In the past I’d thrown the whole alphabet open at once. This time I broke the alphabet into a couple of pieces, and asked for about one-quarter at a time. This, overall, worked. For one, it gave me more chances to talk about the A To Z. Talking about something is one of the non-annoying ways to advertise a thing. And I think it helped a greater variety of people suggest topics. I did have more collisions this time around, letters for which several people suggested different ideas. That’s a happy situation to have. Thinking of what to write is the hard part; going on about a topic someone else named? That’s easy. So I’ll certainly keep that.

I did write some about maybe doing supplementary pieces, based on topics I didn’t use for the main line. Might yet do that, perhaps under rules where I do one a week, or limit myself to 700 words, or something like that. It might be worth doing a couple just to have a buffer against weeks that there’s no comic strips worth discussing. Or to head off gaps next time around, although that would spoil some letters for people.

Also I completely ran out of ‘X’ topics, and went with the 90s alternative of “extreme”. There are plenty of “extreme” things in mathematics I could write about. But that feels a bit chintzy to do too often.

This time around I changed focus on many of my essays. It wasn’t a conscious thing, not to start with. But I got to writing more about the meaning and significance and cultural import of topics, rather than definitions or descriptions of the use of things. This is a natural direction to go for a topic like Fermat’s Last Theorem, or the Infinite Monkey Theorem, or mathematics jokes. I liked the way those pieces turned out, though, and tried doing more of it. This likely helped the essays grow so long. Context demands space, after all. And more thinking. Thinking’s the hard part of writing, but it’s also fun, because when you’re thinking about a subject you aren’t typing any specific words.

But it’s probably a worthwhile shift. For a pop mathematics blog to describe what makes something a “smooth” function is all right. But it’s not a story unless it says why we should care. That’s more about context than about definitions, which anyone could get by typing ‘mathworld smooth’ into DuckDuckGo. For all the trouble this causes me, it’s the way to go.

Every good lesson carries its opposite along, though. One of the requests this time around was about Lord Kelvin. There’s no end of things you can write about him: he did important work in basically every field of science and mathematics as the 19th century knew it. It’s easy to start writing about his work and never stop. I did the opposite, taking one tiny and often-overlooked piece and focusing on that. I’m not sure it alone would convince anyone of Kelvin’s exhaustive greatness. But I don’t imagine anyone interested in reading a single essay on Kelvin would never read a second one. It seems to me a couple narrow-focus essays help in that context. Seeing more of one detail gives scale to the big picture.

I’ve done just the one A To Z the last two years. There’s surely an optimal rate for doing these. The sequences are usually good for my readership. My experience, tracking monthly readership figures, suggests that just posting more often is good for my readership. They’re also the thing I write that most directly solicits reader responses. They’re also exhausting. The last several letters are always a challenge. The weeks after a sequence is completed I collapse into a little recuperative bubble. So I want to do these as much as I can without burning out on the idea. Also without overloading Thomas Dye, who’s been so good as to make the snappy banners for these pieces. He has his own projects, including the web comic Projection Edge, to worry about. More than once a year is probably sustainable. I may also want to stack this with hosting the Playful Mathematics Education Blog Carnival again, if I’m able to this year.

Deep down, though, I think the best moment of my Fall 2018 A To Z might have been in the first. I wrote about asymptotes and realized I could put in ordinary words why they were a thing worth having. If I could have three insights like that a year I’d be a great mathematics writer.

I put a roster of things written up in this A To Z at this page. The Summer 2015 A To Z essays should be here. The essays from the Leap Day 2016 A To Z essays are at this link. The essays from the End 2016 A To Z essays are here. Those from the Summer 2017 A To Z sequence are at this link. And I should keep using the A-To-Z tag, so all of these, and any future A To Z essays, should appear at this link. Thank you for reading along.

## What I Wrote About in My 2018 Mathematics A To Z

I have reached the end! Thirteen weeks at two essays per week to describe a neat sampling of mathematics. I hope to write a few words about what I learned by doing all this. In the meanwhile, though, I want to gather together the list of all the essays I did put into this project.

## My 2018 Mathematics A To Z: Zugzwang

My final glossary term for this year’s A To Z sequence was suggested by aajohannas, who’d also suggested “randomness” and “tiling”. I don’t know of any blogs or other projects they’re behind, but if I do hear, I’ll pass them on.

# Zugzwang.

Some areas of mathematics struggle against the question, “So what is this useful for?” As though usefulness were a particular merit — or demerit — for a field of human study. Most mathematics fields discover some use, though, even if it takes centuries. Others are born useful. Probability, for example. Statistics. Know what the fields are and you know why they’re valuable.

Game theory is another of these. The subject, as often happens, we can trace back centuries. Usually as the study of some particular game. Occasionally in the study of some political science problem. But game theory developed a particular identity in the early 20th century. Some of this from set theory experts. Some from probability experts. Some from John von Neumann, because it was the 20th century and all that. Calling it “game theory” explains why anyone might like to study it. Who doesn’t like playing games? Who, studying a game, doesn’t want to play it better?

But why it might be interesting is different from why it might be important. Think of what a game is. It is a string of choices made by one or more parties. The point of the choices is to achieve some goal. Put that way you realize: this is everything. All life is making choices, all in the pursuit of some goal, even if that goal is just “not end up any worse off”. I don’t know that the earliest researchers in game theory as a field realized what a powerful subject they had touched on. But by the 1950s they were doing serious work in strategic planning, and by 1964 were even giving us Stanley Kubrick movies.

This is taking me away from my glossary term. The field of games is enormous. If we narrow the field some we can discuss specific kinds of games. And say more involved things about these games. So first we’ll limit things by thinking only of sequential games. These are ones where there are a set number of players, and they take turns making choices. I’m not sure whether the field expects the order of play to be the same every time. My understanding is that much of the focus is on two-player games. What’s important is that at any one step there’s only one party making a choice.

The other thing narrowing the field is to think of information. There are many things that can affect the state of the game. Some of them might be obvious, like where the pieces are on the game board. Or how much money a player has. We’re used to that. But there can be hidden information. A player might conceal some game money so as to make other players underestimate her resources. Many card games have one or more cards concealed from the other players. There can be information unknown to any party. No one can make a useful prediction what the next throw of the game dice will be. Or what the next event card will be.

But there are games where there’s none of this ambiguity. These are called games with “perfect information”. In them all the players know the past moves every player has made. Or at least should know them. Players are allowed to forget what they ought to know.

There’s a separate but similar-sounding idea called “complete information”. In a game with complete information, players know everything that affects the gameplay. At least, probably, apart from what their opponents intend to do. This might sound like an impossibly high standard, at first. All games with shuffled decks of cards and with dice to roll are out. There’s no concealing or lying about the state of affairs.

Set complete-information aside; we don’t need it here. Think only of perfect-information games. What are they? Some ancient games, certainly. Tic-tac-toe, for example. Some more modern versions, like Connect Four and its variations. Some that are actually deep, like checkers and chess and go. Some that are, arguably, more puzzles than games, as in sudoku. Some that hardly seem like games, like several people agreeing how to cut a cake fairly. Some that seem like tests to prove people are fundamentally stupid, like when you auction off a dollar. (The rules are set so players can easily end up paying more then a dollar.) But that’s enough for me, at least. You can see there are games of clear, tangible interest here.

The last restriction: think only of two-player games. Or at least two parties. Any of these two-party sequential games with perfect information are a part of “combinatorial game theory”. It doesn’t usually allow for incomplete-information games. But at least the MathWorld glossary doesn’t demand they be ruled out. So I will defer to this authority. I’m not sure how the name “combinatorial” got attached to this kind of game. My guess is that it seems like you should be able to list all the possible combinations of legal moves. That number may be enormous, as chess and go players are always going on about. But you could imagine a vast book which lists every possible game. If your friend ever challenged you to a game of chess the two of you could simply agree, oh, you’ll play game number 2,038,940,949,172 and then look up to see who won. Quite the time-saver.

Most games don’t have such a book, though. Players have to act on what they understand of the current state, and what they think the other player will do. This is where we get strategies from. Not just what we plan to do, but what we imagine the other party plans to do. When working out a strategy we often expect the other party to play perfectly. That is, to make no mistakes, to not do anything that worsens their position. Or that reduces their chance of winning.

… And yes, arguably, the word “chance” doesn’t belong there. These are games where the rules are known, every past move is known, every future move is in principle computable. And if we suppose everyone is making the best possible move then we can imagine forecasting the whole future of the game. One player has a “chance” of winning in the same way Christmas day of the year 2038 has a “chance” of being on a Tuesday. That is, the probability is just an expression of our ignorance, that we don’t happen to be able to look it up.

But what choice do we have? I’ve never seen a reference that lists all the possible games of tic-tac-toe. And that’s about the simplest combinatorial-game-theory game anyone might actually play. What’s possible is to look at the current state of the game. And evaluate which player seems to be closer to her goal. And then look at all the possible moves.

There are three things a move can do. It can put the party closer to the goal. It can put the party farther from the goal. Or it can do neither. On her turn the other party might do something that moves you farther from your goal, moves you closer to your goal, or doesn’t affect your status at all. It seems like this makes strategy obvious. On every step take the available move that takes one closest to the goal. This is known as a “greedy” strategy. As the name suggests it isn’t automatically bad. If you expect the game to be a short one, greed might be the best approach. The catch is that moves that seem less good — even ones that seem to hurt you initially — might set up other, even better moves. So strategy requires some thinking beyond the current step. Properly, it requires thinking through to the end of the game. Or at least until the end of the game seems obvious.

We should like a strategy that leaves us no choice but to win. Next-best would be one that leaves the game undecided, since something might happen like the other player needing to catch a bus and so resigning. This is how I got my solitary win in the two months I spent in the college chess club. Worst would be the games that leave us no choice but to lose.

It can be that there are no good moves. That is, that every move available makes it a little less likely that we win. Sometimes a game offers the chance to pass, preserving the state of the game but giving the other party the turn. Then maybe the other party will do something that creates a better opportunity for us. But if we are allowed to pass, there’s a good chance the game lets the other party pass, too, and we end up in the same fix. And it may be the rules of the game don’t allow passing anyway. One must move.

The phenomenon of having to make a move when it’s impossible to make a good move has prominence in chess. I don’t have the chess knowledge to say how common the situation is. But it seems to be a situation people who study chess problems love. I suppose it appeals to a love of lost causes and the hope that you can be brilliant enough to see what everyone else has overlooked. German chess literate gave it a name 160 years ago, “zugzwang”, “compulsion to move”. Somehow I never encountered the term when I was briefly a college chess player. Perhaps because I was never in zugzwang and was just too incompetent a player to find my good moves. I first encountered the term in Michael Chabon’s The Yiddish Policeman’s Union. The protagonist picked up on the term as he investigated the murder of a chess player and then felt himself in one.

Combinatorial game theorists have picked up the word, and sharpened its meaning. If I understand correctly chess players allow the term to be used for any case where a player hurts her position by moving at all. Game theorists make it more dire. This may reflect their knowledge that an optimal strategy might require taking some dismal steps along the way. The game theorist formally grants the term only to the situation where the compulsion to move changes what should be a win into a loss. This seems terrible, but then, we’ve all done this in play. We all feel terrible about it.

I’d like here to give examples. But in searching the web I can find only either courses in game theory. These are a bit too much for even me to sumarize. Or chess problems, which I’m not up to understanding. It seems hard to set out an example: I need to not just set out the game, but show that what had been a win is now, by any available move, turned into a loss. Chess is looser. It even allows, I discover, a double zugzwang, where both players are at a disadvantage if they have to move.

It’s a quite relatable problem. You see why game theory has this reputation as mathematics that touches all life.

And with that … I am done! All of the Fall 2018 Mathematics A To Z posts should be at this link. Next week I’ll post my big list of all the letters, though. And, as has become tradition, a post about what I learned by doing this project. And sometime before then I should have at least one more Reading the Comics post. Thanks kindly for reading and we’ll see when in 2019 I feel up to doing another of these.

## My 2018 Mathematics A To Z: Yamada Polynomial

I had another free choice. I thought I’d go back to one of the topics I knew and loved in grad school even though I didn’t have the time to properly study it then. It turned out I had forgotten some important points and spent a night crash-relearning knot theory. This isn’t a bad thing necessarily.

This is a thing which comes from graphs. Not the graphs you ever drew in algebra class. Graphs as in graph theory. These figures made of spots called vertices. Pairs of vertices are connected by edges. There’s many interesting things to study about these.

One path to take in understanding graphs is polynomials. Of course I would bring things back to polynomials. But there’s good reasons. These reasons come to graph theory by way of knot theory. That’s an interesting development since we usually learn graph theory before knot theory. But knot theory has the idea of representing these complicated shapes as polynomials.

There are a bunch of different polynomials for any given graph. The oldest kind, the Alexander Polynomial, J W Alexander developed in the 1920s. And that was about it until the 1980s when suddenly everybody was coming up with good new polynomials. The definitions are different. They give polynomials that look different. Some are able to distinguish between a knot and the knot that’s its reflection across a mirror. Some, like the Alexander aren’t. But they’re common in some important ways. One is that they might not actually be, you know, polynomials. I mean, they’ll be the sum of numbers — whole numbers, even — times a variable raised to a power. The variable might be t, might be x. Might be something else, but it doesn’t matter. It’s a pure dummy variable. But the variable might be raised to a negative power, which isn’t really a polynomial. It might even be raised to, oh, one-half or three-halves, or minus nine-halves, or something like that. We can try saying this is “a polynomial in t-to-the-halves”. Mostly it’s because we don’t have a better name for it.

And going from a particular knot to a polynomial follows a pretty common procedure. At least it can, when you’re learning knot theory and feel a bit overwhelmed trying to prove stuff about “knot invariants” and “homologies” and all. Having a specific example can be such a comfort. You can work this out by an iterative process. Take a specific drawing of your knot. There’s places where the strands of the knot cross over one another. For each of those crossings you ponder some alternate cases where the strands cross over in a different way. And then you add together some coefficient times the polynomial of this new, different knot. The coefficient you get by the rules of whatever polynomial you’re making. The new, different knots are, usually, no more complicated than what you started with. They’re often simpler knots. This is what saves you from an eternity of work. You’re breaking the knot down into more but simpler knots. Just the fact of doing that can be satisfying enough. Eventually you get to something really simple, like a circle, and declare that’s some basic polynomial. Then there’s a lot bit of adding up coefficients and powers and all that. Tedious but not hard.

Knots are made from a continuous loop of … we’ll just call it thread. It can fold over itself many times. It has to, really, or it hasn’t got a chance of being more interesting than a circle. A graph is different. That there are vertices seems to change things. Less than you’d think, though. The thread of a knot can cross over and under itself. Edges of a graph can cross over and under other edges. This isn’t too different. We can also imagine replacing a spot where two edges cross over and under the other with an intersection and new vertex.

So we get to the Yamada polynomial by treating a graph an awful lot like we might treat a knot. Take the graph and split it up at each overlap. At each overlap we have something that looks, at least locally, kind of like an X. An upper left, upper right, lower left, and lower right intersection. The lower left connects to the upper right, and the upper left connects to the lower right. But these two edges don’t actually touch; one passes over the other. (By convention, the lower left going to the upper right is on top.)

There’s three alternate graphs. One has the upper left connected to the lower left, and the upper right connected to the lower right. This looks like replacing the X with a )( loop. The second alternate has the upper left connected to the upper right, and the lower left connected to the lower right. This looks like … well, that )( but rotated ninety degrees. I can’t do that without actually including a picture. The third alternate puts a vertex in the X. So now the upper left, upper right, lower left, and lower right all connect to the new vertex in the center.

Probably you’d agree that replacing the original X with a )( pattern, or its rotation, probably doesn’t make the graph any more complicated. And it might make the graph simpler. But adding that new vertex looks like trouble. It looks like it’s getting more complicated. We might get stuck in an infinite regression of more-complicated polynomials.

What saves us is the coefficient we’re multiplying the polynomials for these new graphs by. It’s called the “chromatic coefficient” and it reflects how many different colors you need to color in this graph. An edge needs to connect two different colors. And — what happens if an edge connects a vertex to itself? That is, the edge loops around back to where it started? That’s got a chromatic number of zero and the moment we get a single one of these loops anywhere in our graph we can stop calculating. We’re done with that branch of the calculations. This is what saves us.

There’s a catch. It’s a catch that knot polynomials have, too. This scheme writes a polynomial not just for a particular graph but a particular way of rendering this graph. There’s always other ways to draw it. If nothing else you can always twirl a edge over itself, into a loop like you get when Christmas tree lights start tangling themselves up. But you can move the vertices to different places. You can have an edge go outside the rest of the figure instead of inside, that sort of thing. Starting from a different rendition of the shape gets you to a different polynomial.

Superficially different, anyway. What you get from two different renditions of the same graph are polynomials different by your dummy variable raised to a whole number. Also maybe a plus-or-minus sign. You can see a difference between, say, $t^{-1} - 2 + 3t$ (to make up an example) and $t - 2t^2 + 3t^3$. But you can see that second polynomial is just $t^2\left(t^{-1} - 2 + 3t\right)$. It’s some confounding factor times something that is distinctive to the graph.

And that distinctive part, the thing that doesn’t change if you draw the graph differently? That’s the Yamada polynomial, at last. It’s a way to represent this collection of vertices and edges using only coefficients and exponents.

I would like to give an impressive roster of uses for these polynomials here. I’m afraid I have to let you down. There is the obvious use: if you suspect two graphs are really the same, despite how different they look, here’s a test. Calculate their Yamada polynomials and if they’re different, you know the graphs were different. It can be hard to tell. Get anything with more than, say, eight vertices and 24 edges in it and you’re not going to figure that out by sight.

I encountered the Yamada polynomial specifically as part of a textbook chapter about chemistry. It’s easy to imagine there should be great links between knots and graphs and the way that atoms bundle together into molecules. The shape of their structures describes what they will do. But I am not enough of a chemist to say how this description helps chemists understand molecules. It’s possible that it doesn’t: Yamada’s paper introducing the polynomial was published in 1989. My knot theory textbook might have brought it up because it looked exciting. There are trends and fashions in mathematical thought too. I don’t know what several more decades of work have done to the polynomial’s reputation. I’m glad to hear from people who know better.

There’s one more term in the Fall 2018 Mathematics A To Z to come. Will I get the article about it written before Friday? We’ll know on Saturday! At least I don’t have more Reading the Comics posts to write before Sunday.

## My 2018 Mathematics A To Z: Extreme Value Theorem

The letter ‘X’ is a problem. For all that the letter ‘x’ is important to mathematics there aren’t many mathematical terms starting with it. Mr Wu, mathematics tutor and author of the MathTuition88 blog, had a suggestion. Why not 90s it up a little and write about an Extreme theorem? I’m game.

The Extreme Value Theorem, which I chose to write about, is a fundamental bit of analysis. There is also a similarly-named but completely unrelated Extreme Value Theory. This exists in the world of statistics. That’s about outliers, and about how likely it is you’ll find an even more extreme outlier if you continue sampling. This is valuable in risk assessment: put another way, it’s the question of what neighborhoods you expect to flood based on how the river’s overflowed the last hundred years. Or be in a wildfire, or be hit by a major earthquake, or whatever. The more I think about it the more I realize that’s worth discussing too. Maybe in the new year, if I decide to do some A To Z extras.

# Extreme Value Theorem.

There are some mathematical theorems which defy intuition. You can encounter one and conclude that can’t be so. This can inspire one to study mathematics, to understand how it could be. Famously, the philosopher Thomas Hobbes encountered the Pythagorean Theorem and disbelieved it. He then fell into a controversial love with the subject. Some you can encounter, and study, and understand, and never come to believe. This would be the Banach-Tarski Paradox. It’s the realization that one can split a ball into as few as five pieces, and reassemble the pieces, and have two complete balls. They can even be wildly larger or smaller than the one you started with. It’s dazzling.

And then there are theorems that seem the opposite. Ones that seem so obvious, and so obviously true, that they hardly seem like mathematics. If they’re not axioms, they might as well be. The extreme value theorem is one of these.

It’s a theorem about functions. Here, functions that have a domain and a range that are both real numbers. Even more specifically, about continuous functions. “Continuous” is a tricky idea to make precise, but we don’t have to do it. A century of mathematicians worked out meanings that correspond pretty well to what you’d imagine it should mean. It means you can draw a graph representing the function without lifting the pen. (Do not attempt to use this definition at your thesis defense. I’m skipping what a century’s worth of hard thinking about the subject.)

And it’s a theorem about “extreme” values. “Extreme” is a convenient word. It means “maximum or minimum”. We’re often interested in the greatest or least value of a function. Having a scheme to find the maximum is as good as having one to find a minimum. So there’s little point talking about them as separate things. But that forces us to use a bunch of syllables. Or to adopt a convention that “by maximum we always mean maximum or minimum”. We could say we mean that, but I’ll bet a good number of mathematicians, and 95% of mathematics students, would forget the “or minimum” within ten minutes. “Extreme”, then. It’s short and punchy and doesn’t commit us to a maximum or a minimum. It’s simply the most outstanding value we can find.

The Extreme Value Theorem doesn’t help us find them. It only proves to us there is an extreme to find. Particularly, it says that if a continuous function has a domain that’s a closed interval, then it has to have a maximum and a minimum. And it has to attain the maximum and the minimum at least once each. That is, something in the domain matches to the maximum. And something in the domain matches to the minimum. Could be multiple times, yes.

This might not seem like much of a theorem. Existence proofs rarely do. It’s a bias, I suppose. We like to think we’re out looking for solutions. So we suppose there’s a solution to find. Checking that there is an answer before we start looking? That seems excessive. Before heading to the airport we might check the flight wasn’t delayed. But we almost never check that there is still a Newark to fly to. I’m not sure, in working out problems, that we check it explicitly. We decide early on that we’re working with continuous functions and so we can try out the usual approaches. That we use the theorem becomes invisible.

And that’s sort of the history of this theorem. The Extreme Value Theorem, for example, is part of how we now prove Rolle’s Theorem. Rolle’s theorem is about functions continuous and differentiable on the interval from a to b. And functions that have the same value for a and for b. The conclusion is the function hass got a local maximum or minimum in-between these. It’s the theorem depicted in that xkcd comic you maybe didn’t check out a few paragraphs ago. Rolle’s Theorem is named for Michael Rolle, who prove the theorem (for polynomials) in 1691. The Indian mathematician Bhaskara II, in the 12th century, is credited with stating the theorem too. The Extreme Value Theorem was proven around 1860. (There was an earlier proof, by Bernard Bolzano, whose name you’ll find all over talk about limits and functions and continuity and all. But that was unpublished until 1930. The proofs known about at the time were done by Karl Weierstrass. His is the other name you’ll find all over talk about limits and functions and continuity and all. Go on, now, guess who it was proved the Extreme Value Theorem. And guess what theorem, bearing the name of two important 19th-century mathematicians, is at the core of proving that. You need at most two chances!) That is, mathematicians were comfortable using the theorem before it had a clear identity.

Once you know that it’s there, though, the Extreme Value Theorem’s a great one. It’s useful. Rolle’s Theorem I just went through. There’s also the quite similar Mean Value Theorem. This one is about functions continuous and differentiable on an interval. It tells us there’s at least one point where the derivative is equal to the mean slope of the function on that interval. This is another theorem that’s a quick proof once you have the Extreme Value Theorem. Or we can get more esoteric. There’s a technique known as Lagrange Multipliers. It’s a way to find where on a constrained surface a function is at its maximum or minimum. It’s a clever technique, one that I needed time to accept as a thing that could possibly work. And why should it work? Go ahead, guess what the centerpiece of at least one method of proving it is.

Step back from calculus and into real analysis. That’s the study of why calculus works, and how real numbers work. The Extreme Value Theorem turns up again and again. Like, one technique for defining the integral itself is to approximate a function with a “stepwise” function. This is one that looks like a pixellated, rectangular approximation of the function. The definition depends on having a stepwise rectangular approximation that’s as close as you can get to a function while always staying less than it. And another stepwise rectangular approximation that’s as close as you can get while always staying greater than it.

And then other results. Often in real analysis we want to know about whether sets are closed and bounded. The Extreme Value Theorem has a neat corollary. Start with a continuous function with domain that’s a closed and bounded interval. Then, this theorem demonstrates, the range is also a closed and bounded interval. I know this sounds like a technical point. But it is the sort of technical point that makes life easier.

The Extreme Value Theorem even takes on meaning when we don’t look at real numbers. We can rewrite it in topological spaces. These are sets of points for which we have an idea of a “neighborhood” of points. We don’t demand that we know what distance is exactly, though. What had been a closed and bounded interval becomes a mathematical construct called a “compact set”. The idea of a continuous function changes into one about the image of an open set being another open set. And there is still something recognizably the Extreme Value Theorem. It tells us about things called the supremum and infimum, which are slightly different from the maximum and minimum. Just enough to confuse the student taking real analysis the first time through.

Topological spaces are an abstracted concept. Real numbers are topological spaces, yes. But many other things also are. Neighborhoods and compact sets and open sets are also abstracted concepts. And so this theorem has its same quiet utility in these many spaces. It’s just there quietly supporting more challenging work.

And now I get to really relax: I already have a Reading the Comics post ready for tomorrow, and Sunday’s is partly written. Now I just have to find a mathematical term starting with ‘Y’ that’s interesting enough to write about.

## My 2018 Mathematics A To Z: Witch of Agnesi

Nobody had a suggested topic starting with ‘W’ for me! So I’ll take that as a free choice, and get lightly autobiogrpahical.

# Witch of Agnesi.

I know I encountered the Witch of Agnesi while in middle school. Eighth grade, if I’m not mistaken. It was a footnote in a textbook. I don’t remember much of the textbook. What I mostly remember of the course was how much I did not fit with the teacher. The only relief from boredom that year was the month we had a substitute and the occasional interesting footnote.

It was in a chapter about graphing equations. That is, finding curves whose points have coordinates that satisfy some equation. In a bit of relief from lines and parabolas the footnote offered this:

$y = \frac{8a^3}{x^2 + 4a^2}$

In a weird tantalizing moment the footnote didn’t offer a picture. Or say what an ‘a’ was doing in there. In retrospect I recognize ‘a’ as a parameter, and that different values of it give different but related shapes. No hint what the ‘8’ or the ‘4’ were doing there. Nor why ‘a’ gets raised to the third power in the numerator or the second in the denominator. I did my best with the tools I had at the time. Picked a nice easy boring ‘a’. Picked out values of ‘x’ and found the corresponding ‘y’ which made the equation true, and tried connecting the dots. The result didn’t look anything like a witch. Nor a witch’s hat.

It was one of a handful of biographical notes in the book. These were a little attempt to add some historical context to mathematics. It wasn’t much. But it was an attempt to show that mathematics came from people. Including, here, from Maria Gaëtana Agnesi. She was, I’m certain, the only woman mentioned in the textbook I’ve otherwise completely forgotten.

We have few names of ancient mathematicians. Those we have are often compilers like Euclid whose fame obliterated the people whose work they explained. Or they’re like Pythagoras, credited with discoveries by people who obliterated their own identities. In later times we have the mathematics done by, mostly, people whose social positions gave them time to write mathematics results. So we see centuries where every mathematician is doing it as their side hustle to being a priest or lawyer or physician or combination of these. Women don’t get the chance to stand out here.

Today of course we can name many women who did, and do, mathematics. We can name Emmy Noether, Ada Lovelace, and Marie-Sophie Germain. Challenged to do a bit more, we can offer Florence Nightingale and Sofia Kovalevskaya. Well, and also Grace Hopper and Margaret Hamilton if we decide computer scientists count. Katherine Johnson looks likely to make that cut. But in any case none of these people are known for work understandable in a pre-algebra textbook. This must be why Agnesi earned a place in this book. She’s among the earliest women we can specifically credit with doing noteworthy mathematics. (Also physics, but that’s off point for me.) Her curve might be a little advanced for that textbook’s intended audience. But it’s not far off, and pondering questions like “why $8a^3$? Why not $a^3$?” is more pleasant, to a certain personality, than pondering what a directrix might be and why we might use one.

The equation might be a lousy way to visualize the curve described. The curve is one of that group of interesting shapes you get by constructions. That is, following some novel process. Constructions are fun. They’re almost a craft project.

For this we start with a circle. And two parallel tangent lines. Without loss of generality, suppose they’re horizontal, so, there’s lines at the top and the bottom of the curve.

Take one of the two tangent points. Again without loss of generality, let’s say the bottom one. Draw a line from that point over to the other line. Anywhere on the other line. There’s a point where the line you drew intersects the circle. There’s another point where it intersects the other parallel line. We’ll find a new point by combining pieces of these two points. The point is on the same horizontal as wherever your line intersects the circle. It’s on the same vertical as wherever your line intersects the other parallel line. This point is on the Witch of Agnesi curve.

Now draw another line. Again, starting from the lower tangent point and going up to the other parallel line. Again it intersects the circle somewhere. This gives another point on the Witch of Agnesi curve. Draw another line. Another intersection with the circle, another intersection with the opposite parallel line. Another point on the Witch of Agnesi curve. And so on. Keep doing this. When you’ve drawn all the lines that reach from the tangent point to the other line, you’ll have generated the full Witch of Agnesi curve. This takes more work than writing out $y = \frac{8a^3}{x^2 + 4a^2}$, yes. But it’s more fun. It makes for neat animations. And I think it prepares us to expect the shape of the curve.

It’s a neat curve. Between it and the lower parallel line is an area four times that of the circle that generated it. The shape is one we would get from looking at the derivative of the arctangent. So there’s some reasons someone working in calculus might find it interesting. And people did. Pierre de Fermat studied it, and found this area. Isaac Newton and Luigi Guido Grandi studied the shape, using this circle-and-parallel-lines construction. Maria Agnesi’s name attached to it after she published a calculus textbook which examined this curve. She showed, according to people who present themselves as having read her book, the curve and how to find it. And she showed its equation and found the vertex and asymptote line and the inflection points. The inflection points, here, are where the curve chances from being cupped upward to cupping downward, or vice-versa.

It’s a neat function. It’s got some uses. It’s a natural smooth-hill shape, for example. So this makes a good generic landscape feature if you’re modeling the flow over a surface. I read that solitary waves can have this curve’s shape, too.

And the curve turns up as a probability distribution. Take a fixed point. Pick lines at random that pass through this point. See where those lines reach a separate, straight line. Some regions are more likely to be intersected than are others. Chart how often any particular line is the new intersection point. That chart will (given some assumptions I ask you to pretend you agree with) be a Witch of Agnesi curve. This might not surprise you. It seems inevitable from the circle-and-intersecting-line construction process. And that’s nice enough. As a distribution it looks like the usual Gaussian bell curve.

It’s different, though. And it’s different in strange ways. Like, for a probability distribution we can find an expected value. That’s … well, what it sounds like. But this is the strange probability distribution for which the law of large numbers does not work. Imagine an experiment that produces real numbers, with the frequency of each number given by this distribution. Run the experiment zillions of times. What’s the mean value of all the zillions of generated numbers? And it … doesn’t … have one. I mean, we know it ought to, it should be the center of that hill. But the calculations for that don’t work right. Taking a bigger sample makes the sample mean jump around more, not less, the way every other distribution should work. It’s a weird idea.

Imagine carving a block of wood in the shape of this curve, with a horizontal lower bound and the Witch of Agnesi curve as the upper bound. Where would it balance? … The normal mathematical tools don’t say, even though the shape has an obvious line of symmetry. And a finite area. You don’t get this kind of weirdness with parabolas.

(Yes, you’ll get a balancing point if you actually carve a real one. This is because you work with finitely-long blocks of wood. Imagine you had a block of wood infinite in length. Then you would see some strange behavior.)

It teaches us more strange things, though. Consider interpolations, that is, taking a couple data points and fitting a curve to them. We usually start out looking for polynomials when we interpolate data points. This is because everything is polynomials. Toss in more data points. We need a higher-order polynomial, but we can usually fit all the given points. But sometimes polynomials won’t work. A problem called Runge’s Phenomenon can happen, where the more data points you have the worse your polynomial interpolation is. The Witch of Agnesi curve is one of those. Carl Runge used points on this curve, and trying to fit polynomials to those points, to discover the problem. More data and higher-order polynomials make for worse interpolations. You get curves that look less and less like the original Witch. Runge is himself famous to mathematicians, known for “Runge-Kutta”. That’s a family of techniques to solve differential equations numerically. I don’t know whether Runge came to the weirdness of the Witch of Agnesi curve from considering how errors build in numerical integration. I can imagine it, though. The topics feel related to me.

I understand how none of this could fit that textbook’s slender footnote. I’m not sure any of the really good parts of the Witch of Agnesi could even fit thematically in that textbook. At least beyond the fact of its interesting name, which any good blog about the curve will explain. That there was no picture, and that the equation was beyond what the textbook had been describing, made it a challenge. Maybe not seeing what the shape was teased the mathematician out of this bored student.

And next is ‘X’. Will I take Mr Wu’s suggestion and use that to describe something “extreme”? Or will I take another topic or suggestion? We’ll see on Friday, barring unpleasant surprises. Thanks for reading.

## My 2018 Mathematics A To Z: Volume

Ray Kassinger, of the popular web comic Housepets!, had a silly suggestion when I went looking for topics. In one episode of Mystery Science Theater 3000, Crow T Robot gets the idea that you could describe the size of a space by the number of turkeys which fill it. (It’s based on like two minor mentions of “turkeys” in the show they were watching.)

I liked that episode. I’ve got happy memories of the time when I first saw it. I thought the sketch in which Crow T Robot got so volume-obsessed was goofy and dumb in the fun-nerd way.

I accept Mr Kassinger’s challenge only I’m going to take it seriously.

# Volume.

How big is a thing?

There is a legend about Thomas Edison. He was unimpressed with a new hire. So he hazed the college-trained engineer who deeply knew calculus. He demanded the engineer tell him the volume within a light bulb. The engineer went to work, making measurements of the shape of the bulb’s outside. And then started the calculations. This involves a calculus technique called “volumes of rotation”. This can tell the volume within a rotationally symmetric shape. It’s tedious, especially if the outer edge isn’t some special nice shape. Edison, fed up, took the bulb, filled it with water, poured that out into a graduated cylinder and said that was the answer.

I’m skeptical of legends. I’m skeptical of stories about the foolish intellectual upstaged by the practical man-of-action. And I’m skeptical of Edison because, jeez, I’ve read biographies of the man. Even the fawning ones make him out to be yeesh.

But the legend’s Edison had a point. If the volume of a shape is not how much stuff fits inside the shape, what is it? And maybe some object has too complicated a shape to find its volume. Can we think of a way to produce something with the same volume, but that is easier? Sometimes we can. When we do this with straightedge and compass, the way the Ancient Greeks found so classy, we call this “quadrature”. It’s called quadrature from its application in two dimensions. It finds, for a shape, a square with the same area. For a three-dimensional object, we find a cube with the same volume. Cubes are easy to understand.

Straightedge and compass can’t do everything. Indeed, there’s so much they can’t do. Some of it is stuff you’d think it should be able to, like, find a cube with the same volume as a sphere. Integration gives us a mathematical tool for describing how much stuff is inside a shape. It’s even got a beautiful shorthand expression. Suppose that D is the shape. Then its volume V is:

$V = \int\int\int_D dV$

Here “dV” is the “volume form”, a description of how the coordinates we describe a space in relate to the volume. The $\int\int\int$ is jargon, meaning, “integrate over the whole volume”. The subscript “D” modifies that phrase by adding “of D” to it. Writing “D” is shorthand for “these are all the points inside this shape, in whatever coordinate system you use”. If we didn’t do that we’d have to say, on each $\int$ sign, what points are inside the shape, coordinate by coordinate. At this level the equation doesn’t offer much help. It says the volume is the sum of infinitely many, infinitely tiny pieces of volume. True, but that doesn’t give much guidance about whether it’s more or less than two cups of water. We need to get more specific formulas, usually. We need to pick coordinates, for example, and say what coordinates are inside the shape. A lot of the resulting formulas can’t be integrated exactly. Like, an ellipsoid? Maybe you can integrate that. Don’t try without getting hazard pay.

We can approximate this integral. Pick a tiny shape whose volume is easy to know. Fill your shape with duplicates of it. Count the duplicates. Multiply that count by the volume of this tiny shape. Done. This is numerical integration, sometimes called “numerical quadrature”. If we’re being generous, we can say the legendary Edison did this, using water molecules as the tiny shape. And working so that he didn’t need to know the exact count or the volume of individual molecules. Good computational technique.

It’s hard not to feel we’re begging the question, though. We want the volume of something. So we need the volume of something else. Where does that volume come from?

Well, where does an inch come from? Or a centimeter? Whatever unit you use? You pick something to use as reference. Any old thing will do. Which is why you get fascinating stories about choosing what to use. And bitter arguments about which of several alternatives to use. And we express the length of something as some multiple of this reference length.

Volume works the same way. Pick a reference volume, something that can be one unit-of-volume. Other volumes are some multiple of that unit-of-volume. Possibly a fraction of that unit-of-volume.

Usually we use a reference volume that’s based on the reference length. Typically, we imagine a cube that’s one unit of length on each side. The volume of this cube with sides of length 1 unit-of-length is then 1 unit-of-volume. This seems all nice and orderly and it’s surely not because mathematicians have paid off by six-sided-dice manufacturers.

Does it have to be?

That we need some reference volume seems inevitable. We can’t very well say the area of something is ten times nothing-in-particular. Does that reference volume have to be a cube? Or even a rectangle or something else? It seems obvious that we need some reference shape that tiles, that can fill up space by itself … right?

What if we don’t?

I’m going to drop out of three dimensions a moment. Not because it changes the fundamentals, but because it makes something easier. Specifically, it makes it easier if you decide you want to get some construction paper, cut out shapes, and try this on your own. What this will tell us about area is just as true for volume. Area, for a two-dimensional sapce, and volume, for a three-dimensional, describe the same thing. If you’ll let me continue, then, I will.

So draw a figure on a clean sheet of paper. What’s its area? Now imagine you have a whole bunch of shapes with reference areas. A bunch that have an area of 1. That’s by definition. That’s our reference area. A bunch of smaller shapes with an area of one-half. By definition, too. A bunch of smaller shapes still with an area of one-third. Or one-fourth. Whatever. Shapes with areas you know because they’re marked on them.

Here’s one way to find the area. Drop your reference shapes, the ones with area 1, on your figure. How many do you need to completely cover the figure? It’s all right to cover more than the figure. It’s all right to have some of the reference shapes overlap. All you need is to cover the figure completely. … Well, you know how many pieces you needed for that. You can count them up. You can add up the areas of all these pieces needed to cover the figure. So the figure’s area can’t be any bigger than that sum.

Can’t be exact, though, right? Because you might get a different number if you covered the figure differently. If you used smaller pieces. If you arranged them better. This is true. But imagine all the possible reference shapes you had, and all the possible ways to arrange them. There’s some smallest area of those reference shapes that would cover your figure. Is there a more sensible idea for what the area of this figure would be?

And put this into three dimensions. If we start from some reference shapes of volume 1 and maybe 1/2 and 1/3 and whatever other useful fractions there are? Doesn’t this covering make sense as a way to describe the volume? Cubes or rectangles are easy to imagine. Tetrahedrons too. But why not any old thing? Why not, as the Mystery Science Theater 3000 episode had it, turkeys?

This is a nice, flexible, convenient way to define area. So now let’s see where it goes all bizarre. We know this thanks to Giuseppe Peano. He’s among the late-19th/early-20th century mathematicians who shaped modern mathematics. They did this by showing how much of our mathematics broke intuition. Peano was (here) exploring what we now call fractals. And noted a family of shapes that curl back on themselves, over and over. They’re beautiful.

And they fill area. Fill volume, if done in three dimensions. It seems impossible. If we use this covering scheme, and try to find the volume of a straight line, we get zero. Well, we find that any positive number is too big, and from that conclude that it has to be zero. Since a straight line has length, but not volume, this seems fine. But a Peano curve won’t go along with this. A Peano curve winds back on itself so much that there is some minimum volume to cover it.

This unsettles. But this idea of volume (or area) by covering works so well. To throw it away seems to hobble us. So it seems worth the trade. We allow ourselves to imagine a line so long and so curled up that it has a volume. Amazing.

And now I get to relax and unwind and enjoy a long weekend before coming to the letter ‘W’. That’ll be about some topic I figure I can whip out a nice tight 500 words about, and instead, produce some 1541-word monstrosity while I wonder why I’ve had no free time at all since August. Tuesday, give or take, it’ll be available at this link, as are the rest of these glossary posts. Thanks for reading.

## My 2018 Mathematics A To Z: Unit Fractions

My subject for today is another from Iva Sallay, longtime friend of the blog and creator of the Find the Factors recreational mathematics game. I think you’ll likely find something enjoyable at her site, whether it’s the puzzle or the neat bits of trivia as she works through all the counting numbers.

# Unit Fractions.

We don’t notice how unit fractions are around us. Likely there’s some in your pocket. Or there have been recently. Think of what you do when paying for a thing, when it’s not a whole number of dollars. (Pounds, euros, whatever the unit of currency is.) Suppose you have exact change. What do you give for the 38 cents?

Likely it’s something like a 25-cent piece and a 10-cent piece and three one-cent pieces. This is an American and Canadian solution. I know that 20-cent pieces are more common than 25-cent ones worldwide. It doesn’t make much difference; if you want it to be three 10-cent, one five-cent, and three one-cent pieces that’s as good. And granted, outside the United States it’s growing common to drop pennies altogether and round prices off to a five- or ten-cent value. Again, it doesn’t make much difference.

But look at the coins. The 25 cent piece is one-quarter of a dollar. It’s even called that, and stamped that on one side. I sometimes hear a dime called “a tenth of a dollar”, although mostly by carnival barkers in one-reel cartoons of the 1930s. A nickel is one-twentieth of a dollar. A penny is one-hundredth. A 20-cent piece is one-fifth of a dollar. And there are half-dollars out there, although not in the United States, not really anymore.

(Pre-decimalized currencies offered even more unit fractions. Using old British coins, for familiarity-to-me and great names, there were farthings, 1/960th of a pound; halfpennies, 1/480th; pennies, 1/240th; threepence, 1/80th of a pound; groats, 1/60th; sixpence, 1/40th; florins, 1/10th; half-crowns, 1/8th; crowns, 1/4th. And what seem to the modern wallet like impossibly tiny fractions like the half-, third-, and quarter-farthings used where 1/3840th of a pound might be a needed sum of money.)

Unit fractions get named and defined somewhere in elementary school arithmetic. They go on, becoming forgotten sometime after that. They might make a brief reappearance in calculus. There are some rational functions that get easier to integrate if you think of them as the sums of fractions, with constant numerators and polynomial denominators. These aren’t unit fractions. A unit fraction has a 1, the unit, in the numerator. But we see units along the way to integrating $\frac{1}{x^2 - x}$ as an example. And see it in the promise that there are still more amazing integrals to learn how to do.

They get more attention if you take a history of computation class. Or read the subject on your own. Unit fractions stand out in history. We learn the Ancient Egyptians worked with fractions as sums of unit fractions. That is, had they dollars, they would not look at the $\frac{38}{100}$ we do. They would look at $\frac{1}{4}$ plus $\frac{1}{10}$ plus $\frac{1}{100}$ plus $\frac{1}{100}$ plus $\frac{1}{100}$. When we count change we are using, without noticing it, a very old computing scheme.

This isn’t quite true. The Ancient Egyptians seemed to shun repeating a unit like that. To use $\frac{1}{100}$ once is fine; three times is suspicious. They would prefer something like $\frac{1}{3}$ plus $\frac{1}{24}$ plus $\frac{1}{200}$. Or maybe some other combination. I just wrote out the first one I found.

But there are many ways we can make 38 cents using ordinary coins of the realm. There are infinitely many ways to make up any fraction using unit fractions. There’s surely a most “efficient”. Most efficient might be the one which uses the fewest number of terms. Most efficient might be the one that uses the smallest denominators. Choose what you like; no one knows a scheme that always turns up the most efficient, either way. We can always find some representation, though. It may not be “good”, but it will exist, which may be good enough. Leonardo of Pisa, or as he got named in the 19th century, Fibonacci, proved that was true.

We may ask why the Egyptians used unit fractions. They seem inefficient compared to the way we work with fractions. Or, better, decimals. I’m not sure the question can have a coherent answer. Why do we have a fashion for converting fractions to a “proper” form? Why do we use the number of decimal points we do for a given calculation? Sometimes a particular mode of expression is the fashion. It comes to seem natural because everyone uses it. We do it too.

And there is practicality to them. Even efficiency. If you need π, for example, you can write it as 3 plus $\frac{1}{8}$ plus $\frac{1}{61}$ and your answer is off by under one part in a thousand. Combine this with the Egyptian method of multiplication, where you would think of (say) “11 times π” as “1 times π plus 2 times π plus 8 times π”. And with tables they had worked up which tell you what $\frac{2}{8}$ and $\frac{2}{61}$ would be in a normal representation. You can get rather good calculations without having to do more than addition and looking up doublings. Represent π as 3 plus $\frac{1}{8}$ plus $\frac{1}{61}$ plus $\frac{1}{5020}$ and you’re correct to within one part in 130 million. That isn’t bad for having to remember four whole numbers.

(The Ancient Egyptians, like many of us, were not absolutely consistent in only using unit fractions. They had symbols to represent $\frac{2}{3}$ and $\frac{3}{4}$, probably due to these numbers coming up all the time. Human systems vary to make the commonest stuff we do easier.)

Enough practicality or efficiency, if this is that. Is there beauty? Is there wonder? Certainly. Much of it is in number theory. Number theory splits between astounding results and results that would be astounding if we had any idea how to prove them. Many of the astounding results are about unit fractions. Take, for example, the harmonic series $1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} + \cdots$. Truncate that series whenever you decide you’ve had enough. Different numbers of terms in this series will add up to different numbers. Eventually, infinitely many numbers. The numbers will grow ever-higher. There’s no number so big that it won’t, eventually, be surpassed by some long-enough truncated harmonic series. And yet, past the number 1, it’ll never touch a whole number again. Infinitely many partial sums. Partial sums differing from one another by one-googol-plex and smaller. And yet of the infinitely many whole numbers this series manages to miss them all, after its starting point. Worse, any set of consecutive terms, not even starting from 1, will never hit a whole number. I can understand a person who thinks mathematics is boring, but how can anyone not find it astonishing?

There are more strange, beautiful things. Consider heptagonal numbers, which Iva Sallay knows well. These are numbers like 1 and 7 and 18 and 34 and 55 and 1288. Take a heptagonal number of, oh, beads or dots or whatever, and you can lay them out to form a regular seven-sided figure. Add together the reciprocals of the heptagonal numbers. What do you get? It’s a weird number. It’s irrational, which you maybe would have guessed as more likely than not. But it’s also transcendental. Most real numbers are transcendental. But it’s often hard to prove any specific number is.

Unit fractions creep back into actual use. For example, in modular arithmetic, they offer a way to turn division back into multiplication. Division, in modular arithmetic, tends to be hard. Indeed, if you need an algorithm to make random-enough numbers, you often will do something with division in modular arithmetic. Suppose you want to divide by a number x, modulo y, and x and y are relatively prime, though. Then unit fractions tell us how to turn this into finding a greatest common denominator problem.

They teach us about our computers, too. Much of serious numerical mathematics involves matrix multiplication. Matrices are, for this purpose, tables of numbers. The Hilbert Matrix has elements that are entirely unit fractions. The Hilbert Matrix is really a family of square matrices. Pick any of the family you like. It can have two rows and two columns, or three rows and three columns, or ten rows and ten columns, or a million rows and a million columns. Your choice. The first row is made of the numbers $1, \frac{1}{2}, \frac{1}{3}, \frac{1}{4},$ and so on. The second row is made of the numbers $\frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5},$ and so on. The third row is made of the numbers $\frac{1}{3}, \frac{1}{4}, \frac{1}{5}, \frac{1}{6},$ and so on. You see how this is going.

Matrices can have inverses. It’s not guaranteed; matrices are like that. But the Hilbert Matrix does. It’s another matrix, of the same size. All the terms in it are integers. Multiply the Hilbert Matrix by its inverse and you get the Identity Matrix. This is a matrix, the same number of rows and columns as you started with. But nearly every element in the identity matrix is zero. The only exceptions are on the diagonal — first row, first column; second row, second column; third row, third column; and so on. There, the identity matrix has a 1. The identity matrix works, for matrix multiplication, much like the real number 1 works for normal multiplication.

Matrix multiplication is tedious. It’s not hard, but it involves a lot of multiplying and adding and it just takes forever. So set a computer to do this! And you get … uh ..

For a small Hilbert Matrix and its inverse, you get an identity matrix. That’s good. For a large Hilbert Matrix and its inverse? You get garbage. Large isn’t maybe very large. A 12 by 12 matrix gives you trouble. A 14 by 14 matrix gives you a mess. Well, on my computer it does. Cute little laptop I got when my former computer suddenly died. On a better computer? One designed for computation? … You could do a little better. Less good than you might imagine.

The trouble is that computers don’t really do mathematics. They do an approximation of it, numerical computing. Most use a scheme called floating point arithmetic. It mostly works well. There’s a bit of error in every calculation. For most calculations, though, the error stays small. At least relatively small. The Hilbert Matrix, built of unit fractions, doesn’t respect this. It and its inverse have a “numerical instability”. Some kinds of calculations make errors explode. They’ll overwhelm the meaningful calculation. It’s a bit of a mess.

Numerical instability is something anyone doing mathematics on the computer must learn. Must grow comfortable with. Must understand. The matrix multiplications, and inverses, that the Hilbert Matrix involves highlights those. A great and urgent example of a subtle danger of computerized mathematics waits for us in these unit fractions. And we’ve known and felt comfortable with them for thousands of years.

There’ll be some mathematical term with a name starting ‘V’ that, barring surprises, should be posted Friday. What’ll it be? I have an idea at least. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Tiling

For today’s a to Z topic I again picked one nominated by aajohannas. This after I realized I was falling into a never-ending research spiral on Mr Wu, of Mathtuition’s suggested “torus”. I do have an older essay describing the torus, as a set. But that does leave out a lot of why a torus is interesting. Well, we’ll carry on.

# Tiling.

Here is a surprising thought for the next time you consider remodeling the kitchen. It’s common to tile the floor. Perhaps some of the walls behind the counter. What patterns could you use? And there are infinitely many possibilities. You might leap ahead of me and say, yes, but they’re all boring. A tile that’s eight inches square is different from one that’s twelve inches square and different from one that’s 12.01 inches square. Fine. Let’s allow that all square tiles are “really” the same pattern. The only difference between a square two feet on a side and a square half an inch on a side is how much grout you have to deal with. There are still infinitely many possibilities.

You might still suspect me of being boring. Sure, there’s a rectangular tile that’s, say, six inches by eight inches. And one that’s six inches by nine inches. Six inches by ten inches. Six inches by one millimeter. Yes, I’m technically right. But I’m not interested in that. Let’s allow that all rectangular tiles are “really” the same pattern. So we have “squares” and “rectangles”. There are still infinitely many tile possibilities.

Let me shorten the discussion here. Draw a quadrilateral. One that doesn’t intersect itself. That is, there’s four corners, four lines, and there’s no X crossings. If you have that, then you have a tiling. Get enough of these tiles and arrange them correctly and you can cover the plane. Or the kitchen floor, if you have a level floor. It might not be obvious how to do it. You might have to rotate alternating tiles, or set them in what seem like weird offsets. But you can do it. You’ll need someone to make the tiles for you, if you pick some weird pattern. I hope I live long enough to see it become part of the dubious kitchen package on junk home-renovation shows.

Let me broaden the discussion here. What do I mean by a tiling if I’m allowing any four-sided figure to be a tile? We start with a surface. Usually the plane, a flat surface stretching out infinitely far in two dimensions. The kitchen floor, or any other mere mortal surface, approximates this. But the floor stops at some point. That’s all right. The ideas we develop for the plane work all right for the kitchen. There’s some weird effects for the tiles that get too near the edges of the room. We don’t need to worry about them here. The tiles are some collection of open sets. No two tiles overlap. The tiles, plus their boundaries, cover the whole plane. That is, every point on the plane is either inside exactly one of the open sets, or it’s on the boundary between one (or more) sets.

There isn’t a requirement that all these sets have the same shape. We usually do, and will limit our tiles to one or two shapes endlessly repeated. It seems to appeal to our aesthetics and our installation budget. Using a single pattern allows us to cover the plane with triangles. Any triangle will do. Similarly any quadrilateral will do. For convex pentagonal tiles — here things get weird. There are fourteen known families of pentagons that tile the plane. Each member of the family looks about the same, but there’s some room for variation in the sides. Plus there’s one more special case that can tile the plane, but only that one shape, with no variation allowed. We don’t know if there’s a sixteenth pattern. But then until 2015 we didn’t know there was a 15th, and that was the first pattern found in thirty years. Might be an opening for someone with a good eye for doodling.

There are also exciting opportunities in convex hexagons. Anyone who plays strategy games knows a regular hexagon will tile the plane. (Regular hexagonal tilings fit a certain kind of strategy game well. Particularly they imply an equal distance between the centers of any adjacent tiles. Square and triangular tiles don’t guarantee that. This can imply better balance for territory-based games.) Irregular hexagons will, too. There are three known families of irregular hexagons that tile the plane. You can treat the regular hexagon as a special case of any of these three families. No one knows if there’s a fourth family. Ready your notepad at the next overlong, agenda-less meeting.

There aren’t tilings for identical convex heptagons, figures with seven sides. Nor eight, nor nine, nor any higher figure. You can cover them if you have non-convex figures. See any Tetris game where you keep getting the ‘s’ or ‘t’ shapes. And you can cover them if you use several shapes.

There’s some guidance if you want to create your own periodic tilings. I see it called the Conway Criterion. I don’t know the field well enough to say whether that is a common term. It could be something one mathematics popularizer thought of and that other popularizers imitated. (I don’t find “Conway Criterion” on the Mathworld glossary, but that isn’t definitive.) Suppose your polygon satisfies a couple of rules about the shapes of the edges. The rules are given in that link earlier this paragraph. If your shape does, then it’ll be able to tile the plane. If you don’t satisfy the rules, don’t despair! It might yet. The Conway Criterion tells you when some shape will tile the plane. It won’t tell you that something won’t.

(The name “Conway” may nag at you as familiar from somewhere. This criterion is named for John H Conway, who’s famous for a bunch of work in knot theory, group theory, and coding theory. And in popular mathematics for the “Game of Life”. This is a set of rules on a grid of numbers. The rules say how to calculate a new grid, based on this first one. Iterating them, creating grid after grid, can make patterns that seem far too complicated to be implicit in the simple rules. Conway also developed an algorithm to calculate the day of the week, in the Gregorian calendar. It is difficult to explain to the non-calendar fan how great this sort of thing is.)

This has all gotten to periodic tilings. That is, these patterns might be complicated. But if need be, we could get them printed on a nice square tile and cover the floor with that. Almost as beautiful and much easier to install. Are there tilings that aren’t periodic? Aperiodic tilings?

Well, sure. Easily. Take a bunch of tiles with a right angle, and two 45-degree angles. Put any two together and you have a square. So you’re “really” tiling squares that happen to be made up of a pair of triangles. Each pair, toss a coin to decide whether you put the diagonal as a forward or backward slash. Done. That’s not a periodic tiling. Not unless you had a weird run of luck on your coin tosses.

All right, but is that just a technicality? We could have easily installed this periodically and we just added some chaos to make it “not work”. Can we use a finite number of different kinds of tiles, and have it be aperiodic however much we try to make it periodic? And through about 1966 mathematicians would have mostly guessed that no, you couldn’t. If you had a set of tiles that would cover the plane aperiodically, there was also some way to do it periodically.

And then in 1966 came a surprising result. No, not Penrose tiles. I know you want me there. I’ll get there. Not there yet though. In 1966 Robert Berger — who also attended Rensselaer Polytechnic Institute, thank you — discovered such a tiling. It’s aperiodic, and it can’t be made periodic. Why do we know Penrose Tiles rather than Berger Tiles? Couple reasons, including that Berger has to use 20,426 distinct tile shapes. In 1971 Raphael M Robinson simplified matters a bit and got that down to six shapes. Roger Penrose in 1974 squeezed the set down to two, although by adding some rules about what edges may and may not touch one another. (You can turn this into a pure edges thing by putting notches into the shapes.) That really caught the public imagination. It’s got simplicity and accessibility to combine with beauty. Aperiodic tiles seem to relate to “quasicrystals”, which are what the name suggests and do happen in some materials. And they’ve got beauty. Aperiodic tiling embraces our need to have not too much order in our order.

I’ve discussed, in all this, tiling the plane. It’s an easy surface to think about and a popular one. But we can form tiling questions about other shapes. Cylinders, spheres, and toruses seem like they should have good tiling questions available. And we can imagine “tiling” stuff in more dimensions too. If we can fill a volume with cubes, or rectangles, it’s natural to wonder what other shapes we can fill it with. My impression is that fewer definite answers are known about the tiling of three- and four- and higher-dimensional space. Possibly because it’s harder to sketch out ideas and test them. Possibly because the spaces are that much stranger. I would be glad to hear more.

I’m hoping now to have a nice relaxing weekend. I won’t. I need to think of what to say for the letter ‘U’. On Tuesday I hope that it will join the rest of my A to Z essays at this link.

## My 2018 Mathematics A To Z: Sorites Paradox

Today’s topic is the lone (so far) request by bunnydoe, so I’m under pressure to make it decent. If she or anyone else would like to nominate subjects for the letters U through Z, please drop me a note at this post. I keep fooling myself into thinking I’ll get one done in under 1200 words.

This is a story which makes a capitalist look kind of good. I make no vouches for its truth, or even, at this remove, where I got it. The story as I heard it was about Ray Kroc, who made McDonald’s into a thing people of every land can complain about. The story has him demonstrate skepticism about the use of business consultants. A consultant might find, for example, that each sesame-seed hamburger bun has (say) 43 seeds. And that if they just cut it down to 41 seeds then each franchise would save (say) $50,000 annually. And no customer would notice the difference. Fine; trim the seeds a little. The next round of consultant would point out, cutting from 41 seeds to 38 would save a further$65,000 per store per year. And again no customer would notice the difference. Cut to 36 seeds? No customer would notice. This process would end when each bun had three sesame seeds, and the customers notice.

Part of the paradox’s intractability must be that it’s so nearly induction. Induction is a fantastic tool for mathematical problems. We couldn’t do without it. But consider the argument. If a bun is unsatisfying, one more seed won’t make it satisfying. A bun with one seed is unsatisfying. Therefore all buns have an unsatisfying number of sesame seeds on them. It suggests there must be some point at which “adding one more seed won’t help” stops being true. Fine; where is that point, and why isn’t it one fewer or one more seed?

A certain kind of nerd has a snappy answer for the Sorites Paradox. Test a broad population on a variety of sesame-seed buns. There’ll be some so sparse that nearly everyone will say they’re unsatisfying. There’ll be some so abundant most everyone agrees they’re great. So there’s the buns most everyone says are fine. There’s the buns most everyone says are not. The dividing line is at any point between the sparsest that satisfy most people and the most abundant that don’t. The nerds then declare the problem solved and go off. Let them go. We were lucky to get as much of their time as we did. They’re quite busy solving what “really” happened for Rashomon. The approach of “set a line somewhere” is fine if all want is guidance on where to draw a line. It doesn’t help say why we can anoint some border over any other. At least when we use a river as border between states we can agree going into the water disrupts what we were doing with the land. And even then we have to ask what happens during droughts and floods, and if the river is an estuary, how tides affect matters.

We might see an answer by thinking more seriously about these sesame-seed buns. We force a problem by declaring that every bun is either satisfying or it is not. We can imagine buns with enough seeds that we don’t feel cheated by them, but that we also don’t feel satisfied by. This reflects one of the common assumptions of logic. Mathematicians know it as the Law of the Excluded Middle. A thing is true or it is not true. There is no middle case. This is fine for logic. But for everyday words?

It doesn’t work when considering sesame-seed buns. I can imagine a bun that is not satisfying, but also is not unsatisfying. Surely we can make some logical provision for the concept of “meh”. Now we need not draw some arbitrary line between “satisfying” and “unsatisfying”. We must draw two lines, one of them between “unsatisfying” and “meh”. There is a potential here for regression. Also for the thought of a bun that’s “satisfying-meh-satisfying by unsatisfying”. I shall step away from this concept.

But there are more subtle ways to not exclude the middle. For example, we might decide a statement’s truth exists on a spectrum. We can match how true a statement is to a number. Suppose an obvious falsehood is zero; an unimpeachable truth is one, and normal mortal statements somewhere in the middle. “This bun with a single sesame seed is satisfying” might have a truth of 0.01. This perhaps reflects the tastes of people who say they want sesame seeds but don’t actually care. “This bun with fifteen sesame seeds is satisfying” might have a truth of 0.25, say. “This bun with forty sesame seeds is satisfying” might have a truth of 0.97. (It’s true for everyone except those who remember the flush times of the 43-seed bun.) This seems to capture the idea that nothing is always wholly anything. But we can still step into absurdity. Suppose “this bun with 23 sesame seeds is satisfying” has a truth of 0.50. Then “this bun with 23 sesame seeds is not satisfying” should also have a truth of 0.50. What do we make of the statement “this bun with 23 sesame seeds is simultaneously satisfying and not satisfying”? Do we make something different to “this bun with 23 sesame seeds is simultaneously satisfying and satisfying”?

I see you getting tired in the back there. This may seem like word games. And we all know that human words are imprecise concepts. What has this to do with logic, or mathematics, or anything but the philosophy of language? And the first answer is that we understand logic and mathematics through language. When learning mathematics we get presented with definitions that seem absolute and indisputable. We start to see the human influence in mathematics when we ask why 1 is not a prime number. Later we see things like arguments about whether a ring has a multiplicative identity. And then there are more esoteric debates about the bounds of mathematical concepts.

Perhaps we can think of a concept we can’t describe in words. If we don’t express it to other people, the concept dies with us. We need words. No, putting it in symbols does not help. Mathematical symbols may look like slightly alien scrawl. But they are shorthand for words, and can be read as sentences, and there is this fuzziness in all of them.

And we find mathematical properties that share this problem. Consider: what is the color of the chemical element flerovium? Before you say I just made that up, flerovium was first synthesized in 1998, and officially named in 2012. We’d guess that it’s a silvery-white or maybe grey metallic thing. Humanity has only ever observed about ninety atoms of the stuff. It’s, for atoms this big, amazingly stable. We know an isotope of it that has a half-life of two and a half seconds. But it’s hard to believe we’ll ever have enough of the stuff to look at it and say what color it is.

That’s … all right, though? Maybe? Because we know the quantum mechanics that seem to describe how atoms form. And how they should pack together. And how light should be absorbed, and how light should be emitted, and how light should be scattered by it. At least in principle. The exact answers might be beyond us. But we can imagine having a solution, at least in principle. We can imagine the computer that after great diligent work gives us a picture of what a ten-ton lump of flerovium would look like.

So where does its color come from? Or any of the other properties that these atoms have as a group? No one atom has a color. No one atom has a density, either, or a viscosity. No one atom has a temperature, or a surface tension, or a boiling point. In combination, though, they have.

These are known to statistical mechanics, and through that thermodynamics, as intensive properties. If we have a partition function, which describes all the ways a system can be organized, we can extract information about these properties. They turn up as derivatives with respect to the right parameters of the system.

But the same problem exists. Take a homogeneous gas. It has some temperature. Divide it into two equal portions. Both sides have the same temperature. Divide each half into two equal portions again. All four pieces have the same temperature. Divide again, and again, and a few more times. You eventually get containers with so little gas in them they don’t have a temperature. Where did it go? When did it disappear?

The counterpart to an intensive property is an extensive one. This is stuff like the mass or the volume or the energy of a thing. Cut the gas’s container in two, and each has half the volume. Cut it in half again, and each of the four containers has one-quarter the volume. Keep this up and you stay in uncontroversial territory, because I am not discussing Zeno’s Paradoxes here.

And like Zeno’s Paradoxes, the Sorites Paradox can seem at first trivial. We can distinguish a heap from a non-heap; who cares where the dividing line is? Or whether the division is a gradual change? It seems easy. To show why it is easy is hard. Each potential answer is interesting, and plausible, and when you think hard enough of it, not quite satisfying. Good material to think about.

I hope to find some material think about the letter ‘T’ and have it published Friday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Randomness

Today’s topic is an always rich one. It was suggested by aajohannas, who so far as I know has’t got an active blog or other project. If I’m mistaken please let me know. I’m glad to mention the creative works of people hanging around my blog.

# Randomness.

An old Sydney Harris cartoon I probably won’t be able to find a copy of before this publishes. A couple people gather around an old fanfold-paper printer. On the printout is the sequence “1 … 2 … 3 … 4 … 5 … ” The caption: ‘Bizarre sequence of computer-generated random numbers’.

Randomness feels familiar. It feels knowable. It means surprise, unpredictability. The upending of patterns. The obliteration of structure. I imagine there are sociologists who’d say it’s what defines Modernity. It’s hard to avoid noticing that the first great scientific theories that embrace unpredictability — evolution and thermodynamics — came to public awareness at the same time impressionism came to arts, and the subconscious mind came to psychology. It’s grown since then. Quantum mechanics is built on unpredictable specifics. Chaos theory tells us even if we could predict statistics it would do us no good. Randomness feels familiar, even necessary. Even desirable. A certain type of nerd thinks eagerly of the Singularity, the point past which no social interactions are predictable anymore. We live in randomness.

And yet … it is hard to find randomness. At least to be sure we have found it. We might choose between options we find ambivalent by tossing a coin. This seems random. But anyone who was six years old and trying to cheat a sibling knows ways around that. Drop the coin without spinning it, from a half-inch above the table, and you know the outcome, all the way through to the sibling’s punching you. When we’re older and can be made to be better sports we’re fairer about it. We toss the coin and give it a spin. There’s no way we could predict the outcome. Unless we knew just how strong a toss we gave it, and how fast it spun, and how the mass of the coin was distributed. … Really, if we knew enough, our tossed coin would be as predictably as the coin we dropped as a six-year-old. At least unless we tossed in some chaotic way, where each throw would be deterministic, but we couldn’t usefully make a prediction.

Our instinctive idea of what randomness must be is flawed. That shouldn’t surprise. Our instinctive idea of anything is flawed. But randomness gives us trouble. It’s obvious, for example, that randomly selected things should have no pattern. But then how is that reasonable? If we draw letters from the alphabet at random, we should expect sometimes to get some cute pattern like ‘aaaaa’ or ‘qwertyuiop’ or the works of Shakespeare. Perhaps we mean we shouldn’t get patterns any more often than we would expect. All right; how often is that?

We can make tests. Some of them are obvious. Take something that generates possibly-random results. Look up how probable each of those outcomes is. Then run off a bunch of outcomes. Do we get about as many of each result as we should expect? Probability tells us we should get as close as we like to the expected frequency if we let the random process run long enough. If this doesn’t happen, great! We can conclude we don’t really have something random.

We can do more tests. Some of them are brilliantly clever. Suppose there’s a way to order the results. Since mathematicians usually want numbers, putting them in order is easy to do. If they’re not, there’s usually a way to match results to numbers. You’ll see me slide here into talking about random numbers as though that were the same as random results. But if I can distinguish different outcomes, then I can label them. If I can label them, I can use numbers as labels. If the order of the numbers doesn’t matter — should “red” be a 1 or a 2? Should “green” be a 3 or an 8? — then, fine; any order is good.

There are 120 ways to order five distinct things. So generate lots of sets of, say, five numbers. What order are they in? There’s 120 possibilities. Do each of the possibilities turn up as often as expected? If they don’t, great! We can conclude we don’t really have something random.

I can go on. There are many tests which will let us say something isn’t a truly random sequence. They’ll allow for something like Sydney Harris’s peculiar sequence of random numbers. Mostly by supposing that if we let it run long enough the sequence would stop. But these all rule out random number generators. Do we have any that rule them in? That say yes, this generates randomness?

I don’t know of any. I suspect there can’t be any, on the grounds that a test of a thousand or a thousand million or a thousand million quadrillion numbers can’t assure us the generator won’t break down next time we use it. If we knew the algorithm by which the random numbers were generated — oh, but there we’re foiled before we can start. An algorithm is the instructions of how to do a thing. How can an instruction tell us how to do a thing that can’t be predicted?

Algorithms seem, briefly, to offer a way to tell whether we do have a good random sequence, though. We can describe patterns. A strong pattern is easy to describe, the way a familiar story is easy to reference. A weak pattern, a random one, is hard to describe. It’s like a dream, in which you can just list events. So we can call random something which can’t be described any more efficiently than just giving a list of all the results. But how do we know that can’t be done? 7, 7, 2, 4, 5, 3, 8, 5, 0, 9 looks like a pretty good set of digits, whole numbers from 0 through 9. I’ll bet not more than one in ten of you guesses correctly what the next digit in the sequence is. Unless you’ve noticed that these are the digits in the square root of π, so that the next couple digits have to be 0, 5, 5, and 1.

We know, on theoretical grounds, that we have randomness all around us. Quantum mechanics depends on it. If we need truly random numbers we can set a sensor. It will turn the arrival of cosmic rays, or the decay of radioactive atoms, or the sighing of a material flexing in the heat into numbers. We trust we gather these and process them in a way that doesn’t spoil their unpredictability. To what end?

That is, why do we care about randomness? Especially why should mathematicians care? The image of mathematics is that it is a series of logical deductions. That is, things known to be true because they follow from premises known to be true. Where can randomness fit?

One answer, one close to my heart, is called Monte Carlo methods. These are techniques that find approximate answers to questions. They do well when exact answers are too hard for us to find. They use random numbers to approximate answers and, often, to make approximate answers better. This demands computations. The field didn’t really exist before computers, although there are some neat forebears. I mean the Buffon needle problem, which lets you calculate the digits of π about as slowly as you could hope to do.

Another, linked to Monte Carlo methods, is stochastic geometry. “Stochastic” is the word mathematicians attach to things when they feel they’ve said “random” too often, or in an undignified manner. Stochastic geometery is what we can know about shapes when there’s randomness about how the shapes are formed. This sounds like it’d be too weak a subject to study. That it’s built on relatively weak assumptions means it describes things in many fields, though. It can be seen in understanding how forests grow. How to find structures inside images. How to place cell phone towers. Why materials should act like they do instead of some other way. Why galaxies cluster.

There’s also a stochastic calculus, a bit of calculus with randomness added. This is useful for understanding systems where some persistent unpredictable behavior is there. It comes, if I understand the histories of this right, from studying the ways molecules will move around in weird zig-zagging twists. They do this even when there is no overall flow, just a fluid at a fixed temperature. It too has surprising applications. Without the assumption that some prices of things are regularly jostled by arbitrary and unpredictable forces, and the treatment of that by stochastic calculus methods, we wouldn’t have nearly the ability to hedge investments against weird chaotic events. This would be a bad thing, I am told by people with more sophisticated investments than I have. I personally own like ten shares of the Tootsie Roll corporation and am working my way to a \$2.00 rebate check from Boyer.

Given that we need randomness, but don’t know how to get it — or at least don’t know how to be sure we have it — what is there to do? We accept our failings and make do with “quasirandom numbers”. We find some process that generates numbers which look about like random numbers should. These have failings. Most important is that if we could predict them. They’re random like “the date Easter will fall on” is random. The date Easter will fall is not at all random; it’s defined by a specific and humanly knowable formula. But if the only information you have is that this year, Easter fell on the 1st of April (Gregorian computus), you don’t have much guidance to whether this coming year it’ll be on the 7th, 14th, or 21st of April the next year. Most notably, quasirandom number generators will tend to repeat after enough numbers are drawn. If we know we won’t need enough numbers to see a repetition, though? Another stereotype of the mathematician is that of a person who demands exactness. It is often more true to say she is looking for an answer good enough. We are usually all right with a merely good enough quasirandomness.

Boyer candies — Mallo Cups, most famously, although I more like the peanut butter Smoothies — come with a cardboard card backing. Each card has two play money “coins”, of values from 5 cents to 50 cents. These can be gathered up for a rebate check or for various prizes. Whether your coin is 5 cents, 10, 25, or 50 cents … well, there’s no way to tell, before you open the package. It’s, so far as you can tell, randomness.

My next A To Z post should be available at this link. It’s coming Tuesday and should be the letter ‘S’.

## I’m Looking For The Last Topics For My Fall 2018 Mathematics A-To-Z

And now it’s my last request for my Fall 2018 mathematics A-To-Z. There’s only a half-dozen letters left, but nto to fear: they include letters with no end of potential topics, like, ‘X’.

If you have any mathematical topics with a name that starts U through Z that you’d like to see me write about, please say so. I’m happy to write what I fully mean to be a tight 500 words about the subject and then find I’ve put up my second 1800-word essay of the week. I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me. This might be the only way to get a good ‘X’ letter.

Also when you do make a request, please feel free to mention your blog, Twitter feed, YouTube channel, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.

So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

#### Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

#### Available Letters for the Fall 2018 A To Z:

• U
• V
• W
• X
• Y
• Z

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

## My 2018 Mathematics A To Z: Quadratic Equation

I have another topic today suggested by Dina Yagodich. I’ve mentioned before her YouTube channel. It’s got a variety of educational videos you might enjoy. Give it a try.

I’m planning this week to open up the end of the alphabet — and the year — to topic suggestions. So there’s no need to panic about that.

The Quadratic Equation is the tool humanity used to discover mathematics. Yes, I exaggerate a bit. But it touches a stunning array of important things. It is most noteworthy because of the time I impressed by several-levels-removed boss at the summer job I had while an undergraduate. He had been stumped by a data-optimization problem for weeks. I noticed it was just a quadratic equation, that’s easy to solve. He was, must be said, overly impressed. I would go on to grad school where I was once stymied for a week because I couldn’t find the derivative of $e^t$ correctly. It is, correctly, $e^t$. So I have sympathy for my remote supervisor.

We normally write the Quadratic Equation in one of two forms:

$ax^2 + bx + c = 0$

$a_0 + a_1 x + a_2 x^2 = 0$

The first form is great when you are first learning about polynomials, and parabolas. And you’re content to something raised to the second power. The second form is great when you are learning advanced stuff about polynomials. Then you start wanting to know things true about polynomials that go up to arbitrarily high powers. And we always want to know about polynomials. The subscripts under $a_j$ mean we can’t run out of letters to be coefficients. Setting the subscripts and powers to keep increasing lets us write this out neatly.

We don’t have to use x. We never do. But we mostly use x. Maybe t, if we’re writing an equation that describes something changing with time. Maybe z, if we want to emphasize how complex-valued numbers might enter into things. The name of the independent variable doesn’t matter. But stick to the obvious choices. If you’re going to make the variable ‘f’ you better have a good reason.

The equation is very old. We have ancient Babylonian clay tablets which describe it. Well, not the quadratic equation as we write it. The oldest problems put it as finding numbers that simultaneously solve two equations, one of them a sum and one of them a product. Changing one equation into two is a venerable mathematical process. It often makes problems simpler. We do this all the time in Ordinary Differential Equations. I doubt there is a direct connection between Ordinary Differential Equations and this alternate form of the Quadratic Equation. But it is a reminder that the ways we express mathematical problems are our conventions. We can rewrite problems to make our lives easier, to make answers clearer. We should look for chances to do that.

It weaves into everything. Some things seem obvious. Suppose the coefficients — a, b, and c; or $a_0, a_1, a_2$ if you’d rather — are all real-valued numbers. Then the quadratic equation has to hav two solutions. There can be two real-valued solutions. There can be one real-valued solution, counted twice for reasons that make sense but are too much a digression for me to justify here. There can be two complex-valued solutions. We can infer the usefulness of imaginary and complex-valued numbers by finding solutions to the quadratic equation.

(The quadratic equation is a great introduction complex-valued numbers. It’s not how mathematicians came to them. Complex-valued numbers looked like obvious nonsense. They corresponded to there being no real-valued answers. A formula that gives obvious nonsense when there’s no answer is great. It’s formulas that give subtle nonsense when there’s no answer that are dangerous. But similar-in-design formulas for cubic and quartic polynomials could use complex-valued numbers in intermediate steps. Plunging ahead as though these complex-valued numbers were proper would get to the real-valued answers. This made the argument that complex-valued numbers should be taken seriously.)

We learn useful things right away from trying to solve it. We teach students to “complete the square” as a first approach to solving it. Completing the square is not that useful by itself: a few pages later in the textbook we get to the quadratic formula and that has every quadratic equation solved. Just plug numbers into the formula. But completing the square teaches something more useful than just how to solve an equation. It’s a method in which we solve a problem by saying, you know, this would be easy to solve if only it were different. And then thinking how to change it into a different-looking problem with the same solutions. This is brilliant work. A mathematician is imagined to have all sorts of brilliant ideas on how to solve problems. Closer to to the truth is that she’s learned all sorts of brilliant ways to make a problem more like one she already knows how to solve. (This is the nugget of truth which makes one genre of mathematical jokes. These jokes have the punch line, “the mathematician declares, this is a problem already solved’ and goes back to sleep.”)

Stare at the solutions of the quadratic equation. You will find patterns. Suppose the coefficients are all real numbers. Then there are some numbers that can be solutions: 0, 1, square root of 15, -3.5, these can all turn up. There are some numbers that can’t be. π. e. The tangent of 2. It’s not just a division between rational and irrational numbers. There are different kinds of irrational numbers. This — alongside looking at other polynomials — leads us to transcendental numbers.

Keep staring at the two solutions of the quadratic equation. You’ll notice the sum of the solutions is $-\frac{b}{a}$. You’ll notice the product of the two solutions is $\frac{c}{a}$. You’ll glance back at those ancient Babylonian tablets. This seems interesting, but little more than that. It’s a lead, though. Similar formulas exist for the sum of the solutions for a cubic, for a quartic, for other polynomials. Also for the sum of products of pairs of these solutions. Or the sum of products of triplets of these solutions. Or the product of all these solutions. These are known as Vieta’s Formulas, after the 16th-century mathematician François Viète. (This by way of his Latinized, academic’sona, name, Franciscus Vieta.) This gives us a way to rewrite the original polynomial as a set of polynomials in several variables. What’s interesting is the set of polynomials have symmetries. They all look like, oh, “xy + yz + zx”. No one variable gets used in a way distinguishable from the others.

This leads us to group theory. The coefficients start out in a ring. The quotients from these Vieta’s Formulas give us an “extension” of the ring. An extension is roughly what the common use of the word suggests. It takes the ring and builds from it a bigger thing that satisfies some nice interesting rules. And it leads us to surprises. The ancient Greeks had several challenges to be done with only straightedge and compass. One was to make a cube double the volume of a given cube. It’s impossible to do, with these tools. (Even ignoring the question of what we would draw on.) Another was to trisect any arbitrary angle; it turns out, there are angles it’s just impossible. The group theory derived, in part, from this tells us why. One more impossibility: drawing a square that has exactly the same area as a given circle.

But there are possible things still. Step back from the quadratic equation, that $ax^2 + bx + c = 0$ bit. Make a function, instead, something that matches numbers (real, complex, what have you) to numbers (the same). Its rule: any x in the domain matches to the number $f(x) = ax^2 + bx + c$ in the range. We can make a picture that represents this. Set Cartesian coordinates — the x and y coordinates that people think of as the default — on a surface. Then highlight all the points with coordinates (x, y) which make true the equation $y = f(x)$. This traces out a particular shape, the parabola.

Draw a line that crosses this parabola twice. There’s now one fully-enclosed piece of the surface. How much area is enclosed there? It’s possible to find a triangle with area three-quarters that of the enclosed part. It’s easy to use straightedge and compass to draw a square the same area as a given triangle. Showing the enclosed area is four-thirds the triangle’s area? That can … kind of … be done by straightedge and compass. It takes infinitely many steps to do this. But if you’re willing to allow a process to go on forever? And you show that the process would reach some fixed, knowable answer? This could be done by the ancient Greeks; indeed, it was. Aristotle used this as an example of the method of exhaustion. It’s one of the ideas that reaches toward integral calculus.

This has been a lot of exact, “analytic” results. There are neat numerical results too. Vieta’s formulas, for example, give us good ways to find approximate solutions of the quadratic equation. They work well if one solution is much bigger than the other. Numerical methods for finding solutions tend to work better if you can start from a decent estimate of the answer. And you can learn of numerical stability, and the need for it, studying these.

Numerical calculations have a problem. We have a set number of decimal places with which to work. What happens if we need a calculation that takes more decimal places than we’re given to do perfectly? Here’s a toy version: two-thirds is the number 0.6666. Or 0.6667. Already we’re in trouble. What is three times two-thirds? We’re going to get either 1.9998 or 2.0001 and either way something’s wrong. The wrongness looks small. But any formula you want to use has some numbers that will turn these small errors into big ones. So numerical stability is, in fairness, not something unique to the quadratic equation. It is something you learn if you study the numerics of the equation deeply enough.

I’m also delighted to learn, through Wikipedia, that there’s a prosthaphaeretic method for solving the quadratic equation. Prosthaphaeretic methods use trigonometric functions and identities to rewrite problems. You might call it madness to rely on arctangents and half-angle formulas and such instead of, oh, doing a division or taking a square root. This is because you have calculators. But if you don’t? If you have to do all that work by hand? That’s terrible. But if someone has already prepared a table listing the sines and cosines and tangents of a great variety of angles? They did a great many calculations already. You just need to pick out the one that tells you what you hope to know. I’ll spare you the steps of solving the quadratic equation using trig tables. Wikipedia describes it fine enough.

So you see how much mathematics this connects to. It’s a bit of question-begging to call it that important. As I said, we’ve known the quadratic equation for a long time. We’ve thought about it for a long while. It would be surprising if we didn’t find many and deep links to other things. Even if it didn’t have links, we would try to understand new mathematical tools in terms of how they affect familiar old problems like this. But these are some of the things which we’ve found, and which run through much of what we understand mathematics to be.

The letter ‘R’ for this Fall 2018 Mathematics A-To-Z post should be published Friday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Pigeonhole Principle

Today’s topic is another that Dina Yagodich offered me. She keeps up a YouTube channel, with a variety of educational videos you might enjoy. And I apologize to Roy Kassinger, but I might come back around to “parasitic numbers” if I feel like doing some supplements or side terms.

# Pigeonhole Principle.

Mathematics has a reputation for obscurity. It’s full of jargon. All these weird, technical terms. Properties that mathematicians take to be obvious but that normal people find baffling. “The only word I understood was the’,” is the feedback mathematicians get when they show their family their thesis. I’m happy to share one that is not. This is one of those principles that anyone can understand. It’s so accessible that people might ask how it’s even mathematics.

The Pigeonhole Principle is usually put something like this. If you have more pigeons than there are different holes to nest them in, then at least one pigeonhole has to hold more than one pigeon. This is if the speaker wishes to keep pigeons in the discussion and is assuming there’s a positive number of both pigeons and holes. Tying a mathematical principle to something specific seems odd. We don’t talk about addition as apples put together or divided among friends. Not after elementary school anyway. Not once we’ve trained our natural number sense to work with abstractions.

If we want to make it abstract we can. Put it as “if you have more objects to put in boxes then you have boxes, then at least one box must hold more than one object”. In this form it is known as the Dirichlet Box Principle. Dirichlet here is Johan Peter Gustav Lejeune Dirichlet. He’s one of the seemingly infinite number of great 19th-Century French-German mathematicians. His family name was “Lejeune Dirichlet”, so his surname is an example of his own box principle. Everyone speaks of him as just Dirichlet, though. And they speak of him a lot, for stuff in mathematical physics, in thermodynamics, in Fourier transforms, in number theory (he proved two specific cases of Fermat’s Last Theorem), and in probability.

Still, at least in my experience, it’s “pigeonhole principle”. I don’t know why pigeons. It would be as good a metaphor to speak of horses put in stalls, or letters put in mailboxes, or pairs of socks put in hotel drawers. Perhaps it’s a reflection of the long history of breeding pigeons. That they’re familiar, likable animals, despite invective. That a bird in a cubby-hole seems like a cozy, pleasant image.

The pigeonhole principle is one of those neat little utility theorems. I think of it as something handy for existence proofs. These are proofs where you show there must be a thing. They don’t typically tell you what the thing is, or even help you to find it. They promise there is something to find.

Some of its uses seem too boring to bother proving. Pick five cards from a standard deck of cards; at least two will be the same suit. There are at least two non-bald people in Philadelphia who have the same number of hairs on their heads. Some of these uses seem interesting enough to prove, but worth nothing more than a shrug and a huh. Any 27-word sequence in the Constitution of the United States includes at least two words that begin with the same letter. Also at least two words that end with the same letter. If you pick any five integers from 1 to 8 (inclusive), then at least two of them will sum to nine.

Some uses start feeling unsettling. Draw five dots on the surface of an orange. It’s always possible to cut the orange in half in such a way that four points are on the same half. (This supposes that a point on the cut counts as being on both halves.)

Pick a set of 100 different whole numbers. It is always possible to select fifteen of these numbers, so that the difference between any pair of these select fifteen is some whole multiple of 7.

Select six people. There is always a triad of three people who all know one another, or who are all strangers to one another. (This supposes that “knowing one another” is symmetric. Real world relationships are messier than this. I have met Roger Penrose. There is not the slightest chance he is aware of this. Do we count as knowing one another or not?)

Some seem to transcend what we could possibly know. Drop ten points anywhere along a circle of diameter 5. Then we can conclude there are at least two points a distance of less than 2 from one another.

Drop ten points into an equilateral triangle whose sides are all length 1. Then there must be at least two points that are no more than distance $\frac{1}{3}$ apart.

Take a line of length L. Drop on it some number of points n + 1. There is some shortest length between consecutive points. What is the largest possible shortest-length-between-points? It is the distance $\frac{L}{n}$.

As I say, this won’t help you find the examples. You need to check the points in your triangle to see which ones are close to one another. You need to try out possible sets of your 100 numbers to find the ones that are all multiples of seven apart. But you have the assurance that the search will end in success, which is no small thing. And many of the conclusions you can draw are delights: results unexpected and surprising and wonderful. It’s great mathematics.

A note on sources. I drew pigeonhole-principle uses from several places. John Allen Paulos’s Innumeracy. Paulos is another I know without their knowing me. But also 16 Fun Applications of the Pigeonhole Principle, by Presh Talwalkar. Brilliant.org’s Pigeonhole Principle, by Lawrence Chiou, Parth Lohomi, Andrew Ellinor, et al. The Art of Problem Solving’s Pigeonhole Principle, author not made clear on the page. If you’re stumped by how to prove one or more of these claims, and don’t feel like talking them out here, try these pages.

My next Fall 2018 Mathematics A-To-Z post should be Tuesday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Oriented Graph

I am surprised to have had no suggestions for an ‘O’ letter. I’m glad to take a free choice, certainly. It let me get at one of those fields I didn’t specialize in, but could easily have. And let me mention that while I’m still taking suggestions for the letters P through T, each other letter has gotten at least one nomination. I can be swayed by a neat term, though, so if you’ve thought of something hard to resist, try me. And later this month I’ll open up the letters U through Z. Might want to start thinking right away about what X, Y, and Z could be.

# Oriented Graph.

This is another term from graph theory, one of the great mathematical subjects for doodlers. A graph, here, is made of two sets of things. One is a bunch of fixed points, called ‘vertices’. The other is a bunch of curves, called ‘edges’. Every edge starts at one vertex and ends at one vertex. We don’t require that every vertex have an edge grow from it.

Already you can see why this is a fun subject. It models some stuff really well. Like, anything where you have a bunch of sources of stuff, that come together and spread out again? Chances are there’s a graph that describes this. There’s a compelling all-purpose interpretation. Have vertices represent the spots where something accumulates, or rests, or changes, or whatever. Have edges represent the paths along which something can move. This covers so much.

The next step is a “directed graph”. This comes from making the edges different. If we don’t say otherwise we suppose that stuff can move along an edge in either direction. But suppose otherwise. Suppose there are some edges that can be used in only one direction. This makes a “directed edge”. It’s easy to see in graph theory networks of stuff like city streets. Once you ponder that, one-way streets follow close behind. If every edge in a graph is directed, then you have a directed graph. Moving from a regular old undirected graph to a directed graph changes everything you’d learned about graph theory. Mostly it makes things harder. But you get some good things in trade. We become able to model sources, for example. This is where whatever might move comes from. Also sinks, which is where whatever might move disappears from our consideration.

You might fear that by switching to a directed graph there’s no way to have a two-way connection between a pair of vertices. Or that if there is you have to go through some third vertex. I understand your fear, and wish to reassure you. We can get a two-way connection even in a directed graph: just have the same two vertices be connected by two edges. One goes one way, one goes the other. I hope you feel some comfort.

What if we don’t have that, though? What if the directed graph doesn’t have any vertices with a pair of opposite-directed edges? And that, then, is an oriented graph. We get the orientation from looking at pairs of vertices. Each pair either has no edge connecting them, or has a single directed edge between them.

There’s a lot of potential oriented graphs. If you have three vertices, for example, there’s seven oriented graphs to make of that. You’re allowed to have a vertex not connected to any others. You’re also allowed to have the vertices grouped into a couple of subsets, and connect only to other vertices in their own subset. This is part of why four vertices can give you 42 different oriented graphs. Five vertices can give you 582 different oriented graphs. You can insist on a connected oriented graph.

A connected graph is what you guess. It’s a graph where there’s no vertices off on their own, unconnected to anything. There’s no subsets of vertices connected only to each other. This doesn’t mean you can always get from any one vertex to any other vertex. The directions might not allow you to that. But if you’re willing to break the laws, and ignore the directions of these edges, you could then get from any vertex to any other vertex. Limiting yourself to connected graphs reduces the number of oriented graphs you can get. But not by as much as you might guess, at least not to start. There’s only one connected oriented graph for two vertices, instead of two. Three vertices have five connected oriented graphs, rather than seven. Four vertices have 34, rather than 42. Five vertices, 535 rather than 582. The total number of lost graphs grows, of course. The percentage of lost graphs dwindles, though.

There’s something more. What if there are no unconnected vertices? That is, every pair of vertices has an edge? If every pair of vertices in a graph has a direct connection we call that a “complete” graph. This is true whether the graph is directed or not. If you do have a complete oriented graph — every pair of vertices has a direct connection, and only the one direction — then that’s a “tournament”. If that seems like a whimsical name, consider one interpretation of it. Imagine a sports tournament in which every team played every other team once. And that there’s no ties. Each vertex represents one team. Each edge is the match played by the two teams. The direction is, let’s say, from the losing team to the winning team. (It’s as good if the direction is from the winning team to the losing team.) Then you have a complete, oriented, directed graph. And it represents your tournament.

And that delights me. A mathematician like me might talk a good game about building models. How one can represent things with mathematical constructs. Here, it’s done. You can make little dots, for vertices, and curved lines with arrows, for edges. And draw a picture that shows how a round-robin tournament works. It can be that direct.

My next Fall 2018 Mathematics A-To-Z post should be Friday. It’ll be available at this link, as are the rest of these glossary posts. And I’ve got requests for the next letter. I just have to live up to at least one of them.

## My 2018 Mathematics A To Z: Nearest Neighbor Model

I had a free choice of topics for today! Nobody had a suggestion for the letter ‘N’, so, I’ll take one of my own. If you did put in a suggestion, I apologize; I somehow missed the comment in which you did. I’ll try to do better in future.

# Nearest Neighbor Model.

Why are restaurants noisy?

It’s one of those things I wondered while at a noisy restaurant. I have heard it is because restauranteurs believe patrons buy more, and more expensive stuff, in a noisy place. I don’t know that I have heard this correctly, nor that what I heard was correct. I’ll leave it to people who work that end of restaurants to say. But I wondered idly whether mathematics could answer why.

It’s easy to form a rough model. Suppose I want my brilliant words to be heard by the delightful people at my table. Then I have to be louder, to them, than the background noise is. Fine. I don’t like talking loudly. My normal voice is soft enough even I have a hard time making it out. And I’ll drop the ends of sentences when I feel like I’ve said all the interesting parts of them. But I can overcome my instinct if I must.

The trouble comes from other people thinking of themselves the way I think of myself. They want to be heard over how loud I have been. And there’s no convincing them they’re wrong. If there’s bunches of tables near one another, we’re going to have trouble. We’ll each by talking loud enough to drown one another out, until the whole place is a racket. If we’re close enough together, that is. If the tables around mine are empty, chances are my normal voice is enough for the cause. If they’re not, we might have trouble.

So this inspires a model. The restaurant is a space. The tables are set positions, points inside it. Each table is making some volume of noise. Each table is trying to be louder than the background noise. At least until the people at the table reach the limits of their screaming. Or decide they can’t talk, they’ll just eat and go somewhere pleasant.

Making calculations on this demands some more work. Some is obvious: how do you represent “quiet” and “loud”? Some is harder: how far do voices carry? Grant that a loud table is still loud if you’re near it. How far away before it doesn’t sound loud? How far away before you can’t hear it anyway? Imagine a dining room that’s 100 miles long. There’s no possible party at one end that could ever be heard at the other. Never mind that a 100-mile-long restaurant would be absurd. It shows that the limits of people’s voices are a thing we have to consider.

There are many ways to model this distance effect. A realistic one would fall off with distance, sure. But it would also allow for echoes and absorption by the walls, and by other patrons, and maybe by restaurant decor. This would take forever to get answers from, but if done right it would get very good answers. A simpler model would give answers less fitted to your actual restaurant. But the answers may be close enough, and let you understand the system. And may be simple enough that you can get answers quickly. Maybe even by hand.

And so I come to the “nearest neighbor model”. The common English meaning of the words suggest what it’s about. We get it from models, like my restaurant noise problem. It’s made of a bunch of points that have some value. For my problem, tables and their noise level. And that value affects stuff in some region around these points.

In the “nearest neighbor model”, each point directly affects only its nearest neighbors. Saying which is the nearest neighbor is easy if the points are arranged in some regular grid. If they’re evenly spaced points on a line, say. Or a square grid. Or a triangular grid. If the points are in some other pattern, you need to think about what the nearest neighbors are. This is why people working in neighbor-nearness problems get paid the big money.

Suppose I use a nearest neighbor model for my restaurant problem. In this, I pretend the only background noise at my table is that of the people the next table over, in each direction. Two tables over? Nope. I don’t hear them at my table. I do get an indirect effect. Two tables over affects the table that’s between mine and theirs. But vice-versa, too. The table that’s 100 miles away can’t affect me directly, but it can affect a table in-between it and me. And that in-between table can affect the next one closer to me, and so on. The effect is attenuated, yes. Shouldn’t it be, if we’re looking at something farther away?

This sort of model is easy to work with numerically. I’m inclined toward problems that work numerically. Analytically … well, it can be easy. It can be hard. There’s a one-dimensional version of this problem, a bunch of evenly-spaced sites on an infinitely long line. If each site is limited to one of exactly two values, the problem becomes easy enough that freshman physics majors can solve it exactly. They don’t, not the first time out. This is because it requires recognizing a trigonometry trick that they don’t realize would be relevant. But once they know the trick, they agree it’s easy, when they go back two years later and look at it again. It just takes familiarity.

This comes up in thermodynamics, because it makes a nice model for how ferromagnetism can work. More realistic problems, like, two-dimensional grids? … That’s harder to solve exactly. Can be done, though not by undergraduates. Three-dimensional can’t, last time I looked. Weirdly, four-dimensional can. You expect problems to only get harder with more dimensions of space, and then you get a surprise like that.

The nearest-neighbor-model is a first choice. It’s hardly the only one. If I told you there were a next-nearest-neighbor model, what would you suppose it was? Yeah, you’d be right. As long as you supposed it was “things are affected by the nearest and the next-nearest neighbors”. Mathematicians have heard of loopholes too, you know.

As for my restaurant model? … I never actually modelled it. I did think about the model. I concluded my model wasn’t different enough from ferromagnetism models to need me to study it more. I might be mistaken. There may be interesting weird effects caused by the facts of restaurants. That restaurants are pretty small things. That they can have echo-y walls and ceilings. That they can have sound-absorbing things like partial walls or plants. Perhaps I gave up too easily when I thought I knew the answer. Some of my idle thoughts end up too idle.

I should have my next Fall 2018 Mathematics A-To-Z post on Tuesday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Manifold

Two commenters suggested the topic for today’s A to Z post. I suspect I’d have been interested in it if only one had. (Although Dina Yagoditch’s suggestion of the Menger Sponge is hard to resist.) But a double domination? The topic got suggested by Mr Wu, author of MathTuition88, and by John Golden, author of Math Hombre. My thanks to all for interesting things to think about.

# Manifold.

So you know how in the first car you ever owned the alternator was always going bad? If you’re lucky, you reach a point where you start owning cars good enough that the alternator is not the thing always going bad. Once you’re there, congratulations. Now the thing that’s always going bad in your car will be the manifold. That one’s for my dad.

Manifolds are a way to do normal geometry on weird shapes. What’s normal geometry? It’s … you know, the way shapes work on your table, or in a room. The Euclidean geometry that we’re so used to that it’s hard to imagine it not working. Why worry about weird shapes? They’re interesting, for one. And they don’t have to be that weird to count as weird. A sphere, like the surface of the Earth, can be weird. And these weird shapes can be useful. Mathematical physics, for example, can represent the evolution of some complicated thing as a path drawn on a weird shape. Bringing what we know about geometry from years of study, and moving around rooms, to a problem that abstract makes our lives easier.

We use language that sounds like that of map-makers when discussing manifolds. We have maps. We gather together charts. The collection of charts describing a surface can be an atlas. All these words have common meanings. Mercifully, these common meanings don’t lead us too far from the mathematical meanings. We can even use the problem of mapping the surface of the Earth to understand manifolds.

If you love maps, the geography kind, you learn quickly that there’s no making a perfect two-dimensional map of the Earth’s surface. Some of these imperfections are obvious. You can distort shapes trying to make a flat map of the globe. You can distort sizes. But you can’t represent every point on the globe with a point on the paper. Not without doing something that really breaks continuity. Like, say, turning the North Pole into the whole line at the top of the map. Like in the Equirectangular projection. Or skipping some of the points, like in the Mercator projection. Or adding some cuts into a surface that doesn’t have them, like in the Goode homolosine projection. You may recognize this as the one used in classrooms back when the world had first begun.

But what if we don’t need the whole globedone in a single map? Turns out we can do that easy. We can make charts that cover a part of the surface. No one chart has to cover the whole of the Earth’s surface. It only has to cover some part of it. It covers the globe with a piece that looks like a common ordinary Euclidean space, where ordinary geometry holds. It’s the collection of charts that covers the whole surface. This collection of charts is an atlas. You have a manifold if it’s possible to make a coherent atlas. For this every point on the manifold has to be on at least one chart. It’s okay if a point is on several charts. It’s okay if some point is on all the charts. Like, suppose your original surface is a circle. You can represent this with an atlas of two charts. Each chart maps the circle, except for one point, onto a line segment. The two charts don’t both skip the same point. All but two points on this circle are on all the maps of this chart. That’s cool. What’s not okay is if some point can’t be coherently put onto some chart.

This sad fate can happen. Suppose instead of a circle you want to chart a figure-eight loop. That won’t work. The point where the figure crosses itself doesn’t look, locally, like a Euclidean space. It looks like an ‘x’. There’s no getting around that. There’s no atlas that can cover the whole of that surface. So that surface isn’t a manifold.

But many things are manifolds nevertheless. Toruses, the doughnut shapes, are. Möbius strips and Klein bottles are. Ellipsoids and hyperbolic surfaces are, or at least can be. Mathematical physics finds surfaces that describe all the ways the planets could move and still conserve the energy and momentum and angular momentum of the solar system. That cheesecloth surface stretched through 54 dimensions, is a manifold. There are many possible atlases, with many more charts. But each of those means we can, at least locally, for particular problems, understand them the same way we understand cutouts of triangles and pentagons and circles on construction paper.

So to get back to cars: no one has ever said “my car runs okay, but I regret how I replaced the brake covers the moment I suspected they were wearing out”. Every car problem is easier when it’s done as soon as your budget and schedule allow.

This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. What will I choose for ‘N’, later this week? I really should have decided that by now.

## My 2018 Mathematics A To Z: Limit

I got an irresistible topic for today’s essay. It’s courtesy Peter Mander, author of Carnot Cycle, “the classical blog about thermodynamics”. It’s bimonthly and it’s one worth waiting for. Some of the essays are historical; some are statistical-mechanics; many are mixtures of them. You could make a fair argument that thermodynamics is the most important field of physics. It’s certainly one that hasn’t gotten the popularization treatment it deserves, for its importance. Mander is doing something to correct that.

# Limit.

It is hard to think of limits without thinking of motion. The language even professional mathematicians use suggests it. We speak of the limit of a function “as x goes to a”, or “as x goes to infinity”. Maybe “as x goes to zero”. But a function is a fixed thing, a relationship between stuff in a domain and stuff in a range. It can’t change any more than January, AD 1988 can change. And ‘x’ here is a dummy variable, part of the scaffolding to let us find what we want to know. I suppose ‘x’ can change, but if we ever see it, something’s gone very wrong. But we want to use it to learn something about a function for a point like ‘a’ or ‘infinity’ or ‘zero’.

The language of motion helps us learn, to a point. We can do little experiments: if $f(x) = \frac{sin(x)}{x}$, then, what should we expect it to be for x near zero? It’s irresistible to try out the calculator. Let x be 0.1. 0.01. 0.001. 0.0001. The numbers say this f(x) gets closer and closer to 1. That’s good, right? We know we can’t just put in an x of zero, because there’s some trouble that makes. But we can imagine creeping up on the zero we really wanted. We might spot some obvious prospects for mischief: what if x is negative? We should try -0.1, -0.01, -0.001 and so on. And maybe we won’t get exactly the right answer. But if all we care about is the first (say) three digits and we try out a bunch of x’s and the corresponding f(x)’s agree to those three digits, that’s good enough, right?

This is good for giving an idea of what to expect a limit to look like. It should be, well, what it really really really looks like a function should be. It takes some thinking to see where it might go wrong. It might go to different numbers based on which side you approach from. But that seems like something you can rationalize. Indeed, we do; we can speak of functions having different limits based on what direction you approach from. Sometimes that’s the best one can say about them.

But it can get worse. It’s possible to make functions that do crazy weird things. Some of these look like you’re just trying to be difficult. Like, set f(x) equal to 1 if x is rational and 0 if x is irrational. If you don’t expect that to be weird you’re not paying attention. Can’t blame someone for deciding that falls outside the realm of stuff you should be able to find limits for. And who would make, say, an f(x) that was 1 if x was 0.1 raised to some power, but 2 if x was 0.2 raised to some power, and 3 otherwise? Besides someone trying to prove a point?

Fine. But you can make a function that looks innocent and yet acts weird if the domain is two-dimensional. Or more. It makes sense to say that the functions I wrote in the above paragraph should be ruled out of consideration. But the limit of $f(x, y) = \frac{x^3 y}{x^6 + y^2}$ at the origin? You get different results approaching in different directions. And the function doesn’t give obvious signs of imminent danger here.

We need a better idea. And we even have one. This took centuries of mathematical wrangling and arguments about what should and shouldn’t be allowed. This should inspire sympathy with Intro Calc students who don’t understand all this by the end of week three. But here’s what we have.

I need a supplementary idea first. That is the neighborhood. A point has a neighborhood if there’s some open set that contains it. We represent this by drawing a little blob around the point we care about. If we’re looking at the neighborhood of a real number, then this is a little interval, that’s all. When we actually get around to calculating, we make these neighborhoods little circles. Maybe balls. But when we’re doing proofs about how limits work, or how we use them to prove things, we make blobs. This “neighborhood” idea looks simple, but we need it, so here we go.

So start with a function, named ‘f’. It has a domain, which I’ll call ‘D’. And a range, which I want to call ‘R’, but I don’t think I need the shorthand. Now pick some point ‘a’. This is the point at which we want to evaluate the limit. This seems like it ought to be called the “limit point” and it’s not. I’m sorry. Mathematicians use “limit point” to talk about something else. And, unfortunately, it makes so much sense in that context that we aren’t going to change away from that.

‘a’ might be in the domain ‘D’. It might not. It might be on the border of ‘D’. All that’s important is that there be a neighborhood inside ‘D’ that contains ‘a’.

I don’t know what f(a) is. There might not even be an f(a), if a is on the boundary of the domain ‘D’. But I do know that everything inside the neighborhood of ‘a’, apart from ‘a’, is in the domain. So we can look at the values of f(x) for all the x’s in this neighborhood. This will create a set, in the range, that’s known as the image of the neighborhood. It might be a continuous chunk in the range. It might be a couple of chunks. It might be a single point. It might be some crazy-quilt set. Depends on ‘f’. And the neighborhood. No matter.

Now I need you to imagine the reverse. Pick a point in the range. And then draw a neighborhood around it. Then pick out what we call the pre-image of it. That’s all the points in the domain that get matched to values inside that neighborhood. Don’t worry about trying to do it; that’s for the homework practice. Would you agree with me that you can imagine it?

I hope so because I’m about to describe the part where Intro Calc students think hard about whether they need this class after all.

All right. Then I want something in the range. I’m going to call it ‘L’. And it’s special. It’s the limit of ‘f’ at ‘a’ if this following bit is true:

Think of every neighborhood you could pick of ‘L’. Can be big, can be small. Just has to be a neighborhood of ‘L’. Now think of the pre-image of that neighborhood. Is there always a neighborhood of ‘a’ inside that pre-image? It’s okay if it’s a tiny neighborhood. Just has to be an open neighborhood. It doesn’t have to contain ‘a’. You can allow a pinpoint hole there.

If you can always do this, however tiny the neighborhood of ‘L’ is, then the limit of ‘f’ at ‘a’ is ‘L’. If you can’t always do this — if there’s even a single exception — then there is no limit of ‘f’ at ‘a’.

I know. I felt like that the first couple times through the subject too. The definition feels backward. Worse, it feels like it begs the question. We suppose there’s an ‘L’ and then test these properties about it and then if it works we say we’re done? I know. It’s a pain when you start calculating this with specific formulas and all that, too. But supposing there is an answer and then learning properties about it, including whether it can exist? That’s a slick trick. We can use it.

Thing is, the pain is worth it. We can calculate with it and not have to out-think tricky functions. It works for domains with as many dimensions as you need. It works for limits that aren’t inside the domain. It works with domains and ranges that aren’t real numbers. It works for functions with weird and complicated domains. We can adapt it if we want to consider limits that are constrained in some way. It won’t be fooled by tricks like I put up above, the f(x) with different rules for the rational and irrational numbers.

So mathematicians shrug, and do enough problems that they get the hang of it, and use this definition. It’s worth it, once you get there.

This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. And I’m still taking nominations for discussion topics, if you’d like to see mathematics terms explained. I know I would.

## I’m Looking For The Next Set Of Topics For My Fall 2018 Mathematics A-To-Z

We’re at the end of another month. So it’s a good chance to set out requests for the next several week’s worth of my mathematics A-To-Z. As I say, I’ve been doing this piecemeal so that I can keep track of requests better. I think it’s been working out, too.

If you have any mathematical topics with a name that starts N through T, let me know! I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me.

Also when you do make a request, please feel free to mention your blog, Twitter feed, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.

So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

#### Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

#### Available Letters for the Fall 2018 A To Z:

• N
• O
• P
• Q
• R
• S
• T

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

## My 2018 Mathematics A To Z: Kelvin (the scientist)

Today’s request is another from John Golden, @mathhombre on Twitter and similarly on Blogspot. It’s specifically for Kelvin — “scientist or temperature unit”, the sort of open-ended goal I delight in. I decided on the scientist. But that’s a lot even for what I honestly thought would be a quick little essay. So I’m going to take out a tiny slice of a long and amazingly fruitful career. There’s so much more than this.

Before I get into what I did pick, let me repeat an important warning about historical essays. Every history is incomplete, yes. But any claim about something being done for the first time is simplified to the point of being wrong. Any claim about an individual discovering or inventing something is simplified to the point of being wrong. Everything is more complicated and, especially, more ambiguous than this. If you do not love the challenge of working out a coherent narrative when the most discrete and specific facts are also the ones that are trivia, do not get into history. It will only break your heart and mislead your readers. With that disclaimer, let me try a tiny slice of the life of William Thomson, the Baron Kelvin.

# Kelvin (the scientist).

The great thing about a magnetic compass is that it’s easy. Set the thing on an axis and let it float freely. It aligns itself to the magnetic poles. It’s easy to see why this looks like magic.

The trouble is that it’s not quite right. It’s near enough for many purposes. But the direction a magnetic compass points out to be north is not the true geographic north. Fortunately, we’ve got a fair idea just how far off north that is. It depends on where you are. If you have a rough idea where you already are, you can make a correction. We can print up charts saying how much of a correction to make.

The trouble is that it’s still not quite right. The location of the magnetic north and south poles wanders. Fortunately we’ve got a fair idea of how quickly it’s moving, and in what direction. So if you have a rough idea how out of date your chart is, and what direction the poles were moving in, you can make a correction. We can communicate how much the variance between true north and magnetic north vary.

The trouble is that it’s still not quite right. The size of the variation depends on the season of the year. But all right; we should have a rough idea what season it is. We can correct for that. The size of the variation also depends on what time of day it is. Compasses point farther east at around 8 am (sun time) than they do the rest of the day, and farther west around 1 pm. At least they did when Alan Gurney’s Compass: A Story of Exploration and Innovation was published. I would be unsurprised if that’s changed since the book came out a dozen years ago. Still. These are all, we might say, global concerns. They’s based on where you are and when you look at the compass. But they don’t depend on you, the specific observer.

The trouble is that it’s still not quite right yet. Almost as soon as compasses were used for navigation, on ships, mariners noticed the compass could vary. And not just because compasses were often badly designed and badly made. The ships themselves got in the way. The problem started with guns, the iron of which led compasses astray. When it was just the ship’s guns the problem could be coped with. Set the compass binnacle far from any source of iron, and the error should be small enough.

The trouble is when the time comes to make ships with iron. There are great benefits you get from cladding ships in iron, or making them of iron altogether. Losing the benefits of navigation, though … that’s a bit much.

There’s an obvious answer. Suppose you know the construction of the ship throws off compass bearings. Then measure what the compass reads, at some point when you know what it should read. Use that to correct your measurements when you aren’t sure. From the early 1800s mariners could use a method called “swinging the ship”, setting the ship at known angles and comparing what the compass read. It’s a bit of a chore. And you should arrange things you need to do so that it’s harder to make a careless mistake at them.

In the 1850s John Gray of Liverpool patented a binnacle — the little pillar that holds the compass — which used the other obvious but brilliant approach. If the iron which builds the ship sends the compass awry, why not put iron near the compass to put the compass back where it should be? This set up a contraption of a binnacle surrounded by adjustable, correcting magnets.

Enter finally William Thomson, who would become Baron Kelvin in 1892. In 1871 the magazine Good Words asked him to write an article about the marine compass. In 1874 he published his first essay on the subject. The second part appeared five years after that. I am not certain that this is directly related to the tiny slice of story I tell. I just mention it to reassure every academic who’s falling behind on their paper-writing, which is all of them.

But come the 1880s Thomson patented an improved binnacle. Thomson had the sort of talents normally associated only with the heroes of some lovable yet dopey space-opera of the 1930s. He was a talented scientist, competent in thermodynamics and electricity and magnetism and fluid flow. He was a skilled mathematician, as you’d need to be to keep up with all that and along the way prove the Stokes theorem. (This is one of those incredibly useful theorems that gives information about the interior of a volume using only integrals over the surface.) He was a magnificent engineer, with a particular skill at developing instruments that would brilliantly measure delicate matters. He’s famous for saving the trans-Atlantic telegraph cable project. He recognized that what was needed was not more voltage to drive signal through three thousand miles of dubiously made copper wire, but rather ways to pick up the feeble signals that could come across, and amplify them into usability. And also described the forces at work on a ship that is laying a long line of submarine cable. And he was a manufacturer, able to turn these designs into mass-produced products. This through collaborating with James White, of Glasgow, for over half a century. And a businessman, able to convince people and organizations to use the things. He’s an implausible protagonist; and yet, there he is.

Thomson’s revision for the binnacle made it simpler. A pair of spheres, flanking the compass, and adjustable. The Royal Museums Greenwich web site offers a picture of this sort of system. It’s not so shiny as others in the collection. But this angle shows how adjustable the system would be. It’s a design that shows brilliance behind it. What work you might have to do to use it is obvious. At least it’s obvious once you’re told the spheres are adjustable. To reduce a massive, lingering, challenging problem to something easy is one of the great accomplishments of any practical mathematician.

This was not all Thomson did in maritime work. He’d developed an analog computer which would calculate the tides. Wikipedia tells me that Thomson claimed a similar mechanism could solve arbitrary differential equations. I’d accept that claim, if he made it. Thomson also developed better tools for sounding depths. And developed compasses proper, not just the correcting tools for binnacles. A maritime compass is a great practical challenge. It has to be able to move freely, so that it can give a correct direction even as the ship changes direction. But it can’t move too freely, or it becomes useless in rolling seas. It has to offer great precision, or it loses its use in directing long journeys. It has to be quick to read, or it won’t be consulted. Thomson designed a compass that was, my readings indicate, a great fit for all these constraints. By the time of his death in 1907 Kelvin and White (the company had various names) had made something like ten thousand compasses and binnacles.

And this from a person attached to all sorts of statistical mechanics stuff and who’s important for designing electrical circuits and the like.