## My 2018 Mathematics A To Z: Volume

Ray Kassinger, of the popular web comic Housepets!, had a silly suggestion when I went looking for topics. In one episode of Mystery Science Theater 3000, Crow T Robot gets the idea that you could describe the size of a space by the number of turkeys which fill it. (It’s based on like two minor mentions of “turkeys” in the show they were watching.)

I liked that episode. I’ve got happy memories of the time when I first saw it. I thought the sketch in which Crow T Robot got so volume-obsessed was goofy and dumb in the fun-nerd way.

I accept Mr Kassinger’s challenge only I’m going to take it seriously.

# Volume.

How big is a thing?

There is a legend about Thomas Edison. He was unimpressed with a new hire. So he hazed the college-trained engineer who deeply knew calculus. He demanded the engineer tell him the volume within a light bulb. The engineer went to work, making measurements of the shape of the bulb’s outside. And then started the calculations. This involves a calculus technique called “volumes of rotation”. This can tell the volume within a rotationally symmetric shape. It’s tedious, especially if the outer edge isn’t some special nice shape. Edison, fed up, took the bulb, filled it with water, poured that out into a graduated cylinder and said that was the answer.

I’m skeptical of legends. I’m skeptical of stories about the foolish intellectual upstaged by the practical man-of-action. And I’m skeptical of Edison because, jeez, I’ve read biographies of the man. Even the fawning ones make him out to be yeesh.

But the legend’s Edison had a point. If the volume of a shape is not how much stuff fits inside the shape, what is it? And maybe some object has too complicated a shape to find its volume. Can we think of a way to produce something with the same volume, but that is easier? Sometimes we can. When we do this with straightedge and compass, the way the Ancient Greeks found so classy, we call this “quadrature”. It’s called quadrature from its application in two dimensions. It finds, for a shape, a square with the same area. For a three-dimensional object, we find a cube with the same volume. Cubes are easy to understand.

Straightedge and compass can’t do everything. Indeed, there’s so much they can’t do. Some of it is stuff you’d think it should be able to, like, find a cube with the same volume as a sphere. Integration gives us a mathematical tool for describing how much stuff is inside a shape. It’s even got a beautiful shorthand expression. Suppose that D is the shape. Then its volume V is:

$V = \int\int\int_D dV$

Here “dV” is the “volume form”, a description of how the coordinates we describe a space in relate to the volume. The $\int\int\int$ is jargon, meaning, “integrate over the whole volume”. The subscript “D” modifies that phrase by adding “of D” to it. Writing “D” is shorthand for “these are all the points inside this shape, in whatever coordinate system you use”. If we didn’t do that we’d have to say, on each $\int$ sign, what points are inside the shape, coordinate by coordinate. At this level the equation doesn’t offer much help. It says the volume is the sum of infinitely many, infinitely tiny pieces of volume. True, but that doesn’t give much guidance about whether it’s more or less than two cups of water. We need to get more specific formulas, usually. We need to pick coordinates, for example, and say what coordinates are inside the shape. A lot of the resulting formulas can’t be integrated exactly. Like, an ellipsoid? Maybe you can integrate that. Don’t try without getting hazard pay.

We can approximate this integral. Pick a tiny shape whose volume is easy to know. Fill your shape with duplicates of it. Count the duplicates. Multiply that count by the volume of this tiny shape. Done. This is numerical integration, sometimes called “numerical quadrature”. If we’re being generous, we can say the legendary Edison did this, using water molecules as the tiny shape. And working so that he didn’t need to know the exact count or the volume of individual molecules. Good computational technique.

It’s hard not to feel we’re begging the question, though. We want the volume of something. So we need the volume of something else. Where does that volume come from?

Well, where does an inch come from? Or a centimeter? Whatever unit you use? You pick something to use as reference. Any old thing will do. Which is why you get fascinating stories about choosing what to use. And bitter arguments about which of several alternatives to use. And we express the length of something as some multiple of this reference length.

Volume works the same way. Pick a reference volume, something that can be one unit-of-volume. Other volumes are some multiple of that unit-of-volume. Possibly a fraction of that unit-of-volume.

Usually we use a reference volume that’s based on the reference length. Typically, we imagine a cube that’s one unit of length on each side. The volume of this cube with sides of length 1 unit-of-length is then 1 unit-of-volume. This seems all nice and orderly and it’s surely not because mathematicians have paid off by six-sided-dice manufacturers.

Does it have to be?

That we need some reference volume seems inevitable. We can’t very well say the area of something is ten times nothing-in-particular. Does that reference volume have to be a cube? Or even a rectangle or something else? It seems obvious that we need some reference shape that tiles, that can fill up space by itself … right?

What if we don’t?

I’m going to drop out of three dimensions a moment. Not because it changes the fundamentals, but because it makes something easier. Specifically, it makes it easier if you decide you want to get some construction paper, cut out shapes, and try this on your own. What this will tell us about area is just as true for volume. Area, for a two-dimensional sapce, and volume, for a three-dimensional, describe the same thing. If you’ll let me continue, then, I will.

So draw a figure on a clean sheet of paper. What’s its area? Now imagine you have a whole bunch of shapes with reference areas. A bunch that have an area of 1. That’s by definition. That’s our reference area. A bunch of smaller shapes with an area of one-half. By definition, too. A bunch of smaller shapes still with an area of one-third. Or one-fourth. Whatever. Shapes with areas you know because they’re marked on them.

Here’s one way to find the area. Drop your reference shapes, the ones with area 1, on your figure. How many do you need to completely cover the figure? It’s all right to cover more than the figure. It’s all right to have some of the reference shapes overlap. All you need is to cover the figure completely. … Well, you know how many pieces you needed for that. You can count them up. You can add up the areas of all these pieces needed to cover the figure. So the figure’s area can’t be any bigger than that sum.

Can’t be exact, though, right? Because you might get a different number if you covered the figure differently. If you used smaller pieces. If you arranged them better. This is true. But imagine all the possible reference shapes you had, and all the possible ways to arrange them. There’s some smallest area of those reference shapes that would cover your figure. Is there a more sensible idea for what the area of this figure would be?

And put this into three dimensions. If we start from some reference shapes of volume 1 and maybe 1/2 and 1/3 and whatever other useful fractions there are? Doesn’t this covering make sense as a way to describe the volume? Cubes or rectangles are easy to imagine. Tetrahedrons too. But why not any old thing? Why not, as the Mystery Science Theater 3000 episode had it, turkeys?

This is a nice, flexible, convenient way to define area. So now let’s see where it goes all bizarre. We know this thanks to Giuseppe Peano. He’s among the late-19th/early-20th century mathematicians who shaped modern mathematics. They did this by showing how much of our mathematics broke intuition. Peano was (here) exploring what we now call fractals. And noted a family of shapes that curl back on themselves, over and over. They’re beautiful.

And they fill area. Fill volume, if done in three dimensions. It seems impossible. If we use this covering scheme, and try to find the volume of a straight line, we get zero. Well, we find that any positive number is too big, and from that conclude that it has to be zero. Since a straight line has length, but not volume, this seems fine. But a Peano curve won’t go along with this. A Peano curve winds back on itself so much that there is some minimum volume to cover it.

This unsettles. But this idea of volume (or area) by covering works so well. To throw it away seems to hobble us. So it seems worth the trade. We allow ourselves to imagine a line so long and so curled up that it has a volume. Amazing.

And now I get to relax and unwind and enjoy a long weekend before coming to the letter ‘W’. That’ll be about some topic I figure I can whip out a nice tight 500 words about, and instead, produce some 1541-word monstrosity while I wonder why I’ve had no free time at all since August. Tuesday, give or take, it’ll be available at this link, as are the rest of these glossary posts. Thanks for reading.

## My 2018 Mathematics A To Z: Unit Fractions

My subject for today is another from Iva Sallay, longtime friend of the blog and creator of the Find the Factors recreational mathematics game. I think you’ll likely find something enjoyable at her site, whether it’s the puzzle or the neat bits of trivia as she works through all the counting numbers.

# Unit Fractions.

We don’t notice how unit fractions are around us. Likely there’s some in your pocket. Or there have been recently. Think of what you do when paying for a thing, when it’s not a whole number of dollars. (Pounds, euros, whatever the unit of currency is.) Suppose you have exact change. What do you give for the 38 cents?

Likely it’s something like a 25-cent piece and a 10-cent piece and three one-cent pieces. This is an American and Canadian solution. I know that 20-cent pieces are more common than 25-cent ones worldwide. It doesn’t make much difference; if you want it to be three 10-cent, one five-cent, and three one-cent pieces that’s as good. And granted, outside the United States it’s growing common to drop pennies altogether and round prices off to a five- or ten-cent value. Again, it doesn’t make much difference.

But look at the coins. The 25 cent piece is one-quarter of a dollar. It’s even called that, and stamped that on one side. I sometimes hear a dime called “a tenth of a dollar”, although mostly by carnival barkers in one-reel cartoons of the 1930s. A nickel is one-twentieth of a dollar. A penny is one-hundredth. A 20-cent piece is one-fifth of a dollar. And there are half-dollars out there, although not in the United States, not really anymore.

(Pre-decimalized currencies offered even more unit fractions. Using old British coins, for familiarity-to-me and great names, there were farthings, 1/960th of a pound; halfpennies, 1/480th; pennies, 1/240th; threepence, 1/80th of a pound; groats, 1/60th; sixpence, 1/40th; florins, 1/10th; half-crowns, 1/8th; crowns, 1/4th. And what seem to the modern wallet like impossibly tiny fractions like the half-, third-, and quarter-farthings used where 1/3840th of a pound might be a needed sum of money.)

Unit fractions get named and defined somewhere in elementary school arithmetic. They go on, becoming forgotten sometime after that. They might make a brief reappearance in calculus. There are some rational functions that get easier to integrate if you think of them as the sums of fractions, with constant numerators and polynomial denominators. These aren’t unit fractions. A unit fraction has a 1, the unit, in the numerator. But we see units along the way to integrating $\frac{1}{x^2 - x}$ as an example. And see it in the promise that there are still more amazing integrals to learn how to do.

They get more attention if you take a history of computation class. Or read the subject on your own. Unit fractions stand out in history. We learn the Ancient Egyptians worked with fractions as sums of unit fractions. That is, had they dollars, they would not look at the $\frac{38}{100}$ we do. They would look at $\frac{1}{4}$ plus $\frac{1}{10}$ plus $\frac{1}{100}$ plus $\frac{1}{100}$ plus $\frac{1}{100}$. When we count change we are using, without noticing it, a very old computing scheme.

This isn’t quite true. The Ancient Egyptians seemed to shun repeating a unit like that. To use $\frac{1}{100}$ once is fine; three times is suspicious. They would prefer something like $\frac{1}{3}$ plus $\frac{1}{24}$ plus $\frac{1}{200}$. Or maybe some other combination. I just wrote out the first one I found.

But there are many ways we can make 38 cents using ordinary coins of the realm. There are infinitely many ways to make up any fraction using unit fractions. There’s surely a most “efficient”. Most efficient might be the one which uses the fewest number of terms. Most efficient might be the one that uses the smallest denominators. Choose what you like; no one knows a scheme that always turns up the most efficient, either way. We can always find some representation, though. It may not be “good”, but it will exist, which may be good enough. Leonardo of Pisa, or as he got named in the 19th century, Fibonacci, proved that was true.

We may ask why the Egyptians used unit fractions. They seem inefficient compared to the way we work with fractions. Or, better, decimals. I’m not sure the question can have a coherent answer. Why do we have a fashion for converting fractions to a “proper” form? Why do we use the number of decimal points we do for a given calculation? Sometimes a particular mode of expression is the fashion. It comes to seem natural because everyone uses it. We do it too.

And there is practicality to them. Even efficiency. If you need π, for example, you can write it as 3 plus $\frac{1}{8}$ plus $\frac{1}{61}$ and your answer is off by under one part in a thousand. Combine this with the Egyptian method of multiplication, where you would think of (say) “11 times π” as “1 times π plus 2 times π plus 8 times π”. And with tables they had worked up which tell you what $\frac{2}{8}$ and $\frac{2}{61}$ would be in a normal representation. You can get rather good calculations without having to do more than addition and looking up doublings. Represent π as 3 plus $\frac{1}{8}$ plus $\frac{1}{61}$ plus $\frac{1}{5020}$ and you’re correct to within one part in 130 million. That isn’t bad for having to remember four whole numbers.

(The Ancient Egyptians, like many of us, were not absolutely consistent in only using unit fractions. They had symbols to represent $\frac{2}{3}$ and $\frac{3}{4}$, probably due to these numbers coming up all the time. Human systems vary to make the commonest stuff we do easier.)

Enough practicality or efficiency, if this is that. Is there beauty? Is there wonder? Certainly. Much of it is in number theory. Number theory splits between astounding results and results that would be astounding if we had any idea how to prove them. Many of the astounding results are about unit fractions. Take, for example, the harmonic series $1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5} + \frac{1}{6} + \cdots$. Truncate that series whenever you decide you’ve had enough. Different numbers of terms in this series will add up to different numbers. Eventually, infinitely many numbers. The numbers will grow ever-higher. There’s no number so big that it won’t, eventually, be surpassed by some long-enough truncated harmonic series. And yet, past the number 1, it’ll never touch a whole number again. Infinitely many partial sums. Partial sums differing from one another by one-googol-plex and smaller. And yet of the infinitely many whole numbers this series manages to miss them all, after its starting point. Worse, any set of consecutive terms, not even starting from 1, will never hit a whole number. I can understand a person who thinks mathematics is boring, but how can anyone not find it astonishing?

There are more strange, beautiful things. Consider heptagonal numbers, which Iva Sallay knows well. These are numbers like 1 and 7 and 18 and 34 and 55 and 1288. Take a heptagonal number of, oh, beads or dots or whatever, and you can lay them out to form a regular seven-sided figure. Add together the reciprocals of the heptagonal numbers. What do you get? It’s a weird number. It’s irrational, which you maybe would have guessed as more likely than not. But it’s also transcendental. Most real numbers are transcendental. But it’s often hard to prove any specific number is.

Unit fractions creep back into actual use. For example, in modular arithmetic, they offer a way to turn division back into multiplication. Division, in modular arithmetic, tends to be hard. Indeed, if you need an algorithm to make random-enough numbers, you often will do something with division in modular arithmetic. Suppose you want to divide by a number x, modulo y, and x and y are relatively prime, though. Then unit fractions tell us how to turn this into finding a greatest common denominator problem.

They teach us about our computers, too. Much of serious numerical mathematics involves matrix multiplication. Matrices are, for this purpose, tables of numbers. The Hilbert Matrix has elements that are entirely unit fractions. The Hilbert Matrix is really a family of square matrices. Pick any of the family you like. It can have two rows and two columns, or three rows and three columns, or ten rows and ten columns, or a million rows and a million columns. Your choice. The first row is made of the numbers $1, \frac{1}{2}, \frac{1}{3}, \frac{1}{4},$ and so on. The second row is made of the numbers $\frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5},$ and so on. The third row is made of the numbers $\frac{1}{3}, \frac{1}{4}, \frac{1}{5}, \frac{1}{6},$ and so on. You see how this is going.

Matrices can have inverses. It’s not guaranteed; matrices are like that. But the Hilbert Matrix does. It’s another matrix, of the same size. All the terms in it are integers. Multiply the Hilbert Matrix by its inverse and you get the Identity Matrix. This is a matrix, the same number of rows and columns as you started with. But nearly every element in the identity matrix is zero. The only exceptions are on the diagonal — first row, first column; second row, second column; third row, third column; and so on. There, the identity matrix has a 1. The identity matrix works, for matrix multiplication, much like the real number 1 works for normal multiplication.

Matrix multiplication is tedious. It’s not hard, but it involves a lot of multiplying and adding and it just takes forever. So set a computer to do this! And you get … uh ..

For a small Hilbert Matrix and its inverse, you get an identity matrix. That’s good. For a large Hilbert Matrix and its inverse? You get garbage. Large isn’t maybe very large. A 12 by 12 matrix gives you trouble. A 14 by 14 matrix gives you a mess. Well, on my computer it does. Cute little laptop I got when my former computer suddenly died. On a better computer? One designed for computation? … You could do a little better. Less good than you might imagine.

The trouble is that computers don’t really do mathematics. They do an approximation of it, numerical computing. Most use a scheme called floating point arithmetic. It mostly works well. There’s a bit of error in every calculation. For most calculations, though, the error stays small. At least relatively small. The Hilbert Matrix, built of unit fractions, doesn’t respect this. It and its inverse have a “numerical instability”. Some kinds of calculations make errors explode. They’ll overwhelm the meaningful calculation. It’s a bit of a mess.

Numerical instability is something anyone doing mathematics on the computer must learn. Must grow comfortable with. Must understand. The matrix multiplications, and inverses, that the Hilbert Matrix involves highlights those. A great and urgent example of a subtle danger of computerized mathematics waits for us in these unit fractions. And we’ve known and felt comfortable with them for thousands of years.

There’ll be some mathematical term with a name starting ‘V’ that, barring surprises, should be posted Friday. What’ll it be? I have an idea at least. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Sorites Paradox

Today’s topic is the lone (so far) request by bunnydoe, so I’m under pressure to make it decent. If she or anyone else would like to nominate subjects for the letters U through Z, please drop me a note at this post. I keep fooling myself into thinking I’ll get one done in under 1200 words.

This is a story which makes a capitalist look kind of good. I make no vouches for its truth, or even, at this remove, where I got it. The story as I heard it was about Ray Kroc, who made McDonald’s into a thing people of every land can complain about. The story has him demonstrate skepticism about the use of business consultants. A consultant might find, for example, that each sesame-seed hamburger bun has (say) 43 seeds. And that if they just cut it down to 41 seeds then each franchise would save (say) $50,000 annually. And no customer would notice the difference. Fine; trim the seeds a little. The next round of consultant would point out, cutting from 41 seeds to 38 would save a further$65,000 per store per year. And again no customer would notice the difference. Cut to 36 seeds? No customer would notice. This process would end when each bun had three sesame seeds, and the customers notice.

Part of the paradox’s intractability must be that it’s so nearly induction. Induction is a fantastic tool for mathematical problems. We couldn’t do without it. But consider the argument. If a bun is unsatisfying, one more seed won’t make it satisfying. A bun with one seed is unsatisfying. Therefore all buns have an unsatisfying number of sesame seeds on them. It suggests there must be some point at which “adding one more seed won’t help” stops being true. Fine; where is that point, and why isn’t it one fewer or one more seed?

A certain kind of nerd has a snappy answer for the Sorites Paradox. Test a broad population on a variety of sesame-seed buns. There’ll be some so sparse that nearly everyone will say they’re unsatisfying. There’ll be some so abundant most everyone agrees they’re great. So there’s the buns most everyone says are fine. There’s the buns most everyone says are not. The dividing line is at any point between the sparsest that satisfy most people and the most abundant that don’t. The nerds then declare the problem solved and go off. Let them go. We were lucky to get as much of their time as we did. They’re quite busy solving what “really” happened for Rashomon. The approach of “set a line somewhere” is fine if all want is guidance on where to draw a line. It doesn’t help say why we can anoint some border over any other. At least when we use a river as border between states we can agree going into the water disrupts what we were doing with the land. And even then we have to ask what happens during droughts and floods, and if the river is an estuary, how tides affect matters.

We might see an answer by thinking more seriously about these sesame-seed buns. We force a problem by declaring that every bun is either satisfying or it is not. We can imagine buns with enough seeds that we don’t feel cheated by them, but that we also don’t feel satisfied by. This reflects one of the common assumptions of logic. Mathematicians know it as the Law of the Excluded Middle. A thing is true or it is not true. There is no middle case. This is fine for logic. But for everyday words?

It doesn’t work when considering sesame-seed buns. I can imagine a bun that is not satisfying, but also is not unsatisfying. Surely we can make some logical provision for the concept of “meh”. Now we need not draw some arbitrary line between “satisfying” and “unsatisfying”. We must draw two lines, one of them between “unsatisfying” and “meh”. There is a potential here for regression. Also for the thought of a bun that’s “satisfying-meh-satisfying by unsatisfying”. I shall step away from this concept.

But there are more subtle ways to not exclude the middle. For example, we might decide a statement’s truth exists on a spectrum. We can match how true a statement is to a number. Suppose an obvious falsehood is zero; an unimpeachable truth is one, and normal mortal statements somewhere in the middle. “This bun with a single sesame seed is satisfying” might have a truth of 0.01. This perhaps reflects the tastes of people who say they want sesame seeds but don’t actually care. “This bun with fifteen sesame seeds is satisfying” might have a truth of 0.25, say. “This bun with forty sesame seeds is satisfying” might have a truth of 0.97. (It’s true for everyone except those who remember the flush times of the 43-seed bun.) This seems to capture the idea that nothing is always wholly anything. But we can still step into absurdity. Suppose “this bun with 23 sesame seeds is satisfying” has a truth of 0.50. Then “this bun with 23 sesame seeds is not satisfying” should also have a truth of 0.50. What do we make of the statement “this bun with 23 sesame seeds is simultaneously satisfying and not satisfying”? Do we make something different to “this bun with 23 sesame seeds is simultaneously satisfying and satisfying”?

I see you getting tired in the back there. This may seem like word games. And we all know that human words are imprecise concepts. What has this to do with logic, or mathematics, or anything but the philosophy of language? And the first answer is that we understand logic and mathematics through language. When learning mathematics we get presented with definitions that seem absolute and indisputable. We start to see the human influence in mathematics when we ask why 1 is not a prime number. Later we see things like arguments about whether a ring has a multiplicative identity. And then there are more esoteric debates about the bounds of mathematical concepts.

Perhaps we can think of a concept we can’t describe in words. If we don’t express it to other people, the concept dies with us. We need words. No, putting it in symbols does not help. Mathematical symbols may look like slightly alien scrawl. But they are shorthand for words, and can be read as sentences, and there is this fuzziness in all of them.

And we find mathematical properties that share this problem. Consider: what is the color of the chemical element flerovium? Before you say I just made that up, flerovium was first synthesized in 1998, and officially named in 2012. We’d guess that it’s a silvery-white or maybe grey metallic thing. Humanity has only ever observed about ninety atoms of the stuff. It’s, for atoms this big, amazingly stable. We know an isotope of it that has a half-life of two and a half seconds. But it’s hard to believe we’ll ever have enough of the stuff to look at it and say what color it is.

That’s … all right, though? Maybe? Because we know the quantum mechanics that seem to describe how atoms form. And how they should pack together. And how light should be absorbed, and how light should be emitted, and how light should be scattered by it. At least in principle. The exact answers might be beyond us. But we can imagine having a solution, at least in principle. We can imagine the computer that after great diligent work gives us a picture of what a ten-ton lump of flerovium would look like.

So where does its color come from? Or any of the other properties that these atoms have as a group? No one atom has a color. No one atom has a density, either, or a viscosity. No one atom has a temperature, or a surface tension, or a boiling point. In combination, though, they have.

These are known to statistical mechanics, and through that thermodynamics, as intensive properties. If we have a partition function, which describes all the ways a system can be organized, we can extract information about these properties. They turn up as derivatives with respect to the right parameters of the system.

But the same problem exists. Take a homogeneous gas. It has some temperature. Divide it into two equal portions. Both sides have the same temperature. Divide each half into two equal portions again. All four pieces have the same temperature. Divide again, and again, and a few more times. You eventually get containers with so little gas in them they don’t have a temperature. Where did it go? When did it disappear?

The counterpart to an intensive property is an extensive one. This is stuff like the mass or the volume or the energy of a thing. Cut the gas’s container in two, and each has half the volume. Cut it in half again, and each of the four containers has one-quarter the volume. Keep this up and you stay in uncontroversial territory, because I am not discussing Zeno’s Paradoxes here.

And like Zeno’s Paradoxes, the Sorites Paradox can seem at first trivial. We can distinguish a heap from a non-heap; who cares where the dividing line is? Or whether the division is a gradual change? It seems easy. To show why it is easy is hard. Each potential answer is interesting, and plausible, and when you think hard enough of it, not quite satisfying. Good material to think about.

I hope to find some material think about the letter ‘T’ and have it published Friday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Randomness

Today’s topic is an always rich one. It was suggested by aajohannas, who so far as I know has’t got an active blog or other project. If I’m mistaken please let me know. I’m glad to mention the creative works of people hanging around my blog.

# Randomness.

An old Sydney Harris cartoon I probably won’t be able to find a copy of before this publishes. A couple people gather around an old fanfold-paper printer. On the printout is the sequence “1 … 2 … 3 … 4 … 5 … ” The caption: ‘Bizarre sequence of computer-generated random numbers’.

Randomness feels familiar. It feels knowable. It means surprise, unpredictability. The upending of patterns. The obliteration of structure. I imagine there are sociologists who’d say it’s what defines Modernity. It’s hard to avoid noticing that the first great scientific theories that embrace unpredictability — evolution and thermodynamics — came to public awareness at the same time impressionism came to arts, and the subconscious mind came to psychology. It’s grown since then. Quantum mechanics is built on unpredictable specifics. Chaos theory tells us even if we could predict statistics it would do us no good. Randomness feels familiar, even necessary. Even desirable. A certain type of nerd thinks eagerly of the Singularity, the point past which no social interactions are predictable anymore. We live in randomness.

And yet … it is hard to find randomness. At least to be sure we have found it. We might choose between options we find ambivalent by tossing a coin. This seems random. But anyone who was six years old and trying to cheat a sibling knows ways around that. Drop the coin without spinning it, from a half-inch above the table, and you know the outcome, all the way through to the sibling’s punching you. When we’re older and can be made to be better sports we’re fairer about it. We toss the coin and give it a spin. There’s no way we could predict the outcome. Unless we knew just how strong a toss we gave it, and how fast it spun, and how the mass of the coin was distributed. … Really, if we knew enough, our tossed coin would be as predictably as the coin we dropped as a six-year-old. At least unless we tossed in some chaotic way, where each throw would be deterministic, but we couldn’t usefully make a prediction.

Our instinctive idea of what randomness must be is flawed. That shouldn’t surprise. Our instinctive idea of anything is flawed. But randomness gives us trouble. It’s obvious, for example, that randomly selected things should have no pattern. But then how is that reasonable? If we draw letters from the alphabet at random, we should expect sometimes to get some cute pattern like ‘aaaaa’ or ‘qwertyuiop’ or the works of Shakespeare. Perhaps we mean we shouldn’t get patterns any more often than we would expect. All right; how often is that?

We can make tests. Some of them are obvious. Take something that generates possibly-random results. Look up how probable each of those outcomes is. Then run off a bunch of outcomes. Do we get about as many of each result as we should expect? Probability tells us we should get as close as we like to the expected frequency if we let the random process run long enough. If this doesn’t happen, great! We can conclude we don’t really have something random.

We can do more tests. Some of them are brilliantly clever. Suppose there’s a way to order the results. Since mathematicians usually want numbers, putting them in order is easy to do. If they’re not, there’s usually a way to match results to numbers. You’ll see me slide here into talking about random numbers as though that were the same as random results. But if I can distinguish different outcomes, then I can label them. If I can label them, I can use numbers as labels. If the order of the numbers doesn’t matter — should “red” be a 1 or a 2? Should “green” be a 3 or an 8? — then, fine; any order is good.

There are 120 ways to order five distinct things. So generate lots of sets of, say, five numbers. What order are they in? There’s 120 possibilities. Do each of the possibilities turn up as often as expected? If they don’t, great! We can conclude we don’t really have something random.

I can go on. There are many tests which will let us say something isn’t a truly random sequence. They’ll allow for something like Sydney Harris’s peculiar sequence of random numbers. Mostly by supposing that if we let it run long enough the sequence would stop. But these all rule out random number generators. Do we have any that rule them in? That say yes, this generates randomness?

I don’t know of any. I suspect there can’t be any, on the grounds that a test of a thousand or a thousand million or a thousand million quadrillion numbers can’t assure us the generator won’t break down next time we use it. If we knew the algorithm by which the random numbers were generated — oh, but there we’re foiled before we can start. An algorithm is the instructions of how to do a thing. How can an instruction tell us how to do a thing that can’t be predicted?

Algorithms seem, briefly, to offer a way to tell whether we do have a good random sequence, though. We can describe patterns. A strong pattern is easy to describe, the way a familiar story is easy to reference. A weak pattern, a random one, is hard to describe. It’s like a dream, in which you can just list events. So we can call random something which can’t be described any more efficiently than just giving a list of all the results. But how do we know that can’t be done? 7, 7, 2, 4, 5, 3, 8, 5, 0, 9 looks like a pretty good set of digits, whole numbers from 0 through 9. I’ll bet not more than one in ten of you guesses correctly what the next digit in the sequence is. Unless you’ve noticed that these are the digits in the square root of π, so that the next couple digits have to be 0, 5, 5, and 1.

We know, on theoretical grounds, that we have randomness all around us. Quantum mechanics depends on it. If we need truly random numbers we can set a sensor. It will turn the arrival of cosmic rays, or the decay of radioactive atoms, or the sighing of a material flexing in the heat into numbers. We trust we gather these and process them in a way that doesn’t spoil their unpredictability. To what end?

That is, why do we care about randomness? Especially why should mathematicians care? The image of mathematics is that it is a series of logical deductions. That is, things known to be true because they follow from premises known to be true. Where can randomness fit?

One answer, one close to my heart, is called Monte Carlo methods. These are techniques that find approximate answers to questions. They do well when exact answers are too hard for us to find. They use random numbers to approximate answers and, often, to make approximate answers better. This demands computations. The field didn’t really exist before computers, although there are some neat forebears. I mean the Buffon needle problem, which lets you calculate the digits of π about as slowly as you could hope to do.

Another, linked to Monte Carlo methods, is stochastic geometry. “Stochastic” is the word mathematicians attach to things when they feel they’ve said “random” too often, or in an undignified manner. Stochastic geometery is what we can know about shapes when there’s randomness about how the shapes are formed. This sounds like it’d be too weak a subject to study. That it’s built on relatively weak assumptions means it describes things in many fields, though. It can be seen in understanding how forests grow. How to find structures inside images. How to place cell phone towers. Why materials should act like they do instead of some other way. Why galaxies cluster.

There’s also a stochastic calculus, a bit of calculus with randomness added. This is useful for understanding systems where some persistent unpredictable behavior is there. It comes, if I understand the histories of this right, from studying the ways molecules will move around in weird zig-zagging twists. They do this even when there is no overall flow, just a fluid at a fixed temperature. It too has surprising applications. Without the assumption that some prices of things are regularly jostled by arbitrary and unpredictable forces, and the treatment of that by stochastic calculus methods, we wouldn’t have nearly the ability to hedge investments against weird chaotic events. This would be a bad thing, I am told by people with more sophisticated investments than I have. I personally own like ten shares of the Tootsie Roll corporation and am working my way to a \$2.00 rebate check from Boyer.

Given that we need randomness, but don’t know how to get it — or at least don’t know how to be sure we have it — what is there to do? We accept our failings and make do with “quasirandom numbers”. We find some process that generates numbers which look about like random numbers should. These have failings. Most important is that if we could predict them. They’re random like “the date Easter will fall on” is random. The date Easter will fall is not at all random; it’s defined by a specific and humanly knowable formula. But if the only information you have is that this year, Easter fell on the 1st of April (Gregorian computus), you don’t have much guidance to whether this coming year it’ll be on the 7th, 14th, or 21st of April the next year. Most notably, quasirandom number generators will tend to repeat after enough numbers are drawn. If we know we won’t need enough numbers to see a repetition, though? Another stereotype of the mathematician is that of a person who demands exactness. It is often more true to say she is looking for an answer good enough. We are usually all right with a merely good enough quasirandomness.

Boyer candies — Mallo Cups, most famously, although I more like the peanut butter Smoothies — come with a cardboard card backing. Each card has two play money “coins”, of values from 5 cents to 50 cents. These can be gathered up for a rebate check or for various prizes. Whether your coin is 5 cents, 10, 25, or 50 cents … well, there’s no way to tell, before you open the package. It’s, so far as you can tell, randomness.

My next A To Z post should be available at this link. It’s coming Tuesday and should be the letter ‘S’.

## I’m Looking For The Last Topics For My Fall 2018 Mathematics A-To-Z

And now it’s my last request for my Fall 2018 mathematics A-To-Z. There’s only a half-dozen letters left, but nto to fear: they include letters with no end of potential topics, like, ‘X’.

If you have any mathematical topics with a name that starts U through Z that you’d like to see me write about, please say so. I’m happy to write what I fully mean to be a tight 500 words about the subject and then find I’ve put up my second 1800-word essay of the week. I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me. This might be the only way to get a good ‘X’ letter.

Also when you do make a request, please feel free to mention your blog, Twitter feed, YouTube channel, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.

So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

#### Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

#### Available Letters for the Fall 2018 A To Z:

• U
• V
• W
• X
• Y
• Z

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

## My 2018 Mathematics A To Z: Quadratic Equation

I have another topic today suggested by Dina Yagodich. I’ve mentioned before her YouTube channel. It’s got a variety of educational videos you might enjoy. Give it a try.

I’m planning this week to open up the end of the alphabet — and the year — to topic suggestions. So there’s no need to panic about that.

The Quadratic Equation is the tool humanity used to discover mathematics. Yes, I exaggerate a bit. But it touches a stunning array of important things. It is most noteworthy because of the time I impressed by several-levels-removed boss at the summer job I had while an undergraduate. He had been stumped by a data-optimization problem for weeks. I noticed it was just a quadratic equation, that’s easy to solve. He was, must be said, overly impressed. I would go on to grad school where I was once stymied for a week because I couldn’t find the derivative of $e^t$ correctly. It is, correctly, $e^t$. So I have sympathy for my remote supervisor.

We normally write the Quadratic Equation in one of two forms:

$ax^2 + bx + c = 0$

$a_0 + a_1 x + a_2 x^2 = 0$

The first form is great when you are first learning about polynomials, and parabolas. And you’re content to something raised to the second power. The second form is great when you are learning advanced stuff about polynomials. Then you start wanting to know things true about polynomials that go up to arbitrarily high powers. And we always want to know about polynomials. The subscripts under $a_j$ mean we can’t run out of letters to be coefficients. Setting the subscripts and powers to keep increasing lets us write this out neatly.

We don’t have to use x. We never do. But we mostly use x. Maybe t, if we’re writing an equation that describes something changing with time. Maybe z, if we want to emphasize how complex-valued numbers might enter into things. The name of the independent variable doesn’t matter. But stick to the obvious choices. If you’re going to make the variable ‘f’ you better have a good reason.

The equation is very old. We have ancient Babylonian clay tablets which describe it. Well, not the quadratic equation as we write it. The oldest problems put it as finding numbers that simultaneously solve two equations, one of them a sum and one of them a product. Changing one equation into two is a venerable mathematical process. It often makes problems simpler. We do this all the time in Ordinary Differential Equations. I doubt there is a direct connection between Ordinary Differential Equations and this alternate form of the Quadratic Equation. But it is a reminder that the ways we express mathematical problems are our conventions. We can rewrite problems to make our lives easier, to make answers clearer. We should look for chances to do that.

It weaves into everything. Some things seem obvious. Suppose the coefficients — a, b, and c; or $a_0, a_1, a_2$ if you’d rather — are all real-valued numbers. Then the quadratic equation has to hav two solutions. There can be two real-valued solutions. There can be one real-valued solution, counted twice for reasons that make sense but are too much a digression for me to justify here. There can be two complex-valued solutions. We can infer the usefulness of imaginary and complex-valued numbers by finding solutions to the quadratic equation.

(The quadratic equation is a great introduction complex-valued numbers. It’s not how mathematicians came to them. Complex-valued numbers looked like obvious nonsense. They corresponded to there being no real-valued answers. A formula that gives obvious nonsense when there’s no answer is great. It’s formulas that give subtle nonsense when there’s no answer that are dangerous. But similar-in-design formulas for cubic and quartic polynomials could use complex-valued numbers in intermediate steps. Plunging ahead as though these complex-valued numbers were proper would get to the real-valued answers. This made the argument that complex-valued numbers should be taken seriously.)

We learn useful things right away from trying to solve it. We teach students to “complete the square” as a first approach to solving it. Completing the square is not that useful by itself: a few pages later in the textbook we get to the quadratic formula and that has every quadratic equation solved. Just plug numbers into the formula. But completing the square teaches something more useful than just how to solve an equation. It’s a method in which we solve a problem by saying, you know, this would be easy to solve if only it were different. And then thinking how to change it into a different-looking problem with the same solutions. This is brilliant work. A mathematician is imagined to have all sorts of brilliant ideas on how to solve problems. Closer to to the truth is that she’s learned all sorts of brilliant ways to make a problem more like one she already knows how to solve. (This is the nugget of truth which makes one genre of mathematical jokes. These jokes have the punch line, “the mathematician declares, this is a problem already solved’ and goes back to sleep.”)

Stare at the solutions of the quadratic equation. You will find patterns. Suppose the coefficients are all real numbers. Then there are some numbers that can be solutions: 0, 1, square root of 15, -3.5, these can all turn up. There are some numbers that can’t be. π. e. The tangent of 2. It’s not just a division between rational and irrational numbers. There are different kinds of irrational numbers. This — alongside looking at other polynomials — leads us to transcendental numbers.

Keep staring at the two solutions of the quadratic equation. You’ll notice the sum of the solutions is $-\frac{b}{a}$. You’ll notice the product of the two solutions is $\frac{c}{a}$. You’ll glance back at those ancient Babylonian tablets. This seems interesting, but little more than that. It’s a lead, though. Similar formulas exist for the sum of the solutions for a cubic, for a quartic, for other polynomials. Also for the sum of products of pairs of these solutions. Or the sum of products of triplets of these solutions. Or the product of all these solutions. These are known as Vieta’s Formulas, after the 16th-century mathematician François Viète. (This by way of his Latinized, academic’sona, name, Franciscus Vieta.) This gives us a way to rewrite the original polynomial as a set of polynomials in several variables. What’s interesting is the set of polynomials have symmetries. They all look like, oh, “xy + yz + zx”. No one variable gets used in a way distinguishable from the others.

This leads us to group theory. The coefficients start out in a ring. The quotients from these Vieta’s Formulas give us an “extension” of the ring. An extension is roughly what the common use of the word suggests. It takes the ring and builds from it a bigger thing that satisfies some nice interesting rules. And it leads us to surprises. The ancient Greeks had several challenges to be done with only straightedge and compass. One was to make a cube double the volume of a given cube. It’s impossible to do, with these tools. (Even ignoring the question of what we would draw on.) Another was to trisect any arbitrary angle; it turns out, there are angles it’s just impossible. The group theory derived, in part, from this tells us why. One more impossibility: drawing a square that has exactly the same area as a given circle.

But there are possible things still. Step back from the quadratic equation, that $ax^2 + bx + c = 0$ bit. Make a function, instead, something that matches numbers (real, complex, what have you) to numbers (the same). Its rule: any x in the domain matches to the number $f(x) = ax^2 + bx + c$ in the range. We can make a picture that represents this. Set Cartesian coordinates — the x and y coordinates that people think of as the default — on a surface. Then highlight all the points with coordinates (x, y) which make true the equation $y = f(x)$. This traces out a particular shape, the parabola.

Draw a line that crosses this parabola twice. There’s now one fully-enclosed piece of the surface. How much area is enclosed there? It’s possible to find a triangle with area three-quarters that of the enclosed part. It’s easy to use straightedge and compass to draw a square the same area as a given triangle. Showing the enclosed area is four-thirds the triangle’s area? That can … kind of … be done by straightedge and compass. It takes infinitely many steps to do this. But if you’re willing to allow a process to go on forever? And you show that the process would reach some fixed, knowable answer? This could be done by the ancient Greeks; indeed, it was. Aristotle used this as an example of the method of exhaustion. It’s one of the ideas that reaches toward integral calculus.

This has been a lot of exact, “analytic” results. There are neat numerical results too. Vieta’s formulas, for example, give us good ways to find approximate solutions of the quadratic equation. They work well if one solution is much bigger than the other. Numerical methods for finding solutions tend to work better if you can start from a decent estimate of the answer. And you can learn of numerical stability, and the need for it, studying these.

Numerical calculations have a problem. We have a set number of decimal places with which to work. What happens if we need a calculation that takes more decimal places than we’re given to do perfectly? Here’s a toy version: two-thirds is the number 0.6666. Or 0.6667. Already we’re in trouble. What is three times two-thirds? We’re going to get either 1.9998 or 2.0001 and either way something’s wrong. The wrongness looks small. But any formula you want to use has some numbers that will turn these small errors into big ones. So numerical stability is, in fairness, not something unique to the quadratic equation. It is something you learn if you study the numerics of the equation deeply enough.

I’m also delighted to learn, through Wikipedia, that there’s a prosthaphaeretic method for solving the quadratic equation. Prosthaphaeretic methods use trigonometric functions and identities to rewrite problems. You might call it madness to rely on arctangents and half-angle formulas and such instead of, oh, doing a division or taking a square root. This is because you have calculators. But if you don’t? If you have to do all that work by hand? That’s terrible. But if someone has already prepared a table listing the sines and cosines and tangents of a great variety of angles? They did a great many calculations already. You just need to pick out the one that tells you what you hope to know. I’ll spare you the steps of solving the quadratic equation using trig tables. Wikipedia describes it fine enough.

So you see how much mathematics this connects to. It’s a bit of question-begging to call it that important. As I said, we’ve known the quadratic equation for a long time. We’ve thought about it for a long while. It would be surprising if we didn’t find many and deep links to other things. Even if it didn’t have links, we would try to understand new mathematical tools in terms of how they affect familiar old problems like this. But these are some of the things which we’ve found, and which run through much of what we understand mathematics to be.

The letter ‘R’ for this Fall 2018 Mathematics A-To-Z post should be published Friday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Pigeonhole Principle

Today’s topic is another that Dina Yagodich offered me. She keeps up a YouTube channel, with a variety of educational videos you might enjoy. And I apologize to Roy Kassinger, but I might come back around to “parasitic numbers” if I feel like doing some supplements or side terms.

# Pigeonhole Principle.

Mathematics has a reputation for obscurity. It’s full of jargon. All these weird, technical terms. Properties that mathematicians take to be obvious but that normal people find baffling. “The only word I understood was the’,” is the feedback mathematicians get when they show their family their thesis. I’m happy to share one that is not. This is one of those principles that anyone can understand. It’s so accessible that people might ask how it’s even mathematics.

The Pigeonhole Principle is usually put something like this. If you have more pigeons than there are different holes to nest them in, then at least one pigeonhole has to hold more than one pigeon. This is if the speaker wishes to keep pigeons in the discussion and is assuming there’s a positive number of both pigeons and holes. Tying a mathematical principle to something specific seems odd. We don’t talk about addition as apples put together or divided among friends. Not after elementary school anyway. Not once we’ve trained our natural number sense to work with abstractions.

If we want to make it abstract we can. Put it as “if you have more objects to put in boxes then you have boxes, then at least one box must hold more than one object”. In this form it is known as the Dirichlet Box Principle. Dirichlet here is Johan Peter Gustav Lejeune Dirichlet. He’s one of the seemingly infinite number of great 19th-Century French-German mathematicians. His family name was “Lejeune Dirichlet”, so his surname is an example of his own box principle. Everyone speaks of him as just Dirichlet, though. And they speak of him a lot, for stuff in mathematical physics, in thermodynamics, in Fourier transforms, in number theory (he proved two specific cases of Fermat’s Last Theorem), and in probability.

Still, at least in my experience, it’s “pigeonhole principle”. I don’t know why pigeons. It would be as good a metaphor to speak of horses put in stalls, or letters put in mailboxes, or pairs of socks put in hotel drawers. Perhaps it’s a reflection of the long history of breeding pigeons. That they’re familiar, likable animals, despite invective. That a bird in a cubby-hole seems like a cozy, pleasant image.

The pigeonhole principle is one of those neat little utility theorems. I think of it as something handy for existence proofs. These are proofs where you show there must be a thing. They don’t typically tell you what the thing is, or even help you to find it. They promise there is something to find.

Some of its uses seem too boring to bother proving. Pick five cards from a standard deck of cards; at least two will be the same suit. There are at least two non-bald people in Philadelphia who have the same number of hairs on their heads. Some of these uses seem interesting enough to prove, but worth nothing more than a shrug and a huh. Any 27-word sequence in the Constitution of the United States includes at least two words that begin with the same letter. Also at least two words that end with the same letter. If you pick any five integers from 1 to 8 (inclusive), then at least two of them will sum to nine.

Some uses start feeling unsettling. Draw five dots on the surface of an orange. It’s always possible to cut the orange in half in such a way that four points are on the same half. (This supposes that a point on the cut counts as being on both halves.)

Pick a set of 100 different whole numbers. It is always possible to select fifteen of these numbers, so that the difference between any pair of these select fifteen is some whole multiple of 7.

Select six people. There is always a triad of three people who all know one another, or who are all strangers to one another. (This supposes that “knowing one another” is symmetric. Real world relationships are messier than this. I have met Roger Penrose. There is not the slightest chance he is aware of this. Do we count as knowing one another or not?)

Some seem to transcend what we could possibly know. Drop ten points anywhere along a circle of diameter 5. Then we can conclude there are at least two points a distance of less than 2 from one another.

Drop ten points into an equilateral triangle whose sides are all length 1. Then there must be at least two points that are no more than distance $\frac{1}{3}$ apart.

Take a line of length L. Drop on it some number of points n + 1. There is some shortest length between consecutive points. What is the largest possible shortest-length-between-points? It is the distance $\frac{L}{n}$.

As I say, this won’t help you find the examples. You need to check the points in your triangle to see which ones are close to one another. You need to try out possible sets of your 100 numbers to find the ones that are all multiples of seven apart. But you have the assurance that the search will end in success, which is no small thing. And many of the conclusions you can draw are delights: results unexpected and surprising and wonderful. It’s great mathematics.

A note on sources. I drew pigeonhole-principle uses from several places. John Allen Paulos’s Innumeracy. Paulos is another I know without their knowing me. But also 16 Fun Applications of the Pigeonhole Principle, by Presh Talwalkar. Brilliant.org’s Pigeonhole Principle, by Lawrence Chiou, Parth Lohomi, Andrew Ellinor, et al. The Art of Problem Solving’s Pigeonhole Principle, author not made clear on the page. If you’re stumped by how to prove one or more of these claims, and don’t feel like talking them out here, try these pages.

My next Fall 2018 Mathematics A-To-Z post should be Tuesday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Oriented Graph

I am surprised to have had no suggestions for an ‘O’ letter. I’m glad to take a free choice, certainly. It let me get at one of those fields I didn’t specialize in, but could easily have. And let me mention that while I’m still taking suggestions for the letters P through T, each other letter has gotten at least one nomination. I can be swayed by a neat term, though, so if you’ve thought of something hard to resist, try me. And later this month I’ll open up the letters U through Z. Might want to start thinking right away about what X, Y, and Z could be.

# Oriented Graph.

This is another term from graph theory, one of the great mathematical subjects for doodlers. A graph, here, is made of two sets of things. One is a bunch of fixed points, called ‘vertices’. The other is a bunch of curves, called ‘edges’. Every edge starts at one vertex and ends at one vertex. We don’t require that every vertex have an edge grow from it.

Already you can see why this is a fun subject. It models some stuff really well. Like, anything where you have a bunch of sources of stuff, that come together and spread out again? Chances are there’s a graph that describes this. There’s a compelling all-purpose interpretation. Have vertices represent the spots where something accumulates, or rests, or changes, or whatever. Have edges represent the paths along which something can move. This covers so much.

The next step is a “directed graph”. This comes from making the edges different. If we don’t say otherwise we suppose that stuff can move along an edge in either direction. But suppose otherwise. Suppose there are some edges that can be used in only one direction. This makes a “directed edge”. It’s easy to see in graph theory networks of stuff like city streets. Once you ponder that, one-way streets follow close behind. If every edge in a graph is directed, then you have a directed graph. Moving from a regular old undirected graph to a directed graph changes everything you’d learned about graph theory. Mostly it makes things harder. But you get some good things in trade. We become able to model sources, for example. This is where whatever might move comes from. Also sinks, which is where whatever might move disappears from our consideration.

You might fear that by switching to a directed graph there’s no way to have a two-way connection between a pair of vertices. Or that if there is you have to go through some third vertex. I understand your fear, and wish to reassure you. We can get a two-way connection even in a directed graph: just have the same two vertices be connected by two edges. One goes one way, one goes the other. I hope you feel some comfort.

What if we don’t have that, though? What if the directed graph doesn’t have any vertices with a pair of opposite-directed edges? And that, then, is an oriented graph. We get the orientation from looking at pairs of vertices. Each pair either has no edge connecting them, or has a single directed edge between them.

There’s a lot of potential oriented graphs. If you have three vertices, for example, there’s seven oriented graphs to make of that. You’re allowed to have a vertex not connected to any others. You’re also allowed to have the vertices grouped into a couple of subsets, and connect only to other vertices in their own subset. This is part of why four vertices can give you 42 different oriented graphs. Five vertices can give you 582 different oriented graphs. You can insist on a connected oriented graph.

A connected graph is what you guess. It’s a graph where there’s no vertices off on their own, unconnected to anything. There’s no subsets of vertices connected only to each other. This doesn’t mean you can always get from any one vertex to any other vertex. The directions might not allow you to that. But if you’re willing to break the laws, and ignore the directions of these edges, you could then get from any vertex to any other vertex. Limiting yourself to connected graphs reduces the number of oriented graphs you can get. But not by as much as you might guess, at least not to start. There’s only one connected oriented graph for two vertices, instead of two. Three vertices have five connected oriented graphs, rather than seven. Four vertices have 34, rather than 42. Five vertices, 535 rather than 582. The total number of lost graphs grows, of course. The percentage of lost graphs dwindles, though.

There’s something more. What if there are no unconnected vertices? That is, every pair of vertices has an edge? If every pair of vertices in a graph has a direct connection we call that a “complete” graph. This is true whether the graph is directed or not. If you do have a complete oriented graph — every pair of vertices has a direct connection, and only the one direction — then that’s a “tournament”. If that seems like a whimsical name, consider one interpretation of it. Imagine a sports tournament in which every team played every other team once. And that there’s no ties. Each vertex represents one team. Each edge is the match played by the two teams. The direction is, let’s say, from the losing team to the winning team. (It’s as good if the direction is from the winning team to the losing team.) Then you have a complete, oriented, directed graph. And it represents your tournament.

And that delights me. A mathematician like me might talk a good game about building models. How one can represent things with mathematical constructs. Here, it’s done. You can make little dots, for vertices, and curved lines with arrows, for edges. And draw a picture that shows how a round-robin tournament works. It can be that direct.

My next Fall 2018 Mathematics A-To-Z post should be Friday. It’ll be available at this link, as are the rest of these glossary posts. And I’ve got requests for the next letter. I just have to live up to at least one of them.

## My 2018 Mathematics A To Z: Nearest Neighbor Model

I had a free choice of topics for today! Nobody had a suggestion for the letter ‘N’, so, I’ll take one of my own. If you did put in a suggestion, I apologize; I somehow missed the comment in which you did. I’ll try to do better in future.

# Nearest Neighbor Model.

Why are restaurants noisy?

It’s one of those things I wondered while at a noisy restaurant. I have heard it is because restauranteurs believe patrons buy more, and more expensive stuff, in a noisy place. I don’t know that I have heard this correctly, nor that what I heard was correct. I’ll leave it to people who work that end of restaurants to say. But I wondered idly whether mathematics could answer why.

It’s easy to form a rough model. Suppose I want my brilliant words to be heard by the delightful people at my table. Then I have to be louder, to them, than the background noise is. Fine. I don’t like talking loudly. My normal voice is soft enough even I have a hard time making it out. And I’ll drop the ends of sentences when I feel like I’ve said all the interesting parts of them. But I can overcome my instinct if I must.

The trouble comes from other people thinking of themselves the way I think of myself. They want to be heard over how loud I have been. And there’s no convincing them they’re wrong. If there’s bunches of tables near one another, we’re going to have trouble. We’ll each by talking loud enough to drown one another out, until the whole place is a racket. If we’re close enough together, that is. If the tables around mine are empty, chances are my normal voice is enough for the cause. If they’re not, we might have trouble.

So this inspires a model. The restaurant is a space. The tables are set positions, points inside it. Each table is making some volume of noise. Each table is trying to be louder than the background noise. At least until the people at the table reach the limits of their screaming. Or decide they can’t talk, they’ll just eat and go somewhere pleasant.

Making calculations on this demands some more work. Some is obvious: how do you represent “quiet” and “loud”? Some is harder: how far do voices carry? Grant that a loud table is still loud if you’re near it. How far away before it doesn’t sound loud? How far away before you can’t hear it anyway? Imagine a dining room that’s 100 miles long. There’s no possible party at one end that could ever be heard at the other. Never mind that a 100-mile-long restaurant would be absurd. It shows that the limits of people’s voices are a thing we have to consider.

There are many ways to model this distance effect. A realistic one would fall off with distance, sure. But it would also allow for echoes and absorption by the walls, and by other patrons, and maybe by restaurant decor. This would take forever to get answers from, but if done right it would get very good answers. A simpler model would give answers less fitted to your actual restaurant. But the answers may be close enough, and let you understand the system. And may be simple enough that you can get answers quickly. Maybe even by hand.

And so I come to the “nearest neighbor model”. The common English meaning of the words suggest what it’s about. We get it from models, like my restaurant noise problem. It’s made of a bunch of points that have some value. For my problem, tables and their noise level. And that value affects stuff in some region around these points.

In the “nearest neighbor model”, each point directly affects only its nearest neighbors. Saying which is the nearest neighbor is easy if the points are arranged in some regular grid. If they’re evenly spaced points on a line, say. Or a square grid. Or a triangular grid. If the points are in some other pattern, you need to think about what the nearest neighbors are. This is why people working in neighbor-nearness problems get paid the big money.

Suppose I use a nearest neighbor model for my restaurant problem. In this, I pretend the only background noise at my table is that of the people the next table over, in each direction. Two tables over? Nope. I don’t hear them at my table. I do get an indirect effect. Two tables over affects the table that’s between mine and theirs. But vice-versa, too. The table that’s 100 miles away can’t affect me directly, but it can affect a table in-between it and me. And that in-between table can affect the next one closer to me, and so on. The effect is attenuated, yes. Shouldn’t it be, if we’re looking at something farther away?

This sort of model is easy to work with numerically. I’m inclined toward problems that work numerically. Analytically … well, it can be easy. It can be hard. There’s a one-dimensional version of this problem, a bunch of evenly-spaced sites on an infinitely long line. If each site is limited to one of exactly two values, the problem becomes easy enough that freshman physics majors can solve it exactly. They don’t, not the first time out. This is because it requires recognizing a trigonometry trick that they don’t realize would be relevant. But once they know the trick, they agree it’s easy, when they go back two years later and look at it again. It just takes familiarity.

This comes up in thermodynamics, because it makes a nice model for how ferromagnetism can work. More realistic problems, like, two-dimensional grids? … That’s harder to solve exactly. Can be done, though not by undergraduates. Three-dimensional can’t, last time I looked. Weirdly, four-dimensional can. You expect problems to only get harder with more dimensions of space, and then you get a surprise like that.

The nearest-neighbor-model is a first choice. It’s hardly the only one. If I told you there were a next-nearest-neighbor model, what would you suppose it was? Yeah, you’d be right. As long as you supposed it was “things are affected by the nearest and the next-nearest neighbors”. Mathematicians have heard of loopholes too, you know.

As for my restaurant model? … I never actually modelled it. I did think about the model. I concluded my model wasn’t different enough from ferromagnetism models to need me to study it more. I might be mistaken. There may be interesting weird effects caused by the facts of restaurants. That restaurants are pretty small things. That they can have echo-y walls and ceilings. That they can have sound-absorbing things like partial walls or plants. Perhaps I gave up too easily when I thought I knew the answer. Some of my idle thoughts end up too idle.

I should have my next Fall 2018 Mathematics A-To-Z post on Tuesday. It’ll be available at this link, as are the rest of these glossary posts.

## My 2018 Mathematics A To Z: Manifold

Two commenters suggested the topic for today’s A to Z post. I suspect I’d have been interested in it if only one had. (Although Dina Yagoditch’s suggestion of the Menger Sponge is hard to resist.) But a double domination? The topic got suggested by Mr Wu, author of MathTuition88, and by John Golden, author of Math Hombre. My thanks to all for interesting things to think about.

# Manifold.

So you know how in the first car you ever owned the alternator was always going bad? If you’re lucky, you reach a point where you start owning cars good enough that the alternator is not the thing always going bad. Once you’re there, congratulations. Now the thing that’s always going bad in your car will be the manifold. That one’s for my dad.

Manifolds are a way to do normal geometry on weird shapes. What’s normal geometry? It’s … you know, the way shapes work on your table, or in a room. The Euclidean geometry that we’re so used to that it’s hard to imagine it not working. Why worry about weird shapes? They’re interesting, for one. And they don’t have to be that weird to count as weird. A sphere, like the surface of the Earth, can be weird. And these weird shapes can be useful. Mathematical physics, for example, can represent the evolution of some complicated thing as a path drawn on a weird shape. Bringing what we know about geometry from years of study, and moving around rooms, to a problem that abstract makes our lives easier.

We use language that sounds like that of map-makers when discussing manifolds. We have maps. We gather together charts. The collection of charts describing a surface can be an atlas. All these words have common meanings. Mercifully, these common meanings don’t lead us too far from the mathematical meanings. We can even use the problem of mapping the surface of the Earth to understand manifolds.

If you love maps, the geography kind, you learn quickly that there’s no making a perfect two-dimensional map of the Earth’s surface. Some of these imperfections are obvious. You can distort shapes trying to make a flat map of the globe. You can distort sizes. But you can’t represent every point on the globe with a point on the paper. Not without doing something that really breaks continuity. Like, say, turning the North Pole into the whole line at the top of the map. Like in the Equirectangular projection. Or skipping some of the points, like in the Mercator projection. Or adding some cuts into a surface that doesn’t have them, like in the Goode homolosine projection. You may recognize this as the one used in classrooms back when the world had first begun.

But what if we don’t need the whole globedone in a single map? Turns out we can do that easy. We can make charts that cover a part of the surface. No one chart has to cover the whole of the Earth’s surface. It only has to cover some part of it. It covers the globe with a piece that looks like a common ordinary Euclidean space, where ordinary geometry holds. It’s the collection of charts that covers the whole surface. This collection of charts is an atlas. You have a manifold if it’s possible to make a coherent atlas. For this every point on the manifold has to be on at least one chart. It’s okay if a point is on several charts. It’s okay if some point is on all the charts. Like, suppose your original surface is a circle. You can represent this with an atlas of two charts. Each chart maps the circle, except for one point, onto a line segment. The two charts don’t both skip the same point. All but two points on this circle are on all the maps of this chart. That’s cool. What’s not okay is if some point can’t be coherently put onto some chart.

This sad fate can happen. Suppose instead of a circle you want to chart a figure-eight loop. That won’t work. The point where the figure crosses itself doesn’t look, locally, like a Euclidean space. It looks like an ‘x’. There’s no getting around that. There’s no atlas that can cover the whole of that surface. So that surface isn’t a manifold.

But many things are manifolds nevertheless. Toruses, the doughnut shapes, are. Möbius strips and Klein bottles are. Ellipsoids and hyperbolic surfaces are, or at least can be. Mathematical physics finds surfaces that describe all the ways the planets could move and still conserve the energy and momentum and angular momentum of the solar system. That cheesecloth surface stretched through 54 dimensions, is a manifold. There are many possible atlases, with many more charts. But each of those means we can, at least locally, for particular problems, understand them the same way we understand cutouts of triangles and pentagons and circles on construction paper.

So to get back to cars: no one has ever said “my car runs okay, but I regret how I replaced the brake covers the moment I suspected they were wearing out”. Every car problem is easier when it’s done as soon as your budget and schedule allow.

This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. What will I choose for ‘N’, later this week? I really should have decided that by now.

## My 2018 Mathematics A To Z: Limit

I got an irresistible topic for today’s essay. It’s courtesy Peter Mander, author of Carnot Cycle, “the classical blog about thermodynamics”. It’s bimonthly and it’s one worth waiting for. Some of the essays are historical; some are statistical-mechanics; many are mixtures of them. You could make a fair argument that thermodynamics is the most important field of physics. It’s certainly one that hasn’t gotten the popularization treatment it deserves, for its importance. Mander is doing something to correct that.

# Limit.

It is hard to think of limits without thinking of motion. The language even professional mathematicians use suggests it. We speak of the limit of a function “as x goes to a”, or “as x goes to infinity”. Maybe “as x goes to zero”. But a function is a fixed thing, a relationship between stuff in a domain and stuff in a range. It can’t change any more than January, AD 1988 can change. And ‘x’ here is a dummy variable, part of the scaffolding to let us find what we want to know. I suppose ‘x’ can change, but if we ever see it, something’s gone very wrong. But we want to use it to learn something about a function for a point like ‘a’ or ‘infinity’ or ‘zero’.

The language of motion helps us learn, to a point. We can do little experiments: if $f(x) = \frac{sin(x)}{x}$, then, what should we expect it to be for x near zero? It’s irresistible to try out the calculator. Let x be 0.1. 0.01. 0.001. 0.0001. The numbers say this f(x) gets closer and closer to 1. That’s good, right? We know we can’t just put in an x of zero, because there’s some trouble that makes. But we can imagine creeping up on the zero we really wanted. We might spot some obvious prospects for mischief: what if x is negative? We should try -0.1, -0.01, -0.001 and so on. And maybe we won’t get exactly the right answer. But if all we care about is the first (say) three digits and we try out a bunch of x’s and the corresponding f(x)’s agree to those three digits, that’s good enough, right?

This is good for giving an idea of what to expect a limit to look like. It should be, well, what it really really really looks like a function should be. It takes some thinking to see where it might go wrong. It might go to different numbers based on which side you approach from. But that seems like something you can rationalize. Indeed, we do; we can speak of functions having different limits based on what direction you approach from. Sometimes that’s the best one can say about them.

But it can get worse. It’s possible to make functions that do crazy weird things. Some of these look like you’re just trying to be difficult. Like, set f(x) equal to 1 if x is rational and 0 if x is irrational. If you don’t expect that to be weird you’re not paying attention. Can’t blame someone for deciding that falls outside the realm of stuff you should be able to find limits for. And who would make, say, an f(x) that was 1 if x was 0.1 raised to some power, but 2 if x was 0.2 raised to some power, and 3 otherwise? Besides someone trying to prove a point?

Fine. But you can make a function that looks innocent and yet acts weird if the domain is two-dimensional. Or more. It makes sense to say that the functions I wrote in the above paragraph should be ruled out of consideration. But the limit of $f(x, y) = \frac{x^3 y}{x^6 + y^2}$ at the origin? You get different results approaching in different directions. And the function doesn’t give obvious signs of imminent danger here.

We need a better idea. And we even have one. This took centuries of mathematical wrangling and arguments about what should and shouldn’t be allowed. This should inspire sympathy with Intro Calc students who don’t understand all this by the end of week three. But here’s what we have.

I need a supplementary idea first. That is the neighborhood. A point has a neighborhood if there’s some open set that contains it. We represent this by drawing a little blob around the point we care about. If we’re looking at the neighborhood of a real number, then this is a little interval, that’s all. When we actually get around to calculating, we make these neighborhoods little circles. Maybe balls. But when we’re doing proofs about how limits work, or how we use them to prove things, we make blobs. This “neighborhood” idea looks simple, but we need it, so here we go.

So start with a function, named ‘f’. It has a domain, which I’ll call ‘D’. And a range, which I want to call ‘R’, but I don’t think I need the shorthand. Now pick some point ‘a’. This is the point at which we want to evaluate the limit. This seems like it ought to be called the “limit point” and it’s not. I’m sorry. Mathematicians use “limit point” to talk about something else. And, unfortunately, it makes so much sense in that context that we aren’t going to change away from that.

‘a’ might be in the domain ‘D’. It might not. It might be on the border of ‘D’. All that’s important is that there be a neighborhood inside ‘D’ that contains ‘a’.

I don’t know what f(a) is. There might not even be an f(a), if a is on the boundary of the domain ‘D’. But I do know that everything inside the neighborhood of ‘a’, apart from ‘a’, is in the domain. So we can look at the values of f(x) for all the x’s in this neighborhood. This will create a set, in the range, that’s known as the image of the neighborhood. It might be a continuous chunk in the range. It might be a couple of chunks. It might be a single point. It might be some crazy-quilt set. Depends on ‘f’. And the neighborhood. No matter.

Now I need you to imagine the reverse. Pick a point in the range. And then draw a neighborhood around it. Then pick out what we call the pre-image of it. That’s all the points in the domain that get matched to values inside that neighborhood. Don’t worry about trying to do it; that’s for the homework practice. Would you agree with me that you can imagine it?

I hope so because I’m about to describe the part where Intro Calc students think hard about whether they need this class after all.

All right. Then I want something in the range. I’m going to call it ‘L’. And it’s special. It’s the limit of ‘f’ at ‘a’ if this following bit is true:

Think of every neighborhood you could pick of ‘L’. Can be big, can be small. Just has to be a neighborhood of ‘L’. Now think of the pre-image of that neighborhood. Is there always a neighborhood of ‘a’ inside that pre-image? It’s okay if it’s a tiny neighborhood. Just has to be an open neighborhood. It doesn’t have to contain ‘a’. You can allow a pinpoint hole there.

If you can always do this, however tiny the neighborhood of ‘L’ is, then the limit of ‘f’ at ‘a’ is ‘L’. If you can’t always do this — if there’s even a single exception — then there is no limit of ‘f’ at ‘a’.

I know. I felt like that the first couple times through the subject too. The definition feels backward. Worse, it feels like it begs the question. We suppose there’s an ‘L’ and then test these properties about it and then if it works we say we’re done? I know. It’s a pain when you start calculating this with specific formulas and all that, too. But supposing there is an answer and then learning properties about it, including whether it can exist? That’s a slick trick. We can use it.

Thing is, the pain is worth it. We can calculate with it and not have to out-think tricky functions. It works for domains with as many dimensions as you need. It works for limits that aren’t inside the domain. It works with domains and ranges that aren’t real numbers. It works for functions with weird and complicated domains. We can adapt it if we want to consider limits that are constrained in some way. It won’t be fooled by tricks like I put up above, the f(x) with different rules for the rational and irrational numbers.

So mathematicians shrug, and do enough problems that they get the hang of it, and use this definition. It’s worth it, once you get there.

This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. And I’m still taking nominations for discussion topics, if you’d like to see mathematics terms explained. I know I would.

## I’m Looking For The Next Set Of Topics For My Fall 2018 Mathematics A-To-Z

We’re at the end of another month. So it’s a good chance to set out requests for the next several week’s worth of my mathematics A-To-Z. As I say, I’ve been doing this piecemeal so that I can keep track of requests better. I think it’s been working out, too.

If you have any mathematical topics with a name that starts N through T, let me know! I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me.

Also when you do make a request, please feel free to mention your blog, Twitter feed, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.

So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

#### Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

#### Available Letters for the Fall 2018 A To Z:

• N
• O
• P
• Q
• R
• S
• T

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

## My 2018 Mathematics A To Z: Kelvin (the scientist)

Today’s request is another from John Golden, @mathhombre on Twitter and similarly on Blogspot. It’s specifically for Kelvin — “scientist or temperature unit”, the sort of open-ended goal I delight in. I decided on the scientist. But that’s a lot even for what I honestly thought would be a quick little essay. So I’m going to take out a tiny slice of a long and amazingly fruitful career. There’s so much more than this.

Before I get into what I did pick, let me repeat an important warning about historical essays. Every history is incomplete, yes. But any claim about something being done for the first time is simplified to the point of being wrong. Any claim about an individual discovering or inventing something is simplified to the point of being wrong. Everything is more complicated and, especially, more ambiguous than this. If you do not love the challenge of working out a coherent narrative when the most discrete and specific facts are also the ones that are trivia, do not get into history. It will only break your heart and mislead your readers. With that disclaimer, let me try a tiny slice of the life of William Thomson, the Baron Kelvin.

# Kelvin (the scientist).

The great thing about a magnetic compass is that it’s easy. Set the thing on an axis and let it float freely. It aligns itself to the magnetic poles. It’s easy to see why this looks like magic.

The trouble is that it’s not quite right. It’s near enough for many purposes. But the direction a magnetic compass points out to be north is not the true geographic north. Fortunately, we’ve got a fair idea just how far off north that is. It depends on where you are. If you have a rough idea where you already are, you can make a correction. We can print up charts saying how much of a correction to make.

The trouble is that it’s still not quite right. The location of the magnetic north and south poles wanders. Fortunately we’ve got a fair idea of how quickly it’s moving, and in what direction. So if you have a rough idea how out of date your chart is, and what direction the poles were moving in, you can make a correction. We can communicate how much the variance between true north and magnetic north vary.

The trouble is that it’s still not quite right. The size of the variation depends on the season of the year. But all right; we should have a rough idea what season it is. We can correct for that. The size of the variation also depends on what time of day it is. Compasses point farther east at around 8 am (sun time) than they do the rest of the day, and farther west around 1 pm. At least they did when Alan Gurney’s Compass: A Story of Exploration and Innovation was published. I would be unsurprised if that’s changed since the book came out a dozen years ago. Still. These are all, we might say, global concerns. They’s based on where you are and when you look at the compass. But they don’t depend on you, the specific observer.

The trouble is that it’s still not quite right yet. Almost as soon as compasses were used for navigation, on ships, mariners noticed the compass could vary. And not just because compasses were often badly designed and badly made. The ships themselves got in the way. The problem started with guns, the iron of which led compasses astray. When it was just the ship’s guns the problem could be coped with. Set the compass binnacle far from any source of iron, and the error should be small enough.

The trouble is when the time comes to make ships with iron. There are great benefits you get from cladding ships in iron, or making them of iron altogether. Losing the benefits of navigation, though … that’s a bit much.

There’s an obvious answer. Suppose you know the construction of the ship throws off compass bearings. Then measure what the compass reads, at some point when you know what it should read. Use that to correct your measurements when you aren’t sure. From the early 1800s mariners could use a method called “swinging the ship”, setting the ship at known angles and comparing what the compass read. It’s a bit of a chore. And you should arrange things you need to do so that it’s harder to make a careless mistake at them.

In the 1850s John Gray of Liverpool patented a binnacle — the little pillar that holds the compass — which used the other obvious but brilliant approach. If the iron which builds the ship sends the compass awry, why not put iron near the compass to put the compass back where it should be? This set up a contraption of a binnacle surrounded by adjustable, correcting magnets.

Enter finally William Thomson, who would become Baron Kelvin in 1892. In 1871 the magazine Good Words asked him to write an article about the marine compass. In 1874 he published his first essay on the subject. The second part appeared five years after that. I am not certain that this is directly related to the tiny slice of story I tell. I just mention it to reassure every academic who’s falling behind on their paper-writing, which is all of them.

But come the 1880s Thomson patented an improved binnacle. Thomson had the sort of talents normally associated only with the heroes of some lovable yet dopey space-opera of the 1930s. He was a talented scientist, competent in thermodynamics and electricity and magnetism and fluid flow. He was a skilled mathematician, as you’d need to be to keep up with all that and along the way prove the Stokes theorem. (This is one of those incredibly useful theorems that gives information about the interior of a volume using only integrals over the surface.) He was a magnificent engineer, with a particular skill at developing instruments that would brilliantly measure delicate matters. He’s famous for saving the trans-Atlantic telegraph cable project. He recognized that what was needed was not more voltage to drive signal through three thousand miles of dubiously made copper wire, but rather ways to pick up the feeble signals that could come across, and amplify them into usability. And also described the forces at work on a ship that is laying a long line of submarine cable. And he was a manufacturer, able to turn these designs into mass-produced products. This through collaborating with James White, of Glasgow, for over half a century. And a businessman, able to convince people and organizations to use the things. He’s an implausible protagonist; and yet, there he is.

Thomson’s revision for the binnacle made it simpler. A pair of spheres, flanking the compass, and adjustable. The Royal Museums Greenwich web site offers a picture of this sort of system. It’s not so shiny as others in the collection. But this angle shows how adjustable the system would be. It’s a design that shows brilliance behind it. What work you might have to do to use it is obvious. At least it’s obvious once you’re told the spheres are adjustable. To reduce a massive, lingering, challenging problem to something easy is one of the great accomplishments of any practical mathematician.

This was not all Thomson did in maritime work. He’d developed an analog computer which would calculate the tides. Wikipedia tells me that Thomson claimed a similar mechanism could solve arbitrary differential equations. I’d accept that claim, if he made it. Thomson also developed better tools for sounding depths. And developed compasses proper, not just the correcting tools for binnacles. A maritime compass is a great practical challenge. It has to be able to move freely, so that it can give a correct direction even as the ship changes direction. But it can’t move too freely, or it becomes useless in rolling seas. It has to offer great precision, or it loses its use in directing long journeys. It has to be quick to read, or it won’t be consulted. Thomson designed a compass that was, my readings indicate, a great fit for all these constraints. By the time of his death in 1907 Kelvin and White (the company had various names) had made something like ten thousand compasses and binnacles.

And this from a person attached to all sorts of statistical mechanics stuff and who’s important for designing electrical circuits and the like.

## My 2018 Mathematics A To Z: Jokes

For today’s entry, Iva Sallay, of Find The Factors, gave me an irresistible topic. I did not resist.

# Jokes.

What’s purple and commutes?
An Abelian grape.

Whatever else you say about mathematics we are human. We tell jokes. I will tell some here. You may not understand the words in them. That’s all right. From the Abelian grape there, you gather this is some manner of wordplay. A pun, particularly. It’s built on a technical term. “Abelian groups” come from (not high school) Algebra. In an Abelian group, the group multiplication commutes. That is, if ‘a’ and ‘b’ are any things in the group, then their product “ab” is the same as “ba’. That is, the group works like ordinary addition on numbers does. We say “Abelian” in honor of Niels Henrik Abel, who taught us some fascinating stuff about polynomials. Puns are a common kind of humor. So common, they’re almost base. Even a good pun earns less laughter than groans.

But mathematicians make many puns. A typical page of mathematics jokes has a whole section of puns. “What’s yellow and equivalent to the Axiom of Choice? Zorn’s Lemon.” “What’s nonorientable and lives in the sea?” “Möbius Dick.” “One day Jesus said to his disciples, The Kingdom of Heaven is like 3x2 + 8x – 9′. Thomas looked very confused and asked peter, What does the teacher mean?’ Peter replied, Don’t worry. It’s just another one of his parabolas’.” And there are many jokes built on how it is impossible to tell the difference between the sounds of “π” and “pie”.

It shouldn’t surprise that mathematicians make so many puns. Mathematics trains people to know definitions. To think about precisely what we mean. Puns ignore definitions. They build nonsense out of the ways that sounds interact. Mathematicians practice how to make things interact, even if they don’t know or care what the underlying things are. If you’ve gotten used to proving things about $aba^{-1}b^{-1}$, without knowing what ‘a’ or ‘b’ are, it’s difficult to avoid turning “poles on the half-plane” (which matters in some mathematical physics) to a story about Polish people on an aircraft.

If there’s a flaw to this kind of humor it’s that these jokes may sound juvenile. One of the first things that strikes kids as funny is that a thing might have several meanings. Or might sound like another thing. “Why do mathematicians like parks? Because of all the natural logs!”

Jokes can be built tightly around definitions. “What do you get if you cross a mosquito with a mountain climber? Nothing; you can’t cross a vector with a scalar.” “There are 10 kinds of people in the world, those who understand binary mathematics and those who don’t.” “Life is complex; it has real and imaginary parts.”

There are more sophisticated jokes. Many of them are self-deprecating. “A mathematician is a device for turning coffee into theorems.” “An introvert mathematician looks at her shoes while talking to you. An extrovert mathematician looks at your shoes.” “A mathematics professor is someone who talks in someone else’s sleep”. “Two people are adrift in a hot air balloon. Finally they see someone and shout down, Where are we?’ The person looks up, and studies them, watching the balloon drift away. Finally, when they are barely in shouting range, the person on the ground shouts back, You are in a balloon!’ The first passenger curses their luck at running across a mathematician. How do you know that was a mathematician?’ Because her answer took a long time, was perfectly correct, and absolutely useless!”’ These have the form of being about mathematicians. But they’re not really. It would be the same joke to say “a poet is a device for turning coffee into couplets”, the sleep-talker anyone who teachers, or have the hot-air balloonists discover a lawyer or a consultant.

Some of these jokes get more specific, with mathematics harder to extract from the story. The tale of the nervous flyer who, before going to the conference, sends a postcard that she has a proof of the Riemann hypothesis. She arrives and admits she has no such thing, of course. But she sends that word ahead of every conference. She knows if she died in a plane crash after that, she’d be famous forever, and God would never give her that. (I wonder if Ian Randal Strock’s little joke of a story about Pierre de Fermat was an adaptation of this joke.) You could recast the joke for physicists uniting gravity and quantum mechanics. But I can’t imagine a way to make this joke about an ISO 9000 consultant.

A dairy farmer knew he could be milking his cows better. He could surely get more milk, and faster, if only the operations of his farm were arranged better. So he hired a mathematician to find the optimal way to configure everything. The mathematician toured every part of the pastures, the milking barn, the cows, everything relevant. And then the mathematician set to work devising a plan for the most efficient possible cow-milking operation. The mathematician declared, “First, assume a spherical cow.”

This joke is very mathematical. I know of no important results actually based on spherical cows. But the attitude that tries to make spheres of cows comes from observing mathematicians. To describe any real-world process is to make a model of that thing. A model is a simplification of the real thing. You suppose that things behave more predictably than the real thing. You trust the error made by this supposition is small enough for your needs. A cow is complicated, all those pointy ends and weird contours. A sphere is easy. And, besides, cows are funny. “Spherical cow” is a funny string of sounds, at least in English.

The spherical cows approach parodying the work mathematicians do. Many mathematical jokes are burlesques of deductive logic. Or not even burlesques. Charles Dodgson, known to humans as Lewis Carroll, wrote this in Symbolic Logic:

“No one, who means to go by the train and cannot get a conveyance, and has not enough time to walk to the station, can do without running;
This party of tourists mean to go by the train and cannot get a conveyance, but they have plenty of time to walk to the station.
∴ This party of tourists need not run.”

[ Here is another opportunity, gentle Reader, for playing a trick on your innocent friend. Put the proposed Syllogism before him, and ask him what he thinks of the Conclusion.

He will reply “Why, it’s perfectly correct, of course! And if your precious Logic-book tells you it isn’t, don’t believe it! You don’t mean to tell me those tourists need to run? If I were one of them, and knew the Premises to be true, I should be quite clear that I needn’t run — and I should walk!

And you will reply “But suppose there was a mad bull behind you?”

And then your innocent friend will say “Hum! Ha! I must think that over a bit!” ]

The punch line is diffused by the text being so educational. And by being written in the 19th century, when it was bad form to excise any word from any writing. But you can recognize the joke, and why it should be a joke.

Not every mathematical-reasoning joke features some manner of cattle. Some are legitimate:

Claim. There are no uninteresting whole numbers.
Proof. Suppose there is a smalled uninteresting whole number. Call it N. That N is uninteresting is an interesting fact. Therefore N is not an uninteresting whole number.

Three mathematicians step up to the bar. The bartender asks, “you all want a beer?” The first mathematician says, “I don’t know.” The second mathematician says, “I don’t know.” The third says, “Yes”.

Some mock reasoning uses nonsense methods to get a true conclusion. It’s the fun of watching Mister Magoo walk unharmed through a construction site to find the department store exchange counter:

5095 / 1019 = 5095 / 1019 = 505 / 101 = 55 / 11 = 5

This one includes the thrill of division by zero.

Venn Diagrams are not by themselves jokes (most of the time). But they are a great structure for jokes. And easy to draw, which is great for us who want to be funny but don’t feel sure about their drafting abilities.

And then there are personality jokes. Mathematics encourages people to think obsessively. Obsessive people are often funny people. Alexander Grothendieck was one of the candidates for “greatest 20th century mathematician”. His reputation is that he worked so well on abstract problems that he was incompetent at practical ones. The story goes that he was demonstrating something about prime numbers and his audience begged him to speak about a specific number, that they could follow an example. And that he grumbled a bit and, finally, said, “57”. It’s not a prime number. But if you speak of “Grothendieck’s prime”, many will recognize what you mean, and grin.

There are more outstanding, preposterous personalities. Paul Erdös was prolific, and a restless traveller. The stories go that he would show up at some poor mathematician’s door and stay with them several months. And then co-author a paper with the elevator operator. (Erdös is also credited as the originator of the “coffee into theorems” quip above.) John von Neumann was supposedly presented with this problem:

Two trains are on the same track, 60 miles apart, heading toward each other, each travelling 30 miles per hour. A fly travels 60 miles per hour, leaving one engine flying toward the other. When it reaches the other engine it turns around immediately and flies back to the other engine. This is repeated until the two trains crash. How far does the fly travel before the crash?

The first, hard way to do this is to realize how far the fly travels is a series. It starts at, let’s say, the left engine and flies to the right. Add to that the distance from the right to the left train now. Then left to the right again. Right to left. This is a bunch of calculations. Most people give up on that and realize the problem is easier. The trains will crash in one hour. The fly travels 60 miles per hour for an hour. It’ll fly 60 miles total. John von Neumann, say witnesses, had the answer instantly. He recognized the trick? “I summed the series.”

The personalities can be known more remotely, from a handful of facts about who they were or what they did. “Cantor did it diagonally.” Georg Cantor is famous for great thinking about infinitely large sets. His “diagonal proof” shows the set of real numbers must be larger than the set of rational numbers. “Fermat tried to do it in the margin but couldn’t fit it in.” “Galois did it on the night before.” (Évariste Galois wrote out important pieces of group theory the night before a duel. It went badly for him. French politics of the 1830s.) Every field has its celebrities. Mathematicians learn just enough about theirs to know a couple of jokes.

The jokes can attach to a generic mathematician personality. “How can you possibly visualize something that happens in a 12-dimensional space?” “Easy, first visualize it in an N-dimensional space, and then let N go to 12.” Three statisticians go hunting. They spot a deer. One shoots, missing it on the left. The second shoots, missing it on the right. The third leaps up, shouting, “We’ve hit it!” An engineer and a mathematician are sleeping in a hotel room when the fire alarm goes off. The engineer ties the bedsheets into a rope and shimmies out of the room. The mathematician looks at this, unties the bedsheets, sets them back on the bed, declares, “this is a problem already solved” and goes back to sleep. (Engineers and mathematicians pair up a lot in mathematics jokes. I assume in engineering jokes too, but that the engineers make wrong assumptions about who the joke is on. If there’s a third person in the party, she’s a physicist.)

Do I have a favorite mathematics joke? I suppose I must. There are jokes I like better than others, and there are — I assume — finitely many different mathematics jokes. So I must have a favorite. What is it? I don’t know. It must vary with the day and my mood and the last thing I thought about. I know a bit of doggerel keeps popping into my head, unbidden. Let me close by giving it to you.

Integral z-squared dz
From 1 to the cube root of 3
Times the cosine
Of three π over nine
Equals log of the cube root of e.

This may not strike you as very funny. I’m not sure it strikes me as very funny. But it keeps showing up, all the time. That has to add up.

This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. Also, now and then, I talk about comic strips here. You might like that too.

## My 2018 Mathematics A To Z: Infinite Monkey Theorem

Dina Yagodich gave me the topic for today. She keeps up a YouTube channel with a variety of interesting videos. And she did me a favor. I’ve been thinking a long while to write a major post about this theorem. Its subject turns up so often. I’d wanted to have a good essay about it. I hope this might be one.

# Infinite Monkey Theorem.

Some mathematics escapes mathematicians and joins culture. This is one such. The monkeys are part of why. They’re funny and intelligent and sad and stupid and deft and clumsy, and they can sit at a keyboard almost look in place. They’re so like humans, except that we empathize with them. To imagine lots of monkeys, and putting them to some silly task, is compelling.

The metaphor traces back to a 1913 article by the mathematical physicist Émile Borel which I have not read. Searching the web I find much more comment about it than I find links to a translation of the text. And only one copy of the original, in French. And that page wants €10 for it. So I can tell you what everybody says was in Borel’s original text, but can’t verify it. The paper’s title is “Statistical Mechanics and Irreversibility”. From this I surmise that Borel discussed one of the great paradoxes of statistical mechanics. If we open a bottle of one gas in an airtight room, it disperses through the room. Why doesn’t every molecule of gas just happen, by chance, to end up back where it started? It does seem that if we waited long enough, it should. It’s unlikely it would happen on any one day, but give it enough days …

But let me turn to many web sites that are surely not all copying Wikipedia on this. Borel asked us to imagine a million monkeys typing ten hours a day. He posited it was possible but extremely unlikely that they would exactly replicate all the books of the richest libraries of the world. But that would be more likely than the atmosphere in a room un-mixing like that. Fair enough, but we’re not listening anymore. We’re thinking of monkeys. Borel’s is a fantastic image. It would see some adaptation in the years. Physicist Arthur Eddington, in 1928, made it an army of monkeys, with their goal being the writing all the books in the British Museum. By 1960 Bob Newhart had an infinite number of monkeys and typewriters, and a goal of all the great books. Stating the premise gets a laugh I doubt the setup would today. I’m curious whether Newhart brought the idea to the mass audience. (Google NGrams for “monkeys at typewriters” suggest that phrase was unwritten, in books, before about 1965.) We may owe Bob Newhart thanks for a lot of monkeys-at-typewriters jokes.

Newhart has a monkey hit on a line from Hamlet. I don’t know if it was Newhart that set the monkeys after Shakespeare particularly, rather than some other great work of writing. Shakespeare does seem to be the most common goal now. Sometimes the number of monkeys diminishes, to a thousand or even to one. Some people move the monkeys off of typewriters and onto computers. Some take the cowardly measure of putting the monkeys at “keyboards”. The word is ambiguous enough to allow for typewriters, computers, and maybe a Megenthaler Linotype. The monkeys now work 24 hours a day. This will be a comment someday about how bad we allowed pre-revolutionary capitalism to get.

The cultural legacy of monkeys-at-keyboards might well itself be infinite. It turns up in comic strips every few weeks at least. Television shows, usually writing for a comic beat, mention it. Computer nerds doing humor can’t resist the idea. Here’s a video of a 1979 Apple ][ program titled THE INFINITE NO. OF MONKEYS, which used this idea to show programming tricks. And it’s a great philosophical test case. If a random process puts together a play we find interesting, has it created art? No deliberate process creates a sunset, but we can find in it beauty and meaning. Why not words? There’s likely a book to write about the infinite monkeys in pop culture. Though the quotations of original materials would start to blend together.

But the big question. Have the monkeys got a chance? In a break from every probability question ever, the answer is: it depends on what the question precisely is. Occasional real-world experiments-cum-art-projects suggest that actual monkeys are worse typists than you’d think. They do more of bashing the keys with a stone before urinating on it, a reminder of how slight is the difference between humans and our fellow primates. So we turn to abstract monkeys who behave more predictably, and run experiments that need no ethical oversight.

So we must think what we mean by Shakespeare’s Plays. Arguably the play is a specific performance of actors in a set venue doing things. This is a bit much to expect of even a skilled abstract monkey. So let us switch to the book of a play. This has a more clear representation. It’s a string of characters. Mostly letters, some punctuation. Good chance there’s numerals in there. It’s probably a lot of characters. So the text to match is some specific, long string of characters in a particular order.

And what do we mean by a monkey at the keyboard? Well, we mean some process that picks characters randomly from the allowed set. When I see something is picked “randomly” I want to know what the distribution rule is. Like, are Q’s exactly as probable as E’s? As &’s? As %’s? How likely it is a particular string will get typed is easiest to answer if we suppose a “uniform” distribution. This means that every character is equally likely. We can quibble about capital and lowercase letters. My sense is most people frame the problem supposing case-insensitivity. That the monkey is doing fine to type “whaT beArD weRe i BEsT tO pLAy It iN?”. Or we could set the monkey at an old typesetter’s station, with separate keys for capital and lowercase letters. Some will even forgive the monkeys punctuating terribly. Make your choices. It affects the numbers, but not the point.

I’ll suppose there are 91 characters to pick from, as a Linotype keyboard had. So the monkey has capitals and lowercase and common punctuation to get right. Let your monkey pick one character. What is the chance it hit the first character of one of Shakespeare’s plays? Well, the chance is 1 in 91 that you’ve hit the first character of one specific play. There’s several dozen plays your monkey might be typing, though. I bet some of them even start with the same character, so giving an exact answer is tedious. If all we want monkey-typed Shakespeare plays, we’re being fussy if we want The Tempest typed up first and Cymbeline last. If we want a more tractable problem, it’s easier to insist on a set order.

So suppose we do have a set order. Then there’s a one-in-91 chance the first character matches the first character of the desired text. A one-in-91 chance the second character typed matches the second character of the desired text. A one-in-91 chance the third character typed matches the third character of the desired text. And so on, for the whole length of the play’s text. Getting one character right doesn’t make it more or less likely the next one is right. So the chance of getting a whole play correct is $\frac{1}{91}$ raised to the power of however many characters are in the first script. Call it 800,000 for argument’s sake. More characters, if you put two spaces between sentences. The prospects of getting this all correct is … dismal.

I mean, there’s some cause for hope. Spelling was much less fixed in Shakespeare’s time. There are acceptable variations for many of his words. It’d be silly to rule out a possible script that (say) wrote “look’d” or “look’t”, rather than “looked”. Still, that’s a slender thread.

But there is more reason to hope. Chances are the first monkey will botch the first character. But what if they get the first character of the text right on the second character struck? Or on the third character struck? It’s all right if there’s some garbage before the text comes up. Many writers have trouble starting and build from a first paragraph meant to be thrown away. After every wrong letter is a new chance to type the perfect thing, reassurance for us all.

Since the monkey does type, hypothetically, forever … well, so each character has a probability of only $\left(\frac{1}{91}\right)^{800,000}$ (or whatever) of starting the lucky sequence. The monkey will have $91^{800,000}$ chances to start. More chances than that.

And we don’t have only one monkey. We have a thousand monkeys. At least. A million monkeys. Maybe infinitely many monkeys. Each one, we trust, is working independently, owing to the monkeys’ strong sense of academic integrity. There are $91^{800,000}$ monkeys working on the project. And more than that. Each one takes their chance.

There are dizzying possibilities here. There’s the chance some monkey will get it all exactly right first time out. More. Think of a row of monkeys. What’s the chance the first thing the first monkey in the row types is the first character of the play? What’s the chance the first thing the second monkey in the row types is the second character of the play? The chance the first thing the third monkey in the row types is the third character in the play? What’s the chance a long enough row of monkeys happen to hit the right buttons so the whole play appears in one massive simultaneous stroke of the keys? Not any worse than the chance your one monkey will type this all out. Monkeys at keyboards are ergodic. It’s as good to have a few monkeys working a long while as to have many monkeys working a short while. The Mythical Man-Month is, for this project, mistaken.

That solves it then, doesn’t it? A monkey, or a team of monkeys, has a nonzero probability of typing out all Shakespeare’s plays. Or the works of Dickens. Or of Jorge Luis Borges. Whatever you like. Given infinitely many chances at it, they will, someday, succeed.

Except.

What is the chance that the monkeys screw up? They get the works of Shakespeare just right, but for a flaw. The monkeys’ Midsummer Night’s Dream insists on having the fearsome lion played by “Smaug the joiner” instead. This would send the play-within-the-play in novel directions. The result, though interesting, would not be Shakespeare. There’s a nonzero chance they’ll write the play that way. And so, given infinitely many chances, they will.

What’s the chance that they always will? That they just miss every single chance to write “Snug”. It comes out “Smaug” every time?

We can say. Call the probability that they make this Snug-to-Smaug typo any given time $p$. That’s a number from 0 to 1. 0 corresponds to not making this mistake; 1 to certainly making it. The chance they get it right is $1 - p$. The chance they make this mistake twice is smaller than $p$. The chance that they get it right at least once in two tries is closer to 1 than $1 - p$ is. The chance that, given three tries, they make the mistake every time is even smaller still. The chance that they get it right at least once is even closer to 1.

You see where this is going. Every extra try makes the chance they got it wrong every time smaller. Every extra try makes the chance they get it right at least once bigger. And now we can let some analysis come into play.

So give me a positive number. I don’t know your number, so I’ll call it ε. It’s how unlikely you want something to be before you say it won’t happen. Whatever your ε was, I can give you a number $M$. If the monkeys have taken more than $M$ tries, the chance they get it wrong every single time is smaller than your ε. The chance they get it right at least once is bigger than 1 – ε. Let the monkeys have infinitely many tries. The chance the monkey gets it wrong every single time is smaller than any positive number. So the chance the monkey gets it wrong every single time is zero. It … can’t happen, right? The chance they get it right at least once is closer to 1 than to any other number. So it must be 1. So it must be certain. Right?

But let me give you this. Detach a monkey from typewriter duty. This one has a coin to toss. It tosses fairly, with the coin having a 50% chance of coming up tails and 50% chance of coming up heads each time. The monkey tosses the coin infinitely many times. What is the chance the coin comes up tails every single one of these infinitely many times? The chance is zero, obviously. At least you can show the chance is smaller than any positive number. So, zero.

Yet … what power enforces that? What forces the monkey to eventually have a coin come up heads? It’s … nothing. Each toss is a fair toss. Each toss is independent of its predecessors. But there is no force that causes the monkey, after a hundred million billion trillion tosses of “tails”, to then toss “heads”. It’s the gambler’s fallacy to think there is one. The hundred million billion trillionth-plus-one toss is as likely to come up tails as the first toss is. It’s impossible that the monkey should toss tails infinitely many times. But there’s no reason it can’t happen. It’s also impossible that the monkeys still on the typewriters should get Shakespeare wrong every single time. But there’s no reason that can’t happen.

It’s unsettling. Well, probability is unsettling. If you don’t find it disturbing you haven’t thought long enough about it. Infinities, too, are unsettling so.

Formally, mathematicians interpret this — if not explain it — by saying the set of things that can happen is a “probability space”. The likelihood of something happening is what fraction of the probability space matches something happening. (I’m skipping a lot of background to say something that simple. Do not use this at your thesis defense without that background.) This sort of “impossible” event has “measure zero”. So its probability of happening is zero. Measure turns up in analysis, in understanding how calculus works. It complicates a bunch of otherwise-obvious ideas about continuity and stuff. It turns out to apply to probability questions too. Imagine the space of all the things that could possibly happen as being the real number line. Pick one number from that number line. What is the chance you have picked exactly the number -24.11390550338228506633488? I’ll go ahead and say you didn’t. It’s not that you couldn’t. It’s not impossible. It’s just that the chance that this happened, out of the infinity of possible outcomes, is zero.

The infinite monkeys give us this strange set of affairs. Some things have a probability of zero of happening, which does not rule out that they can. Some things have a probability of one of happening, which does not mean they must. I do not know what conclusion Borel ultimately drew about the reversibility problem. I expect his opinion to be that we have a clear answer, and unsettlingly great room for that answer to be incomplete.

This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. The next essay should come Friday and will, I hope, be shorter.

## My 2018 Mathematics A To Z: Hyperbolic Half-Plane

Today’s term was one of several nominations I got for ‘H’. This one comes from John Golden, @mathhobre on Twitter and author of the Math Hombre blog on Blogspot. He brings in a lot of thought about mathematics education and teaching tools that you might find interesting or useful or, better, both.

# Hyperbolic Half-Plane.

The half-plane part is easy to explain. By the “plane” mathematicians mean, well, the plane. What you’d get if a sheet of paper extended forever. Also if it had zero width. To cut it in half … well, first we have to think hard what we mean by cutting an infinitely large thing in half. Then we realize we’re overthinking this. Cut it by picking a line on the plane, and then throwing away everything on one side or the other of that line. Maybe throw away everything on the line too. It’s logically as good to pick any line. But there are a couple lines mathematicians use all the time. This is because they’re easy to describe, or easy to work with. At least once you fix an origin and, with it, x- and y-axes. The “right half-plane”, for example, is everything in the positive-x-axis direction. Every point with coordinates you’d describe with positive x-coordinate values. Maybe the non-negative ones, if you want the edge included. The “upper half plane” is everything in the positive-y-axis direction. All the points whose coordinates have a positive y-coordinate value. Non-negative, if you want the edge included. You can make guesses about what the “left half-plane” or the “lower half-plane” are. You are correct.

The “hyperbolic” part takes some thought. What is there to even exaggerate? Wrong sense of the word “hyperbolic”. The word here is the same one used in “hyperbolic geometry”. That takes explanation.

The Western mathematics tradition, as we trace it back to Ancient Greece and Ancient Egypt and Ancient Babylon and all, gave us “Euclidean” geometry. It’s a pretty good geometry. It describes how stuff on flat surfaces works. In the Euclidean formation we set out a couple of axioms that aren’t too controversial. Like, lines can be extended indefinitely and that all right angles are congruent. And one axiom that is controversial. But which turns out to be equivalent to the idea that there’s only one line that goes through a point and is parallel to some other line.

And it turns out that you don’t have to assume that. You can make a coherent “spherical” geometry, one that describes shapes on the surface of a … you know. You have to change your idea of what a line is; it becomes a “geodesic” or, on the globe, a “great circle”. And it turns out that there’s no lines geodesics that go through a point and that are parallel to some other line geodesic. (I know you want to think about globes. I do too. You maybe want to say the lines of latitude are parallel one another. They’re even called parallels, sometimes. So they are. But they’re not geodesics. They’re “little circles”. I am not throwing in ad hoc reasons I’m right and you’re not.)

There is another, though. This is “hyperbolic” geometry. This is the way shapes work on surfaces that mathematicians call saddle-shaped. I don’t know what the horse enthusiasts out there call these shapes. My guess is they chuckle and point out how that would be the most painful saddle ever. Doesn’t matter. We have surfaces. They act weird. You can draw, through a point, infinitely many lines parallel to a given other line.

That’s some neat stuff. That’s weird and interesting. They’re even called “hyperparallel lines” if that didn’t sound great enough. You can see why some people would find this worth studying. The catch is that it’s hard to order a pad of saddle-shaped paper to try stuff out on. It’s even harder to get a hyperbolic blackboard. So what we’d like is some way to represent these strange geometries using something easier to work with.

The hyperbolic half-plane is one of those approaches. This uses the upper half-plane. It works by a move as brilliant and as preposterous as that time Q told Data and LaForge how to stop that falling moon. “Simple. Change the gravitational constant of the universe.”

What we change here is the “metric”. The metric is a function. It tells us something about how points in a space relate to each other. It gives us distance. In Euclidean geometry, plane geometry, we use the Euclidean metric. You can find the distance between point A and point B by looking at their coordinates, $(x_A, y_A)$ and $(x_B, y_B)$. This distance is $\sqrt{\left(x_B - x_A\right)^2 + \left(y_B - y_A\right)^2}$. Don’t worry about the formulas. The lines on a sheet of graph paper are a reflection of this metric. Each line is (normally) a fixed distance from its parallel neighbors. (Yes, there are polar-coordinate graph papers. And there are graph papers with logarithmic or semilogarithmic spacing. I mean graph paper like you can find at the office supply store without asking for help.)

But the metric is something we choose. There are some rules it has to follow to be logically coherent, yes. But those rules give us plenty of room to play. By picking the correct metric, we can make this flat plane obey the same geometric rules as the hyperbolic surface. This metric looks more complicated than the Euclidean metric does, but only because it has more terms and takes longer to write out. What’s important about it is that the distance your thumb put on top of the paper covers up is bigger if your thumb is near the bottom of the upper-half plane than if your thumb is near the top of the paper.

So. There are now two things that are “lines” in this. One of them is vertical lines. The graph paper we would make for this has a nice file of parallel lines like ordinary paper does. The other thing, though … well, that’s half-circles. They’re half-circles with a center on the edge of the half-plane. So our graph paper would also have a bunch of circles, of different sizes, coming from regularly-spaced sources on the bottom of the paper. A line segment is a piece of either these vertical lines or these half-circles. You can make any polygon you like with these, if you pick out enough line segments. They’re there.

There are many ways to represent hyperbolic surfaces. This is one of them. It’s got some nice properties. One of them is that it’s “conformal”. Angles that you draw using this metric are the same size as those on the corresponding hyperbolic surface. You don’t appreciate how sweet that is until you’re working in non-Euclidean geometries. Circles that are entirely within the hyperbolic half-plane match to circles on a hyperbolic surface. Once you’ve got your intuition for this hyperbolic half-plane, you can step into hyperbolic half-volumes. And that lets you talk about the geometry of hyperbolic spaces that reach into four or more dimensions of human-imaginable spaces. Isometries — picking up a shape and moving it in ways that don’t change distance — match up with the Möbius Transformations. These are a well-understood set of altering planes that comes from a different corner of geometry. Also from that fellow with the strip, August Ferdinand Möbius. It’s always exciting to find relationships like that in mathematical structures.

Pictures often help. I don’t know why I don’t include them. But here is a web site with pages, and pictures, that describe much of the hyperbolic half-plane. It includes code to use with the Geometer Sketchpad software, which I have never used and know nothing about. That’s all right. There’s at least one page there showing a wondrous picture. I hope you enjoy.

This and other essays in the Fall 2018 A-To-Z should be at this link. And I’ll start paneling for more letters soon.

## My 2018 Mathematics A To Z: Group Action

I got several great suggestions for topics for ‘g’. The one that most caught my imagination was mathtuition88’s, the group action. Mathtuition88 is run by Mr Wu, a mathematics tutor in Singapore. His mathematics blog recounts his own explorations of interesting topics.

# Group Action.

This starts from groups. A group, here, means a pair of things. The first thing is a set of elements. The second is some operation. It takes a pair of things in the set and matches it to something in the set. For example, try the integers as the set, with addition as the operation. There are many kinds of groups you can make. There can be finite groups, ones with as few as one element or as many as you like. (The one-element groups are so boring. We usually need at least two to have much to say about them.) There can be infinite groups, like the integers. There can be discrete groups, where there’s always some minimum distance between elements. There can be continuous groups, like the real numbers, where there’s no smallest distance between distinct elements.

Groups came about from looking at how numbers work. So the first examples anyone gets are based on numbers. The integers, especially, and then the integers modulo something. For example, there’s $Z_2$, which has two numbers, 0 and 1. Addition works by the rule that 0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, and 1 + 1 = 0. There’s similar rules for $Z_3$, which has three numbers, 0, 1, and 2.

But after a few comfortable minutes on this, group theory moves on to more abstract things. Things with names like the “permutation group”. This starts with some set of things and we don’t even care what the things are. They can be numbers. They can be letters. They can be places. They can be anything. We don’t care. The group is all of the ways to swap elements around. All the relabellings we can do without losing or gaining an item. Or another, the “symmetry group”. This is, for some given thing — plates, blocks, and wallpaper patterns are great examples — all the ways you can rotate or move or reflect the thing without changing the way it looks.

And now we’re creeping up on what a “group action” is. Let me just talk about permutations here. These are where you swap around items. Like, start out with a list of items “1 2 3 4”. And pick out a permutation, say, swap the second with the fourth item. We write that, in shorthand, as (2 4). Maybe another permutation too. Say, swap the first item with the third. Write that out as (1 3). We can multiply these permutations together. Doing these permutations, in this order, has a particular effect: it swaps the second and fourth items, and swaps the first and third items. This is another permutation on these four items.

These permutations, these “swap this item with that” rules, are a group. The set for the group is instructions like “swap this with that”, or “swap this with that, and that with this other thing, and this other thing with the first thing”. Or even “leave this thing alone”. The operation between two things in the set is, do one and then the other. For example, (2 3) and then (3 4) has the effect of moving the second thing to the fourth spot, the (original) fourth thing to the third spot, and the original third thing to the second spot. That is, it’s the permutation (2 3 4). If you ever need something to doodle during a slow meeting, try working out all the ways you can shuffle around, say, six things. And what happens as you do all the possible combinations of these things. Hey, you’re only permuting six items. How many ways could that be?

So here’s what sounds like a fussy point. The group here is made up the ways you can permute these items. The items aren’t part of the group. They just gave us something to talk about. This is where I got so confused, as an undergraduate, working out groups and group actions.

When we move back to talking about the original items, then we get a group action. You get a group action by putting together a group with some set of things. Let me call the group ‘G’ and the set ‘X’. If I need something particular in the group I’ll call that ‘g’. If I need something particular from the set ‘X’ I’ll call that ‘x’. This is fairly standard mathematics notation. You see how subtly clever this notation is. The group action comes from taking things in G and applying them to things in X, to get things in X. Usually other things, but not always. In the lingo, we say the group action maps the pair of things G and X to the set X.

There are rules these actions have to follow. They’re what you would expect, if you’ve done any fiddling with groups. Don’t worry about them. What’s interesting is what we get from group actions.

First is group orbits. Take some ‘g’ out of the group G. Take some ‘x’ out of the set ‘X’. And build this new set. First, x. Then, whatever g does to x, which we write as ‘gx’. But ‘gx’ is still something in ‘X’, so … what does g do to that? So toss in ‘ggx’. Which is still something in ‘X’, so, toss in ‘gggx’. And ‘ggggx’. And keep going, until you stop getting new things. If ‘X’ is finite, this sequence has to be finite. It might be the whole set of X. It might be some subset of X. But if ‘X’ is finite, it’ll get back, eventually, to where you started, which is why we call this the “group orbit”. We use the same term even if X isn’t finite and we can’t guarantee that all these iterations of g on x eventually get back to the original x. This is a subgroup of X, based on the same group operation that G has.

There can be other special groups. Like, are there elements ‘g’ that map ‘x’ to ‘x’? Sure. The has to be at least one, since the group G has an identity element. There might be others. So, for any given ‘x’, what are all the elements in ‘g’ that don’t change it? The set of all the values of g for which gx is x is the “isotropy group” Gx. Or the “stabilizer subgroup”. This is a subgroup of G, based on x.

Yes, but the point?

Well, the biggest thing we get from group actions is the chance to put group theory principles to work on specific things. A group might describe the ways you can rotate or reflect a square plate without leaving an obvious change in the plate. The group action lets you make this about the plate. Much of modern physics is about learning how the geometry of a thing affects its behavior. This can be the obvious sorts of geometry, like, whether it’s rotationally symmetric. But it can be subtler things, like, whether the forces in the system are different at different times. Group actions let us put what we know from geometry and topology to work in specifics.

A particular favorite of mine is that they let us express the wallpaper groups. These are the ways we can use rotations and reflections and translations (linear displacements) to create different patterns. There are fewer different patterns than you might have guessed. (Different, here, overlooks such petty things as whether the repeated pattern is a diamond, a flower, or a hexagon. Or whether the pattern repeats every two inches versus every three inches.)

And they stay useful for abstract mathematical problems. All this talk about orbits and stabilizers lets us find something called the Orbit Stabilization Theorem. This connects the size of the group G to the size of orbits of x and of the stabilizer subgroups. This has the exciting advantage of letting us turn many proofs into counting arguments. A counting argument is just what you think: showing there’s as many of one thing as there are another. here’s a nice page about the Orbit Stabilization Theorem, and how to use it. This includes some nice, easy-to-understand problems like “how many different necklaces could you make with three red, two green, and one blue bead?” Or if that seems too mundane a problem, an equivalent one from organic chemistry: how many isomers of naphthol could there be? You see where these group actions give us useful information about specific problems.

If you should like a more detailed introduction, although one that supposes you’re more conversant with group theory than I do here, this is a good sequence: Group Actions I, which actually defines the things. Group actions II: the orbit-stabilizer theorem, which is about just what it says. Group actions III — what’s the point of them?, which has the sort of snappy title I like, but which gives points that make sense when you’re comfortable talking about quotient groups and isomorphisms and the like. And what I think is the last in the sequence, Group actions IV: intrinsic actions, which is about using group actions to prove stuff. And includes a mention of one of my favorite topics, the points the essay-writer just didn’t get the first time through. (And more; there’s a point where the essay goes wrong, and needs correction. I am not the Joseph who found the problem.)

## My 2018 Mathematics A To Z: Fermat’s Last Theorem

Today’s topic is another request, this one from a Dina. I’m not sure if this is Dina Yagodich, who’d also suggested using the letter ‘e’ for the number ‘e’. Trusting that it is, Dina Yagodich has a YouTube channel of mathematics videos. They cover topics like how to convert degrees and radians to one another, what the chance of a false positive (or false negative) on a medical test is, ways to solve differential equations, and how to use computer tools like MathXL, TI-83/84 calculators, or Matlab. If I’m mistaken, original-commenter Dina, please let me know and let me know if you have any creative projects that should be mentioned here.

# Fermat’s Last Theorem.

It comes to us from number theory. Like many great problems in number theory, it’s easy to understand. If you’ve heard of the Pythagorean Theorem you know, at least, there are triplets of whole numbers so that the first number squared plus the second number squared equals the third number squared. It’s easy to wonder about generalizing. Are there quartets of numbers, so the squares of the first three add up to the square of the fourth? Quintuplets? Sextuplets? … Oh, yes. That’s easy. What about triplets of whole numbers, including negative numbers? Yeah, and that turns out to be boring. Triplets of rational numbers? Turns out to be the same as triplets of whole numbers. Triplets of real-valued numbers? Turns out to be very boring. Triplets of complex-valued numbers? Also none too interesting.

Ah, but, what about a triplet of numbers, only raised to some other power? All three numbers raised to the first power is easy; we call that addition. To the third power, though? … The fourth? Any other whole number power? That’s hard. It’s hard finding, for any given power, a trio of numbers that work, although some come close. I’m informed there was an episode of The Simpsons which included, as a joke, the equation $1782^{12} + 1841^{12} = 1922^{12}$. If it were true, this would be enough to show Fermat’s Last Theorem was false. … Which happens. Sometimes, mathematicians believe they have found something which turns out to be wrong. Often this comes from noticing a pattern, and finding a proof for a specific case, and supposing the pattern holds up. This equation isn’t true, but it is correct for the first nine digits. An episode of The Wizard of Evergreen Terrace puts forth $3987^{12} + 4365^{12} = 4472^{12}$, which apparently matches ten digits. This includes the final digit, also known as “the only one anybody could check”. (The last digit of 398712 is 1. Last digit of 436512 is 5. Last digit of 447212 is 6, and there you go.) Really makes you think there’s something weird going on with 12th powers.

For a Fermat-like example, Leonhard Euler conjectured a thing about “Sums of Like Powers”. That for a whole number ‘n’, you need at least n whole numbers-raised-to-an-nth-power to equal something else raised to an n-th power. That is, you need at least three whole numbers raised to the third power to equal some other whole number raised to the third power. At least four whole numbers raised to the fourth power to equal something raised to the fourth power. At least five whole numbers raised to the fifth power to equal some number raised to the fifth power. Euler was wrong, in this case. L J Lander and T R Parkin published, in 1966, the one-paragraph paper Counterexample to Euler’s Conjecture on Sums of Like Powers. $27^5 + 84^5 + 110^5 + 133^5 = 144^5$ and there we go. Thanks, CDC 6600 computer!

But Fermat’s hypothesis. Let me put it in symbols. It’s easier than giving everything long, descriptive names. Suppose that the power ‘n’ is a whole number greater than 2. Then there are no three counting numbers ‘a’, ‘b’, and ‘c’ which make true the equation $a^n + b^n = c^n$. It looks doable. It looks like once you’ve mastered high school algebra you could do it. Heck, it looks like if you know the proof about how the square root of two is irrational you could approach it. Pierre de Fermat himself said he had a wonderful little proof of it.

He was wrong. No shame in that. He was right about a lot of mathematics, including a lot of stuff that leads into the basics of calculus. And he was right in his feeling that this $a^n + b^n = c^n$ stuff was impossible. He was wrong that he had a proof. At least not one that worked for every possible whole number ‘n’ larger than 2.

For specific values of ‘n’, though? Oh yes, that’s doable. Fermat did it himself for an ‘n’ of 4. Euler, a century later, filed in ‘n’ of 3. Peter Dirichlet, a great name in number theory and analysis, and Joseph-Louis Lagrange, who worked on everything, proved the case of ‘n’ of 5. Dirichlet, in 1832, proved the case for ‘n’ of 14. And there were more partial solutions. You could show that if Fermat’s Last Theorem were ever false, it would have to be false for some prime-number value of ‘n’. That’s great work, answering as it does infinitely many possible cases. It just leaves … infinitely many to go.

And that’s how things went for centuries. I don’t know that every mathematician made some attempt on Fermat’s Last Theorem. But it seems hard to imagine a person could love mathematics enough to spend their lives doing it and not at least take an attempt at it. Nobody ever found it, though. In a 1989 episode of Star Trek: The Next Generation, Captain Picard muses on how eight centuries after Fermat nobody’s proven his theorem. This struck me at the time as too pessimistic. Granted humans were stumped for 400 years. But for 800 years? And stumping everyone in a whole Federation of a thousand worlds? And more than a thousand mathematical traditions? And, for some of these species, tens of thousands of years of recorded history? … Still, there wasn’t much sign of the solving the problem. In 1992 Analog Science Fiction Magazine published a funny short-short story by Ian Randal Strock, “Fermat’s Legacy”. In it, Fermat — jealous of figures like René Descartes and Blaise Pascal who upstaged his mathematical accomplishments — jots down the note. He figures an unsupported claim like that will earn true lasting fame.

So that takes us to 1993, when the world heard about elliptic integrals for the first time. Elliptic curves are neat things. They’re polynomials. They have some nice mathematical properties. People first noticed them in studying how long arcs of ellipses are. (This is why they’re called elliptic curves, even though most of them have nothing to do with any ellipse you’d ever tolerate in your presence.) They look ready to use for encryption. And in 1985, Gerhard Frey noticed something. Suppose you did have, for some ‘n’ bigger than 2, a solution $a^n + b^n = c^n$. Then you could use that a, b, and n to make a new elliptic curve. That curve is the one that satisfies $y^2 = x\cdot\left(x - a^n\right)\cdot\left(x + b^n\right)$. And then that elliptic curve would not be “modular”.

I would like to tell you what it means for an elliptic curve to be modular. But getting to that point would take at least four subsidiary essays. MathWorld has a description of what it means to be modular, and even links to explaining terms like “meromorphic”. It’s getting exotic stuff.

Frey didn’t show whether elliptic curves of this time had to be modular or not. This is normal enough, for mathematicians. You want to find things which are true and interesting. This includes conjectures like this, that if elliptic curves are all modular then Fermat’s Last Theorem has to e true. Frey was working on consequences of the Taniyama-Shimura Conjecture, itself three decades old at that point. Yutaka Taniyama and Goro Shimura had found there seemed to be a link between elliptic curves and these “modular forms”, which are a kind of group. That is, a group-theory thing.

So in fall of 1993 I was taking an advanced, though still undergraduate, course in (not-high-school) algebra at Rutgers. It’s where we learn group theory, after Intro to Algebra introduced us to group theory. Some exciting news came out. This fellow named Andrew Wiles at Princeton had shown an impressive bunch of things. Most important, that the Taniyama-Shimura Conjecture was true for semistable elliptic curves. This includes the kind of elliptic curve Frey made out of solutions to Fermat’s Last Theorem. So the curves based on solutions to Fermat’s Last Theorem would have be modular. But Frey had shown any curves based on solutions to Fermat’s Last Theorem couldn’t be modular. The conclusion: there can’t be any solutions to Fermat’s Last Theorem. Our professor did his best to explain the proof to us. Abstract Algebra was the undergraduate course closest to the stuff Wiles was working on. It wasn’t very close. When you’re still trying to work out what it means for something to be an ideal it’s hard to even follow the setup of the problem. The proof itself was inaccessible.

Which is all right. Wiles’s original proof had some flaws. At least this mathematics major shrugged when that news came down and wondered, well, maybe it’ll be fixed someday. Maybe not. I remembered how exciting cold fusion was for about six weeks, too. But this someday didn’t take long. Wiles, with Richard Taylor, revised the proof and published about a year later. So far as I’m aware, nobody has any serious qualms about the proof.

So does knowing Fermat’s Last Theorem get us anything interesting? … And here is a sad anticlimax. It’s neat to know that $a^n + b^n = c^n$ can’t be true unless ‘n’ is 1 or 2, at least for positive whole numbers. But I’m not aware of any neat results that follow from that, or that would follow if it were untrue. There are results that follow from the Taniyama-Shimura Conjecture that are interesting, according to people who know them and don’t seem to be fibbing me. But Fermat’s Last Theorem turns out to be a cute little aside.

Which is not to say studying it was foolish. This easy-to-understand, hard-to-solve problem certainly attracted talented minds to think about mathematics. Mathematicians found interesting stuff in trying to solve it. Some of it might be slight. I learned that in a Pythagorean triplet — ‘a’, ‘b’, and ‘c’ with $a^2 + b^2 = c^2$ — that I was not the infinitely brilliant mathematician at age fifteen I hoped I might be. Also that if ‘a’, ‘b’, and ‘c’ are relatively prime, you can’t have ‘a’ and ‘b’ both odd and ‘c’ even. You had to have ‘c’ and either ‘a’ or ‘b’ odd, with the other number even. Other mathematicians of more nearly infinite ability found stuff of greater import. Ernst Eduard Kummer in the 19th century developed ideals. These are an important piece of group theory. He was busy proving special cases of Fermat’s Last Theorem.

Kind viewers have tried to retcon Picard’s statement about Fermat’s Last Theorem. They say Picard was really searching for the proof Fermat had, or believed he had. Something using the mathematical techniques available to the early 17th century. Or that follow closely enough from that. The Taniyama-Shimura Conjecture definitely isn’t it. I don’t buy the retcon, but I’m willing to play along for the sake of not causing trouble. I suspect there’s not a proof of the general case that uses anything Fermat could have recognized, or thought he had. That’s all right. The search for a thing can be useful even if the thing doesn’t exist.

## My 2018 Mathematics A To Z: e

I’m back to requests! Today’s comes from commenter Dina Yagodich. I don’t know whether Yagodich has a web site, YouTube channel, or other mathematics-discussion site, but am happy to pass along word if I hear of one.

# e.

Let me start by explaining integral calculus in two paragraphs. One of the things done in it is finding a definite integral’. This is itself a function. The definite integral has as its domain the combination of a function, plus some boundaries, and its range is numbers. Real numbers, if nobody tells you otherwise. Complex-valued numbers, if someone says it’s complex-valued numbers. Yes, it could have some other range. But if someone wants you to do that they’re obliged to set warning flares around the problem and precede and follow it with flag-bearers. And you get at least double pay for the hazardous work. The function that gets definite-integrated has its own domain and range. The boundaries of the definite integral have to be within the domain of the integrated function.

For real-valued functions this definite integral has a great physical interpretation. A real-valued function means the domain and range are both real numbers. You see a lot of these. Call the function ‘f’, please. Call its independent variable ‘x’ and its dependent variable ‘y’. Using Euclidean coordinates, or as normal people call it “graph paper”, draw the points that make true the equation “y = f(x)”. Then draw in the x-axis, that is, the points where “y = 0”. The boundaries of the definite integral are going to be two values of ‘x’, a lower and an upper bound. Call that lower bound ‘a’ and the upper bound ‘b’. And heck, call that a “left boundary” and a “right boundary”, because … I mean, look at them. Draw the vertical line at “x = a” and the vertical line at “x = b”. If ‘f(x)’ is always a positive number, then there’s a shape bounded below by “y = 0”, on the left by “x = a”, on the right by “x = b”, and above by “y = f(x)”. And the definite integral is the area of that enclosed space. If ‘f(x)’ is sometimes zero, then there’s several segments, but their combined area is the definite integral. If ‘f(x)’ is sometimes below zero, then there’s several segments. The definite integral is the sum of the areas of parts above “y = 0” minus the area of the parts below “y = 0”.

(Why say “left boundary” instead of “lower boundary”? Taste, pretty much. But I look at the words “lower boundary” and think about the lower edge, that is, the line where “y = 0” here. And “upper boundary” makes sense as a way to describe the curve where “y = f(x)” as well as “x = b”. I’m confusing enough without making the simple stuff ambiguous.)

Don’t try to pass your thesis defense on this alone. But it’s what you need to understand ‘e’. Start out with the function ‘f’, which has domain of the positive real numbers and range of the positive real numbers. For every ‘x’ in the domain, ‘f(x)’ is the reciprocal, one divided by x. This is a shape you probably know well. It’s a hyperbola. Its asymptotes are the x-axis and the y-axis. It’s a nice gentle curve. Its plot passes through such famous points as (1, 1), (2, 1/2), (1/3, 3), and pairs like that. (10, 1/10) and (1/100, 100) too. ‘f(x)’ is always positive on this domain. Use as left boundary the line “x = 1”. And then — let’s think about different right boundaries.

If the right boundary is close to the left boundary, then this area is tiny. If it’s at, like, “x = 1.1” then the area can’t be more than 0.1. (It’s less than that. If you don’t see why that’s so, fit a rectangle of height 1 and width 0.1 around this curve and these boundaries. See?) But if the right boundary is farther out, this area is more. It’s getting bigger if the right boundary is “x = 2” or “x = 3”. It can get bigger yet. Give me any positive number you like. I can find a right boundary so the area inside this is bigger than your number.

Is there a right boundary where the area is exactly 1? … Well, it’s hard to see how there couldn’t be. If a quantity (“area between x = 1 and x = b”) changes from less than one to greater than one, it’s got to pass through 1, right? … Yes, it does, provided some technical points are true, and in this case they are. So that’s nice.

And there is. It’s a number (settle down, I see you quivering with excitement back there, waiting for me to unveil this) a slight bit more than 2.718. It’s a neat number. Carry it out a couple more digits and it turns out to be 2.718281828. So it looks like a great candidate to memorize. It’s not. It’s an irrational number. The digits go off without repeating or falling into obvious patterns after that. It’s a transcendental number, which has to do with polynomials. Nobody knows whether it’s a normal number, because remember, a normal number is just any real number that you never heard of. To be a normal number, every finite string of digits has to appear in the decimal expansion, just as often as every other string of digits of the same length. We can show by clever counting arguments that roughly every number is normal. Trick is it’s hard to show that any particular number is.

So let me do another definite integral. Set the left boundary to this “x = 2.718281828(etc)”. Set the right boundary a little more than that. The enclosed area is less than 1. Set the right boundary way off to the right. The enclosed area is more than 1. What right boundary makes the enclosed area ‘1’ again? … Well, that will be at about “x = 7.389”. That is, at the square of 2.718281828(etc).

Repeat this. Set the left boundary at “x = (2.718281828etc)2”. Where does the right boundary have to be so the enclosed area is 1? … Did you guess “x = (2.718281828etc)3”? Yeah, of course. You know my rhetorical tricks. What do you want to guess the area is between, oh, “x = (2.718281828etc)3” and “x = (2.718281828etc)5”? (Notice I put a ‘5’ in the superscript there.)

Now, relationships like this will happen with other functions, and with other left- and right-boundaries. But if you want it to work with a function whose rule is as simple as “f(x) = 1 / x”, and areas of 1, then you’re going to end up noticing this 2.718281828(etc). It stands out. It’s worthy of a name.

Which is why this 2.718281828(etc) is a number you’ve heard of. It’s named ‘e’. Leonhard Euler, whom you will remember as having written or proved the fundamental theorem for every area of mathematics ever, gave it that name. He used it first when writing for his own work. Then (in November 1731) in a letter to Christian Goldbach. Finally (in 1763) in his textbook Mechanica. Everyone went along with him because Euler knew how to write about stuff, and how to pick symbols that worked for stuff.

Once you know ‘e’ is there, you start to see it everywhere. In Western mathematics it seems to have been first noticed by Jacob (I) Bernoulli, who noticed it in toy compound interest problems. (Given this, I’d imagine it has to have been noticed by the people who did finance. But I am ignorant of the history of financial calculations. Writers of the kind of pop-mathematics history I read don’t notice them either.) Bernoulli and Pierre Raymond de Montmort noticed the reciprocal of ‘e’ turning up in what we’ve come to call the ‘hat check problem’. A large number of guests all check one hat each. The person checking hats has no idea who anybody is. What is the chance that nobody gets their correct hat back? … That chance is the reciprocal of ‘e’. The number’s about 0.368. In a connected but not identical problem, suppose something has one chance in some number ‘N’ of happening each attempt. And it’s given ‘N’ attempts given for it to happen. What’s the chance that it doesn’t happen? The bigger ‘N’ gets, the closer the chance it doesn’t happen gets to the reciprocal of ‘e’.

It comes up in peculiar ways. In high school or freshman calculus you see it defined as what you get if you take $\left(1 + \frac{1}{x}\right)^x$ for ever-larger real numbers ‘x’. (This is the toy-compound-interest problem Bernoulli found.) But you can find the number other ways. Tou can calculate it — if you have the stamina — by working out the value of

$1 + 1 + \frac12\left( 1 + \frac13\left( 1 + \frac14\left( 1 + \frac15\left( 1 + \cdots \right)\right)\right)\right)$

There’s a simpler way to write that. There always is. Take all the nonnegative whole numbers — 0, 1, 2, 3, 4, and so on. Take their factorials. That’s 1, 1, 2, 6, 24, and so on. Take the reciprocals of all those. That’s … 1, 1, one-half, one-sixth, one-twenty-fourth, and so on. Add them all together. That’s ‘e’.

This ‘e’ turns up all the time. Any system whose rate of growth depends on its current value has an ‘e’ lurking in its description. That’s true if it declines, too, as long as the decline depends on its current value. It gets stranger. Cross ‘e’ with complex-valued numbers and you get, not just growth or decay, but oscillations. And many problems that are hard to solve to start with become doable, even simple, if you rewrite them as growths and decays and oscillations. Through ‘e’ problems too hard to do become problems of polynomials, or even simpler things.

Simple problems become that too. That property about the area underneath “f(x) = 1/x” between “x = 1” and “x = b” makes ‘e’ such a natural base for logarithms that we call it the base for natural logarithms. Logarithms let us replace multiplication with addition, and division with subtraction, easier work. They change exponentiation problems to multiplication, again easier. It’s a strange touch, a wondrous one.

There are some numbers interesting enough to attract books about them. π, obviously. 0. The base of imaginary numbers, $\imath$, has a couple. I only know one pop-mathematics treatment of ‘e’, Eli Maor’s e: The Story Of A Number. I believe there’s room for more.

Oh, one little remarkable thing that’s of no use whatsoever. Mathworld’s page about approximations to ‘e’ mentions this. Work out, if you can coax your calculator into letting you do this, the number:

$\left(1 + 9^{-(4^{(42)})}\right)^{\left(3^{(2^{85})}\right)}$

You know, the way anyone’s calculator will let you raise 2 to the 85th power. And then raise 3 to whatever number that is. Anyway. The digits of this will agree with the digits of ‘e’ for the first 18,457,734,525,360,901,453,873,570 decimal digits. One Richard Sabey found that, by what means I do not know, in 2004. The page linked there includes a bunch of other, no less amazing, approximations to numbers like ‘e’ and π and the Euler-Mascheroni Constant.

## My 2018 Mathematics A To Z: Distribution (probability)

Today’s term ended up being a free choice. Nobody found anything appealing in the D’s to ask about. That’s all right.

I’m still looking for topics for the letters G through M, excluding L, if you’d like in on those letters.

And for my own sake, please check out the Playful Mathematics Education Blog Carnival, #121, if you haven’t already.

# Distribution (probability).

I have to specify. There’s a bunch of mathematics concepts called `distribution’. Some of them are linked. Some of them are just called that because we don’t have a better word. Like, what else would you call multiplying the sum of something? I want to describe a distribution that comes to us in probability and in statistics. Through these it runs through modern physics, as well as truly difficult sciences like sociology and economics.

We get to distributions through random variables. These are variables that might be any one of multiple possible values. There might be as few as two options. There might be a finite number of possibilities. There might be infinitely many. They might be numbers. At the risk of sounding unimaginative, they often are. We’re always interested in measuring things. And we’re used to measuring them in numbers.

What makes random variables hard to deal with is that, if we’re playing by the rules, we never know what it is. Once we get through (high school) algebra we’re comfortable working with an ‘x’ whose value we don’t know. But that’s because we trust that, if we really cared, we would find out what it is. Or we would know that it’s a ‘dummy variable’, whose value is unimportant but gets us to something that is. A random variable is different. Its value matters, but we can’t know what it is.

Instead we get a distribution. This is a function which gives us information about what the outcomes are, and how likely they are. There are different ways to organize this data. If whoever’s talking about it doesn’t say just what they’re doing, bet on it being a “probability distribution function”. This follows slightly different rules based on whether the range of values is discrete or continuous, but the idea is roughly the same. Every possible outcome has a probability at least zero but not more than one. The total probability over every possible outcome is exactly one. There’s rules about the probability of two distinct outcomes happening. Stuff like that.

Distributions are interesting enough when they’re about fixed things. In learning probability this is stuff like hands of cards or totals of die rolls or numbers of snowstorms in the season. Fun enough. These get to be more personal when we take a census, or otherwise sample things that people do. There’s something wondrous in knowing that while, say, you might not know how long a commute your neighbor has, you know there’s an 80 percent change it’s between 15 and 25 minutes (or whatever). It’s also good for urban planners to know.

It gets exciting when we look at how distributions can change. It’s hard not to think of that as “changing over time”. (You could make a fair argument that “change” is “time”.) But it doesn’t have to. We can take a function with a domain that contains all the possible values in the distribution, and a range that’s something else. The image of the distribution is some new distribution. (Trusting that the function doesn’t do something naughty.) These functions — these mappings — might reflect nothing more than relabelling, going from (say) a distribution of “false and true” values to one of “-5 and 5” values instead. They might reflect regathering data; say, going from the distribution of a die’s outcomes of “1, 2, 3, 4, 5, or 6” to something simpler, like, “less than two, exactly two, or more than two”. Or they might reflect how something does change in time. They’re all mappings; they’re all ways to change what a distribution represents.

These mappings turn up in statistical mechanics. Processes will change the distribution of positions and momentums and electric charges and whatever else the things moving around do. It’s hard to learn. At least my first instinct was to try to warm up to it by doing a couple test cases. Pick specific values for the random variables and see how they change. This can help build confidence that one’s calculating correctly. Maybe give some idea of what sorts of behaviors to expect.

But it’s calculating the wrong thing. You need to look at the distribution as a specific thing, and how that changes. It’s a change of view. It’s like the change in view from thinking of a position as an x- and y- and maybe z-coordinate to thinking of position as a vector. (Which, I realize now, gave me slightly similar difficulties in thinking of what to do for any particular calculation.)

Distributions can change in time, just the way that — in simpler physics — positions might change. Distributions might stabilize, forming an equilibrium. This can mean that everything’s found a place to stop and rest. That will never happen for any interesting problem. What you might get is an equilibrium like the rings of Saturn. Everything’s moving, everything’s changing, but the overall shape stays the same. (Roughly.)

There are many specifically named distributions. They represent patterns that turn up all the time. The binomial distribution, for example, which represents what to expect if you have a lot of examples of something that can be one of two values each. The Poisson distribution, for representing how likely something that could happen any time (or any place) will happen in a particular span of time (or space). The normal distribution, also called the Gaussian distribution, which describes everything that isn’t trying to be difficult. There are like 400 billion dozen more named ones, each really good at describing particular kinds of problems. But they’re all distributions.