My 2018 Mathematics A To Z: Hyperbolic Half-Plane


Today’s term was one of several nominations I got for ‘H’. This one comes from John Golden, @mathhobre on Twitter and author of the Math Hombre blog on Blogspot. He brings in a lot of thought about mathematics education and teaching tools that you might find interesting or useful or, better, both.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Hyperbolic Half-Plane.

The half-plane part is easy to explain. By the “plane” mathematicians mean, well, the plane. What you’d get if a sheet of paper extended forever. Also if it had zero width. To cut it in half … well, first we have to think hard what we mean by cutting an infinitely large thing in half. Then we realize we’re overthinking this. Cut it by picking a line on the plane, and then throwing away everything on one side or the other of that line. Maybe throw away everything on the line too. It’s logically as good to pick any line. But there are a couple lines mathematicians use all the time. This is because they’re easy to describe, or easy to work with. At least once you fix an origin and, with it, x- and y-axes. The “right half-plane”, for example, is everything in the positive-x-axis direction. Every point with coordinates you’d describe with positive x-coordinate values. Maybe the non-negative ones, if you want the edge included. The “upper half plane” is everything in the positive-y-axis direction. All the points whose coordinates have a positive y-coordinate value. Non-negative, if you want the edge included. You can make guesses about what the “left half-plane” or the “lower half-plane” are. You are correct.

The “hyperbolic” part takes some thought. What is there to even exaggerate? Wrong sense of the word “hyperbolic”. The word here is the same one used in “hyperbolic geometry”. That takes explanation.

The Western mathematics tradition, as we trace it back to Ancient Greece and Ancient Egypt and Ancient Babylon and all, gave us “Euclidean” geometry. It’s a pretty good geometry. It describes how stuff on flat surfaces works. In the Euclidean formation we set out a couple of axioms that aren’t too controversial. Like, lines can be extended indefinitely and that all right angles are congruent. And one axiom that is controversial. But which turns out to be equivalent to the idea that there’s only one line that goes through a point and is parallel to some other line.

And it turns out that you don’t have to assume that. You can make a coherent “spherical” geometry, one that describes shapes on the surface of a … you know. You have to change your idea of what a line is; it becomes a “geodesic” or, on the globe, a “great circle”. And it turns out that there’s no lines geodesics that go through a point and that are parallel to some other line geodesic. (I know you want to think about globes. I do too. You maybe want to say the lines of latitude are parallel one another. They’re even called parallels, sometimes. So they are. But they’re not geodesics. They’re “little circles”. I am not throwing in ad hoc reasons I’m right and you’re not.)

There is another, though. This is “hyperbolic” geometry. This is the way shapes work on surfaces that mathematicians call saddle-shaped. I don’t know what the horse enthusiasts out there call these shapes. My guess is they chuckle and point out how that would be the most painful saddle ever. Doesn’t matter. We have surfaces. They act weird. You can draw, through a point, infinitely many lines parallel to a given other line.

That’s some neat stuff. That’s weird and interesting. They’re even called “hyperparallel lines” if that didn’t sound great enough. You can see why some people would find this worth studying. The catch is that it’s hard to order a pad of saddle-shaped paper to try stuff out on. It’s even harder to get a hyperbolic blackboard. So what we’d like is some way to represent these strange geometries using something easier to work with.

The hyperbolic half-plane is one of those approaches. This uses the upper half-plane. It works by a move as brilliant and as preposterous as that time Q told Data and LaForge how to stop that falling moon. “Simple. Change the gravitational constant of the universe.”

What we change here is the “metric”. The metric is a function. It tells us something about how points in a space relate to each other. It gives us distance. In Euclidean geometry, plane geometry, we use the Euclidean metric. You can find the distance between point A and point B by looking at their coordinates, (x_A, y_A) and (x_B, y_B) . This distance is \sqrt{\left(x_B - x_A\right)^2 + \left(y_B - y_A\right)^2} . Don’t worry about the formulas. The lines on a sheet of graph paper are a reflection of this metric. Each line is (normally) a fixed distance from its parallel neighbors. (Yes, there are polar-coordinate graph papers. And there are graph papers with logarithmic or semilogarithmic spacing. I mean graph paper like you can find at the office supply store without asking for help.)

But the metric is something we choose. There are some rules it has to follow to be logically coherent, yes. But those rules give us plenty of room to play. By picking the correct metric, we can make this flat plane obey the same geometric rules as the hyperbolic surface. This metric looks more complicated than the Euclidean metric does, but only because it has more terms and takes longer to write out. What’s important about it is that the distance your thumb put on top of the paper covers up is bigger if your thumb is near the bottom of the upper-half plane than if your thumb is near the top of the paper.

So. There are now two things that are “lines” in this. One of them is vertical lines. The graph paper we would make for this has a nice file of parallel lines like ordinary paper does. The other thing, though … well, that’s half-circles. They’re half-circles with a center on the edge of the half-plane. So our graph paper would also have a bunch of circles, of different sizes, coming from regularly-spaced sources on the bottom of the paper. A line segment is a piece of either these vertical lines or these half-circles. You can make any polygon you like with these, if you pick out enough line segments. They’re there.

There are many ways to represent hyperbolic surfaces. This is one of them. It’s got some nice properties. One of them is that it’s “conformal”. Angles that you draw using this metric are the same size as those on the corresponding hyperbolic surface. You don’t appreciate how sweet that is until you’re working in non-Euclidean geometries. Circles that are entirely within the hyperbolic half-plane match to circles on a hyperbolic surface. Once you’ve got your intuition for this hyperbolic half-plane, you can step into hyperbolic half-volumes. And that lets you talk about the geometry of hyperbolic spaces that reach into four or more dimensions of human-imaginable spaces. Isometries — picking up a shape and moving it in ways that don’t change distance — match up with the Möbius Transformations. These are a well-understood set of altering planes that comes from a different corner of geometry. Also from that fellow with the strip, August Ferdinand Möbius. It’s always exciting to find relationships like that in mathematical structures.

Pictures often help. I don’t know why I don’t include them. But here is a web site with pages, and pictures, that describe much of the hyperbolic half-plane. It includes code to use with the Geometer Sketchpad software, which I have never used and know nothing about. That’s all right. There’s at least one page there showing a wondrous picture. I hope you enjoy.


This and other essays in the Fall 2018 A-To-Z should be at this link. And I’ll start paneling for more letters soon.

Advertisements

My 2018 Mathematics A To Z: Group Action


I got several great suggestions for topics for ‘g’. The one that most caught my imagination was mathtuition88’s, the group action. Mathtuition88 is run by Mr Wu, a mathematics tutor in Singapore. His mathematics blog recounts his own explorations of interesting topics.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Group Action.

This starts from groups. A group, here, means a pair of things. The first thing is a set of elements. The second is some operation. It takes a pair of things in the set and matches it to something in the set. For example, try the integers as the set, with addition as the operation. There are many kinds of groups you can make. There can be finite groups, ones with as few as one element or as many as you like. (The one-element groups are so boring. We usually need at least two to have much to say about them.) There can be infinite groups, like the integers. There can be discrete groups, where there’s always some minimum distance between elements. There can be continuous groups, like the real numbers, where there’s no smallest distance between distinct elements.

Groups came about from looking at how numbers work. So the first examples anyone gets are based on numbers. The integers, especially, and then the integers modulo something. For example, there’s Z_2 , which has two numbers, 0 and 1. Addition works by the rule that 0 + 0 = 0, 0 + 1 = 1, 1 + 0 = 1, and 1 + 1 = 0. There’s similar rules for Z_3 , which has three numbers, 0, 1, and 2.

But after a few comfortable minutes on this, group theory moves on to more abstract things. Things with names like the “permutation group”. This starts with some set of things and we don’t even care what the things are. They can be numbers. They can be letters. They can be places. They can be anything. We don’t care. The group is all of the ways to swap elements around. All the relabellings we can do without losing or gaining an item. Or another, the “symmetry group”. This is, for some given thing — plates, blocks, and wallpaper patterns are great examples — all the ways you can rotate or move or reflect the thing without changing the way it looks.

And now we’re creeping up on what a “group action” is. Let me just talk about permutations here. These are where you swap around items. Like, start out with a list of items “1 2 3 4”. And pick out a permutation, say, swap the second with the fourth item. We write that, in shorthand, as (2 4). Maybe another permutation too. Say, swap the first item with the third. Write that out as (1 3). We can multiply these permutations together. Doing these permutations, in this order, has a particular effect: it swaps the second and fourth items, and swaps the first and third items. This is another permutation on these four items.

These permutations, these “swap this item with that” rules, are a group. The set for the group is instructions like “swap this with that”, or “swap this with that, and that with this other thing, and this other thing with the first thing”. Or even “leave this thing alone”. The operation between two things in the set is, do one and then the other. For example, (2 3) and then (3 4) has the effect of moving the second thing to the fourth spot, the (original) fourth thing to the third spot, and the original third thing to the second spot. That is, it’s the permutation (2 3 4). If you ever need something to doodle during a slow meeting, try working out all the ways you can shuffle around, say, six things. And what happens as you do all the possible combinations of these things. Hey, you’re only permuting six items. How many ways could that be?

So here’s what sounds like a fussy point. The group here is made up the ways you can permute these items. The items aren’t part of the group. They just gave us something to talk about. This is where I got so confused, as an undergraduate, working out groups and group actions.

When we move back to talking about the original items, then we get a group action. You get a group action by putting together a group with some set of things. Let me call the group ‘G’ and the set ‘X’. If I need something particular in the group I’ll call that ‘g’. If I need something particular from the set ‘X’ I’ll call that ‘x’. This is fairly standard mathematics notation. You see how subtly clever this notation is. The group action comes from taking things in G and applying them to things in X, to get things in X. Usually other things, but not always. In the lingo, we say the group action maps the pair of things G and X to the set X.

There are rules these actions have to follow. They’re what you would expect, if you’ve done any fiddling with groups. Don’t worry about them. What’s interesting is what we get from group actions.

First is group orbits. Take some ‘g’ out of the group G. Take some ‘x’ out of the set ‘X’. And build this new set. First, x. Then, whatever g does to x, which we write as ‘gx’. But ‘gx’ is still something in ‘X’, so … what does g do to that? So toss in ‘ggx’. Which is still something in ‘X’, so, toss in ‘gggx’. And ‘ggggx’. And keep going, until you stop getting new things. If ‘X’ is finite, this sequence has to be finite. It might be the whole set of X. It might be some subset of X. But if ‘X’ is finite, it’ll get back, eventually, to where you started, which is why we call this the “group orbit”. We use the same term even if X isn’t finite and we can’t guarantee that all these iterations of g on x eventually get back to the original x. This is a subgroup of X, based on the same group operation that G has.

There can be other special groups. Like, are there elements ‘g’ that map ‘x’ to ‘x’? Sure. The has to be at least one, since the group G has an identity element. There might be others. So, for any given ‘x’, what are all the elements in ‘g’ that don’t change it? The set of all the values of g for which gx is x is the “isotropy group” Gx. Or the “stabilizer subgroup”. This is a subgroup of G, based on x.

Yes, but the point?

Well, the biggest thing we get from group actions is the chance to put group theory principles to work on specific things. A group might describe the ways you can rotate or reflect a square plate without leaving an obvious change in the plate. The group action lets you make this about the plate. Much of modern physics is about learning how the geometry of a thing affects its behavior. This can be the obvious sorts of geometry, like, whether it’s rotationally symmetric. But it can be subtler things, like, whether the forces in the system are different at different times. Group actions let us put what we know from geometry and topology to work in specifics.

A particular favorite of mine is that they let us express the wallpaper groups. These are the ways we can use rotations and reflections and translations (linear displacements) to create different patterns. There are fewer different patterns than you might have guessed. (Different, here, overlooks such petty things as whether the repeated pattern is a diamond, a flower, or a hexagon. Or whether the pattern repeats every two inches versus every three inches.)

And they stay useful for abstract mathematical problems. All this talk about orbits and stabilizers lets us find something called the Orbit Stabilization Theorem. This connects the size of the group G to the size of orbits of x and of the stabilizer subgroups. This has the exciting advantage of letting us turn many proofs into counting arguments. A counting argument is just what you think: showing there’s as many of one thing as there are another. here’s a nice page about the Orbit Stabilization Theorem, and how to use it. This includes some nice, easy-to-understand problems like “how many different necklaces could you make with three red, two green, and one blue bead?” Or if that seems too mundane a problem, an equivalent one from organic chemistry: how many isomers of naphthol could there be? You see where these group actions give us useful information about specific problems.


If you should like a more detailed introduction, although one that supposes you’re more conversant with group theory than I do here, this is a good sequence: Group Actions I, which actually defines the things. Group actions II: the orbit-stabilizer theorem, which is about just what it says. Group actions III — what’s the point of them?, which has the sort of snappy title I like, but which gives points that make sense when you’re comfortable talking about quotient groups and isomorphisms and the like. And what I think is the last in the sequence, Group actions IV: intrinsic actions, which is about using group actions to prove stuff. And includes a mention of one of my favorite topics, the points the essay-writer just didn’t get the first time through. (And more; there’s a point where the essay goes wrong, and needs correction. I am not the Joseph who found the problem.)

My 2018 Mathematics A To Z: Fermat’s Last Theorem


Today’s topic is another request, this one from a Dina. I’m not sure if this is Dina Yagodich, who’d also suggested using the letter ‘e’ for the number ‘e’. Trusting that it is, Dina Yagodich has a YouTube channel of mathematics videos. They cover topics like how to convert degrees and radians to one another, what the chance of a false positive (or false negative) on a medical test is, ways to solve differential equations, and how to use computer tools like MathXL, TI-83/84 calculators, or Matlab. If I’m mistaken, original-commenter Dina, please let me know and let me know if you have any creative projects that should be mentioned here.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Fermat’s Last Theorem.

It comes to us from number theory. Like many great problems in number theory, it’s easy to understand. If you’ve heard of the Pythagorean Theorem you know, at least, there are triplets of whole numbers so that the first number squared plus the second number squared equals the third number squared. It’s easy to wonder about generalizing. Are there quartets of numbers, so the squares of the first three add up to the square of the fourth? Quintuplets? Sextuplets? … Oh, yes. That’s easy. What about triplets of whole numbers, including negative numbers? Yeah, and that turns out to be boring. Triplets of rational numbers? Turns out to be the same as triplets of whole numbers. Triplets of real-valued numbers? Turns out to be very boring. Triplets of complex-valued numbers? Also none too interesting.

Ah, but, what about a triplet of numbers, only raised to some other power? All three numbers raised to the first power is easy; we call that addition. To the third power, though? … The fourth? Any other whole number power? That’s hard. It’s hard finding, for any given power, a trio of numbers that work, although some come close. I’m informed there was an episode of The Simpsons which included, as a joke, the equation 1782^{12} + 1841^{12} = 1922^{12} . If it were true, this would be enough to show Fermat’s Last Theorem was false. … Which happens. Sometimes, mathematicians believe they have found something which turns out to be wrong. Often this comes from noticing a pattern, and finding a proof for a specific case, and supposing the pattern holds up. This equation isn’t true, but it is correct for the first nine digits. An episode of The Wizard of Evergreen Terrace puts forth 3987^{12} + 4365^{12} = 4472^{12} , which apparently matches ten digits. This includes the final digit, also known as “the only one anybody could check”. (The last digit of 398712 is 1. Last digit of 436512 is 5. Last digit of 447212 is 6, and there you go.) Really makes you think there’s something weird going on with 12th powers.

For a Fermat-like example, Leonhard Euler conjectured a thing about “Sums of Like Powers”. That for a whole number ‘n’, you need at least n whole numbers-raised-to-an-nth-power to equal something else raised to an n-th power. That is, you need at least three whole numbers raised to the third power to equal some other whole number raised to the third power. At least four whole numbers raised to the fourth power to equal something raised to the fourth power. At least five whole numbers raised to the fifth power to equal some number raised to the fifth power. Euler was wrong, in this case. L J Lander and T R Parkin published, in 1966, the one-paragraph paper Counterexample to Euler’s Conjecture on Sums of Like Powers. 27^5 + 84^5 + 110^5 + 133^5 = 144^5 and there we go. Thanks, CDC 6600 computer!

But Fermat’s hypothesis. Let me put it in symbols. It’s easier than giving everything long, descriptive names. Suppose that the power ‘n’ is a whole number greater than 2. Then there are no three counting numbers ‘a’, ‘b’, and ‘c’ which make true the equation a^n + b^n = c^n . It looks doable. It looks like once you’ve mastered high school algebra you could do it. Heck, it looks like if you know the proof about how the square root of two is irrational you could approach it. Pierre de Fermat himself said he had a wonderful little proof of it.

He was wrong. No shame in that. He was right about a lot of mathematics, including a lot of stuff that leads into the basics of calculus. And he was right in his feeling that this a^n + b^n = c^n stuff was impossible. He was wrong that he had a proof. At least not one that worked for every possible whole number ‘n’ larger than 2.

For specific values of ‘n’, though? Oh yes, that’s doable. Fermat did it himself for an ‘n’ of 4. Euler, a century later, filed in ‘n’ of 3. Peter Dirichlet, a great name in number theory and analysis, and Joseph-Louis Lagrange, who worked on everything, proved the case of ‘n’ of 5. Dirichlet, in 1832, proved the case for ‘n’ of 14. And there were more partial solutions. You could show that if Fermat’s Last Theorem were ever false, it would have to be false for some prime-number value of ‘n’. That’s great work, answering as it does infinitely many possible cases. It just leaves … infinitely many to go.

And that’s how things went for centuries. I don’t know that every mathematician made some attempt on Fermat’s Last Theorem. But it seems hard to imagine a person could love mathematics enough to spend their lives doing it and not at least take an attempt at it. Nobody ever found it, though. In a 1989 episode of Star Trek: The Next Generation, Captain Picard muses on how eight centuries after Fermat nobody’s proven his theorem. This struck me at the time as too pessimistic. Granted humans were stumped for 400 years. But for 800 years? And stumping everyone in a whole Federation of a thousand worlds? And more than a thousand mathematical traditions? And, for some of these species, tens of thousands of years of recorded history? … Still, there wasn’t much sign of the solving the problem. In 1992 Analog Science Fiction Magazine published a funny short-short story by Ian Randal Strock, “Fermat’s Legacy”. In it, Fermat — jealous of figures like René Descartes and Blaise Pascal who upstaged his mathematical accomplishments — jots down the note. He figures an unsupported claim like that will earn true lasting fame.

So that takes us to 1993, when the world heard about elliptic integrals for the first time. Elliptic curves are neat things. They’re polynomials. They have some nice mathematical properties. People first noticed them in studying how long arcs of ellipses are. (This is why they’re called elliptic curves, even though most of them have nothing to do with any ellipse you’d ever tolerate in your presence.) They look ready to use for encryption. And in 1985, Gerhard Frey noticed something. Suppose you did have, for some ‘n’ bigger than 2, a solution a^n + b^n = c^n . Then you could use that a, b, and n to make a new elliptic curve. That curve is the one that satisfies y^2 = x\cdot\left(x - a^n\right)\cdot\left(x + b^n\right) . And then that elliptic curve would not be “modular”.

I would like to tell you what it means for an elliptic curve to be modular. But getting to that point would take at least four subsidiary essays. MathWorld has a description of what it means to be modular, and even links to explaining terms like “meromorphic”. It’s getting exotic stuff.

Frey didn’t show whether elliptic curves of this time had to be modular or not. This is normal enough, for mathematicians. You want to find things which are true and interesting. This includes conjectures like this, that if elliptic curves are all modular then Fermat’s Last Theorem has to e true. Frey was working on consequences of the Taniyama-Shimura Conjecture, itself three decades old at that point. Yutaka Taniyama and Goro Shimura had found there seemed to be a link between elliptic curves and these “modular forms”, which are a kind of group. That is, a group-theory thing.

So in fall of 1993 I was taking an advanced, though still undergraduate, course in (not-high-school) algebra at Rutgers. It’s where we learn group theory, after Intro to Algebra introduced us to group theory. Some exciting news came out. This fellow named Andrew Wiles at Princeton had shown an impressive bunch of things. Most important, that the Taniyama-Shimura Conjecture was true for semistable elliptic curves. This includes the kind of elliptic curve Frey made out of solutions to Fermat’s Last Theorem. So the curves based on solutions to Fermat’s Last Theorem would have be modular. But Frey had shown any curves based on solutions to Fermat’s Last Theorem couldn’t be modular. The conclusion: there can’t be any solutions to Fermat’s Last Theorem. Our professor did his best to explain the proof to us. Abstract Algebra was the undergraduate course closest to the stuff Wiles was working on. It wasn’t very close. When you’re still trying to work out what it means for something to be an ideal it’s hard to even follow the setup of the problem. The proof itself was inaccessible.

Which is all right. Wiles’s original proof had some flaws. At least this mathematics major shrugged when that news came down and wondered, well, maybe it’ll be fixed someday. Maybe not. I remembered how exciting cold fusion was for about six weeks, too. But this someday didn’t take long. Wiles, with Richard Taylor, revised the proof and published about a year later. So far as I’m aware, nobody has any serious qualms about the proof.

So does knowing Fermat’s Last Theorem get us anything interesting? … And here is a sad anticlimax. It’s neat to know that a^n + b^n = c^n can’t be true unless ‘n’ is 1 or 2, at least for positive whole numbers. But I’m not aware of any neat results that follow from that, or that would follow if it were untrue. There are results that follow from the Taniyama-Shimura Conjecture that are interesting, according to people who know them and don’t seem to be fibbing me. But Fermat’s Last Theorem turns out to be a cute little aside.

Which is not to say studying it was foolish. This easy-to-understand, hard-to-solve problem certainly attracted talented minds to think about mathematics. Mathematicians found interesting stuff in trying to solve it. Some of it might be slight. I learned that in a Pythagorean triplet — ‘a’, ‘b’, and ‘c’ with a^2 + b^2 = c^2 — that I was not the infinitely brilliant mathematician at age fifteen I hoped I might be. Also that if ‘a’, ‘b’, and ‘c’ are relatively prime, you can’t have ‘a’ and ‘b’ both odd and ‘c’ even. You had to have ‘c’ and either ‘a’ or ‘b’ odd, with the other number even. Other mathematicians of more nearly infinite ability found stuff of greater import. Ernst Eduard Kummer in the 19th century developed ideals. These are an important piece of group theory. He was busy proving special cases of Fermat’s Last Theorem.

Kind viewers have tried to retcon Picard’s statement about Fermat’s Last Theorem. They say Picard was really searching for the proof Fermat had, or believed he had. Something using the mathematical techniques available to the early 17th century. Or that follow closely enough from that. The Taniyama-Shimura Conjecture definitely isn’t it. I don’t buy the retcon, but I’m willing to play along for the sake of not causing trouble. I suspect there’s not a proof of the general case that uses anything Fermat could have recognized, or thought he had. That’s all right. The search for a thing can be useful even if the thing doesn’t exist.

My 2018 Mathematics A To Z: e


I’m back to requests! Today’s comes from commenter Dina Yagodich. I don’t know whether Yagodich has a web site, YouTube channel, or other mathematics-discussion site, but am happy to pass along word if I hear of one.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

e.

Let me start by explaining integral calculus in two paragraphs. One of the things done in it is finding a `definite integral’. This is itself a function. The definite integral has as its domain the combination of a function, plus some boundaries, and its range is numbers. Real numbers, if nobody tells you otherwise. Complex-valued numbers, if someone says it’s complex-valued numbers. Yes, it could have some other range. But if someone wants you to do that they’re obliged to set warning flares around the problem and precede and follow it with flag-bearers. And you get at least double pay for the hazardous work. The function that gets definite-integrated has its own domain and range. The boundaries of the definite integral have to be within the domain of the integrated function.

For real-valued functions this definite integral has a great physical interpretation. A real-valued function means the domain and range are both real numbers. You see a lot of these. Call the function ‘f’, please. Call its independent variable ‘x’ and its dependent variable ‘y’. Using Euclidean coordinates, or as normal people call it “graph paper”, draw the points that make true the equation “y = f(x)”. Then draw in the x-axis, that is, the points where “y = 0”. The boundaries of the definite integral are going to be two values of ‘x’, a lower and an upper bound. Call that lower bound ‘a’ and the upper bound ‘b’. And heck, call that a “left boundary” and a “right boundary”, because … I mean, look at them. Draw the vertical line at “x = a” and the vertical line at “x = b”. If ‘f(x)’ is always a positive number, then there’s a shape bounded below by “y = 0”, on the left by “x = a”, on the right by “x = b”, and above by “y = f(x)”. And the definite integral is the area of that enclosed space. If ‘f(x)’ is sometimes zero, then there’s several segments, but their combined area is the definite integral. If ‘f(x)’ is sometimes below zero, then there’s several segments. The definite integral is the sum of the areas of parts above “y = 0” minus the area of the parts below “y = 0”.

(Why say “left boundary” instead of “lower boundary”? Taste, pretty much. But I look at the words “lower boundary” and think about the lower edge, that is, the line where “y = 0” here. And “upper boundary” makes sense as a way to describe the curve where “y = f(x)” as well as “x = b”. I’m confusing enough without making the simple stuff ambiguous.)

Don’t try to pass your thesis defense on this alone. But it’s what you need to understand ‘e’. Start out with the function ‘f’, which has domain of the positive real numbers and range of the positive real numbers. For every ‘x’ in the domain, ‘f(x)’ is the reciprocal, one divided by x. This is a shape you probably know well. It’s a hyperbola. Its asymptotes are the x-axis and the y-axis. It’s a nice gentle curve. Its plot passes through such famous points as (1, 1), (2, 1/2), (1/3, 3), and pairs like that. (10, 1/10) and (1/100, 100) too. ‘f(x)’ is always positive on this domain. Use as left boundary the line “x = 1”. And then — let’s think about different right boundaries.

If the right boundary is close to the left boundary, then this area is tiny. If it’s at, like, “x = 1.1” then the area can’t be more than 0.1. (It’s less than that. If you don’t see why that’s so, fit a rectangle of height 1 and width 0.1 around this curve and these boundaries. See?) But if the right boundary is farther out, this area is more. It’s getting bigger if the right boundary is “x = 2” or “x = 3”. It can get bigger yet. Give me any positive number you like. I can find a right boundary so the area inside this is bigger than your number.

Is there a right boundary where the area is exactly 1? … Well, it’s hard to see how there couldn’t be. If a quantity (“area between x = 1 and x = b”) changes from less than one to greater than one, it’s got to pass through 1, right? … Yes, it does, provided some technical points are true, and in this case they are. So that’s nice.

And there is. It’s a number (settle down, I see you quivering with excitement back there, waiting for me to unveil this) a slight bit more than 2.718. It’s a neat number. Carry it out a couple more digits and it turns out to be 2.718281828. So it looks like a great candidate to memorize. It’s not. It’s an irrational number. The digits go off without repeating or falling into obvious patterns after that. It’s a transcendental number, which has to do with polynomials. Nobody knows whether it’s a normal number, because remember, a normal number is just any real number that you never heard of. To be a normal number, every finite string of digits has to appear in the decimal expansion, just as often as every other string of digits of the same length. We can show by clever counting arguments that roughly every number is normal. Trick is it’s hard to show that any particular number is.

So let me do another definite integral. Set the left boundary to this “x = 2.718281828(etc)”. Set the right boundary a little more than that. The enclosed area is less than 1. Set the right boundary way off to the right. The enclosed area is more than 1. What right boundary makes the enclosed area ‘1’ again? … Well, that will be at about “x = 7.389”. That is, at the square of 2.718281828(etc).

Repeat this. Set the left boundary at “x = (2.718281828etc)2”. Where does the right boundary have to be so the enclosed area is 1? … Did you guess “x = (2.718281828etc)3”? Yeah, of course. You know my rhetorical tricks. What do you want to guess the area is between, oh, “x = (2.718281828etc)3” and “x = (2.718281828etc)5”? (Notice I put a ‘5’ in the superscript there.)

Now, relationships like this will happen with other functions, and with other left- and right-boundaries. But if you want it to work with a function whose rule is as simple as “f(x) = 1 / x”, and areas of 1, then you’re going to end up noticing this 2.718281828(etc). It stands out. It’s worthy of a name.

Which is why this 2.718281828(etc) is a number you’ve heard of. It’s named ‘e’. Leonhard Euler, whom you will remember as having written or proved the fundamental theorem for every area of mathematics ever, gave it that name. He used it first when writing for his own work. Then (in November 1731) in a letter to Christian Goldbach. Finally (in 1763) in his textbook Mechanica. Everyone went along with him because Euler knew how to write about stuff, and how to pick symbols that worked for stuff.

Once you know ‘e’ is there, you start to see it everywhere. In Western mathematics it seems to have been first noticed by Jacob (I) Bernoulli, who noticed it in toy compound interest problems. (Given this, I’d imagine it has to have been noticed by the people who did finance. But I am ignorant of the history of financial calculations. Writers of the kind of pop-mathematics history I read don’t notice them either.) Bernoulli and Pierre Raymond de Montmort noticed the reciprocal of ‘e’ turning up in what we’ve come to call the ‘hat check problem’. A large number of guests all check one hat each. The person checking hats has no idea who anybody is. What is the chance that nobody gets their correct hat back? … That chance is the reciprocal of ‘e’. The number’s about 0.368. In a connected but not identical problem, suppose something has one chance in some number ‘N’ of happening each attempt. And it’s given ‘N’ attempts given for it to happen. What’s the chance that it doesn’t happen? The bigger ‘N’ gets, the closer the chance it doesn’t happen gets to the reciprocal of ‘e’.

It comes up in peculiar ways. In high school or freshman calculus you see it defined as what you get if you take \left(1 + \frac{1}{x}\right)^x for ever-larger real numbers ‘x’. (This is the toy-compound-interest problem Bernoulli found.) But you can find the number other ways. Tou can calculate it — if you have the stamina — by working out the value of

1 + 1 + \frac12\left( 1 + \frac13\left( 1 + \frac14\left( 1 + \frac15\left( 1 + \cdots \right)\right)\right)\right)

There’s a simpler way to write that. There always is. Take all the nonnegative whole numbers — 0, 1, 2, 3, 4, and so on. Take their factorials. That’s 1, 1, 2, 6, 24, and so on. Take the reciprocals of all those. That’s … 1, 1, one-half, one-sixth, one-twenty-fourth, and so on. Add them all together. That’s ‘e’.

This ‘e’ turns up all the time. Any system whose rate of growth depends on its current value has an ‘e’ lurking in its description. That’s true if it declines, too, as long as the decline depends on its current value. It gets stranger. Cross ‘e’ with complex-valued numbers and you get, not just growth or decay, but oscillations. And many problems that are hard to solve to start with become doable, even simple, if you rewrite them as growths and decays and oscillations. Through ‘e’ problems too hard to do become problems of polynomials, or even simpler things.

Simple problems become that too. That property about the area underneath “f(x) = 1/x” between “x = 1” and “x = b” makes ‘e’ such a natural base for logarithms that we call it the base for natural logarithms. Logarithms let us replace multiplication with addition, and division with subtraction, easier work. They change exponentiation problems to multiplication, again easier. It’s a strange touch, a wondrous one.

There are some numbers interesting enough to attract books about them. π, obviously. 0. The base of imaginary numbers, \imath , has a couple. I only know one pop-mathematics treatment of ‘e’, Eli Maor’s e: The Story Of A Number. I believe there’s room for more.


Oh, one little remarkable thing that’s of no use whatsoever. Mathworld’s page about approximations to ‘e’ mentions this. Work out, if you can coax your calculator into letting you do this, the number:

\left(1 + 9^{-(4^{(42)})}\right)^{\left(3^{(2^{85})}\right)}

You know, the way anyone’s calculator will let you raise 2 to the 85th power. And then raise 3 to whatever number that is. Anyway. The digits of this will agree with the digits of ‘e’ for the first 18,457,734,525,360,901,453,873,570 decimal digits. One Richard Sabey found that, by what means I do not know, in 2004. The page linked there includes a bunch of other, no less amazing, approximations to numbers like ‘e’ and π and the Euler-Mascheroni Constant.

My 2018 Mathematics A To Z: Distribution (probability)


Today’s term ended up being a free choice. Nobody found anything appealing in the D’s to ask about. That’s all right.

I’m still looking for topics for the letters G through M, excluding L, if you’d like in on those letters.

And for my own sake, please check out the Playful Mathematics Education Blog Carnival, #121, if you haven’t already.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Distribution (probability).

I have to specify. There’s a bunch of mathematics concepts called `distribution’. Some of them are linked. Some of them are just called that because we don’t have a better word. Like, what else would you call multiplying the sum of something? I want to describe a distribution that comes to us in probability and in statistics. Through these it runs through modern physics, as well as truly difficult sciences like sociology and economics.

We get to distributions through random variables. These are variables that might be any one of multiple possible values. There might be as few as two options. There might be a finite number of possibilities. There might be infinitely many. They might be numbers. At the risk of sounding unimaginative, they often are. We’re always interested in measuring things. And we’re used to measuring them in numbers.

What makes random variables hard to deal with is that, if we’re playing by the rules, we never know what it is. Once we get through (high school) algebra we’re comfortable working with an ‘x’ whose value we don’t know. But that’s because we trust that, if we really cared, we would find out what it is. Or we would know that it’s a ‘dummy variable’, whose value is unimportant but gets us to something that is. A random variable is different. Its value matters, but we can’t know what it is.

Instead we get a distribution. This is a function which gives us information about what the outcomes are, and how likely they are. There are different ways to organize this data. If whoever’s talking about it doesn’t say just what they’re doing, bet on it being a “probability distribution function”. This follows slightly different rules based on whether the range of values is discrete or continuous, but the idea is roughly the same. Every possible outcome has a probability at least zero but not more than one. The total probability over every possible outcome is exactly one. There’s rules about the probability of two distinct outcomes happening. Stuff like that.

Distributions are interesting enough when they’re about fixed things. In learning probability this is stuff like hands of cards or totals of die rolls or numbers of snowstorms in the season. Fun enough. These get to be more personal when we take a census, or otherwise sample things that people do. There’s something wondrous in knowing that while, say, you might not know how long a commute your neighbor has, you know there’s an 80 percent change it’s between 15 and 25 minutes (or whatever). It’s also good for urban planners to know.

It gets exciting when we look at how distributions can change. It’s hard not to think of that as “changing over time”. (You could make a fair argument that “change” is “time”.) But it doesn’t have to. We can take a function with a domain that contains all the possible values in the distribution, and a range that’s something else. The image of the distribution is some new distribution. (Trusting that the function doesn’t do something naughty.) These functions — these mappings — might reflect nothing more than relabelling, going from (say) a distribution of “false and true” values to one of “-5 and 5” values instead. They might reflect regathering data; say, going from the distribution of a die’s outcomes of “1, 2, 3, 4, 5, or 6” to something simpler, like, “less than two, exactly two, or more than two”. Or they might reflect how something does change in time. They’re all mappings; they’re all ways to change what a distribution represents.

These mappings turn up in statistical mechanics. Processes will change the distribution of positions and momentums and electric charges and whatever else the things moving around do. It’s hard to learn. At least my first instinct was to try to warm up to it by doing a couple test cases. Pick specific values for the random variables and see how they change. This can help build confidence that one’s calculating correctly. Maybe give some idea of what sorts of behaviors to expect.

But it’s calculating the wrong thing. You need to look at the distribution as a specific thing, and how that changes. It’s a change of view. It’s like the change in view from thinking of a position as an x- and y- and maybe z-coordinate to thinking of position as a vector. (Which, I realize now, gave me slightly similar difficulties in thinking of what to do for any particular calculation.)

Distributions can change in time, just the way that — in simpler physics — positions might change. Distributions might stabilize, forming an equilibrium. This can mean that everything’s found a place to stop and rest. That will never happen for any interesting problem. What you might get is an equilibrium like the rings of Saturn. Everything’s moving, everything’s changing, but the overall shape stays the same. (Roughly.)

There are many specifically named distributions. They represent patterns that turn up all the time. The binomial distribution, for example, which represents what to expect if you have a lot of examples of something that can be one of two values each. The Poisson distribution, for representing how likely something that could happen any time (or any place) will happen in a particular span of time (or space). The normal distribution, also called the Gaussian distribution, which describes everything that isn’t trying to be difficult. There are like 400 billion dozen more named ones, each really good at describing particular kinds of problems. But they’re all distributions.

I’m Looking For Some More Topics For My 2018 Mathematics A-To-Z


As I’d said about a month ago, I’m hoping to panel topics for this year’s A-To-Z in a more piecemeal manner. Mostly this is so I don’t lose track of requests. I’m hoping not to go more than about three weeks between when a topic gets brought up and when I actually commit words to page.

But please, if you have any mathematical topics with a name that starts G through M, let me know! I generally take topics on a first-come, first-serve basis for each letter. But I reserve the right to use a not-first-pick choice if I realize the topic’s enchanted me. Also to use a synonym or an alternate phrasing if both topics for a particular letter interest me. Also when you do make a request, please feel free to mention your blog, Twitter feed, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.


So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences, for reference. I probably don’t want to revisit them, but if someone’s interested, I’ll at least think over whether I have new opinions about them. Thank you.

Excerpted From The Summer 2015 A To Z


Excerpted From The Leap Day 2016 A To Z


Excerpted From The Summer 2015 A To Z


Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

  • G
  • H
  • I
  • J
  • K
  • L
  • M

And all the Fall 2018 Mathematics A-To-Z should appear at this link, along with some extra stuff like these topic-request pages and such.

My 2018 Mathematics A To Z: Commutative


Today’s A to Z term comes from Reynardo, @Reynardo_red on Twitter, and is a challenge. And the other A To Z posts for this year should be at this link.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Commutative.

Some terms are hard to discuss. This is among them. Mathematicians find commutative things early on. Addition of whole numbers. Addition of real numbers. Multiplication of whole numbers. Multiplication of real numbers. Multiplication of complex-valued numbers. It’s easy to think of this commuting as just having liberty to swap the order of things. And it’s easy to think of commuting as “two things you can do in either order”. It inspires physical examples like rotating a dial, clockwise or counterclockwise, however much you like. Or outside the things that seem obviously mathematical. Add milk and then cereal to the bowl, or cereal and then milk. As long as you don’t overfill the bowl, there’s not an important different. Per Wikipedia, if you’re putting one sock on each foot, it doesn’t matter which foot gets a sock first.

When something is this accessible, and this universal, it gets hard to talk about. It threatens to be invisible. It was hard to say much interesting about the still air in a closed room, at least before there was a chemistry that could tell it wasn’t a homogenous invisible something, and before there was a statistical mechanics that it was doing something even when it was doing nothing.

But commutativity is different. It’s easy to think of mathematics that doesn’t commute. Subtraction doesn’t, for all that it’s as familiar as addition. And despite that we try, in high school algebra, to fuse it into addition. Division doesn’t either, for all that we try to think of it as multiplication. Rotating things in three dimensions doesn’t commute. Nor does multiplying quaternions, which are a kind of number still. (I’m double-dipping here. You can use quaternions to represent three-dimensional rotations, and vice-versa. So they aren’t quite different examples, even though you can use quaternions to do things unrelated to rotations.) Clothing is a mass of things that can and can’t be put on first.

We talk about commuting as if it’s something in (or not in) the operations we do. Adding. Rotating. Walking in some direction. But it’s not entirely in that. Consider walking directions. From an intersection in the city, walk north to the first intersection you encounter. And walk east to the first intersection you encounter. Does it matter whether you walk north first and then east, or east first and then north? In some cases, no; famously, in Midtown Manhattan there’s no difference. At least if we pretend Broadway doesn’t exist.

Also of we don’t start from near the edge of the island, or near Central Park. An operation, even something familiar like addition, is a function. Its domain is an ordered pair. Each thing in the pair is from the set of whatever might be added together. (Or multiplied, or whatever the name of the operation is.) The operation commutes if the order of the pair doesn’t matter. It’s easy to find sets and operations that won’t commute. I suppose it’s for the same reason it’s easier to find rectangular rather than square things. We’re so used to working with operations like multiplication that we forget that multiplication needs things to multiply.

Whether a thing commutes turns up often in group theory. This shouldn’t surprise. Group theory studies how arithmetic works. A “group”, which is a set of things with an operation like multiplication on it, might or might not commute. A “ring”, which has a set of things and two operations, has some commutativity built into it. One ring operation is something like addition. That commutes, or else you don’t have a ring. The other operation is something like multiplication. That might or might not commute. It depends what you need for your problem. A ring with commuting multiplication, plus some other stuff, can reach the heights of being a “field”. Fields are neat. They look a lot like the real numbers, but they can be all weird, too.

But even in a group, that doesn’t have to have a commuting multiplication, we can tease out commutativity. There is a thing named the “commutator”, which is this particular way of multiplying elements together. You can use it to split the original group in the way that odds and evens split the whole numbers. That splitting is based on the same multiplication as the original group. But its domain is now classes based on elements of the original group. What’s created, the “commutator subgroup”, is commutative. We can find a thing, based on what we are interested in, which offers commutativity right nearby.

It reaches further. In analysis, it can be useful to think of functions as “mappings”. We describe this as though a function took a domain and transformed it into a range. We can compose these functions together: take the range from one function and use it as the domain for another. Sometimes these chains of functions will commute. We can get from the original set to the final set by several paths. This can produce fascinating and beautiful proofs that look as if you just drew a lattice-work. The MathWorld page on “Commutative Diagram” has some examples of this, and I recommend just looking at the pictures. Appreciate their aesthetic, particularly the ones immediately after the sentence about “Commutative diagrams are usually composed by commutative triangles and commutative squares”.

Whether these mappings commute can have meaning. This takes us, maybe inevitably, to quantum mechanics. Mathematically, this represents systems as either a wave function or a matrix, whichever is more convenient. We can use this to find the distribution of positions or momentums or energies or anything else we would like to know. Distributions are as much as we can hope for from quantum mechanics. We can say what (eg) the position of something is most likely to be but not what it is. That’s all right.

The mathematics of finding these distributions is just applying an operator, taking a mapping, on this wave function or this matrix. Some pairs of these operators commute, like the ones that let us find momentum and find kinetic energy. Some do not, like those to find position and angular momentum.

We can describe how much two operators do or don’t commute. This is through a thing called the “commutator”. Its form looks almost playfully simple. Call the operators ‘f’ and ‘g’. And that by ‘fg’ we mean, “do g, then do f”. (This seems awkward. But if you think of ‘fg’ as ‘f(g(x))’, where ‘x’ is just something in the domain of g, then this seems less awkward.) The commutator of ‘f’ and ‘g’ is then whatever ‘fg – gf’ is. If it’s always zero, then ‘f’ and ‘g’ commute. If it’s ever not zero, then they don’t.

This is easy to understand physically. Imagine starting from a point on the surface of the earth. Travel south one mile and then west one mile. You are at a different spot than you would be, had you instead travelled west one mile and then south one mile. How different? That’s the commutator. It’s obviously zero, for just multiplying some regular old numbers together. It’s sometimes zero, for these paths on the Earth’s surface. It’s never zero, for finding-the-position and finding-the-angular-momentum. The amount by which that’s never zero we can see as the famous Uncertainty Principle, the limits of what kinds of information we can know about the world.

Still, it is a hard subject to describe. Things which commute are so familiar that it takes work to imagine them not commuting. (How could three times four equal anything but four times three?) Things which do not commute either obviously shouldn’t (add hot water to the instant oatmeal, and eat it), or are unfamiliar enough people need to stop and think about them. (Rotating something in one direction and then another, in three dimensions, generally doesn’t commute. But I wouldn’t fault you for testing this out with a couple objects on hand before being sure about it.) But it can be noticed, once you know to explore.

My 2018 Mathematics A To Z: Box-And-Whisker Plot


Today’s A To Z term is another from Iva Sallay, Find The Factors blog creator and, as with asymptote, friend of the blog. Thank you for it.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Box-And-Whisker Plot.

People can’t remember many things at once. This has effects. Some of them are obvious. Like, how a phone number, back in the days you might have to memorize them, wouldn’t be more than about seven or eight digits. Some are subtle, such as that we have descriptive statistics. We have descriptive statistics because we want to understand collections of a lot of data. But we can’t understand all the data. We have to simplify it. From this we get many numbers, based on data, that try to represent it. Means. Medians. Variance. Quartiles. All these.

And it’s not enough. We try to understand data further by visualization. Usually this is literal, making pictures that represent data. Now and then somebody visualizes data by something slick, like turning it into an audio recording. (Somewhere here I have an early-60s album turning 18 months of solar radio measurements into something music-like.) But that’s rare, and usually more of an artistic statement. Mostly it’s pictures. Sighted people learn much of the world from the experience of seeing it and moving around it. Visualization turns arithmetic into geometry. We can support our sense of number with our sense of space.

Many of the ways we visualize data came from the same person. William Playfair set out the rules for line charts and area charts and bar charts and pie charts and circle graphs. Florence Nightingale used many of them in her reports on medical care in the Crimean War. And this made them public and familiar enough that we still use them.

Box-and-whisker plots are not among them. I’m startled too. Playfair had a great talent for these sorts of visualizations. That he missed this is a reminder to us all. There are great, simple ideas still available for us to discover.

At least for the brilliant among us to discover. Box-and-whisker plots were introduced in 1969. I’m surprised it’s that recent. John Tukey developed them. Computer scientists remember Tukey’s name; he coined the term ‘bit’, as in the element of computer memory. They also remember he was an early user, if not the coiner, of the term ‘software’. Mathematicians know Tukey’s name too. He and James Cooley developed the Fast Fourier Transform. The Fast Fourier Transform appears on every list of the Most Important Algorithms of the 20th Century. Sometimes the Most Important Algorithms of All Time. The Fourier Transform is this great thing. It’s a way of finding patterns in messy, complicated data. It’s hard to calculate, though. Cooley and Tukey, though, found that the calculations you have to do can be made simpler, and much quicker. (In certain conditions. Mostly depending on how the data’s gathered. Fortunately, computers encourage gathering data in ways that make the Fast Fourier Transform possible. And then go and calculate it nice and fast.)

Box-and-whisker plots are a way to visualize sets of data. Too many data points to look at all at once, not without getting confused. They extract a couple bits of information about the distribution. Distributions say what ranges a data point, picked at random, are likely to be in, and are unlikely to be in. Distributions can be good things to look at. They let you know what typical experiences of a thing are likely to be. And they’re stable. A handful of weird fluke events don’t change them much. If you have a lot of fluke events, that changes the distribution. But if you have a lot of fluke events, they’re not flukes. They’re just events.

Box-and-whisker plots start from the median. This is the second of the three things commonly called “average”. It’s the data point that half the remaining data is less than, and half the remaining data is greater than. It’s a nice number to know. Start your box-and-whisker plot with a short line, horizontal or vertical as fits your worksheet, and labelled with that median.

Around this line we’ll draw a box. It’ll be as wide as the line you made for the median. But how tall should it be?

That is, normally, based on the first and third quartiles. These are the data points like the median. The first quartile has one-quarter the data points less than it, and three-quarters the data points more than it. The third quartile has three-quarters the data points less than it, and one-quarter the data points more than it. (And now you might ask if we can’t call the median the “second quartile”. We sure can. And will if we want to think about how the quartiles relate to each other.) Between the first and the third quartile are half of all the data points. The first and the third quartiles the boundaries of your box. They’re where the edges of the rectangle are.

That’s the box. What are the whiskers?

Well, they’re vertical lines. Or horizontal lines. Whatever’s perpendicular to how you started. They start at the quartile lines. Should they go to the maximum or minimum data points?

Maybe. Maximum and minimum data are neat, yes. But they’re also suspect. They’re extremes. They’re not quite reliable. If you went back to the same source of data, and collected it again, you’d get about the same median, and the same first and third quartile. You’d get different minimums and maximums, though. Often crazily different. Still, if you want to understand the data you did get, it’s hard to ignore that this is the data you have. So one choice for representing these is to just use the maximum and minimum points. Draw the whiskers out to the maximum and minimum, and then add a little cross bar or a circle at the end. This makes clear you meant the line to end there, rather than that your ink ran out. (Making a figure safe against misprinting is one of the understated essentials of good visualization.)

But again, the very highest and lowest data may be flukes. So we could look at other, more stable endpoints for the whiskers. The point of this is to show the range of what we believe most data points are. There are different ways to do this. There’s not one that’s always right. It’s important, when showing a box-and-whisker plot, to explain how far out the whiskers go.

Tukey’s original idea, for example, was to extend the whiskers based on the interquartile range. This is the difference between the third quartile and the first quartile. Like, just subtraction. Find a number that’s one-and-a-half times the interquartile range above the third quartile. The upper whisker goes to the data point that’s closest to that boundary without going over. This might well be the maximum already. The other number is the one that’s the first quartile minus one-and-a-halt times the interquartile range. The lower whisker goes to the data point that’s closest to that boundary without falling underneath it. And this might be the minimum. It depends how the data’s distributed. The upper whisker and the lower whisker aren’t guaranteed to be the same lengths. If there are data outside these whisker ranges, mark them with dots or x’s or something else easy to spot. There’ll typically be only a few of these.

But you can use other rules too. Again as long as you are clear about what they represent. The whiskers might go out, for example, to particular percentiles. Or might reach out a certain number of standard deviations from the mean.

The point of doing this box-and-whisker plot is to show where half the data are. That’s inside the box. And where the rest of the non-fluke data is. That’s the whiskers. And the flukes, those are the odd little dots left outside the whiskers. And it doesn’t take any deep calculations. You need to sort the data in ascending order. You need to count how many data points there are, to find the median and the first and third quartiles. (You might have to do addition and division. If you have, for example, twelve distinct data points, then the median is the arithmetic mean of the sixth and seventh values. The first quartile is the arithmetic mean of the third and fourth values. The third quartile is the arithmetic mean of the ninth and tenth values.) You (might) need to subtract, to find the interquartile range. And multiply that by one and a half, and add or subtract that from the quartiles.

This shows you what are likely and what are improbable values. They give you a cruder picture than, say, the standard deviation and the coefficients of variance do. But they need no hard calculations. None of what you need for box-and-whisker plots is computationally intensive. Heck, none of what you need is hard. You knew everything you needed to find these numbers by fourth grade. And yet they tell you about the distribution. You can compare whether two sets of data are similar by eye. Telling whether sets of data are similar becomes telling whether two shapes look about the same. It’s brilliant to represent so much from such simple work.

My 2018 Mathematics A To Z: Asymptote


Welcome, all, to the start of my 2018 Mathematics A To Z. Twice each week for the rest of the year I hope to have a short essay explaining a term from mathematics. These are fun and exciting for me to do, since I mostly take requests for the words, and I always think I’m going to be father farther ahead of deadline than I actually am.

Today’s word comes from longtime friend of my blog Iva Sallay, whose Find the Factors page offers a nice daily recreational logic puzzle. Also trivia about each whole number, in turn.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Asymptote.

You know how everything feels messy and complicated right now? But you also feel that, at least in the distant past, they were simpler and easier to understand? And how you hope that, sometime in the future, all our current woes will have faded and things will be simple again? Hold that thought.

There is no one thing that every mathematician does, apart from insist to friends that they can’t do arithmetic well. But there are things many mathematicians do. One of those is to work with functions. A function is this abstract concept. It’s a triplet of things. One is a domain, a set of things that we draw the independent variables from. One is a range, a set of things that we draw the dependent variables from. And last thing is a rule, something that matches each thing in the domain to one thing in the range.

The domain and range can be the same thing. They’re often things like “the real numbers”. They don’t have to be. The rule can be almost anything. It can be simple. It can be complicated. Usually, if it’s interesting, there’s at least something complicated about it.

The asymptote, then, is an expression of our hope that we have to work with something that’s truly simple, but has some temporary complicated stuff messing it up just now. Outside some local embarrassment, our function is close enough to this simpler asymptote. The past and the future are these simpler things. It’s only the present, the local area, that’s messy and confusing.

We can make this precise. Start off with some function we both agree is interesting. Reach deep into the imagination to call it ‘f’. Suppose that there is an asymptote. That’s also a function, with the same domain and range as ‘f’. Let me call it ‘g’, because that’s a letter very near ‘f’.

You give me some tolerance for error. This number mathematicians usually call ‘ε’. We usually think of it as a small thing. But all we need is that it’s larger than zero. Anyway, you give me that ε. Then I can give you, for that ε, some bounded region in the domain. Everywhere outside that region, the difference between ‘f’ and ‘g’ is smaller than ε. That is, our complicated original function ‘f’ and the asymptote ‘g’ are indistinguishable enough. At least everywhere except this little patch of the domain. There’s different regions for different ε values, unless something weird is going on. The smaller then ε the bigger the region of exceptions. But if the domain is something like the real numbers, well, big deal. Our function and our asymptote are indistinguishable roughly everywhere.

If there is an asymptote. We’re not guaranteed there is one. But if there is, we know some nice things. We know what our function looks like, at least outside this local range of extra complication. If the domain represents something like time or space, and it often does, then the asymptote represents the big picture. What things look like in deep time. What things look like globally. When studying a function we can divide it into the easy part of the asymptote and the local part that’s “function minus the asymptote”.

Usually we meet asymptotes in high school algebra. They’re a pair of crossed lines that hang around hyperbolas. They help you sketch out the hyperbola. Find equations for the asymptotes. Draw these crossed lines. Figure whether the hyperbola should go above-and-below or left-and-right of the crossed lines. Draw discs accordingly. Then match them up to the crossed lines. Asymptotes don’t seem to do much else there. A parabola, the other exotic shape you meet about the same time, doesn’t have any asymptote that’s any simpler than itself. A circle or an ellipse, which you met before but now have equations to deal with, doesn’t have an asymptote at all. They aren’t big enough to have any. So at first introduction asymptotes seem like a lot of mechanism for a slight problem. We don’t need accurate hand-drawn graphs of hyperbolas that much.

In more complicated mathematics they get useful again. In dynamical systems we look at descriptions of how something behaves in time. Often its behavior will have an asymptote. Not always, but it’s nice to see when it does. When we study operations, how long it takes to do a task, we see asymptotes all over the place. How long it takes to perform a task depends on how big a problem it is we’re trying to solve. The relationship between how big the thing is and how long it takes to do is some function. The asymptote appears when thinking about solving huge examples of the problem. What rule most dominates how hard the biggest problems are? That’s the asymptote, in this case.

Not everything has an asymptote. Some functions are always as complicated as they started. Oscillations, for example, if they don’t dampen out. A sine wave isn’t complicated. Not if you’re the kind of person who’ll write things like “a sine wave isn’t complicated”. But if the size of the oscillations doesn’t decrease, then there can’t be an asymptote. Functions might be chaotic, with values that vary along some truly complicated system, and so never have an asymptote.

But often we can find a simpler function that looks enough like the function we care about. Everywhere except some local little embarrassment. We can enjoy the promise that things were understandable at one point, and maybe will be again.

I’m Still Looking For Fun Mathematics And Words


I’m hoping to get my 2018 Mathematics A To Z started the last week of September, which among other things will let me end it in 2018 if I haven’t been counting wrong. We’ll see. If you’ve got requests for the first several letters in the alphabet, there’s still open slots. I’ll be opening up the next quarter of the alphabet soon, too.

And also set for the last week of September — boy, I’m glad I am not going to have any doubts or regrets about how I’m scheduling my time for two weeks hence — is the Playful Mathematic Education Carnival. This project, overseen by Denise Gaskins, tries to bring a bundle of fun stuff about mathematics to different blogs. Iva Sallay’s turn, the end of August, is up here. Have you spotted something mathematical that’s made you smile? Please let me know. I’d love to share it with the world.

I’m Looking For Topics For My Fall 2018 Mathematics A-To-Z


So I have given up on waiting for a moment when my schedule looks easier. I’m going to plunge in and make it all hard again. Thus I announce, to start in about a month, my Fall 2018 Mathematics A To Z.

This is something I’ve done once or twice the last few years. The idea is easy: I take one mathematical term for each letter of the alphabet and explain it. The last several rounds I’ve gotten the words from you, kind readers who would love watching me trying to explain something in a field of mathematics I only just learned anything about. It’s great fun. If you do any kind of explanatory blog I recommend the format.

I do mean to do things a little different this time. First, and most visibly, I’m only going to post two essays a week. In past years I’ve done three, and that’s a great pace. It’s left me sometimes with that special month where I have a fresh posting every single day of the month. It’s also a crushing schedule, at least for me. Especially since I’ve been writing longer and longer, both here and on my humor blog. Two’s my limit and I reserve the right to skip a week when I need to skip a week.

Second. I’m going to open for requests only a few letters at a time. In the past I’ve ended up lost when, for example, my submit-your-requests post ends up being seven weeks back and hard to find under all my notifications. This should help me better match up my requests, my writing pace, and my deadlines. It will not.

Also, in the past I’ve always done first-come, first-serve. I’m still inclined toward that. But I’m going to declare that if I come in and check my declarations some morning and find several requests for the same letter, I may decide to go with the word that most captures my imagination. Probably I won’t have the nerve. But I’d like to think I have. I might do some supplementals after the string is done, too. We’ll see what I feel up to. Doing a whole run is exhilarating but exhausting.


So. Now I’d like to declare myself open for the letters ‘A’ through ‘F’. In past A to Z’s I’ve already given these words, so probably won’t want to revisit them. (Though there are some that I think, mm, I could do better now.)

Excerpted from The Summer 2015 A To Z


Excerpted from The Leap Day 2016 A To Z


Excerpted from The Summer 2015 A To Z


Excerpted from The Summer 2017 A To Z


And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

  •     A    
  •     B    
  •     C    
  • D
  •     E    
  •     F    

Oh, I need to commission some header art from Thomas K Dye, creator of the web comic Newshounds, for this. Also for another project that’ll help my September get a little more overloaded.

A Summer 2017 Mathematics A To Z Appendix: Are Colbert Numbers A Thing?


This is something I didn’t have space for in the proper A To Z sequence. And it’s more a question than exposition anyway. But what the heck: I like excuses to use my nice shiny art package.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

I was looking for mathematics topics I might write about if I didn’t get requests for particular letters. ‘C’ came up ‘cohomology’, but what if it hadn’t? I found an interesting-looking listing at MathWorld’s dictionary. The Colbert Numbers sounded interesting. this is a collection of very long prime numbers. Each of them has at least a million decimal digits. They relate to a conjecture by Wacław Sierpiński, who’s gone months without a mention around here.

The conjecture is about whole numbers that are equal to k \cdot 2^n + 1 for some whole numbers ‘k’ and ‘n’. Are there choices of ‘k’ for which, no matter what positive whole number ‘n’ you pick, k \cdot 2^n + 1 is never a prime number? (‘k’ has to meet some extra conditions.) I’m not going to explain why this is interesting because I don’t know. It’s a number theory question. They’re all strange and interesting questions in their ways. If I were writing an essay about Colbert Numbers I’d have figured that out.

Thing is we believe we know what the smallest possible ‘k’ is. We think that the smallest possible Sierpiński number is 78,557. We don’t have this quite proved, though. There are some numbers that might be prime numbers of the form k \cdot 2^n + 1 for some ‘k’ and some ‘n’. There was a set of seventeen possible candidates of numbers smaller than 78,557 that might be Sierpiński numbers. If those candidates could be ruled out then we’d have proved 78,557 was it. That’s easy to imagine. Find for each of them a number ‘n’ so that the candidate times 2n plus one was a prime number. But finding big prime numbers is hard. This turned into a distributed-computing search. This would evaluate these huge numbers and find whether they were prime numbers. (The project, “Seventeen Or Bust”, was destroyed by computer failure in 2016. Attempts to verify the work done, and continue it, are underway.) There are six possible candidates left.

MathWorld says that the seventeen cases that had to be checked were named Colbert Numbers. This was in honor of Stephen T Colbert, the screamingly brilliant character host of The Colbert Report. (Ask me sometime about the Watership Down anecdote.) It’s a plausible enough claim. Part of Stephen T Colbert’s persona was demanding things be named for him. And he’d take appropriate delight in having minor but interesting things named for him. Treadmills on the space station. Minor-league hockey team mascots. A class of numbers for proving a 60-year-old mathematical conjecture is exactly the sort of thing that would get his name and attention.

But here’s my problem. Who named them Colbert Numbers? MathWorld doesn’t say. Wikipedia doesn’t say. The best I can find with search engines doesn’t say. When were they named Colbert Numbers? Again, no answers. Poking around fan sites for The Colbert Report — where you’d expect the naming of stuff in his honor to be mentioned — doesn’t turn up anything. Does anyone call them Colbert Numbers? I mean outside people who’ve skimmed MathWorld’s glossary for topics with intersting names?

I don’t mean to sound overly skeptical here. But, like, I know there’s a class of science fiction fans who like to explain how telerobotics engineers name their hardware “waldoes”. This is in honor of a character in a Robert Heinlein story I never read either. I’d accepted that without much interest until Google’s US Patent search became a thing. One afternoon I noticed that if telerobotics engineers do call their hardware “waldoes” they never use the term in patent applications. Is it possible that someone might have slipped a joke in to Wikipedia or something and had it taken seriously? Certainly. What amounts to a Wikipedia prank briefly gave the coati — an obscure-to-the-United-States animal that I like — the nickname of “Brazilian aardvark”. There are surely other instances of Wikipedia-generated pranks becoming “real” things.

So I would like to know. Who named Colbert Numbers that, and when, and were they — as seems obvious, but you never know — named for Stephen T Colbert? Or is this an example of Wikiality, the sense that reality can be whatever enough people decide to believe is true, as described by … well, Stephen T Colbert?

The Summer 2017 Mathematics A To Z: What I Learned


I’ve in the past done essays about what I’ve taken away from an A to Z project. Please indulge me with this.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

The big thing I learned from the Summer 2017 A To Z, besides that it would have been a little better started two weeks earlier? (I couldn’t have started it two weeks earlier. July was a frightfully busy month. As it was I was writing much too close to deadline. Starting sooner would have been impossible.)

Category theory, mostly. Many of the topics requested had some category theory component. Next would be tensors and tensor-related subjects. This is exciting and dangerous. Neither’s a field I know well. Both are fields I want to know better. It’s a truism that to really learn an advanced subject you have to teach a course in it. That’s how I picked up what I know about signal processing and about numerical quantum mechanics. Still, it’s perilous, especially when I would realize the subject asked-for wasn’t what I faintly remembered had been asked for, and that I’d been composing an essay for in my head for a week already.

Also, scheduling. The past A To Z sequences were relatively low-stress things for me. I could get as many as six essays ahead of what I needed to post. That’s so comfortable a place to be. This time around, I was working much closer to deadline, with some pieces needing major rewriting as few as fifteen hours before my posting hour. More needed minor editing the day of posting. There’s several causes for this. But the biggest is that I wrote much longer this time. Past A To Z sequences could have at least a couple essays that were a few paragraphs. This time around I don’t think any piece came in at under a thousand words, and the default was getting to be around 1500 words. I don’t think I broke 2,000 words, but I came close.

That’s fine, because the essays came out great. This has been the A To Z sequence I’m proudest of, so far. They’re the ones that make me think my father’s ever-supportive assurance that I could put these into a book that people would give me actual money for can be right. Still, the combination of writing about stuff I had to research more first and writing longer pieces made the workload more than I’d figured on. When I get to doing this again — and I will, when the exhaustion’s faded enough from memory — I will need more lead time between asking for topics and starting to write. And will need to freeze topics farther in advance than I did this time. I still suspect my father’s too supportive to say I could get money for this. But it’s a less unrealistic thought than I had figured before.

Also learned: hire an artist! I got a better-banner-than-I-paid-for from Thomas K Dye for this series. His work added a snappy bit of visual appeal to my sentence heaps. I’d also gotten from him a banner for the Why Stuff Can Orbit sequence, which I mean to resume now that I have some more writing time. But the banners give a needed bit of unity to my writing, and the automatically-generated Twitter announcements of these posts, and that’s helped the look of the place. Something like nine-tenths of the people I know online are visual artists of one kind or another. (The rest are writers, my siblings, and my mother.) I should be making reasons to commission them. For example, if I want to describe something too complicated to do in words alone I should turn it over to them. Remember, I don’t do the few-pictures thing because I’m a good writer. It’s because I’m too lazy to make an illustration myself. A bit of money can be as good as effort.

Speaking of effort, between the A To Z essays and Reading the Comics posts, and a couple miscellaneous other pieces, I wrote five to six thousand words per week for two months. That’s probably not sustainable indefinitely, but a slightly lower pace? And for a specific big project? It’s good to know that’s something I can do, albeit possibly by putting this blog on hold.

Learned to my personal everlasting humiliation: I spelled “Klein Bottle” wrong. Fortunately, I only spelled it “Klien” in the title of the essay, so it sits there in my tweet publicizing the post and in the full-length URL to the post, forever. I’ll recover, I hope.

The Summer 2017 Mathematics A To Z: What I Talked About


This is just a list of all the topics I covered in the Summer 2017 A To Z.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

And if those aren’t enough essays for you, here’s a collection of all the topics from the three previous A To Z sequences that I’ve done. Thank you, and thanks for reading and for challenging me to write.

The Summer 2017 Mathematics A To Z: Zeta Function


Today Gaurish, of For the love of Mathematics, gives me the last subject for my Summer 2017 A To Z sequence. And also my greatest challenge: the Zeta function. The subject comes to all pop mathematics blogs. It comes to all mathematics blogs. It’s not difficult to say something about a particular zeta function. But to say something at all original? Let’s watch.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Zeta Function.

The spring semester of my sophomore year I had Intro to Complex Analysis. Monday Wednesday 7:30; a rare evening class, one of the few times I’d eat dinner and then go to a lecture hall. There I discovered something strange and wonderful. Complex Analysis is a far easier topic than Real Analysis. Both are courses about why calculus works. But why calculus for complex-valued numbers works is a much easier problem than why calculus for real-valued numbers works. It’s dazzling. Part of this is that Complex Analysis, yes, builds on Real Analysis. So Complex can take for granted some things that Real has to prove. I didn’t mind. Given the way I crashed through Intro to Real Analysis I was glad for a subject that was, relatively, a breeze.

As we worked through Complex Variables and Applications so many things, so very many things, got to be easy. The basic unit of complex analysis, at least as we young majors learned it, was in contour integrals. These are integrals whose value depends on the values of a function on a closed loop. The loop is in the complex plane. The complex plane is, well, your ordinary plane. But we say the x-coordinate and the y-coordinate are parts of the same complex-valued number. The x-coordinate is the real-valued part. The y-coordinate is the imaginary-valued part. And we call that summation ‘z’. In complex-valued functions ‘z’ serves the role that ‘x’ does in normal mathematics.

So a closed loop is exactly what you think. Take a rubber band and twist it up and drop it on the table. That’s a closed loop. Suppose you want to integrate a function, ‘f(z)’. If you can always take its derivative on this loop and on the interior of that loop, then its contour integral is … zero. No matter what the function is. As long as it’s “analytic”, as the terminology has it. Yeah, we were all stunned into silence too. (Granted, mathematics classes are usually quiet, since it’s hard to get a good discussion going. Plus many of us were in post-dinner digestive lulls.)

Integrating regular old functions of real-valued numbers is this tedious process. There’s sooooo many rules and possibilities and special cases to consider. There’s sooooo many tricks that get you the integrals of some functions. And then here, with complex-valued integrals for analytic functions, you know the answer before you even look at the function.

As you might imagine, since this is only page 113 of a 341-page book there’s more to it. Most functions that anyone cares about aren’t analytic. At least they’re not analytic everywhere inside regions that might be interesting. There’s usually some points where an interesting function ‘f(z)’ is undefined. We call these “singularities”. Yes, like starships are always running into. Only we rarely get propelled into other universes or other times or turned into ghosts or stuff like that.

So much of the rest of the course turns into ways to avoid singularities. Sometimes you can spackel them over. This is when the function happens not to be defined somewhere, but you can see what it ought to be. Sometimes you have to do something more. This turns into a search for “removable” singularities. And this does something so brilliant it looks illicit. You modify your closed loop, so that it comes up very close, as close as possible, to the singularity, but studiously avoids it. Follow this game of I’m-not-touching-you right and you can turn your integral into two parts. One is the part that’s equal to zero. The other is the part that’s a constant times whatever the function is at the singularity you’re removing. And that ought to be easy to find the value for. (Being able to find a function’s value doesn’t mean you can find its derivative.)

Those tricks were hard to master. Not because they were hard. Because they were easy, in a context where we expected hard. But after that we got into how to move singularities. That is, how to do a change of variables that moved the singularities to where they’re more convenient for some reason. How could this be more convenient? Because of chapter five, series. In regular old calculus we learn how to approximate well-behaved functions with polynomials. In complex-variable calculus, we learn the same thing all over again. They’re polynomials of complex-valued variables, but it’s the same sort of thing. And not just polynomials, but things that look like polynomials except they’re powers of \frac{1}{z} instead. These open up new ways to approximate functions, and to remove singularities from functions.

And then we get into transformations. These are about turning a problem that’s hard into one that’s easy. Or at least different. They’re a change of variable, yes. But they also change what exactly the function is. This reshuffles the problem. Makes for a change in singularities. Could make ones that are easier to work with.

One of the useful, and so common, transforms is called the Laplace-Stieltjes Transform. (“Laplace” is said like you might guess. “Stieltjes” is said, or at least we were taught to say it, like “Stilton cheese” without the “ton”.) And it tends to create functions that look like a series, the sum of a bunch of terms. Infinitely many terms. Each of those terms looks like a number times another number raised to some constant times ‘z’. As the course came to its conclusion, we were all prepared to think about these infinite series. Where singularities might be. Which of them might be removable.

These functions, these results of the Laplace-Stieltjes Transform, we collectively call ‘zeta functions’. There are infinitely many of them. Some of them are relatively tame. Some of them are exotic. One of them is world-famous. Professor Walsh — I don’t mean to name-drop, but I discovered the syllabus for the course tucked in the back of my textbook and I’m delighted to rediscover it — talked about it.

That world-famous one is, of course, the Riemann Zeta function. Yes, that same Riemann who keeps turning up, over and over again. It looks simple enough. Almost tame. Take the counting numbers, 1, 2, 3, and so on. Take your ‘z’. Raise each of the counting numbers to that ‘z’. Take the reciprocals of all those numbers. Add them up. What do you get?

A mass of fascinating results, for one. Functions you wouldn’t expect are concealed in there. There’s strips where the real part is zero. There’s strips where the imaginary part is zero. There’s points where both the real and imaginary parts are zero. We know infinitely many of them. If ‘z’ is -2, for example, the sum is zero. Also if ‘z’ is -4. -6. -8. And so on. These are easy to show, and so are dubbed ‘trivial’ zeroes. To say some are ‘trivial’ is to say that there are others that are not trivial. Where are they?

Professor Walsh explained. We know of many of them. The nontrivial zeroes we know of all share something in common. They have a real part that’s equal to 1/2. There’s a zero that’s at about the number \frac{1}{2} - \imath 14.13 . Also at \frac{1}{2} + \imath 14.13 . There’s one at about \frac{1}{2} - \imath 21.02 . Also about \frac{1}{2} + \imath 21.02 . (There’s a symmetry, you maybe guessed.) Every nontrivial zero we’ve found has a real component that’s got the same real-valued part. But we don’t know that they all do. Nobody does. It is the Riemann Hypothesis, the great unsolved problem of mathematics. Much more important than that Fermat’s Last Theorem, which back then was still merely a conjecture.

What a prospect! What a promise! What a way to set us up for the final exam in a couple of weeks.

I had an inspiration, a kind of scheme of showing that a nontrivial zero couldn’t be within a given circular contour. Make the size of this circle grow. Move its center farther away from the z-coordinate \frac{1}{2} + \imath 0 to match. Show there’s still no nontrivial zeroes inside. And therefore, logically, since I would have shown nontrivial zeroes couldn’t be anywhere but on this special line, and we know nontrivial zeroes exist … I leapt enthusiastically into this project. A little less enthusiastically the next day. Less so the day after. And on. After maybe a week I went a day without working on it. But came back, now and then, prodding at my brilliant would-be proof.

The Riemann Zeta function was not on the final exam, which I’ve discovered was also tucked into the back of my textbook. It asked more things like finding all the singular points and classifying what kinds of singularities they were for functions like e^{-\frac{1}{z}} instead. If the syllabus is accurate, we got as far as page 218. And I’m surprised to see the professor put his e-mail address on the syllabus. It was merely “bwalsh@math”, but understand, the Internet was a smaller place back then.

I finished the course with an A-, but without answering any of the great unsolved problems of mathematics.

The Summer 2017 Mathematics A To Z: Young Tableau


I never heard of today’s entry topic three months ago. Indeed, three weeks ago I was still making guesses about just what Gaurish, author of For the love of Mathematics,, was asking about. It turns out to be maybe the grand union of everything that’s ever been in one of my A To Z sequences. I overstate, but barely.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Young Tableau.

The specific thing that a Young Tableau is is beautiful in its simplicity. It could almost be a recreational mathematics puzzle, except that it isn’t challenging enough.

Start with a couple of boxes laid in a row. As many or as few as you like.

Now set another row of boxes. You can have as many as the first row did, or fewer. You just can’t have more. Set the second row of boxes — well, your choice. Either below the first row, or else above. I’m going to assume you’re going below the first row, and will write my directions accordingly. If you do things the other way you’re following a common enough convention. I’m leaving it on you to figure out what the directions should be, though.

Now add in a third row of boxes, if you like. Again, as many or as few boxes as you like. There can’t be more than there are in the second row. Set it below the second row.

And a fourth row, if you want four rows. Again, no more boxes in it than the third row had. Keep this up until you’ve got tired of adding rows of boxes.

How many boxes do you have? I don’t know. But take the numbers 1, 2, 3, 4, 5, and so on, up to whatever the count of your boxes is. Can you fill in one number for each box? So that the numbers are always increasing as you go left to right in a single row? And as you go top to bottom in a single column? Yes, of course. Go in order: ‘1’ for the first box you laid down, then ‘2’, then ‘3’, and so on, increasing up to the last box in the last row.

Can you do it in another way? Any other order?

Except for the simplest of arrangements, like a single row of four boxes or three rows of one box atop another, the answer is yes. There can be many of them, turns out. Seven boxes, arranged three in the first row, two in the second, one in the third, and one in the fourth, have 35 possible arrangements. It doesn’t take a very big diagram to get an enormous number of possibilities. Could be fun drawing an arbitrary stack of boxes and working out how many arrangements there are, if you have some time in a dull meeting to pass.

Let me step away from filling boxes. In one of its later, disappointing, seasons Futurama finally did a body-swap episode. The gimmick: two bodies could only swap the brains within them one time. So would it be possible to put Bender’s brain back in his original body, if he and Amy (or whoever) had already swapped once? The episode drew minor amusement in mathematics circles, and a lot of amazement in pop-culture circles. The writer, a mathematics major, found a proof that showed it was indeed always possible, even after many pairs of people had swapped bodies. The idea that a theorem was created for a TV show impressed many people who think theorems are rarer and harder to create than they necessarily are.

It was a legitimate theorem, and in a well-developed field of mathematics. It’s about permutation groups. These are the study of the ways you can swap pairs of things. I grant this doesn’t sound like much of a field. There is a surprising lot of interesting things to learn just from studying how stuff can be swapped, though. It’s even of real-world relevance. Most subatomic particles of a kind — electrons, top quarks, gluons, whatever — are identical to every other particle of the same kind. Physics wouldn’t work if they weren’t. What would happen if we swap the electron on the left for the electron on the right, and vice-versa? How would that change our physics?

A chunk of quantum mechanics studies what kinds of swaps of particles would produce an observable change, and what kind of swaps wouldn’t. When the swap doesn’t make a change we can describe this as a symmetric operation. When the swap does make a change, that’s an antisymmetric operation. And — the Young Tableau that’s a single row of two boxes? That matches up well with this symmetric operation. The Young Tableau that’s two rows of a single box each? That matches up with the antisymmetric operation.

How many ways could you set up three boxes, according to the rules of the game? A single row of three boxes, sure. One row of two boxes and a row of one box. Three rows of one box each. How many ways are there to assign the numbers 1, 2, and 3 to those boxes, and satisfy the rules? One way to do the single row of three boxes. Also one way to do the three rows of a single box. There’s two ways to do the one-row-of-two-boxes, one-row-of-one-box case.

What if we have three particles? How could they interact? Well, all three could be symmetric with each other. This matches the first case, the single row of three boxes. All three could be antisymmetric with each other. This matches the three rows of one box. Or you could have two particles that are symmetric with each other and antisymmetric with the third particle. Or two particles that are antisymmetric with each other but symmetric with the third particle. Two ways to do that. Two ways to fill in the one-row-of-two-boxes, one-row-of-one-box case.

This isn’t merely a neat, aesthetically interesting coincidence. I wouldn’t spend so much time on it if it were. There’s a matching here that’s built on something meaningful. The different ways to arrange numbers in a set of boxes like this pair up with a select, interesting set of matrices whose elements are complex-valued numbers. You might wonder who introduced complex-valued numbers, let alone matrices of them, into evidence. Well, who cares? We’ve got them. They do a lot of work for us. So much work they have a common name, the “symmetric group over the complex numbers”. As my leading example suggests, they’re all over the place in quantum mechanics. They’re good to have around in regular physics too, at least in the right neighborhoods.

These Young Tableaus turn up over and over in group theory. They match up with polynomials, because yeah, everything is polynomials. But they turn out to describe polynomial representations of some of the superstar groups out there. Groups with names like the General Linear Group (square matrices), or the Special Linear Group (square matrices with determinant equal to 1), or the Special Unitary Group (that thing where quantum mechanics says there have to be particles whose names are obscure Greek letters with superscripts of up to five + marks). If you’d care for more, here’s a chapter by Dr Frank Porter describing, in part, how you get from Young Tableaus to the obscure baryons.

Porter’s chapter also lets me tie this back to tensors. Tensors have varied ranks, the number of different indicies you can have on the things. What happens when you swap pairs of indices in a tensor? How many ways can you swap them, and what does that do to what the tensor describes? Please tell me you already suspect this is going to match something in Young Tableaus. They do this by way of the symmetries and permutations mentioned above. But they are there.

As I say, three months ago I had no idea these things existed. If I ever ran across them it was from seeing the name at MathWorld’s list of terms that start with ‘Y’. The article shows some nice examples (with each rows a atop the previous one) but doesn’t make clear how much stuff this subject runs through. I can’t fit everything in to a reasonable essay. (For example: the number of ways to arrange, say, 20 boxes into rows meeting these rules is itself a partition problem. Partition problems are probability and statistical mechanics. Statistical mechanics is the flow of heat, and the movement of the stars in a galaxy, and the chemistry of life.) I am delighted by what does fit.

The Summer 2017 Mathematics A To Z: X


We come now almost to the end of the Summer 2017 A To Z. Possibly also the end of all these A To Z sequences. Gaurish of, For the love of Mathematics, proposed that I talk about the obvious logical choice. The last promising thing I hadn’t talked about. I have no idea what to do for future A To Z’s, if they’re even possible anymore. But that’s a problem for some later time.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

X.

Some good advice that I don’t always take. When starting a new problem, make a list of all the things that seem likely to be relevant. Problems that are worth doing are usually about things. They’ll be quantities like the radius or volume of some interesting surface. The amount of a quantity under consideration. The speed at which something is moving. The rate at which that speed is changing. The length something has to travel. The number of nodes something must go across. Whatever. This all sounds like stuff from story problems. But most interesting mathematics is from a story problem; we want to know what this property is like. Even if we stick to a purely mathematical problem, there’s usually a couple of things that we’re interested in and that we describe. If we’re attacking the four-color map theorem, we have the number of territories to color. We have, for each territory, the number of territories that touch it.

Next, select a name for each of these quantities. Write it down, in the table, next to the term. The volume of the tank is ‘V’. The radius of the tank is ‘r’. The height of the tank is ‘h’. The fluid is flowing in at a rate ‘r’. The fluid is flowing out at a rate, oh, let’s say ‘s’. And so on. You might take a moment to go through and think out which of these variables are connected to which other ones, and how. Volume, for example, is surely something to do with the radius times something to do with the height. It’s nice to have that stuff written down. You may not know the thing you set out to solve, but you at least know you’ve got this under control.

I recommend this. It’s a good way to organize your thoughts. It establishes what things you expect you could know, or could want to know, about the problem. It gives you some hint how these things relate to each other. It sets you up to think about what kinds of relationships you figure to study when you solve the problem. It gives you a lifeline, when you’re lost in a sea of calculation. It’s reassurance that these symbols do mean something. Better, it shows what those things are.

I don’t always do it. I have my excuses. If I’m doing a problem that’s very like one I’ve already recently done, the things affecting it are probably the same. The names to give these variables are probably going to be about the same. Maybe I’ll make a quick sketch to show how the parts of the problem relate. If it seems like less work to recreate my thoughts than to write them down, I skip writing them down. Not always good practice. I tell myself I can always go back and do things the fully right way if I do get lost. So far that’s been true.

So, the names. Suppose I am interested in, say, the length of the longest rod that will fit around this hallway corridor. Then I am in a freshman calculus book, yes. Fine. Suppose I am interested in whether this pinball machine can be angled up the flight of stairs that has a turn in it Then I will measure things like the width of the pinball machine. And the width of the stairs, and of the landing. I will measure this carefully. Pinball machines are heavy and there are many hilarious sad stories of people wedging them into hallways and stairwells four and a half stories up from the street. But: once I have identified, say, ‘width of pinball machine’ as a quantity of interest, why would I ever refer to it as anything but?

This is no dumb question. It is always dangerous to lose the link between the thing we calculate and the thing we are interested in. Without that link we are less able to notice mistakes in either our calculations or the thing we mean to calculate. Without that link we can’t do a sanity check, that reassurance that it’s not plausible we just might fit something 96 feet long around the corner. Or that we estimated that we could fit something of six square feet around the corner. It is common advice in programming computers to always give variables meaningful names. Don’t write ‘T’ when ‘Total’ or, better, ‘Total_Value_Of_Purchase’ is available. Why do we disregard this in mathematics, and switch to ‘T’ instead?

First reason is, well, try writing this stuff out. Your hand (h) will fall off (foff) in about fifteen minutes, twenty seconds. (15′ 20”). If you’re writing a program, the programming environment you have will auto-complete the variable after one or two letters in. Or you can copy and paste the whole name. It’s still good practice to leave a comment about what the variable should represent, if the name leaves any reasonable ambiguity.

Another reason is that sure, we do specific problems for specific cases. But a mathematician is naturally drawn to thinking of general problems, in abstract cases. We see something in common between the problem “a length and a quarter of the length is fifteen feet; what is the length?” and the problem “a volume plus a quarter of the volume is fifteen gallons; what is the volume?”. That one is about lengths and the other about volumes doesn’t concern us. We see a saving in effort by separating the quantity of a thing from the kind of the thing. This restores danger. We must think, after we are done calculating, about whether the answer could make sense. But we can minimize that, we hope. At the least we can check once we’re done to see if our answer makes sense. Maybe even whether it’s right.

For centuries, as the things we now recognize as algebra developed, we would use words. We would talk about the “thing” or the “quantity” or “it”. Some impersonal name, or convenient pronoun. This would often get shortened because anything you write often you write shorter. “Re”, perhaps. In the late 16th century we start to see the “New Algebra”. Here mathematics starts looking like … you know … mathematics. We start to see stuff like “addition” represented with the + symbol instead of an abbreviation for “addition” or a p with a squiggle over it or some other shorthand. We get equals signs. You start to see decimals and exponents. And we start to see letters used in place of numbers whose value we don’t know.

There are a couple kinds of “numbers whose value we don’t know”. One is the number whose value we don’t know, but hope to learn. This is the classic variable we want to solve for. Another kind is the number whose value we don’t know because we don’t care. I mean, it has some value, and presumably it doesn’t change over the course of our problem. But it’s not like our work will be so different if, say, the tank is two feet high rather than four.

Is there a problem? If we pick our letters to fit a specific problem, no. Presumably all the things we want to describe have some clear name, and some letter that best represents the name. It’s annoying when we have to consider, say, the pinball machine width and the corridor width. But we can work something out.

But what about general problems?

Is m b \cos(e) + b^2 \log(y) = \sqrt{e} an easy problem to solve?

If we want to figure what ‘m’ is, yes. Similarly ‘y’. If we want to know what ‘b’ is, it’s tedious, but we can do that. If we want to know what ‘e’ is? Run and hide, that stuff is crazy. If you have to, do it numerically and accept an estimate. Don’t try figuring what that is.

And so we’ve developed conventions. There are some letters that, except in weird circumstances, are coefficients. They’re numbers whose value we don’t know, but either don’t care about or could look up. And there are some that, by default, are variables. They’re the ones whose value we want to know.

These conventions started forming, as mentioned, in the late 16th century. François Viète here made a name that lasts to mathematics historians at least. His texts described how to do algebra problems in the sort of procedural methods that we would recognize as algebra today. And he had a great idea for these letters. Use the whole alphabet, if needed. Use the consonants to represent the coefficients, the numbers we know but don’t care what they are. Use the vowels to represent the variables, whose values we want to learn. So he would look at that equation and see right away: it’s a terrible mess. (I exaggerate. He doesn’t seem to have known the = sign, and I don’t know offhand when ‘log’ and ‘cos’ became common. But suppose the rest of the equation were translated into his terminology.)

It’s not a bad approach. Besides the mnemonic value of consonant-coefficient, vowel-variable, it’s true that we usually have fewer variables than anything else. The more variables in a problem the harder it is. If someone expects you to solve an equation with ten variables in it, you’re excused for refusing. So five or maybe six or possibly seven choices for variables is plenty.

But it’s not what we settled on. René Descartes had a better idea. He had a lot of them, but here’s one. Use the letters at the end of the alphabet for the unknowns. Use the letters at the start of the alphabet for coefficients. And that is, roughly, what we’ve settled on. In my example nightmare equation, we’d suppose ‘y’ to probably be the variable we want to solve for.

And so, and finally, x. It is almost the variable. It says “mathematics” in only two strokes. Even π takes more writing. Descartes used it. We follow him. It’s way off at the end of the alphabet. It starts few words, very few things, almost nothing we would want to measure. (Xylem … mass? Flow? What thing is the xylem anyway?) Even mathematical dictionaries don’t have much to say about it. The letter transports almost no connotations, no messy specific problems to it. If it suggests anything, it suggests the horizontal coordinate in a Cartesian system. It almost is mathematics. It signifies nothing in itself, but long use has given it an identity as the thing we hope to learn by study.

And pirate treasure maps. I don’t know when ‘X’ became the symbol of where to look for buried treasure. My casual reading suggests “never”. Treasure maps don’t really exist. Maps in general don’t work that way. Or at least didn’t before cartoons. X marking the spot seems to be the work of Robert Louis Stevenson, renowned for creating a fanciful map and then putting together a book to justify publishing it. (I jest. But according to Simon Garfield’s On The Map: A Mind-Expanding Exploration of the Way The World Looks, his map did get lost on the way to the publisher, and he had to re-create it from studying the text of Treasure Island. This delights me to no end.) It makes me wonder if Stevenson was thinking of x’s service in mathematics. But the advantages of x as a symbol are hard to ignore. It highlights a point clearly. It’s fast to write. Its use might be coincidence.

But it is a letter that does a needed job really well.

The Summer 2017 Mathematics A To Z: Well-Ordering Principle


It’s the last full week of the Summer 2017 A To Z! Four more essays and I’ll have completed this project and curl up into a word coma. But I’m not there yet. Today’s request is another from Gaurish, who’s given me another delightful topic to write about. Gaurish hosts a fine blog, For the love of Mathematics, which I hope you’ve given a try.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Well-Ordering Principle.

An old mathematics joke. Or paradox, if you prefer. What is the smallest whole number with no interesting properties?

Not one. That’s for sure. We could talk about one forever. It’s the first number we ever know. It’s the multiplicative identity. It divides into everything. It exists outside the realm of prime or composite numbers. It’s — all right, we don’t need to talk about one forever. Two? The smallest prime number. The smallest even number. The only even prime. The only — yeah, let’s move on. Three; the smallest odd prime number. Triangular number. One of only two prime numbers that isn’t one more or one less than a multiple of six. Let’s move on. Four. A square number. The smallest whole number that isn’t 1 or a prime. Five. Prime number. First sum of two prime numbers. Part of the first prime pair. Six. Smallest perfect number. Smallest product of two different prime numbers. Let’s move on.

And so on. Somewhere around 22 or so, the imagination fails and we can’t think of anything not-boring about this number. So we’ve found the first number that hasn’t got any interesting properties! … Except that being the smallest boring number must be interesting. So we have to note that this is otherwise the smallest boring number except for that bit where it’s interesting. On to 23, which used to be the default funny number. 24. … Oh, carry on. Maybe around 31 things settle down again. Our first boring number! Except that, again, being the smallest boring number is interesting. We move on to 32, 33, 34. When we find one that couldn’t be interesting, we find that’s interesting. We’re left to conclude there is no such thing as a boring number.

This would be a nice thing to say for numbers that otherwise get no attention, if we pretend they can have hurt feelings. But we do have to admit, 1729 is actually only interesting because it’s a part of the legend of Srinivasa Ramanujan. Enjoy the silliness for a few paragraphs more.

(This is, if I’m not mistaken, a form of the heap paradox. Don’t remember that? Start with a heap of sand. Remove one grain; you’ve still got a heap of sand. Remove one grain again. Still a heap of sand. Remove another grain. Still a heap of sand. And yet if you did this enough you’d leave one or two grains, not a heap of sand. Where does that change?)

Another problem, something you might consider right after learning about fractions. What’s the smallest positive number? Not one-half, since one-third is smaller and still positive. Not one-third, since one-fourth is smaller and still positive. Not one-fourth, since one-fifth is smaller and still positive. Pick any number you like and there’s something smaller and still positive. This is a difference between the positive integers and the positive real numbers. (Or the positive rational numbers, if you prefer.) The thing positive integers have is obvious, but it is not a given.

The difference is that the positive integers are well-ordered, while the positive real numbers aren’t. Well-ordering we build on ordering. Ordering is exactly what you imagine it to be. Suppose you can say, for any two things in a set, which one is less than another. A set is well-ordered if whenever you have a non-empty subset you can pick out the smallest element. Smallest means exactly what you think, too.

The positive integers are well-ordered. And more. The way they’re set up, they have a property called the “well-ordering principle”. This means any non-empty set of positive integers has a smallest number in it.

This is one of those principles that seems so obvious and so basic that it can’t teach anything interesting. That it serves a role in some proofs, sure, that’s easy to imagine. But something important?

Look back to the joke/paradox I started with. It proves that every positive integer has to be interesting. Every number, including the ones we use every day. Including the ones that no one has ever used in any mathematics or physics or economics paper, and never will. We can avoid that paradox by attacking the vagueness of “interesting” as a word. Are you interested to know the 137th number you can write as the sum of cubes in two different ways? Before you say ‘yes’, consider whether you could name it ten days after you’ve heard the number.

(Granted, yes, it would be nice to know the 137th such number. But would you ever remember it? Would you trust that it’ll be on some Wikipedia page that somehow is never threatened with deletion for not being noteworthy? Be honest.)

But suppose we have some property that isn’t so mushy. Suppose that we can describe it in some way that’s indexed by the positive integers. Furthermore, suppose that we show that in any set of the positive integers it must be true for the smallest number in that set. What do we know?

— We know that it must be true for all the positive integers. There’s a smallest positive integer. The positive integers have this well-ordered principle. So any subset of the positive integers has some smallest member. And if we can show that something or other is always true for the smallest number in a subset of the positive integers, there you go.

This technique we call, when it’s introduced, induction. It’s usually a baffling subject because it’s usually taught like this: suppose the thing you want to show is indexed to the positive integers. Show that it’s true when the index is ‘1’. Show that if the thing is true for an arbitrary index ‘n’, then you know it’s true for ‘n + 1’. It’s baffling because that second part is hard to visualize. The student makes a lot of mistakes in learning, on examples of what the sum of the first ‘N’ whole numbers or their squares or cubes are. I don’t think induction is ever taught in this well-ordering principle method. But it does get used in proofs, once you get to the part of analysis where you don’t have to interact with actual specific numbers much anymore.

The well-ordering principle also gives us the method of infinite descent. You encountered this in learning proofs about, like, how the square root of two must be an irrational number. In this, you show that if something is true for some positive integer, then it must also be true for some other, smaller positive integer. And therefore some other, smaller positive integer again. And again, until you get into numbers small enough you can check by hand.

It keeps creeping in. The Fundamental Theorem of Arithmetic says that every positive whole number larger than one is a product of a unique string of prime numbers. (Well, the order of the primes doesn’t matter. 2 times 3 times 5 is the same number as 3 times 2 times 5, and so on.) The well-ordering principle guarantees you can factor numbers into a product of primes. Watch this slick argument.

Suppose you have a set of whole numbers that isn’t the product of prime numbers. There must, by the well-ordering principle, be some smallest number in that set. Call that number ‘n’. We know that ‘n’ can’t be prime, because if it were, then that would be its prime factorization. So it must be the product of at least two other numbers. Let’s suppose it’s two numbers. Call them ‘a’ and ‘b’. So, ‘n’ is equal to ‘a’ times ‘b’.

Well, ‘a’ and ‘b’ have to be less than ‘n’. So they’re smaller than the smallest number that isn’t a product of primes. So, ‘a’ is the product of some set of primes. And ‘b’ is the product of some set of primes. And so, ‘n’ has to equal the primes that factor ‘a’ times the primes that factor ‘b’. … Which is the prime factorization of ‘n’. So, ‘n’ can’t be in the set of numbers that don’t have prime factorizations. And so there can’t be any numbers that don’t have prime factorizations. It’s for the same reason we worked out there aren’t any numbers with nothing interesting to say about them.

And isn’t it delightful to find so simple a principle can prove such specific things?

The Summer 2017 Mathematics A To Z: Volume Forms


I’ve been reading Elke Stangl’s Elkemental Force blog for years now. Sometimes I even feel social-media-caught-up enough to comment, or at least to like posts. This is relevant today as I discuss one of the Stangl’s suggestions for my letter-V topic.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Volume Forms.

So sometime in pre-algebra, or early in (high school) algebra, you start drawing equations. It’s a simple trick. Lay down a coordinate system, some set of axes for ‘x’ and ‘y’ and maybe ‘z’ or whatever letters are important. Look to the equation, made up of x’s and y’s and maybe z’s and so. Highlight all the points with coordinates whose values make the equation true. This is the logical basis for saying (eg) that the straight line “is” y = 2x + 1 .

A short while later, you learn about polar coordinates. Instead of using ‘x’ and ‘y’, you have ‘r’ and ‘θ’. ‘r’ is the distance from the center of the universe. ‘θ’ is the angle made with respect to some reference axis. It’s as legitimate a way of describing points in space. Some classrooms even have a part of the blackboard (whiteboard, whatever) with a polar-coordinates “grid” on it. This looks like the lines of a dartboard. And you learn that some shapes are easy to describe in polar coordinates. A circle, centered on the origin, is ‘r = 2’ or something like that. A line through the origin is ‘θ = 1’ or whatever. The line that we’d called y = 2x + 1 before? … That’s … some mess. And now r = 2\theta + 1 … that’s not even a line. That’s some kind of spiral. Two spirals, really. Kind of wild.

And something to bother you a while. y = 2x + 1 is an equation that looks the same as r = 2\theta + 1 . You’ve changed the names of the variables, but not how they relate to each other. But one is a straight line and the other a spiral thing. How can that be?

The answer, ultimately, is that the letters in the equations aren’t these content-neutral labels. They carry meaning. ‘x’ and ‘y’ imply looking at space a particular way. ‘r’ and ‘θ’ imply looking at space a different way. A shape has different representations in different coordinate systems. Fair enough. That seems to settle the question.

But if you get to calculus the question comes back. You can integrate over a region of space that’s defined by Cartesian coordinates, x’s and y’s. Or you can integrate over a region that’s defined by polar coordinates, r’s and θ’s. The first time you try this, you find … well, that any region easy to describe in Cartesian coordinates is painful in polar coordinates. And vice-versa. Way too hard. But if you struggle through all that symbol manipulation, you get … different answers. Eventually the calculus teacher has mercy and explains. If you’re integrating in Cartesian coordinates you need to use “dx dy”. If you’re integrating in polar coordinates you need to use “r dr dθ”. If you’ve never taken calculus, never mind what this means. What is important is that “r dr dθ” looks like three things multiplied together, while “dx dy” is two.

We get this explained as a “change of variables”. If we want to go from one set of coordinates to a different one, we have to do something fiddly. The extra ‘r’ in “r dr dθ” is what we get going from Cartesian to polar coordinates. And we get formulas to describe what we should do if we need other kinds of coordinates. It’s some work that introduces us to the Jacobian, which looks like the most tedious possible calculation ever at that time. (In Intro to Differential Equations we learn we were wrong, and the Wronskian is the most tedious possible calculation ever. This is also wrong, but it might as well be true.) We typically move on after this and count ourselves lucky it got no worse than that.

None of this is wrong, even from the perspective of more advanced mathematics. It’s not even misleading, which is a refreshing change. But we can look a little deeper, and get something good from doing so.

The deeper perspective looks at “differential forms”. These are about how to encode information about how your coordinate system represents space. They’re tensors. I don’t blame you for wondering if they would be. A differential form uses interactions between some of the directions in a space. A volume form is a differential form that uses all the directions in a space. And satisfies some other rules too. I’m skipping those because some of the symbols involved I don’t even know how to look up, much less make WordPress present.

What’s important is the volume form carries information compactly. As symbols it tells us that this represents a chunk of space that’s constant no matter what the coordinates look like. This makes it possible to do analysis on how functions work. It also tells us what we would need to do to calculate specific kinds of problem. This makes it possible to describe, for example, how something moving in space would change.

The volume form, and the tools to do anything useful with it, demand a lot of supporting work. You can dodge having to explicitly work with tensors. But you’ll need a lot of tensor-related materials, like wedge products and exterior derivatives and stuff like that. If you’ve never taken freshman calculus don’t worry: the people who have taken freshman calculus never heard of those things either. So what makes this worthwhile?

Yes, person who called out “polynomials”. Good instinct. Polynomials are usually a reason for any mathematics thing. This is one of maybe four exceptions. I have to appeal to my other standard answer: “group theory”. These volume forms match up naturally with groups. There’s not only information about how coordinates describe a space to consider. There’s ways to set up coordinates that tell us things.

That isn’t all. These volume forms can give us new invariants. Invariants are what mathematicians say instead of “conservation laws”. They’re properties whose value for a given problem is constant. This can make it easier to work out how one variable depends on another, or to work out specific values of variables.

For example, classical physics problems like how a bunch of planets orbit a sun often have a “symplectic manifold” that matches the problem. This is a description of how the positions and momentums of all the things in the problem relate. The symplectic manifold has a volume form. That volume is going to be constant as time progresses. That is, there’s this way of representing the positions and speeds of all the planets that does not change, no matter what. It’s much like the conservation of energy or the conservation of angular momentum. And this has practical value. It’s the subject that brought my and Elke Stangl’s blogs into contact, years ago. It also has broader applicability.

There’s no way to provide an exact answer for the movement of, like, the sun and nine-ish planets and a couple major moons and all that. So there’s no known way to answer the question of whether the Earth’s orbit is stable. All the planets are always tugging one another, changing their orbits a little. Could this converge in a weird way suddenly, on geologic timescales? Might the planet might go flying off out of the solar system? It doesn’t seem like the solar system could be all that unstable, or it would have already. But we can’t rule out that some freaky alignment of Jupiter, Saturn, and Halley’s Comet might not tweak the Earth’s orbit just far enough for catastrophe to unfold. Granted there’s nothing we could do about the Earth flying out of the solar system, but it would be nice to know if we face it, we tell ourselves.

But we can answer this numerically. We can set a computer to simulate the movement of the solar system. But there will always be numerical errors. For example, we can’t use the exact value of π in a numerical computation. 3.141592 (and more digits) might be good enough for projecting stuff out a day, a week, a thousand years. But if we’re looking at millions of years? The difference can add up. We can imagine compensating for not having the value of π exactly right. But what about compensating for something we don’t know precisely, like, where Jupiter will be in 16 million years and two months?

Symplectic forms can help us. The volume form represented by this space has to be conserved. So we can rewrite our simulation so that these forms are conserved, by design. This does not mean we avoid making errors. But it means we avoid making certain kinds of errors. We’re more likely to make what we call “phase” errors. We predict Jupiter’s location in 16 million years and two months. Our simulation puts it thirty degrees farther in its circular orbit than it actually would be. This is a less serious mistake to make than putting Jupiter, say, eight-tenths as far from the Sun as it would really be.

Volume forms seem, at first, a lot of mechanism for a small problem. And, unfortunately for students, they are. They’re more trouble than they’re worth for changing Cartesian to polar coordinates, or similar problems. You know, ones that the student already has some feel for. They pay off on more abstract problems. Tracking the movement of a dozen interacting things, say, or describing a space that’s very strangely shaped. Those make the effort to learn about forms worthwhile.

The Summer 2017 Mathematics A To Z: Ulam’s Spiral


Gaurish, of For the love of Mathematics, asked me about one of those modestly famous (among mathematicians) mathematical figures. Yeah, I don’t have a picture of it. Too much effort. It’s easier to write instead.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Ulam’s Spiral.

Boredom is unfairly maligned in our society. I’ve said this before, but that was years ago, and I have some different readers today. We treat boredom as a terrible thing, something to eliminate. We treat it as a state in which nothing seems interesting. It’s not. Boredom is a state in which anything, however trivial, engages the mind. We would not count the tiles on the floor, or time the rocking of a chandelier, or wonder what fraction of solitaire games can be won if we were never bored. A bored mind is a mind ready to discover things. We should welcome the state.

Several times in the 20th century Stanislaw Ulam was bored. I mention solitaire games because, according to Ulam, he spent some time in 1946 bored, convalescent and playing a lot of solitaire. He got to wondering what’s the probability a particular solitaire game is winnable? (He was specifically playing Canfield solitaire. The game’s also called Demon, Chameleon, or Storehouse, if Wikipedia is right.) What’s the chance the cards can’t be played right, no matter how skilled the player is? It’s a problem impossible to do exactly. Ulam was one of the mathematicians designing and programming the computers of the day.

He, with John von Neumann, worked out how to get a computer to simulate many, many rounds of cards. They would get an answer that I have never seen given in any history of the field. The field is Monte Carlo simulations. It’s built on using random numbers to conduct experiments that approximate an answer. (They’re also what my specialty is in. I mention this for those who’ve wondered what, if any, mathematics field I do consider myself competent in. This is not it.) The chance of a winnable deal is about 71 to 72 percent, although actual humans can’t hope to do more than about 35 percent. My evening’s experience with this Canfield Solitaire game suggests the chance of winning is about zero.

In 1963, Ulam told Martin Gardner, he was bored again during a paper’s presentation. Ulam doodled, and doodled something interesting enough to have a computer doodle more than mere pen and paper could. It was interesting enough to feature in Gardner’s Mathematical Games column for March 1964. It started with what the name suggested, a spiral.

Write down ‘1’ in the center. Write a ‘2’ next to it. This is usually done to the right of the ‘1’. If you want the ‘2’ to be on the left, or above, or below, fine, it’s your spiral. Write a ‘3’ above the ‘2’. (Or below if you want, or left or right if you’re doing your spiral that way. You’re tracing out a right angle from the “path” of numbers before that.) A ‘4’ to the left of that, a ‘5’ under that, a ‘6’ under that, a ‘7’ to the right of that, and so on. A spiral, for as long as your paper or your patience lasts. Now draw a circle around the ‘2’. Or a box. Whatever. Highlight it. Also do this for the ‘3’, and the ‘5’, and the ‘7’ and all the other prime numbers. Do this for all the numbers on your spiral. And look at what’s highlighted.

It looks like …

It’s …

Well, it’s something.

It’s hard to say what exactly. There’s a lot of diagonal lines to it. Not uninterrupted lines. Every diagonal line has some spottiness to it. There are blank regions too. There are some long stretches of numbers not highlighted, many of them horizontal or vertical lines with no prime numbers in them. Those stop too. The eye can’t help seeing clumps, especially. Imperfect diagonal stitching across the fabric of the counting numbers.

Maybe seeing this is some fluke. Start with another number in the center. 2, if you like. 41, if you feel ambitious. Repeat the process. The details vary. But the pattern looks much the same. Regions of dense-packed broken diagonals, all over the plane.

It begs us to believe there’s some knowable pattern here. That we could get an artist to draw a figure, with each spot in the figure corresponding to a prime number. This would be great. We know many things about prime numbers, but we don’t really have any system to generate a lot of prime numbers. Not much better than “here’s a thing, try dividing it”. Back in the 80s and 90s we had the big Fractal Boom. Everybody got computers that could draw what passed for pictures. And we could write programs that drew them. The Ulam Spiral was a minor but exciting prospect there. Was it a fractal? I don’t know. I’m not sure if anyone knows. (The spiral like you’d draw on paper wouldn’t be. The spiral that went out to infinitely large numbers might conceivably be.) It seemed plausible enough for computing magazines to be interested in. Maybe we could describe the pattern by something as simple as the Koch curve (that wriggly triangular snowflake shape). Or as easy to program as the Mandelbrot Curve.

We haven’t found one. As keeps happening with prime numbers, the answers evade us. We can understand why diagonals should appear. Write a polynomial of the form 4n^2 + b n + c . Evaluate it for n of 1, 2, 3, 4, and so on. Highlight those numbers. This will tend to highlight numbers that, in this spiral, are diagonal or horizontal or vertical lines. A lot of polynomials like this give a string of some prime numbers. But the polynomials all peter out. The lines all have interruptions.

There are other patterns. One, predating Ulam’s boring paper by thirty years, was made by Laurence Klauber. Klauber was a herpetologist of some renown, if Wikipedia isn’t misleading me. It claims his Rattlesnakes: Their Habits, Life Histories, and Influence on Mankind is still an authoritative text. I don’t know and will defer to people versed in the field. It also credits him with several patents in electrical power transmission.

Anyway, Klauber’s Triangle sets a ‘1’ at the top of the triangle. The numbers ‘2 3 4’ under that, with the ‘3’ directly beneath the ‘1’. The numbers ‘5 6 7 8 9’ beneath that, the ‘7’ directly beneath the ‘3’. ’10 11 12 13 14 15 16′ beneath that, the ’13’ underneath the ‘7’. And so on. Again highlight the prime numbers. You get again these patterns of dots and lines. Many vertical lines. Some lines in isometric view. It looks like strands of Morse Code.

In 1994 Robert Sacks created another variant. This one places the counting numbers on an Archimedian spiral. Space the numbers correctly and highlight the primes. The primes will trace out broken curves. Some are radial. Some spiral in (or out, if you rather). Some open up islands. The pattern looks like a Saul Bass logo for a “Nifty Fifty”-era telecommunications firm or maybe an airline.

You can do more. Draw a hexagonal spiral. Triangular ones. Other patterns of laying down numbers. You get patterns. The eye can’t help seeing order there. We can’t quite pin down what it is. Prime numbers keep evading our full understanding. Perhaps it would help to doodle a little during a tiresome conference call.


Stanislaw Ulam did enough fascinating numerical mathematics that I could probably do a sequence just on his work. I do want to mention one thing. It’s part of information theory. You know the game Twenty Questions. Play that, but allow for some lying. The game is still playable. Ulam did not invent this game; Alfréd Rényi did. (I do not know anything else about Rényi.) But Ulam ran across Rényi’s game, and pointed out how interesting it was, and mathematicians paid attention to him.