What I Wrote About In My Little 2021 Mathematics A to Z

It’s good to have an index of the topics I wrote about for each of my A-to-Z sequences. It’s good for me, at least. It makes my future work much easier. And it might help people find past essays. I hope to have my essay about what I learned from a project that was supposed to be nearly one-third shorter, and ended up sprawling past its designated year, next week.

All of the Little 2021 Mathematics A-to-Z essays should be at this link. And gathered at this link should be all of the A-to-Z essays from all past years. Thank you for your reading.

My Little 2021 Mathematics A-to-Z: Zorn’s Lemma

The joke to which I alluded last week was a quick pun. The setup is, “What is yellow and equivalent to the Axiom of Choice?” It’s the topic for this week, and the conclusion of the Little 2021 Mathematics A-to-Z. I again thank Mr Wu, of Singapore Maths Tuition, for a delightful topic.

Zorn’s Lemma

Max Zorn did not name it Zorn’s Lemma. You expected that. He thought of it just as a Maximal Principle when introducing it in a 1934 presentation and 1935 paper. The word “lemma” connotes that some theorem is a small thing. It usually means it’s used to prove some larger and more interesting theorem. Zorn’s Lemma is one of those small things. With the right background, a rigorous proof is a couple not-too-dense paragraphs. Without the right background? It’s one of those proofs you read the statement of and nod, agreeing, that sounds reasonable.

The lemma is about partially ordered sets. A set’s partially ordered if it has a relationship between pairs of items in it. You will sometimes see a partially ordered set called a “poset”, a term of mathematical art which make me smile too. If we don’t know anything about the ordering relationship we’ll use the ≤ symbol, just like this was ordinary numbers. To be partially ordered, whenever x ≤ y and y ≤ x, we know that x and y must be equal. And the converse: if x = y then x ≤ y and y ≤ x. What makes this partial is that we’re not guaranteed that every x and y relate in some way. It’s a totally ordered set if we’re guaranteed that at least one of x ≤ y and y ≤ x is always true. And then there is such a thing as a well-ordered set. This is a totally ordered set for which every subset (unless it’s empty) has a minimal element.

If we have a couple elements, each of which we can put in some order, then we can create a chain. If x ≤ y and y ≤ z, then we can write x ≤ y ≤ z and we have at least three things all relating to one another. This seems like stuff too basic to notice, if we think too literally about the relationship being “is less than or equal to”. If the relationship is, say, “divides wholly into”, then we get some interesting different chains. Like, 2 divides into 4, which divides into 8, which divides into 24. And 3 divides into 6 which divides into 24. But 2 doesn’t divide into 3, nor 3 into 2. 4 doesn’t divide into 6, nor 6 into either 8 or 4.

So what Zorn’s Lemma says is, if all the chains in a partially ordered set each have an upper bound, then, the partially ordered set has a maximal element. “Maximal element” here means an element that doesn’t have a bigger comparable element. (That is, m is maximal if there’s no other element b for which m ≤ b. It’s possible that m and b can’t be compared, though, the way 6 doesn’t divide 8 and 8 doesn’t divide 6.) This is a little different from a “maximum” . It’s possible for there to be several maximal elements. But if you parse this as “if you can always find a maximum in a string of elements, there’s some maximum element”? And remember there could be many maximums? Then you’re getting the point.

You may also ask how this could be interesting. Zorn’s Lemma is an existence proof. Most existence proofs assure us a thing we thought existed does, but don’t tell us how to find it. This is all right. We tend to rely on an existence proof when we want to talk about some mathematical item but don’t care about fussy things like what it is. It is much the way we might talk about “an odd perfect number N”. We can describe interesting things that follow from having such a number even before we know what value N has.

A classic example, the one you find in any discussion of using Zorn’s Lemma, is about the basis for a vector space. This is like deciding how to give directions to a point in space. But vector spaces include some quite abstract things. One vector space is “the set of all functions you can integrate”. Another is “matrices whose elements are all four-dimensional rotations”. There might be literally infinitely many “directions” to go. How do we know we can find a set of directions that work as well as, for guiding us around a city, the north-south-east-west compass rose does? So there’s the answer. There are other things done all the time, too. A nontrivial ring-with-identity, for example, has to have a maximal ideal. (An ideal is a subset of the ring that’s still a ring.) This is handy to know if you’re working with rings a lot.

The joke in my prologue was built on the claim Zorn’s Lemma is equivalent to the Axiom of Choice. The Axiom of Choice is a piece of set theory that surprised everyone by being independent of the Zermelo-Fraenkel axioms. The Axiom says that, if you have a collection of disjoint nonempty sets, then there must exist at least one set with exactly one element from each of those sets. That is, you can pick one thing out of each of a set of bins. It’s easy to see how this has in common with Zorn’s Lemma being too obvious to imagine proving. That’s the sort of thing that makes a good axiom. Thing about a lemma, though, is we do prove it. That’s how we know it’s a lemma. How can a lemma be equivalent to an axiom?

I’l argue by analogy. In Euclidean geometry one of the axioms is this annoying statement about on which side of a line two other lines that intersect it will meet. If you have this axiom, you can prove some nice results, like, the interior angles of a triangle add up to two right angles. If you decide you’d rather make your axiom that bit about the interior angles adding up? You can go from that to prove the thing about two lines crossing a third line.

So it is here. If you suppose the Axiom of Choice is true, you can get Zorn’s Lemma: you can pick an element in your set, find a chain for which that’s the minimum, and find your maximal element from that. If you make Zorn’s Lemma your axiom? You can use x ≤ y to mean “x is a less desirable element to pick out of this set than is y”. And then you can choose a maximal element out of your set. (It’s a bit more work than that, but it’s that kind of work.)

There’s another theorem, or principle, that’s (with reservations) equivalent to both Zorn’s Lemma and the Axiom of Choice. It’s another piece that seems so obvious it should defy proof. This is the well-ordering theorem, which says that every set can be well-ordered. That is, so that every non-empty subset has some minimum element. Finally, a mathematical excuse for why we have alphabetical order, even if there’s no clear reason that “j” should come after “i”.

(I said “with reservations” above. This is because whether these are equivalent depends on what, precisely, kind of deductive logic you’re using. If you are not using ordinary propositional logic, and are using a “second-order logic” instead, they differ.)

Ermst Zermelo introduced the Axiom of Choice to set theory so that he could prove this in a way that felt reasonable. I bet you can imagine how you’d go from “every non-empty set has a minimum element” right back to “you can always pick one element of every set”, though. And, maybe if you squint, can see how to get from “there’s always a minimum” to “there has to be a maximum”. I’m speaking casually here because proving it precisely is more work than we need to do.

I mentioned how Zorn did not name his lemma after himself. Mathematicians typically don’t name things for themselves. Nor did he even think of it as a lemma. His name seems to have adhered to the principle in the late 30s. Credit the nonexistent mathematician Bourbaki writing about “le théorème de Zorn”. By 1940 John Tukey, celebrated for the Fast Fourier Transform, wrote of “Zorn’s Lemma”. Tukey’s impression was that this is how people in Princeton spoke of it at the time. He seems to have been the first to put the words “Zorn’s Lemma” in print, though. Zorn isn’t the first to have stated this. Kazimierez Kuratowski, in 1922, described what is clearly Zorn’s Lemma in a different form. Zorn remembered being aware of Kuratowski’s publication but did not remember noticing the property. The Hausdorff Maximal Principle, of Felix Hausdorff, has much the same content. Zorn said he did not know about Hausdorff’s 1927 paper until decades later.

Zorn’s lemma, the Axiom of Choice, the well-ordering theorem, and Hausdorff’s Maximal Principle all date to the early 20th century. So do a handful of other ideas that turn out to be equivalent. This was an era when set theory saw an explosive development of new and powerful ideas. The point of describing this chain is to emphasize that great concepts often don’t have a unique presentation. Part of the development of mathematics is picking through several quite similar expressions of a concept. Which one do we enshrine as an axiom, or at least the canonical presentation of the idea?

We have to choose.

And with this I at last declare the hard work Little 2021 Mathematics A-to-Z at an end. I plan to follow up, as traditional, with a little essay about what I learned while doing this project. All of the Little 2021 Mathematics A-to-Z essays should be at this link. And then all of the A-to-Z essays from all eight projects should be at this link. Thank you so for your support in these difficult times.

My Little 2021 Mathematics A-to-Z: Ordinary Differential Equations

Mr Wu, my Singapore Maths Tuition friend, has offered many fine ideas for A-to-Z topics. This week’s is another of them, and I’m grateful for it.

Ordinary Differential Equations

As a rule, if you can do something with a number, you can do the same thing with a function. Not always, of course, but the exceptions are fewer than you might imagine. I’ll start with one of those things you can do to both.

A powerful thing we learn in (high school) algebra is that we can use a number without knowing what it is. We give it a name like ‘x’ or ‘y’ and describe what we find interesting about it. If we want to know what it is, we (usually) find some equation or set of equations and find what value of x could make that true. If we study enough (college) mathematics we learn its equivalent in functions. We give something a name like f or g or Ψ and describe what we know about it. And then try to find functions which make that true.

There are a couple common types of equation for these not-yet-known functions. The kind you expect to learn as a mathematics major involves differential equations. These are ones where your equation (or equations) involve derivatives of the not-yet-known f. A derivative describes the rate at which something changes. If we imagine the original f is a position, the derivative is velocity. Derivatives can have derivatives also; this second derivative would be the acceleration. And then second derivatives can have derivatives also, and so on, into infinity. When an equation involves a function and its derivatives we have a differential equation.

(The second common type is the integral equation, using a function and its integrals. And a third involves both derivatives and integrals. That’s known as an integro-differential equation, and isn’t life complicated enough? )

Differential equations themselves naturally divide into two kinds, ordinary and partial. They serve different roles. Usually an ordinary differential equation we can describe the change for from knowing only the current situation. (This may include velocities and accelerations and stuff. We could ask what the velocity at an instant means. But never mind that here.) Usually a partial differential equation bases the change where you are on the neighborhood of where your location. If you see holes you can pick in that, you’re right. The precise difference is about the independent variables. If the function f has more than one independent variable, it’s possible to take a partial derivative. This describes how f changes if one variable changes while the others stay fixed. If the function f has only the one independent variable, you can only take ordinary derivatives. So you get an ordinary differential equation.

But let’s speak casually here. If what you’re studying can be fully represented with a dashboard readout? Like, an ordered list of positions and velocities and stuff? You probably have an ordinary differential equation. If you need a picture with a three-dimensional surface or a color map to understand it? You probably have a partial differential equation.

One more metaphor. If you can imagine the thing you’re modeling as a marble rolling around on a hilly table? Odds are that’s an ordinary differential equation. And that representation covers a lot of interesting problems. Marbles on hills, obviously. But also rigid pendulums: we can treat the angle a pendulum makes and the rate at which those change as dimensions of space. The pendulum’s swinging then matches exactly a marble rolling around the right hilly table. Planets in space, too. We need more dimensions — three space dimensions and three velocity dimensions — for each planet. So, like, the Sun-Earth-and-Moon would be rolling around a hilly table with 18 dimensions. That’s all right. We don’t have to draw it. The mathematics works about the same. Just longer.

[ To be precise we need three momentum dimensions for each orbiting body. If they’re not changing mass appreciably, and not moving too near the speed of light, velocity is just momentum times a constant number, so we can use whichever is easier to visualize. ]

We mostly work with ordinary differential equations of either the first or the second order. First order means we have first derivatives in the equation, but never have to deal with more than the original function and its first derivative. Second order means we have second derivatives in the equation, but never have to deal with more than the original function or its first or second derivatives. You’ll never guess what a “third order” differential equation is unless you have experience in reading words. There are some reasons we stick to these low orders like first and second, though. One is that we know of good techniques for solving most first- and second-order ordinary differential equations. For higher-order differential equations we often use techniques that find a related normal old polynomial. Its solution helps with the thing we want. Or we break a high-order differential equation into a set of low-order ones. So yes, again, we search for answers where the light is good. But the good light covers many things we like to look at.

There’s simple harmonic motion, for example. It covers pendulums and springs and perturbations around stable equilibriums and all. This turns out to cover so many problems that, as a physics major, you get a little sick of simple harmonic motion. There’s the Airy function, which started out to describe the rainbow. It turns out to describe particles trapped in a triangular quantum well. The van der Pol equation, about systems where a small oscillation gets energy fed into it while a large oscillation gets energy drained. All kinds of exponential growth and decay problems. Very many functions where pairs of particles interact.

This doesn’t cover everything we would like to do. That’s all right. Ordinary differential equations lend themselves to numerical solutions. It requires considerable study and thought to do these numerical solutions well. But this doesn’t make the subject unapproachable. Few of us could animate the “Pink Elephants on Parade” scene from Dumbo. But could you draw a flip book of two stick figures tossing a ball back and forth? If you’ve had a good rest, a hearty breakfast, and have not listened to the news yet today, so you’re in a good mood?

The flip book ball is a decent example here, too. The animation will look good if the ball moves about the “right” amount between pages. A little faster when it’s first thrown, a bit slower as it reaches the top of its arc, a little faster as it falls back to the catcher. The ordinary differential equation tells us how fast our marble is rolling on this hilly table, and in what direction. So we can calculate how far the marble needs to move, and in what direction, to make the next page in the flip book.

Almost. The rate at which the marble should move will change, in the interval between one flip-book page and the next. The difference, the error, may not be much. But there is a difference between the exact and the numerical solution. Well, there is a difference between a circle and a regular polygon. We have many ways of minimizing and estimating and controlling the error. Doing that is what makes numerical mathematics the high-paid professional industry it is. Our game of catch we can verify by flipping through the book. The motion of four dozen planets and moons attracting one another is harder to be sure we calculate it right.

I said at the top that most anything one can do with numbers one can do with functions also. I would like to close the essay with some great parallel. Like, the way that trying to solve cubic equations made people realize complex numbers were good things to have. I don’t have a good example like that for ordinary differential equations, where the study expanded our ideas of what functions could be. Part of that is that complex numbers are more accessible than the stranger functions. Part of that is that complex numbers have a story behind them. The story features titanic figures like Gerolamo Cardano, Niccolò Tartaglia and Ludovico Ferrari. We see some awesome and weird personalities in 19th century mathematics. But their fights are generally harder to watch from the sidelines and cheer on. And part is that it’s easier to find pop historical treatments of the kinds of numbers. The historiography of what a “function” is is a specialist occupation.

But I can think of a possible case. A tool that’s sometimes used in solving ordinary differential equations is the “Dirac delta function”. Yes, that Paul Dirac. It’s a weird function, written as $\delta(x)$ . It’s equal to zero everywhere, except where $x$ is zero. When $x$ is zero? It’s … we don’t talk about what it is. Instead we talk about what it can do. The integral of that Dirac delta function times some other function can equal that other function at a single point. It strains credibility to call this a function the way we speak of, like, $sin(x)$ or $\sqrt{x^2 + 4}$ being functions. Many will classify it as a distribution instead. But it is so useful, for a particular kind of problem, that it’s impossible to throw away.

So perhaps the parallels between numbers and functions extend that far. Ordinary differential equations can make us notice kinds of functions we would not have seen otherwise.

And with this — I can see the much-postponed end of the Little 2021 Mathematics A-to-Z! You can read all my entries for 2021 at this link, and if you’d like can find all my A-to-Z essays here. How will I finish off the shortest yet most challenging sequence I’ve done yet? Will it be yellow and equivalent to the Axiom of Choice? Answers should come, in a week, if all starts going well.

My Little 2021 Mathematics A-to-Z: Tangent Space

And now, finally, I resume and hopefully finish what was meant to be a simpler and less stressful A-to-Z for last year. I’m feeling much better about my stress loads now and hope that I can soon enjoy the feeling of having a thing accomplished.

This topic is one of many suggestions that Elkement, one of my longest blog-friendships here, offered. It’s a creation that sent me back to my grad school textbooks, some of those slender paperback volumes with tiny, close-set type that turn out to be far more expensive than you imagine. Though not in this case: my most useful reference here was V I Arnold’s Ordinary Differential Equations, stamped inside as costing $18.75. The field is full of surprises. Another wonderful reference was this excellent set of notes prepared by Jodin Morey. They would have done much to help me through that class.

Tangent Space

Stand in midtown Manhattan, holding a map of midtown Manhattan. You have — not a tangent space, not yet. A tangent plane, representing the curved surface of the Earth with the flat surface of your map, though. But the tangent space is near: see how many blocks you must go, along the streets and the avenues, to get somewhere. Four blocks north, three west. Two blocks south, ten east. And so on. Those directions, of where you need to go, are the tangent space around you.

There is the first trick in tangent spaces. We get accustomed, early in learning calculus, to think of tangent lines and then of tangent planes. These are nice, flat approximations to some original curve. But while we’re introduced to the tangent space, and first learn examples of it, as tangent planes, we don’t stay there. There are several ways to define tangent spaces. One recasts tangent spaces in group theory terms, describing them as a ring based on functions that are equal to zero at the tangent point. (To be exact, it’s an ideal, based on a quotient group, based on two sets of such functions.)

That’s a description mathematicians are inclined to like, not only because it’s far harder to imagine than a map of the city is. But this ring definition describes the tangent space in terms of what we can do with it, rather than how to calculate finding it. That tends to appeal to mathematicians. And it offers surprising insights. Cleverer mathematicians than I am notice how this makes tangent spaces very close to Lagrange multipliers. Lagrange multipliers are a technique to find the maximum of a function subject to a constraint from another function. They seem to work by magic, and tangent spaces will echo that.

I’ll step back from the abstraction. There’s relevant observations to make from this map of midtown. The directions “four blocks north, three west” do not represent any part of Manhattan. It describes a way you might move in Manhattan, yes. But you could move in that direction from many places in the city. And you could go four blocks north and three west if you were in any part of any city with a grid of streets. It is a vector space, with elements that are velocities at a tangent point.

The tangent space is less a map showing where things are and more one of how to get to other places, closer to a subway map than a literal one. Still, the topic is steeped in the language of maps. I’ll find it a useful metaphor too. We do not make a map unless we want to know how to find something. So the interesting question is what do we try to find in these tangent spaces?

There are several routes to tangent spaces. The one I’m most familiar with is through dynamical systems. These are typically physics-driven, sometimes biology-driven, problems. They describe things that change in time according to ordinary differential equations. Physics problems particularly are often about things moving in space. Space, in dynamical systems, becomes “phase space”, an abstract universe spanned by all of the possible values of the variables. The variables are, usually, the positions and momentums of the particles (for a physics problem). Sometimes time and energy appear as variables. In biology variables are often things that represent populations. The role the Earth served in my first paragraph is now played by a manifold. The manifold represents whatever constraints are relevant to the problem. That’s likely to be conservation laws or limits on how often arctic hares can breed or such.

The evolution in time of this system, though, is now the tracing out of a path in phase space. An understandable and much-used system is the rigid pendulum. A stick, free to swing around a point. There are two useful coordinates here. There’s the angle the stick makes, relative to the vertical axis, $\theta$ . And there’s how fast the stick is changing, $\dot{\theta}$ . You can draw these axes; I recommend $\theta$ as the horizontal and $\dot{\theta}$ as the vertical axis but, you know, you do you.

If you give the pendulum a little tap, it’ll swing back and forth. It rises and moves to the right, then falls while moving to the left, then rises and moves to the left, then falls and moves to the right. In phase space, this traces out an ellipse. It’s your choice whether it’s going clockwise or anticlockwise. If you give the pendulum a huge tap, it’ll keep spinning around and around. It’ll spin a little slower as it gets nearly upright, but it speeds back up again. So in phase space that’s a wobbly line, moving either to the right or the left, depending what direction you hit it.

You can even imagine giving the pendulum just the right tap, exactly hard enough that it rises to vertical and balances there, perfectly aligned so it doesn’t fall back down. This is a special path, the dividing line between those ellipses and that wavy line. Or setting it vertically there to start with and trusting no truck driving down the street will rattle it loose. That’s a very precise dot, where $\dot{\theta}$ is exactly zero. These paths, the trajectories, match whatever walking you did in the first paragraph to get to some spot in midtown Manhattan. And now let’s look again at the map, and the tangent space.

Within the tangent space we see what changes would change the system’s behavior. How much of a tap we would need, say, to launch our swinging pendulum into never-ending spinning. Or how much of a tap to stop a spinning pendulum. Every point on a trajectory of a dynamical system has a tangent space. And, for many interesting systems, the tangent space will be separable into two pieces. One of them will be perturbations that don’t go far from the original trajectory. One of them will be perturbations that do wander far from the original.

These regions may have a complicated border, with enclaves and enclaves within enclaves, and so on. This can be where we get (deterministic) chaos from. But what we usually find interesting is whether the perturbation keeps the old behavior intact or destroys it altogether. That is, how we can change where we are going.

That said, in practice, mathematicians don’t use tangent spaces to send pendulums swinging. They tend to come up when one is past studying such petty things as specific problems. They’re more often used in studying the ways that dynamical systems can behave. Tangent spaces themselves often get wrapped up into structures with names like tangent bundles. You’ll see them proving the existence of some properties, describing limit points and limit cycles and invariants and quite a bit of set theory. These can take us surprising places. It’s possible to use a tangent-space approach to prove the fundamental theorem of algebra, that every polynomial has at least one root. This seems to me the long way around to get there. But it is amazing to learn that is a place one can go.

I am so happy to be finally finishing Little 2021 Mathematics A-to-Z. All of this project’s essays should be at this link. And all my glossary essays from every year should be at this link. Thank you for reading.

My Little 2021 Mathematics A-to-Z: Atlas

I owe Elkement thanks again for a topic. They’re author of the Theory and Practice of Trying to Combine Just Anything blog. And the subject lets me circle back around topology.

Atlas.

Mathematics is like every field in having jargon. Some jargon is unique to the field; there is no lay meaning of a “homeomorphism”. Some jargon is words plucked from the common language, such as “smooth”. The common meaning may guide you to what mathematicians want in it. A smooth function has a graph with no gaps, no discontinuities, no sharp corners; you can see smoothness in it. Sometimes the common meaning is an ambiguous help. A “series” is the sum of a sequence of numbers, that is, it is one number. Mathematicians study the series, but by looking at properties of the sequence.

So what sort of jargon is “atlas”? In common English, an atlas is a book of maps. Each map represents something different. Perhaps a different region of space. Perhaps a different scale, or a different projection altogether. The maps may show different features, or show them at different times. The maps must be about the same sort of thing. No slipping a map of Narnia in with the map of an amusement park, unless you warn of that in the title. The maps must not contradict one another. (So far as human-made things can be consistent, anyway.) And that’s the important stuff.

Atlas is the first kind of common-word jargon. Mathematicians use it to mean a collection of things. Those collected things aren’t mathematical maps. “Map” is the second type of jargon. The collected things are coordinate charts. “Coordinate chart” is a pairing of words not likely to appear in common English. But if you did encounter them? The meaning you might guess from their common use is not far off their mathematical use.

A coordinate chart is a matching of the points in an open set to normal coordinates. Euclidean coordinates, to be precise. But, you know, latitude and longitude, if it’s two dimensional. Add in the altitude if it’s three dimensions. Your x-y-z coordinates. It still counts if this is one dimension, or four dimensions, or sixteen dimensions. You’re less likely to draw a sketch of those. (In practice, you draw a sketch of a three-dimensional blob, and put N = 16 off in the corner, maybe in a box.)

These coordinate charts are on a manifold. That’s the second type of common-language jargon. Manifold, to pick the least bad of its manifold common definitions, is a “complicated object or subject”. The mathematical manifold is a surface. The things on that surface are connected by relationships that could be complicated. But the shape can be as simple as a plane or a sphere or a torus.

Every point on a coordinate chart needs some unique set of coordinates. And if a point appears on two coordinate charts, they have to be consistent. Consistent here is the matching between charts being a homeomorphism. A homeomorphism is a map, in the jargon sense. So it’s a function matching open sets on one chart to ope sets in the other chart. There’s more to it (there always is). But the important thing is that, away from the edges of the chart, we don’t create any new gaps or punctures or missing sections.

Some manifolds are easy to spot. The surface of the Earth, for example. Many are easy to come up with charts for. Think of any map of the Earth. Each point on the surface of the Earth matches some point on the sheet of paper. The coordinate chart is … let’s say how far your point is from the upper left corner of the page. (Pretend that you can measure those points precisely enough to match them to, like, the town you’re in.) Could be how far you are from the center, or the lower right corner, or whatever. These are all as good, and even count as other coordinate charts.

It’s easy to imagine that as latitude and longitude. We see maps of the world arranged by latitude and longitude so often. And that’s fine; latitude and longitude makes a good chart. But we have a problem in giving coordinates to the north and south pole. The latitude is easy but the longitude? So we have two points that can’t be covered on the map. We can save our atlas by having a couple charts. For the Earth this can be a map of most of the world arranged by latitude and longitude, and then two insets showing a disc around the north and the south poles. Thus we have an atlas of three charts.

We can make this a little tighter, reducing this to two charts. Have one that’s your normal sort of wall map, centered on the equator. Have the other be a transverse Mercator map. Make its center the great circle going through the prime meridian and the 180-degree antimeridian. Then every point on the planet, including the poles, has a neat unambiguous coordinate in at least one chart. A good chunk of the world will be on both charts. We can throw in more charts if we like, but two is enough.

The requirements to be an atlas aren’t hard to meet. So a lot of geometric structures end up being atlases. Theodore Frankel’s wonderful The Geometry of Physics introduces them on page 15. But that’s also the last appearance of “atlas”, at least in the index. The idea gets upstaged. The manifolds that the atlas charts end up being more interesting. Many problems about things in motion are easy to describe as paths traced out on manifolds. A large chunk of mathematical physics is then looking at this problem and figuring out what the space of possible behaviors looks like. What its topology is.

In a sense, the mathematical physicist might survey a problem, like a scout exploring new territory, more than solve it. This exploration brings us to directional derivatives. To tangent bundles. To other terms, jargon only partially informed by the common meanings.

And we draw to the final weeks of 2021, and of the Little 2021 Mathematics A-to-Z. All this year’s essays should be at this link. And all my glossary essays from every year should be at this link. Thank you for reading!

My Little 2021 Mathematics A-to-Z: Subtraction

Iva Sallay was once again a kind friend to my writing efforts here. Sallay, who runs the Find the Factors recreational mathematics puzzle site, saw a topic gives a compelling theme to this year’s A-to-Z.

Subtraction.

Subtraction is the inverse of addition.

So thanks for reading along as the Little 2021 Mathematics A-to-Z enters its final stage. Next week I hope to be back with something for my third letter ‘A’ of the sequence.

All right, I can be a little more clear. By the inverse I mean subtraction is the name the name we give to adding the additive inverse of something. It’s what lets addition be a group action. That is, we write $a - b$ to mean we find whatever number, added to b, gives us 0. Then we add that to a. We do this pretty often, so it’s convenient to have a name for it. The word “subtraction” appears in English from about 1400. It grew from the Latin for “taking away”. By about 1425 the word has its mathematical meaning. I imagine this wasn’t too radical a linguistic evolution.

All right, so some other thoughts. What’s so interesting about subtraction that it’s worth a name? We don’t have a particular word for reversing, say, a permutation. But don’t go very far in school not thinking about inverting an addition. Must come down to subtraction’s practical use in finding differences between things. Often in figuring out change. Debts at least. Nobody needs the inverse of a permutation unless they’re putting a deck of cards back in order.

Subtraction has other roles, though. Not so much in mathematics, but in teaching us how to learn about mathematics. For example, subtraction gives us a good reason to notice zero. Zero, the additive identity, is implicit to addition. But if you’re learning addition, and you think of it as “put these two piles of things together into one larger pile”? What good does an empty pile do you there? It’s easy to not notice there’s a concept there. But subtraction, taking stuff away from a pile? You can imagine taking everything away, and wanting a word for that. This isn’t the only way to notice zero is worth some attention. It’s a good way, though.

There’s more, though. Learning subtraction teaches us limits of what we can do, mathematically. We can add 3 to 7 or, if it’s more convenient, 7 to 3. But we learn from the start that while we can subtract 3 from 7, there’s no subtracting 7 from 3. This is true when we’re learning arithmetic and numbers are all positive. Some time later we ask, what happens if we go ahead and do this anyway? And figure out a number that makes sense as the answer to “what do you get subtracting 7 from 3”? This introduces us to the negative numbers. It’s a richer idea of what it is to have numbers. We can start to see addition and subtraction as expressions of the same operation.

Linus: 'Lucy, how much is six from four?' Lucy: 'Six from four?! You can't subtract six from four ... you can't subtract a bigger number from a smaller number.' Linus: 'YOU CAN IF YOU'RE STUPID!' — Charles Schulz’s **Peanuts** for the 27th of August, 1957. The amazing thing is you can if you’re smart, too. We can ask whether it’s good teaching to start instructions with something that’s not true, and then revealing what’s not true about it. My hunch is there is, because this provides the lesson that, even for something as “objective” as mathematics, the way we construct things is a convention. That we can change our tools as we want to do new things.

But we also notice they’re not quite the same. As mentioned, addition can be done in any order. If I need to do 7 + 4 + 3 + 6 I can decide I’d rather do 4 + 6 + 7 + 3 and make that 10 + 10 before getting to 20. This all simplifies my calculating. If I need to do 7 – 4 – 3 – 6 I get into a lot of trouble if I simplify my work by writing 4 – 6 – 7 – 3 instead. Even if I decide I’d rather take the 3 – 6 and turn that into a negative 3 first, I’ve made a mess of things.

The first property this teaches us to notice we call “commutativity”. Most mathematical operations don’t have that. But a lot of the ones we find useful do. The second property this points out is “associativity”, which more of the operations we find useful have. It’s not essential that someone learning how to calculate know this is a way to categorize mathematics operations. (I’ve read that before the New Math educational reforms of the 1960s, American elementary school mathematics textbooks never mentioned commutativity or associativity.) But I suspect it is essential that someone learning mathematics learn the things you can do come in families.

So let me mention division, the inverse of multiplication. (And that my chosen theme won’t let me get to in sequence.) Like subtraction, division refuses to be commutative or associative. Subtraction prompts us to treat the negative numbers as something useful. In parallel, division prompts us to accept fractions as numbers. (We accepted fractions as numbers long before we accepted negative numbers, mind. Anyone with a pie and three friends has an interest in “one-quarter” that they may not have with “negative four”.) When we start learning about numbers raised to powers, or exponentials, we have questions ready to ask. How do the operations behave? Do they encourage us to find other kinds of number?

And we also think of how to patch up subtraction’s problems. If we want subtraction to be a kind of addition, we have to get precise about what that little subtraction sign means. What we’ve settled on is that $a - b$ is shorthand for $a + (-b)$ , where $-b$ is the additive inverse of $b$ .

Once we do that all subtraction’s problems with commutativity and associativity go away. 7 – 4 – 3 – 6 becomes 7 + (-4) + (-3) + (-6), and that we can shuffle around however convenient. Say, to 7 + (-3) + (-4) + (-6), then to 7 + (-3) + (-10), then to 4 + (-10), and so -6. Thus do we domesticate a useful, wild operation like subtraction.

Any individual subtraction has one right answer. There are many ways to get there, though. I had learned, for example, to do a problem such as 738 minus 451 by subtracting one column of numbers at a time. Right to left, so, subtracting 8 minus 1, and then 3 minus 5, and after the borrowing then 6 minus 4. I remember several elementary school textbooks explaining borrowing as unwrapping rolls of dimes. It was a model well-suited to me.

We don’t need to, though. We can go from the left to the right, doing 7 minus 4 first and 8 minus 1 last. We can go through and figure out all the possible carries before doing any work. There’s a slick method called partial differences which skips all the carrying. But it demands writing out several more intermediate terms. This uses more paper, but if there isn’t a paper shortage, so what?

There are more ways to calculate. If we turn things over to a computer, we’re likely to do subtraction using a complements technique. When I say computer you likely think electronic computer, or did right up to the adjective there. But mechanical computers were a thing too. Blaise Pascal’s computing device of the 1650s used nines’ complements to subtract on the gears that did addition. Explaining the trick would take me farther afield than I want to go now. But, you know how, like, 6 plus 3 is 9? So you can turn a subtraction of 6 into an addition of 3. Or a subtraction of 3 into an addition of 6. Plus some bookkeeping.

A digital computer is likely to use ones’ complements, representing every number as a string of 0’s and 1’s. This has great speed advantages. The complement of 0 is 1 and vice-versa, and it’s very quick for a computer to swap between 0 and 1. Subtraction by complements is different and, to my eye, takes more steps. But they might be steps you do better.

One more thought subtraction gives us, though. In a previous paragraph I wrote out 7 – 4, and also wrote 7 + (-4). We use the symbol – for two things. Do those two uses of – mean the same thing? You may think I’m being fussy here. After all, the value of -4 is the same as the value of 0 – 4. And even a fussy mathematician says whichever of “minus four” and “negative four” better fits the meter of the sentence. But our friends in the philosophy department would agree this is a fair question. Are we collapsing two related ideas together by using the same symbol for them?

My inclination is to say that the – of -4 is different from the – in 0 – 4, though. The – in -4 is a unary operation: it means “give me the inverse of the number on the right”. The – in 0 – 4 is a binary operation: it means “subtract the number on the right from the number on the left”. So I would say these are different things sharing a symbol. Unfortunately our friends in the philosophy department can’t answer the question for us. The university laid them off four years ago, part of society’s realignment away from questions like “how can we recognize when a thing is true?” and towards “how can we teach proto-laborers to use Excel macros?”. We have to use subtraction to expand our thinking on our own.

My Little 2021 Mathematics A-to-Z: Convex

Jacob Siehler, a friend from Mathstodon, and Assistant Professor at Gustavus Adolphus College, offered several good topics for the letter ‘C’. I picked the one that seemed to connect to the greatest number of other topics I’ve covered recently.

Convex

It’s easy to say what convex is, if we’re talking about shapes in ordinary space. A convex shape is one where the line connecting any two points inside the shape always stays inside the shape. Circles are convex. Triangles and rectangles too. Star shapes are not. Is a torus? That depends. If it’s a doughnut shape sitting in some bigger space, then it’s not convex. If the doughnut shape is all the space there is to consider, then it is. There’s a parallel here to prime numbers. Whether 5 is a prime depends on whether you think 5 is an integer, a real number, or a complex number.

Still, this seems easy to the point of boring. So how does Wolfram Mathworld match 337 items for ‘convex’? For a sense of scale, it has only 112 matches for ‘quadrilateral’. This is a word used almost as much as ‘quadratic’, with 370 items. Why?

Why is that it’s one of those terms that sneaks in everywhere. Some of it is obvious. There’s a concept called “star-convex”, where two points only need a connection by some path. It doesn’t have to be a straight line. That’s a familiar mathematical trick, coming up with a less-demanding version of a property. There’s the “convex hull”, which is the smallest convex set that contains a given set of points. We even come up with “convex functions”, functions of real numbers. A function’s convex if, the space above the graph of a function is convex. This seems like stretching the idea of convexity rather a bit.

Still, we wouldn’t coin such a term if we couldn’t use it. Well, if someone couldn’t use it. The saving thing here is the idea of “space”. We get it from our idea of what space is from looking around rooms and walking around hills and stuff. But what makes something a space? When we look at what’s essential? What we need is traits like, there are things. We can measure how far apart things are. We have some idea of paths between things. That’s not asking a lot.

So many things become spaces. And so convexity sneaks in everywhere. A convex function has nice properties if you’re looking for minimums. Or maximums; that’s as easy to do. And we look for minimums a lot. A large, practical set of mathematics is the search for optimum values, the set of values that maximize, or minimize, something. You may protest that not everything we’re intersted in is a convex function. This is true. But a lot of what we are interested in is, or is approximately.

This gets into some surprising corners. Economics, for example. The mathematics of economics is often interested in how much of a thing you can make. But you have to put things in to make it. You expect, at least once the system is set up, that if you halve the components you put in you get half the thing out. Or double the components in and get double the thing out. But you can run out of the components. Or related stuff, like, floor space to store partly-complete product. Or transport available to send this stuff to the customer. Or time to get things finished. For our needs these are all “things you can run out of”.

And so we have a problem of linear programming. We have something or other we want to optimize. Call it $y$ . It depends on a whole range of variables, which we describe as a vector $\vec{x}$ . And we have constraints. Each of these is an inequality; we can represent that as demanding some functions of these variables be at most some numbers. We can bundle those functions together as a matrix called $A$ . We can bundle those maximum numbers together as a vector called $\vec{b}$ . So the problem is finding $A\vec{x} \le \vec{b}$ . Also, we demand that none of these values be smaller than some minimum we might as well call 0. The range of all the possible values of these variables is a space. These constraints chop up that space, into a shape. Into a convex shape, of course, or this paragraph wouldn’t belong in this essay. If you need to be convinced of this, imagine taking a wedge of cheese and hacking away slices all the way through it. How do you cut a cave or a tunnel in it?

So take this convex shape, called a polytope. That’s what we call a polygon or polyhedron if we don’t want to commit to any particular number of dimensions of space. (If we’re being careful. My suspicion is ‘polyhedron’ is more often said.) This makes a shape. Some point in that shape has the best possible value of $y$ . (Also the worst, if that’s your thing.) Where is it? There is an answer, and it gives a pretext to share a fun story. The answer is that it’s on the outside, on one of the faces of the polytope. And you can find it following along the edges of those polytopes. This we know as the simplex method, or Dantzig’s Simplex Method if we must be more particular, for George Dantzig. Its success relies on looking at convex functions in convex spaces and how much this simplifies finding things.

Usually. The simplex method is one of polynomial-order complexity for normal, typical problems. That’s a measure of how much longer it takes to find an answer as you get more variables, more constraints, more work. Polynomial is okay, growing about the way it takes longer to multiply when you have more digits in the numbers. But there’s a worst case, in which the complexity grows exponentially. We shy away from exponential-complexity because … you know, exponentials grow fast, given a chance. What saves us is that that’s a worst case, not a typical case. The convexity lets us set up our problem and, rather often, solve it well enough.

Now the story, a mutation of which it’s likely you encountered. George Dantzig, as a student in Jerzy Neyman’s statistics class, arrived late one day to find a couple problems on the board. He took these to be homework, and struggled with the harder-than-usual set. But turned them in, apologizing for them being late. Neyman accepted the work, and eventually got around to looking at it. This wasn’t the homework. This was some unsolved problems in statistics. Six weeks later Neyman had prepared them for publication. A year later, Neyman explained to Dantzig that all he needed to earn his PhD was put these two papers together in a nice binder.

This cute story somehow escaped into the wild. It became an inspirational tale for more than mathematics grad students. That part’s easy to see; it has most everything inspiration needs. It mutated further, into the movie Good Will Hunting. I do not know that the unsolved problems, work done in the late 1930s, related to Dantzig’s simplex method, proved after World War II. It may be that they are simply connected in their originator. But perhaps it is more than I realize now.

I hope to finish off the word ‘Mathematics’ with the letter S next week. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all the A-to-Z essays from past years, should be at this link. Thank you for reading.

I’m looking for the last topics for the Little 2021 Mathematics A-to-Z

I’m approaching the end of this year’s little Mathematics A-to-Z. The project’s been smaller, as I’d hoped, although I’m not sure I managed to make it any less hard on myself. Still, I’m glad to be doing it and glad to have the suggestions of you kind readers for topics. This quartet should wrap up the year, and the project.

So please let me know of any topics you’d like to see me try taking on. The topic should be anything mathematics-related, although I tend to take a broad view of mathematics-related. (I’m also open to biographical sketches.) To suggest something, please, say so in a comment. If you do, please also let me know about any projects you have — blogs, YouTube channels, real-world projects — that I should mention at the top of that essay.

I am happy to revisit a subject I think I have more to write about, so don’t be shy about suggesting those. Past essays for these letters include:

A.

Ansatz (2015)
Axiom (Leap Day 2016)
Algebra (End 2016)
Arithmetic (2017)
Asymptote (2018)
Abacus (2019)
Michael Atiyah (2020)
Addition (2021)
Analysis (2021)

T.

Tensor (2015)
Transcendental Number (Leap Day 2016)
Tree (End 2016)
Topology (2017)
Tiling (2018)
Taylor Series (2019)
Tiling (2020)
Torus (2021)
Triangle (2021)

O.

Z.

And, as ever, all my A-to-Z essays should be at this link. Thanks for reading and thanks for sharing your thoughts.

My Little 2021 Mathematics A-to-Z: Inverse

I owe Iva Sallay thanks for the suggestion of today’s topic. Sallay is a longtime friend of my blog here. And runs the Find the Factors recreational mathematics puzzle site. If you haven’t been following, or haven’t visited before, this is a fun week to step in again. The puzzles this week include (American) Thanksgiving-themed pictures.

Inverse.

When we visit the museum made of a visual artist’s studio we often admire the tools. The surviving pencils and crayons, pens, brushes and such. We don’t often notice the eraser, the correction tape, the unused white-out, or the pages cut into scraps to cover up errors. To do something is to want to undo it. This is as true for the mathematics of a circle as it is for the drawing of one.

If not to undo something, we do often want to know where something comes from. A classic paper asks can one hear the shape of a drum? You hear a sound. Can you say what made that sound? Fine, dismiss the drum shape as idle curiosity. The same question applies to any sensory data. If our hand feels cooler here, where is the insulation of the building damaged? If we have this electrocardiogram reading, what can we say about the action of the heart producing that? If we see the banks of a river, what can we know about how the river floods?

And this is the point, and purpose, of inverses. We can understand them as finding the causes of what we observe.

The first inverse we meet is usually the inverse function. It’s introduced as a way to undo what a function does. That’s an odd introduction, if you’re comfortable with what a function is. A function is a mathematical construct. It’s two sets — a domain and a range — and a rule that links elements in the domain to the range. To “undo” a function is like “undoing” a rectangle. But a function has a compelling “physical” interpretation. It’s routine to introduce functions as machines that take some numbers in and give numbers out. We think of them as ways to transform the domain into the range. In functional analysis get to thinking of domains as the most perfect putty. We expect functions to stretch and rotate and compress and slide along as though they were drawing a Betty Boop cartoon.

So we’re trained to speak of a function as a verb, acting on pieces of the domain. An element or point, or a region, or the whole domain. We think the function “maps”, or “takes”, or “transforms” this into its image in the range. And if we can turn one thing into another, surely we can turn it back.

Some things it’s obvious we can turn back. Suppose our function adds 2 to whatever we give it. We can get the original back by subtracting 2. If the function subtracts 32 and divides by 1.8, we can reverse it by multiplying by 1.8 and adding 32. If the function takes the reciprocal, we can take the reciprocal again. We have a bit of a problem if we started out taking the reciprocal of 0, but who would want to do such a thing anyway? If the function squares a number, we can undo that by taking the square root. Unless we started from a negative number. Then we have trouble.

The trouble is not every function has an inverse. Which we could have realized by thinking how to undo “multiply by zero”. To be a well-defined function, the rule part has to match elements in the domain to exactly one element in the range. This makes the function, in the impenetrable jargon of the mathematician, a “one-to-one function”. Or you can describe it with the more intuitive label of “bijective”.

But there’s no reason more than one thing in the domain can’t match to the same thing in the range. If I know the cosine of my angle is $\frac{1}{2}$ , my angle might be 30 degrees. Or -30 degrees. Or 390 degrees. Or 330 degrees. You may protest there’s no difference between a 30 degree and a 390 degree angle. I agree those angles point in the same direction. But a gear rotated 390 degrees has done something that a gear rotated 30 degrees hasn’t. If all I know is where the dot I’ve put on the gear is, how can I know how much it’s rotated?

So what we do is shift from the actual cosine into one branch of the cosine. By restricting the domain we can create a function that has the same rule as the one we want, but that’s also one-to-one and so has an inverse. What restriction to use? That depends on what you want. But mathematicians have some that come up so often they might as well be defaults. So the square root is the inverse of the square of nonnegative numbers. The inverse Cosine is the inverse of the cosine of angles from 0 to 180 degrees. The inverse Sine is the inverse of the sine of angles from -90 to 90 degrees. The capital letters are convention to say we’re doing this. If we want a different range, we write out that we’re looking for an inverse cosine from -180 to 0 degrees or whatever. (Yes, the mathematician will default to using radians, rather than degrees, for angles. That’s a different essay.) It’s an imperfect solution, but it often works well enough.

The trouble we had with cosines, and functions, continues through all inverses. There are almost always alternate causes. Many shapes of drums sound alike. Take two metal bars. Heat both with a blowtorch, one on the end and one in the center. Not to the point of melting, only to the point of being too hot to touch. Let them cool in insulated boxes for a couple weeks. There’ll be no measurement you can do on the remaining heat that tells you which one was heated on the end and which the center. That’s not because your thermometers are no good or the flow of heat is not deterministic or anything. It’s that both starting cases settle to the same end. So here there is no usable inverse.

This is not to call inverses futile. We can look for what we expect to find useful. We are inclined to find inverses of the cosine between 0 and 180 degrees, even though 4140 through 4320 degrees is as legitimate. We may not know what is wrong with a heart, but have some idea what a heart could do and still beat. And there’s a famous example in 19th-century astronomy. After the discovery of Uranus came the discovery it did not move right. For a while it moved across the sky too fast for its distance from the sun. Then it started moving too slow. The obvious supposition was that there was another, not-yet-seen, planet, affecting its orbit.

The trouble is finding it. Calculating the orbit from what data they had required solving equations with 13 unknown quantities. John Couch Adams and Urbain Le Verrier attempted this anyway, making suppositions about what they could not measure. They made great suppositions. Le Verrier made the better calculations, and persuaded an astronomer (Johann Gottfried Galle, assisted by Heinrich Louis d’Arrest) to go look. Took about an hour of looking. They also made lucky suppositions. Both, for example, supposed the trans-Uranian planet would obey “Bode’s Law”, a seeming pattern in the size of planetary radiuses. The actual Neptune does not. It was near enough in the sky to where the calculated planet would be, though. The world is vaster than our imaginations.

That there are many ways to draw Betty Boop does not mean there’s nothing to learn about how this drawing was done. And so we keep having inverses as a vibrant field of mathematics.

Next week I hope to cover the letter ‘C’ and don’t think I’m not worried about what that ‘C’ will be. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all the A-to-Z essays from past years, should be at this link. Thank you for reading.

My Little 2021 Mathematics A-to-Z: Triangle

And I have another topic suggested by John Golden, author of Math Hombre. It’s one of the basic bits of mathematics, and so is hard to think about.

Triangle.

Edward Brisse assembled a list of 2,001 things to call a “center” of a triangle. I’d have run out around three. We don’t need most of them. I mention them because the list speaks of how interesting we find triangles. Nobody’s got two thousand thoughts about enneadecagons (19-sided figures).

As always with mathematics it’s hard to say whether triangles are all that interesting or whether we humans are obsessed. They’ve got great publicity. The Pythagorean Theorem may be the only bit of interesting mathematics an average person can be assumed to recognize. The kinds of triangles — acute, obtuse, right, equilateral, isosceles, scalene — are fit questions for trivia games. An ordinary mathematics education can end in trigonometry. This ends up being about circles, but we learn it through triangles. The art and science of determining where a thing is we call “triangulation”.

But triangles do seem to stand out. They’re the simplest polygon, only three vertices and three edges. So we can slice any other polygon into triangles. Any triangle can tile the plane. Even quadrilaterals may need reflections of themselves. One of the first geometry facts we learn is the interior angles of a triangle add up to two right angles. And one of the first geometry facts we learn, discovering there are non-Euclidean geometries, is that they don’t have to.

Triangles have to be convex, that is, they don’t have any divots. This property sounds boring. But it’s a good boring; it makes other work easier. It tells us that the length of any two sides of a triangle add together to something longer than the third side. And that’s a powerful idea.

There are many ways to define “distance”. Mathematicians have tried to find the most abstract version of the concept. This inequality is one of the few pieces that every definition of “distance” must respect. This idea of distance leaps out of shapes drawn on paper. Last week I mentioned a triangle inequality, in discussing functions $f$ and $g$ . We can define operators that describe a distance between functions. And the distances between trios of functions behave like the distances between points on the triangle. Thus does geometry sneak in to abstract concepts like “piecewise continuous functions”.

And they serve in curious blends of the abstract and the concrete. For example, numerical solutions to partial differential equations. A partial differential equation is one where we want to know a function of two or more variables, and only have information about how the function changes as those variables change. These turn up all the time in any study of things in bulk. Heat flowing through space. Waves passing through fluids. Fluids running through channels. So any classical physics problem that isn’t, like, balls bouncing against each other or planets orbiting stars. We can solve these if they’re linear. Linear here is a term of art meaning “easy”. I kid; “linear” means more like “manageable”. All the good problems are nonlinear and we can exactly solve about two of them.

So, numerical solutions. We make approximations by putting down a mesh on the differential equation’s domain. And then, using several graduate-level courses’ worth of tricks, approximating the equation we want with one that we can solve here. That mesh, though? … It can be many things. One powerful technique is “finite elements”. An element is a small piece of space. Guess what the default shape for these elements are. There are times, and reasons, to use other shapes as elements. You learn those once you have the hang of triangles. (Dividing the space of your variables up into elements lets you look for an approximate solution using tools easier to manage than you’d have without. This is a bit like looking for one’s keys over where the light is better. But we can find something that’s as close as we need to our keys.)

If we need finite elements for, oh, three dimensions of space, or four, then triangles fail us. We can’t fill a volume with two-dimensional shapes like triangles. But the triangle has its analog. The tetrahedron, in some sense four triangles joined together, has all the virtues of the triangle for three dimensions. We can look for a similar shape in four and five and more dimensions. If we’re looking for the thing most like an equilateral triangle, we’re looking for a “simplex”.

These simplexes, or these elements, sprawl out across the domain we want to solve problems for. They look uncannily like the triangles surveyors draw across the chart of a territory, as they show us where things are.

Next week I hope to cover the letter ‘I’ as I near the end of ‘Mathematics’ and consider what to do about ‘A To Z’. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all the A-to-Z essays from past years, should be at this link. Thank you once more for reading.

My Little 2021 Mathematics A-to-Z: Analysis

I’m fortunate this week to have another topic suggested again by Mr Wu, blogger and Singaporean mathematics tutor. It’s a big field, so forgive me not explaining the entire subject.

Analysis.

Analysis is about proving why the rest of mathematics works. It’s a hard field. My experience, a typical one, included crashing against real analysis as an undergraduate and again as a graduate student. It turns out mathematics works by throwing a lot of $\epsilon$ symbols around.

Let me give an example. If you read pop mathematics blogs you know about the number represented by $0.999999\cdots$ . You’ve seen proofs, some of them even convincing, that this number equals 1. Not a tiny bit less than 1, but exactly 1. Here’s a real-analysis treatment. And — I may regret this — I recommend you don’t read it. Not closely, at least. Instead, look at its shape. Look at the words and symbols as graphic design elements, and trust that what I say is not nonsense. Resume reading after the horizontal rule.

It’s convenient to have a name for the number $0.999999\cdots$ . I’ll call that $r$ , for “repeating”. 1 we’ll call 1. I think you’ll grant that whatever r is, it can’t be more than 1. I hope you’ll accept that if the difference between 1 and r is zero, then r equals 1. So what is the difference between 1 and r?

Give me some number $\epsilon$ . It has to be a positive number. The implication in the letter $\epsilon$ is that it’s a small number. This isn’t actually required in general. We expect it. We feel surprise and offense if it’s ever not the case.

I can show that the difference between 1 and r is less than $\epsilon$ . I know there is some smallest counting number N so that $\epsilon > \frac{1}{10^{N}}$ . For example, say $\epsilon$ is 0.125. Then we can let N = 1, and $0.125 > \frac{1}{10^{1}}$ . Or suppose $\epsilon$ is 0.00625. But then if N = 3, $0.00625 > \frac{1}{10^{3}}$ . (If $\epsilon$ is bigger than 1, let N = 1.) Now we have to ask why I want this N.

Whatever the value of r is, I know that it is more than 0.9. And that it is more than 0.99. And that it is more than 0.999. In fact, it’s more than the number you get by truncating r after any whole number N of digits. Let me call $r_N$ the number you get by truncating r after N digits. So, $r_1 = 0.9$ and $r_2 = 0.99$ and $r_5 = 0.99999$ and so on.

Since $r > r_N$ , it has to be true that $1 - r < 1 - r_N$ . And since we know what $r_N$ is, we can say exactly what $1 - r_N$ is. It's $\frac{1}{10^{N}}$ . And we picked N so that $\frac{1}{10^{N}} < \epsilon$ . So $1 - r < 1 - r_N = \frac{1}{10^{N}} < \epsilon$ . But all we know of $\epsilon$ is that it's a positive number. It can be any positive number. So $1 - r$ has to be smaller than each and every positive number. The biggest number that’s smaller than every positive number is zero. So the difference between 1 and r must be zero and so they must be equal.

That is a compelling argument. Granted, it compels much the way your older brother kneeling on your chest and pressing your head into the ground compels. But this argument gives the flavor of what much of analysis is like.

For one, it is fussy, leaning to technical. You see why the subject has the reputation of driving off all but the most intent mathematics majors. If you get comfortable with this sort of argument it’s hard to notice anymore.

For another, the argument shows that the difference between two things is less than every positive number. Therefore the difference is zero and so the things are equal. This is one of mathematics’ most important tricks. And another point, there’s a lot of talk about $\epsilon$ . And about finding differences that are, it usually turns out, smaller than some $\epsilon$ . (As an undergraduate I found something wasteful in how the differences were so often so much less than $\epsilon$ . We can’t exhaust the small numbers, though. It still feels uneconomic.)

Something this misses is another trick, though. That’s adding zero. I couldn’t think of a good way to use that here. What we often get is the need to show that, say, function $f$ and function $g$ are equal. That is, that they are less than $\epsilon$ apart. What we can often do is show that $f$ is close to some related function, which let me call $f_n$ .

I know what you’re suspecting: $f_n$ must be a polynomial. Good thought! Although in my experience, it’s actually more likely to be a piecewise constant function. That is, it’s some number, eg, “2”, for part of the domain, and then “2.5” in some other region, with no transition between them. Some other values, even values not starting with “2”, in other parts of the domain. Usually this is easier to prove stuff about than even polynomials are.

But get back to $g_n$ . It’s got the same deal as $f_n$ , some approximation easier to prove stuff about. Then we want to show that $g$ is close to some $g_n$ . And then show that $f_n$ is close to $g_n$ . So — watch this trick. Or, again, watch the shape of this trick. Read again after the horizontal rule.

$| f - g | = |f - f_n + f_n -g_n + g_n - g |$

Now we use the “triangle inequality”. If a, b, and c are the lengths of a triangle’s sides, the sum of any two of those numbers is larger than the third. And that tells us:

$|f - f_n + f_n -g_n + g_n - g | \le |f - f_n| + |f_n - g_n| + | g_n - g |$

And then if you can show that $| f - f_n |$ is less than $\frac{1}{3}\epsilon$ ? And that $| f_n - g_n |$ is also $\frac{1}{3}\epsilon$ ? And you see where this is going for $| g_n - g |$ ? Then you’ve shown that $| f - g | \le \epsilon$ . With luck, each of these little pieces is something you can prove.

Don’t worry about what all this means. It’s meant to give a flavor of what you do in an analysis course. It looks hard, but most of that is because it’s a different sort of work than you’d done before. If you hadn’t seen the adding-zero and triangle-inequality tricks? I don’t know how long you’d need to imagine them.

There are other tricks too. An old reliable one is showing that one thing is bounded by the other. That is, that $f \le g$ . You use this trick all the time because if you can also show that $g \le f$ , then those two have to be equal.

The good thing — and there is good — is that once you get the hang of these tricks analysis starts to come together. And even get easier. The first course you take as a mathematics major is real analysis, all about functions of real numbers. The next course in this track is complex analysis, about functions of complex-valued numbers. And it is easy. Compared to what comes before, yes. But also on its own. Every theorem in complex analysis named after Augustin-Louis Cauchy. They all show that the integral of your function, calculated along a closed loop, is zero. I exaggerate by $\epsilon$ .

In grad school, if you make it, you get to functional analysis, which examines functions on functions and other abstractions like that. This, too, is easy, possibly because all the basic approaches you’ve seen several courses over. Or it feels easy after all that mucking around with the real numbers.

This is not the entirety of explaining how mathematics works. Since all these proofs depend on how numbers work, we need to show how numbers work. How logic works. But those are subjects we can leave for grad school, for someone who’s survived this gauntlet.

I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all this year’s essays, and all A-to-Z essays from past years, should be at this link. Thank you once more for reading.

My Little 2021 Mathematics A-to-Z: Monte Carlo

This week’s topic is one of several suggested again by Mr Wu, blogger and Singaporean mathematics tutor. He’d suggested several topics, overlapping in their subject matter, and I was challenged to pick one.

Monte Carlo.

The reputation of mathematics has two aspects: difficulty and truth. Put “difficulty” to the side. “Truth” seems inarguable. We expect mathematics to produce sound, deductive arguments for everything. And that is an ideal. But we often want to know things we can’t do, or can’t do exactly. We can handle that often. If we can show that a number we want must be within some error range of a number we can calculate, we have a “numerical solution”. If we can show that a number we want must be within every error range of a number we can calculate, we have an “analytic solution”.

There are many things we’d like to calculate and can’t exactly. Many of them are integrals, which seem like they should be easy. We can represent any integral as finding the area, or volume, of a shape. The trick is that there’s only a few shapes with volumes we can find exact formulas for. You may remember the area of a triangle or a parallelogram. You have no idea what the area of a regular nonagon is. The trick we rely on is to approximate the shape we want with shapes we know formulas for. This usually gives us a numerical solution.

If you’re any bit devious you’ve had the impulse to think of a shape that can’t be broken up like that. There are such things, and a good swath of mathematics in the late 19th and early 20th centuries was arguments about how to handle them. I don’t mean to discuss them here. I’m more interested in the practical problems of breaking complicated shapes up into simpler ones and adding them all together.

One catch, an obvious one, is that if the shape is complicated you need a lot of simpler shapes added together to get a decent approximation. Less obvious is that you need way more shapes to do a three-dimensional volume well than you need for a two-dimensional area. That’s important because you need even way-er more to do a four-dimensional hypervolume. And more and more and more for a five-dimensional hypervolume. And so on.

That matters because many of the integrals we’d like to work out represent things like the energy of a large number of gas particles. Each of those particles carries six dimensions with it. Three dimensions describe its position and three dimensions describe its momentum. Worse, each particle has its own set of six dimensions. The position of particle 1 tells you nothing about the position of particle 2. So you end up needing ridiculously, impossibly many shapes to get even a rough approximation.

With no alternative, then, we try wisdom instead. We train ourselves to think of deductive reasoning as the only path to certainty. By the rules of deductive logic it is. But there are other unshakeable truths. One of them is randomness.

We can show — by deductive logic, so we trust the conclusion — that the purely random is predictable. Not in the way that lets us say how a ball will bounce off the floor. In the way that we can describe the shape of a great number of grains of sand dropped slowly on the floor.

The trick is one we might get if we were bad at darts. If we toss darts at a dartboard, badly, some will land on the board and some on the wall behind. How many hit the dartboard, compared to the total number we throw? If we’re as likely to hit every spot of the wall, then the fraction that hit the dartboard, times the area of the wall, should be about the area of the dartboard.

So we can do something equivalent to this dart-throwing to find the volumes of these complicated, hyper-dimensional shapes. It’s a kind of numerical integration. It isn’t particularly sensitive to how complicated the shape is, though. It takes more work to find the volume of a shape with more dimensions, yes. But it takes less more-work than the breaking-up-into-known-shapes method does. There are wide swaths of mathematics and mathematical physics where this is the best way to calculate the integral.

This bit that I’ve described is called “Monte Carlo integration”. The “integration” part of the name because that’s what we started out doing. To call it “Monte Carlo” implies either the method was first developed there or the person naming it was thinking of the famous casinos. The case is the latter. Monte Carlo methods as we know them come from Stanislaw Ulam, mathematical physicist working on atomic weapon design. While ill, he got to playing the game of Canfield solitaire, about which I know nothing except that Stanislaw Ulam was playing it in 1946 while ill. He wondered what the chance was that a given game was winnable. The most practical approach was sampling: set a computer to play a great many games and see what fractions of them were won. (The method comes from Ulam and John von Neumann. The name itself comes from their colleague Nicholas Metropolis.)

There are many Monte Carlo methods, with integration being only one very useful one. They hold in common that they’re build on randomness. We try calculations — often simple ones — many times over with many different possible values. And the regularity, the predictability, of randomness serves us. The results come together to an average that is close to the thing we do want to know.

I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all A-to-Z essays from past years, should be at this link. And if you’d like to shape the next several essays, please let me know of some topics worth writing about! Thank you for reading.

I’m looking for some more topics for the Little 2021 Mathematics A-to-Z

I am happy to be near the midpoint of my Little 2021 Mathematics A-to-Z. It feels like forever since I planned to start this, but it has been a long and a hard year. I am in need of topics for the third quarter of letters, the end of the world ‘Mathematics’, and so I appeal to my kind readers for help.

What are mathematical topics which start with the letters I, C, or S, that you’d like to see me try explaining? Leave a comment, and let me know. I’ll pick the one I think I can be most interesting about. As you nominate things, please also include a mention of your own blog or YouTube channel or book. Whatever other projects you do that people might enjoy. The projects don’t need to be mathematical. The topics don’t need to be either, although I like being able to see mathematics from them.

Here are the topics I’ve covered in past years. I’m willing to consider redoing one of these, if I can find a fresh approach. So don’t be afraid to ask if you think I might do a better job about, oh, cohomology or something.

I.

Into (2015)
Isomorphism (Leap Day 2016)
Image (End 2016)
Integration (2017)
Infinite Monkey Theorem (2018)
Infimum (2019)
Imaginary Numbers (2020)

C.

S.

Step (2015)
Surjective Map (Leap Day 2016)
Smooth (End 2016)
Sárközy’s Theorem (2017)
Sorites Paradox (2018)
Sample Space (2019)
Statistics (2020)

(Please note: there’s nothing I can do with cohomology. I did my best and that’s how it came out.)

All the Little 2021 A-to-Z essays should be at this link. And if you like, all of my A-to-Z essays, for every year, should be at this link. Thanks for reading, and thanks for suggesting things.

My Little 2021 Mathematics A-to-Z: Embedding

Elkement, who’s one of my longest blog-friends here, put forth this suggestion for an ‘E’ topic. It’s a good one. They’re author of the Theory and Practice of Trying to Combine Just Anything blog. Their blog has recently been exploring complex-valued numbers and how to represent rotations.

Embedding.

Consider a book. It’s a collection. It’s easy to see the ordered setting of words, maybe pictures, possibly numbers or even equations. The important thing is the ideas those all represent.

Set the book in a library. How can this change the book?

Perhaps the comparison to other books shows us something the original book neglected. Perhaps something in the original book we now realize was a brilliantly-presented insight. The way we appreciate the book may change.

What can’t change is the content of the original book. The words stay the same, in the same order. If it’s a physical book, the number of pages stays the same, as does the size of the page. The ideas expressed remain the same.

So now you understand embedding. It’s a broad concept, something that can have meaning for any mathematical structure. A structure here is a bunch of items and some things you can do with them. A group, for example, is a good structure to use with this sort of thing. So, for example, the integers and regular addition. This original structure’s embedded in another when everything in the original structure is in the new, and everything you can do with the original structure you can do in the new and get the same results. So, for example, the group you get by taking the integers and regular addition? That’s embedded in the group you get by taking the rational numbers and regular addition. 4 + 8 is 12 whether or not you consider 6.5 a topic fit for discussion. It’s an embedding that expands the set of elements, and that modifies the things you can do to match.

The group you get from the integers and addition is embedded in other things. For example, it’s embedded in the ring you get from the integers and regular addition and regular multiplication. 4 + 8 remains 12 whether or not you can multiply 4 by 8. This embedding doesn’t add any new elements, just new things you can do with them.

Once you have the name, you see embedding everywhere. When we first learn arithmetic we — I, anyway — learn it as adding whole numbers together. Then we embed that into whole numbers with addition and multiplication. And then the (nonnegative) rational numbers with addition and multiplication. At some point (I forget when) the negative numbers came in. So did the whole set of real numbers. Eventually the real numbers got embedded into the complex numbers. And the complex numbers got embedded into the quaternions, although we found real and complex numbers enough for most of our work. I imagine something similar goes on these days.

There’s never only one embedding possible. Consider, for example, two-dimensional geometry, the shapes of figures on a sheet of paper. It’s easy to put that in three dimensions, by setting the paper on the floor, and expand it by drawing in chalk on the wall. Or you can set the paper on the wall, and extend its figures by drawing in chalk on the floor. Or set the paper at an angle to the floor. What you use depends on what’s most convenient. And that can be driven by laziness. It’s easy to match, say, the point in two dimensions at coordinates (3, 4) with the point in three dimensions at coordinates (3, 4, 0), even though (0, 3, 4) or (4, 0, 3) are as valid.

Why embed something in another thing? For the same reasons we do any transformation in mathematics. One is that we figure to embed the thing we’re working on into something easier to deal with. A famous example of this is the Nash embedding theorem. It describes when certain manifolds can be embedded into something that looks like normal space. And that’s useful because it can turn nonlinear partial differential equations — the most insufferable equations — into something solvable.

Another good reason, though, is the one implicit in that early arithmetic education. We started with whole-numbers-with-addition. And then we added the new operation of multiplication. And then new elements, like fractions and negative numbers. If we follow this trail we get to some abstract, tricky structures like octonions. But by small steps in which we have great experience guiding us into new territories.

My Little 2021 Mathematics A-to-Z: Hyperbola

John Golden, author of the Math Hombre blog, had several great ideas for the letter H in this little A-to-Z for the year. Here’s one of them.

Hyperbola.

The hyperbola is where advanced mathematics begins. It’s a family of shapes, some of the pieces you get by slicing a cone. You can make an approximate one shining a flashlight on a wall. Other conic sections are familiar, everyday things, though. Circles we see everywhere. Ellipses we see everywhere we look at a circle in perspective. Parabolas we learn, in approximation, watching something tossed, or squirting water into the air. The hyperbola should be as accessible. Hold your flashlight parallel to the wall and look at the outline of light it casts. But the difference between this and a parabola isn’t obvious. And it’s harder to see parabolas in nature. It’s the path a space probe swinging past a planet makes? Great guide for all us who’ve launched space probes past Jupiter.

When we learn of hyperbolas, somewhere in high school algebra or in precalculus, they seem designed to break the rules we had inferred. We’ve learned functions like lines and quadradics (parabolas) and cubics. They’re nice, simple, connected shapes. The hyperbola comes in two pieces. We’ve learned that the graph of a function crosses any given vertical line at most once. Now, we can expect to see it twice. We learn to sketch functions by finding a few interesting points — roots, y-intercepts, things like that. Hyperbolas, we’re taught to draw this little central box and then two asymptotes. Also, we have asymptotes, a simpler curve that the actual curve almost equals.

We’re trained to see functions having the couple odd points where they’re not defined. Nobody expects $y = 1 \div x$ to mean anything when $x$ is zero. But we learn these as weird, isolated points. Now there’s this interval of x-values that don’t fit anything on the graph. Half the time, anyway, because we see two classes of hyperbolas. There’s ones that open like cups, pointing up and down. Those have definitions for every value of x. There’s ones that open like ears, pointing left and right. Those have a box in the center where no y satisfies the x’s. They seem like they’re taught just to be mean.

They’re not, of course. The only mathematical thing we teach just to be mean is integration by trigonometric substitution. The things which seem weird or new in hyperbolas are, largely, things we didn’t notice before. A vertical line put across a circle or ellipse crosses the curve twice, most points. There are two huge intervals, to the left and to the right of the circle, where no value of y makes the equation true. Circles are familiar, though. Ellipses don’t seem intimidating. We know we can’t turn $x^2 + y^2 = 4$ (a typical circle) into a function without some work. We have to write either $f(x) = \sqrt{4 - x^2}$ or $f(x) = -\sqrt{4 - x^2}$ , breaking the circle into two halves. The same happens for hyperbolas, though, with $x^2 - y^2 = 4$ (a typical hyperbola) turning into $f(x) = \sqrt{x^2 - 4}$ or $f(x) = -\sqrt{x^2 - 4}$ .

Even the definitions seem weird. The ellipse we can draw by taking a set distance and two focus points. If the distance from the first focus to a point plus the distance from the point to the second focus is that set distance, the point’s on the ellipse. We can use two thumbtacks and a piece of string to draw the ellipse. The hyperbola has a simliar rule, but weirder. You have your two focus points, yes. And a set distance. But the locus of points of the hyperbola is everything where the distance from the point to one focus minus the distance from the point to the other focus is that set distance. Good luck doing that with thumbtacks and string.

Yet hyperbolas are ready for us. Consider playing with a decent calculator, hitting the reciprocal button for different numbers. 1 turns to 1, yes. 2 turns into 0.5. -0.125 turns into -8. It’s the simplest iterative game to do on the calculator. If you sketch this, though, all the points (x, y) where one coordinate is the reciprocal of the other? It’s two curves. They approach without ever touching the x- and y-axes. Get far enough from the origin and there’s no telling this curve from the axes. It’s a hyperbola, one that obeys that vertical-line rule again. It has only the one value of x that can’t be allowed. We write it as $y = \frac{1}{x}$ or even $xy = 1$ . But it’s the shape we see when we draw $x^2 - y^2 = 2$ , rotated. Or a rotation of one we see when we draw $y^2 - x^2 = 2$ . The equations of rotated shapes are annoying. We do enough of them for ellipses and parabolas and hyperbolas to meet the course requirement. But they point out how the hyperbola is a more normal construct than we fear.

And let me look at that construct again. An equation describing a hyperbola that opens horizontally or vertically looks like $ax^2 - by^2 = c$ for some constant numbers a, b, and c. (If a, b, and c are all positive, this is a hyperbola opening horizontally. If a and b are positive and c negative, this is a hyperbola opening vertically.) An equation describing an ellipse, similarly with its axes horizontal or vertical looks like $ax^2 + by^2 = c$ . (These are shapes centered on the origin. They can have other centers, which make the equations harder but not more enlightening.) The equations have very similar shapes. Mathematics trains us to suspect things with similar shapes have similar properties. That change from a plus to a minus seems too important to ignore, and yet …

I bet you assumed x and y are real numbers. This is convention, the safe bet. If someone wants complex-valued numbers they usually say so. If they don’t want to be explicit, they use z and w as variables instead of x and y. But what if y is an imaginary number? Suppose $y = \imath t$ , for some real number t, where $\imath^2 = -1$ . You haven’t missed a step; I’m summoning this from nowhere. (Let’s not think about how to draw a point with an imaginary coordinate.) Then $ax^2 - by^2 = c$ is $ax^2 - b(\imath t)^2 = c$ which is $ax^2 + bt^2 = c$ . And despite the weird letters, that’s a circle. By the same supposition we could go from $ax^2 + by^2 = c$ , which we’d taken to be a circle, and get $ax^2 - bt^2 = c$ , a hyperbola.

Fine stuff inspiring the question “so?” I made up a case and showed how that made two dissimilar things look alike. All right. But consider trigonometry, built on the cosine and sine functions. One good way to see the cosine and sine of an angle is as the x- and y-coordinates of a point on the unit circle, where $x^2 + y^2 = 1$ . (The angle $\theta$ is the one from the point $(\cos(\theta), \sin(\theta))$ to the origin to the point (1, 0).)

There exists, in parallel to the familiar trig functions, the “hyperbolic trigonometric functions”. These have imaginative names like the hyperbolic sine and hyperbolic cosine. (And onward. We can speak of the “inverse hyperbolic cosecant”, if we wish no one to speak to us again.) Usually these get introduced in calculus, to give the instructor a tiny break. Their derivatives, and integrals, look much like those of the normal trigonometric functions, but aren’t the exact same problems over and over. And these functions, too, have a compelling meaning. The hyperbolic cosine of an angle and hyperbolic sine of an angle have something to do with points on a unit hyperbola, $x^2 - y^2 = 1$ .

Thinking back on the flashlight. We get a circle by holding the light perpendicular to the wall. We get a hyperbola holding the light parallel. We get a circle by drawing $x^2 + y^2 = 1$ with x and y real numbers. We get a hyperbola by (somehow) drawing $x^2 + y^2 = 1$ with x real and y imaginary. We remember something about representing complex-valued numbers with a real axis and an orthogonal imaginary axis.

One almost feels the connection. I can’t promise that pondering this will make hyperbolas be as familiar as circles or at least ellipses. But often a problem that brings us to hyperbolas has an alternate phrasing that’s ellipses, a nd vice-versa. But the common traits of these conic slices can guide you into a new understanding of mathematics.

Thank you for reading. I hope to have another piece next week at this time. This and all of this year’s Little Mathematics A to Z essays should be at this link. And the A-to-Z essays for every year should be at this link.

My Little 2021 Mathematics A-to-Z: Torus

Mr Wu, a mathematics tutor in Singapore and author of the blog about that, offered this week’s topic. It’s about one of the iconic mathematics shapes.

Torus

When one designs a board game, one has to decide what the edge of the board means. Some games make getting to the edge the goal, such as Candy Land or backgammon. Some games set their play so the edge is unreachable, such as Clue or Monopoly. Some make the edge an impassible limit, such as Go or Scrabble or Checkers. And sometimes the edge becomes something different.

Consider a strategy game like Risk or Civilization or their video game descendants like Europa Universalis. One has to be able to go east, or west, without limit. But there’s no making a cylindrical board. Or making a board infinite in extent, side to side. Instead, the game demands we connect borders. Moving east one space from just-at-the-Eastern-edge means we put the piece at just-at-the-Western-edge. As a video game this is seamless. As a tabletop game we just learn to remember those units in Alberta are not so far from Kamchatka as they look. We have the awkward point that the board doesn’t let us go over the poles. It doesn’t hurt game play: no one wants to invade Russia from the north. We can represent a boundless space on our table.

Sometimes we need more. Consider the arcade game Asteroid. The player’s spaceship hopes to survive by blasting into dust asteroids cluttered around them. The game ‘board’ is the arcade screen, a manageable slice of space. Asteroids move in any direction, often drifting off-screen. If they were out of the game, this would make victory so easy as to be unsatisfying. So the game takes a tip from the strategy games, and connects the right edge of the screen to the left. If we ask why an asteroid last seen moving to the right now appears on the left, well, there are answers. One is to say we’re in a very average segment of a huge asteroid field. There’s about as many asteroids that happen to be approaching from off-screen as recede from us. Why our local work destroying asteroids eliminates the off-screen asteroids is a mystery for the ages. Perhaps the rest of the fleet is also asteroid-clearing at about our pace. What matters is we still have to do something with the asteroids.

Almost. We’ve still got asteroids leaking away through the top and bottom. But we can use the same trick the right and left edges do. And now we have some wonderful things. One is a balanced game. Another is the space in which ship and asteroids move. It is no rectangle now, but a torus.

This is a neat space to explore. It’s unbounded, for example, just as the surface of the Earth is. Or (it appears) the actual universe is. Set your course right and your spaceship can go quite a long way without getting back to exactly where it started from, again much like the surface of the Earth or the universe. We can impersonate an unbounded space using a manageably small set of coordinates, a decent-size game board.

That’s a nice trick to have. Many mathematics problems are about how great blocks of things behave. And it’s usually easiest to model these things if there aren’t boundaries. We can, sure, but they’re hard, most of the time. So we analyze great, infinitely-extending stretches of things.

Analysis does great things. But we need sometimes to do simulations, too. Computers are, as ever, great tempting setups to this. Look at a spreadsheet with hundreds of rows and columns of cells. Each can represent a point in space, interacting with whatever’s nearby by whatever our rule is. And this can do very well … except these cells have to represent a finite territory. A million rows can’t span more than one million times the greatest distance between rows. We have to handle that.

There are tricks. One is to model the cells as being at ever-expanding distances, trusting that there are regions too dull to need much attention. Another is to give the boundary some values that, we figure, look as generic as possible. That “past here it carries on like that”. The trick that makes rhetorical sense to mention here is creating a torus, matching left edge to right, top edge to bottom. Front edge to back if it’s a three-dimensional model.

Making a torus works if a particular spot is mostly affected by its local neighborhood. This describes a lot of problems we find interesting. Many of them are in statistical mechanics, where we do a lot of problems about particules in grids that can do one of two things, depending on the locale. But many mechanics problems work like this too. If we’re interested in how a satellite orbits the Earth, we can ignore that Saturn exists, except maybe as something it might photograph.

And just making a grid into a torus doesn’t solve every problem. This is obvious if you imagine making a torus that’s two rows and two columns linked together. There won’t be much interesting behavior there. Even a reasonably large grid offers problems. There might be structures larger than the torus is across or wide, for example, worth study, and those will be missed. That we have a grid means that a shape is easier to represent if it’s horizontal or vertical. In a real continuous space there’s no directions to be partial to.

There are topology differences too. A famous result shows that four colors are enough to color any map on the plane. On the torus we need at least seven. Putting colors on things may seem like a trivial worry. But map colorings represent information about how stuff can be connected. And here’s a huge difference in these connections.

This all is about one aspect of a torus. Likely you came in wondering when I would get to talking about doughnut shapes, and the line about topology may have readied you to hear about coffee cups. The torus, like most any mathematical concept familiar enough ordinary people know the word, connects to many ideas. Some of them have more than one hole. Some have surfaces that intersect themselves. Some extend into four or more dimensions. Some are even constructs that appear in phase space, describing ways that complicated physical systems can behave. These are all reflections of this shape idea that we can learn from thinking about game boards.

This and all of this year’s Little Mathematics A to Z essays should be at this link. And the A-to-Z essays for every year should be at this link.

My Little 2021 Mathematics A-to-Z: Addition

John Golden, whom so far as I know doesn’t have an active blog, suggested this week’s topic. It pairs nicely with last week’s. I link to that in text, but if you would like to read all of this year’s Little Mathematics A to Z it should be at this link. And if you’d like to see all of my A-to-Z projects, pleas try this link. Thank you.

Addition

When I wrote about multiplication I came to the peculiar conclusion that it was the same as addition. This is true only in certain lights. When we study [abstract] algebra we look at things that look like arithmetic. The simplest useful thing that looks like arithmetic is a group. It has a set of elements, and a pairwise “group operation”. That group operation we call multiplication, if we don’t have a better name. We give it two elements and it gives us one. Under certain circumstances, this multiplication looks just like addition does.

But we have reason to think addition and multiplication aren’t the same. Where do we get addition?

We can make a meaningful addition by giving it something to interact with. By adding another operation. This turns the group into a ring. As it has two operations, it’s hard to resist calling one of them addition and the other multiplication. The new multiplication follows many of the rules the addition did. Adding two elements together gives you an element in the ring. So does multiplying. Addition is associative: $a + (b + c)$ is the same thing as $(a + b) + c$ . So it multiplication: $a \times (b \times c)$ is the same thing as $(a \times b) \times c$ .

And then the addition and the multiplication have to interact. If they didn’t, we’d just have a group with two operations. I don’t know anyone who’s found a good use for that. The way addition and multiplication interact we call distribution. This is represented by two rules, both of them depending on elements a, b, and c:

$a\times(b + c) = a\times b + a\times c$

$(a + b)\times c = a\times c + b\times c$

This is where we get something we have to call addition. It’s in having the two interacting group operations.

A problem which would have worried me at age eight: do we know we’re calling the correct operation “addition”? Yes, yes, names are arbitrary. But are we matching the thing we think we’re doing when we calculate 2 + 2 to addition and the thing for 2 x 2 to multiplication? How do we tell these two apart?

For all that they start the same, and resemble one another, there are differences. Addition has an identity, something that works like zero. $a + 0$ is always $a$ , whatever $a$ is. Multiplication … the multiplication we use every day has an identity, that is, 1. Are we required to have a multiplicative identity, something so that $a \times 1$ is always $a$ ? That depends on what it said in the Introduction to Algebra textbook you learned on. If you want to be clear your rings do have a multiplicative identity you call it a “unit ring”. If you want to be clear you don’t care, I don’t know what to say. I’m told some people write that as “rng”, to hint that this identity is missing.

Addition always has an inverse. Whatever element $a$ you pick, there is some $-a$ so that $-a + a$ is the additive identity. Multiplication? Even if we have a unit ring, there’s not always a reciprocal. The integers are a unit ring. But there are only two integers that have an integer multiplicative inverse, something you can multiply them by to get 1. If your unit ring does have a multiplicative inverse, this is called a division algebra. Rational numbers, for example, are a division algebra.

So for some rings, like the integers, there’s an obvious difference between addition and multiplication. But for the rational numbers? Can we tell the operations apart?

We can, through the additive identity, which please let me call 0. And the multiplicative identity, which please let me call 1. Is there a multiplicative inverse of 0? Suppose there is one; let me call it $c$ , because I need some name. Then of all the things in the world, we know this:

$0 \times c = 1$

I can replace anything I like with something equal to it. So, for example, I can replace 0 with the sum of an element and its additive inverse. Like, $(-a + a)$ for some element $a$ . So then:

$(-a + a) \times c = 1$

And distribute this away!

$-a\times c + a\times c = 1$

I don’t know what number $ac$ is, nor what its inverse $-ac$ is. But I know its sum is zero. And so

$0 = 1$

This looks like trouble. But, all right, why not have the additive and the multiplicative identities be the same number? Mathematicians like to play with all kinds of weird things; why not this weirdness?

The why not is that you work out pretty fast that every element has to be equal to every other element. If you’re not sure how, consider the starting line of that little proof, but with an element $b$ :

$0 \times c \times b = 1 \times b$

So there, finally, is a crack between addition and multiplication. Addition’s identity element, its zero, can’t have a multiplicative inverse. Multiplication’s identity element, its one, must have an additive inverse. We get addition from the thing we can’t un-multiply.

It may have struck you that if all we want is a ring with the lone element of 0 (or 1), then we can have addition and multiplication be indistinguishable again. And have the additive and multiplicative identities be the same thing. There’s nothing else for them to be. This is true, and we can. Unfortunately this ring doesn’t do much that’s interesting, except maybe prove some theorem we were working on isn’t always true. So we usually draw a box around it, acknowledge it once, and then exclude it from division algebras and fields and other things of interest. It’s much the same way we normally rule out 1 as a prime number. It’s an example that is too much bother to include given how unenlightening it is.

You can have groups and attach to them a multiplication and an addition and another binary operation. Those aren’t of such general interest that you study them much as an undergraduate.

And this is what we know of addition. It looks almost like a second multiplication. But it interacts just enough with multiplication to force the two to be distinguishable. From that we can create mathematics structures as interesting as arithmetic is.

My Little 2021 Mathematics A-to-Z: Multiplication

I wanted to start the Little 2021 Mathematics A-to-Z with more ceremony. These glossary projects are fun and work in about equal measure. But an already hard year got much harder about a month and a half back, and it hasn’t been getting much better. I’m even considering cutting down the reduced A-to-Z project I am doing. But I also feel I need to get some structured work under way. And sometimes only ambition will overcome a diminished world. So I begin, and with luck, will keep posting weekly essays about mathematical terms.

Today’s was a term suggested by Iva Sallay, longtime blog friend and creator of the Find The Factors recreational mathematics puzzle. Also a frequent host of the Playful Math Education Blog Carnival, a project quite worth reading and a great hosting challenge too. And as often makes for a delightful A-to-Z topic, it’s about something so commonplace one forgets it can hold surprises.

Multiplication

A friend pondering mathematics said they know you learn addition first, but that multiplication somehow felt more fundamental. I supported their insight. We learn two plus two first. It’s two times two where we start seeing strange things.

Suppose for the moment we’re interested only in the integers. Zero multiplied by anything is zero. There’s nothing like that in addition. Consider even numbers. An even number times anything gives you an even number again. There’s no duplicating that in addition. But this trait isn’t even unique to even numbers. Multiples of three, or four, or 237 assimilate the integers by multiplication the same way. You can find an integer to add to 2 to get 5; you can’t find an integer to multiply by 2 to get 5. Or consider prime numbers. There’s no integer you can make by only one, or only finitely many, different sums. New possibilities, and restrictions, happen in multiplication.

Whether this makes multiplication the foundation of mathematics, or at least arithmetic, is a judgement. It depends how basic your concepts must be, and what you decide is important. Mathematicians do have a field which studies “things that look like arithmetic”, though. We call this algebra. Or call it abstract algebra to clarify it’s not that stuff with the quadratic formula. And that starts with group theory. A group is made of two things. One is a collection of elements. The other is a thing to do with pairs of elements. Generically, we call that multiplication.

A possible multiplication has to follow a couple rules. It has to be a binary operation on your group’s set. That is, it matches two things in the set to something in the set. There has to be an identity, something that works like 1 does for multiplying numbers. It has to be associative. If you want to multiply three things together, you can start with whatever pair looks easier. Every element has to have an inverse, something you can multiply it by to get 1 as the product.

That’s all, and that’s not much. This description covers a lot of things. For example, there’s regular old multiplication, for the set of rational numbers (other than zero and I intend to talk about that later). For another, there’s rotations of a ball. Each axis you could turn the ball around on, and angle you could rotate it, is an element of the set of three-dimensional rotations. Multiplication we interpret as doing those rotations one after the other. There’s the multiplication of square matrices, ones that have the same number of rows and columns.

If you’re reading a pop mathematics blog, you know of $\imath$ , the “imaginary unit”. You know it because $\imath^2 = -1$ . A bit more multiplying of these and you find a nice tight cycle. This forms a group, with four discernible elements: $1, \imath, -1, \mbox{ and } -\imath$ and regular multiplication. It’s a nice example of a “cyclic group”. We can represent the whole thing as multiplying a single element together: $\imath^0, \imath, \imath^2, \imath^3$ . We can think of $\imath^4$ but that’s got the same value as $\imath^0$ . Or $\imath^5$ , which has the same value as $\imath^1$ . With a little ingenuity we can even think of what we might mean by, say, $\imath^{-1}$ and realize it has to be the same quantity as $\imath^3$ . Or $\imath{-2}$ which has to equal $\imath^2$ . You see the cycle.

A cyclic group doesn’t have to have four elements. It needs to be generated by doing the multiplication over and over on one element, that’s all. It can have a single element, or two, or two hundred. Or infinitely many elements. Suppose we have a set built on the powers of an element that we’ll call $e$ . This is a common name for “an element and we don’t care what it is”. It has nothing to do with the number called e, or any number. At least it doesn’t have to.

Please let me use the shorthand of $e^2$ to mean $e$ times $e$ , and $e^3$ to mean $e^2$ times $e$ , and so on. Then we have a set that looks like, in part, $\cdots e^{-3}, e^{-2}, e^{-1}, e^0, e^1, e^2, e^3. \cdots$ . They multiply together the way we might multiply x raised to powers. $e^2 \times e^3$ is $e^5$ , and $e^4 \times e^{-4}$ is $e^0$ , and $e^-3 \times e^2$ is $e^{-1}$ and so on.

Those exponents suggest something familiar. In this infinite cyclic group $e^j \times e^k$ is $e^{j + k}$ , where j and k are integers. Do we even need to write the e? Why not just write the j and k in a normal-size typeface? Is there a difference between cyclic-group multiplication and regular old addition of integers?

Not an important one. There’s differences in how we write the symbols, and what we think they mean. There’s not a difference in the way they interact. Regular old addition, in this light, we can see as a multiplication.

Calling addition “multiplication” can be confusing. So we deal with that a few ways. One is to say that rather than multiplication what a group has is a group operation. This lets us avoid fooling people into thinking we mean to take this times that. It lacks a good shorthand word, the way we might say “a times b” or “a plus b”. But we can call it “the group operation”, and say “times” or “plus” as fits our sentence and our sentiment.

I’ve left unanswered that mention of multiplication on the rational-numbers-except-zero making a group. If you include zero in the set, though, you don’t have multiplication as a group operation. There’s no inverse to zero. There seems to be an oversight in multiplication not being a multiplication. I hope to address that in the next A-to-Z essay, on Addition.

This, and my other essays for the Little 2021 Mathematics A-to-Z, should be at this link. And all my A-to-Z essays from every year should be at this link. Thanks for reading.

I’m already looking for topics for the Little 2021 Mathematics A-to-Z

I hope to begin publishing this year’s Little Mathematics A-to-Z next week, with a rousing start in the letter “M”. I’m also hoping to work several weeks ahead of deadline for a change. To that end, I already need more letters! While I have a couple topics picked out for M-A-T-H, I’ll need topics for the next quartet. If you have a mathematics (or mathematics-adjacent) term starting with E, M, A, or T that I might write a roughly thousand-word essay about? Please, leave a comment and I’ll think about it.

If you do, please leave a mention of any project (mathematics or otherwise) you’d like people to know more about. And several folks were kind enough to make suggestions for M-A-T-H, several weeks ago. I’m still keeping those as possibilities for M, A, and T’s later appearances.

I’m open to re-examining a topic I’ve written about in the past, if I think I have something fresh to say about it. Past A-to-Z’s have been about these subjects:

E.

Error (2015)
Energy (Leap Day 2016)
Ergodic (End 2016)
Elliptic Curves (2017)
e (2018)
Encryption Schemes (2019)
Exponential (2020)

M.

Measure (2015)
Matrix (Leap Day 2016)
Monster Group (End 2016)
Morse Theory (2017)
Manifold (2018)
Martingales (2019)
Möbius Strip (2020)

A.

Ansatz (2015)
Axiom (Leap Day 2016)
Algebra (End 2016)
Arithmetic (2017)
Asymptote (2018)
Abacus (2019)
Michael Atiyah (2020)

T.

Tensor (2015)
Transcendental Number (Leap Day 2016)
Tree (End 2016)
Topology (2017)
Tiling (2018)
Taylor Series (2019)
Tiling (2020)

The Little 2021 Mathematics A-to-Z should appear here, when I do start publishing. This and all past A-to-Z essays should be at this link. Thank you for reading.

I’m looking for topics for the Little 2021 Mathematics A-to-Z

I’d like to say I’m ready to start this year’s Mathematics A-to-Z. I’m not sure I am. But if I wait until I’m sure, I’ve learned, I wait too long. As mentioned, this year I’m doing an abbreviated version of my glossary project. Rather than every letter in the alphabet, I intend to write one essay each for the letters in “Mathematics A-to-Z”. The dashes won’t be included.

While I have some thoughts in minds for topics, I’d love to know what my kind readers would like to see me discuss. I’m hoping to write about one essay, of around a thousand words, per week. One for each letter. The topic should be anything mathematics-related, although I tend to take a broad view of mathematics-related. (I’m also open to biographical sketches.) To suggest something, please, say so in a comment. If you do, please also let me know about any projects you have — blogs, YouTube channels, real-world projects — that I should mention at the top of that essay.

To keep things manageable, I’m looking for the first couple letters — MATH — first. But if you have thoughts for later in the alphabet please share them. I can keep track of that. I am happy to revisit a subject I think I have more to write about, too. Past essays for these letters that I’ve written include:

M.

Measure (2015)
Matrix (Leap Day 2016)
Monster Group (End 2016)
Morse Theory (2017)
Manifold (2018)
Martingales (2019)
Möbius Strip (2020)

A.

Ansatz (2015)
Axiom (Leap Day 2016)
Algebra (End 2016)
Arithmetic (2017)
Asymptote (2018)
Abacus (2019)
Michael Atiyah (2020)

T.

Tensor (2015)
Transcendental Number (Leap Day 2016)
Tree (End 2016)
Topology (2017)
Tiling (2018)
Taylor Series (2019)
Tiling (2020)

H.

Hypersphere (2015)
Homomorphism (Leap Day 2016)
Hat (End 2016)
Height Function (elliptic curves) (2017)
Hyperbolic Half-Plane (2018)
Hamiltonian (2019)
Hilbert’s Problems (2020)

The reason I wrote a second Tiling essay is because I forgot I’d already written one in 2018. I hope not to make that same mistake again. But I am open to repeating a topic, or a variation of a topic, on purpose..

Announcing my 2021 Mathematics A-to-Z

I enjoy the tradition of writing an A-to-Z, a string of essays about topics from across the alphabet and mostly chosen by readers and commenters. I’ve done at least one each year since 2015 and it’s a thrilling, exhausting performance. I didn’t want to miss this year, too.

But note the “exhausting” there. It’s been a heck of a year and while I’ve been more fortunate than many, I also know my limits. I don’t believe I have the energy to do the whole alphabet. I tell myself these essays don’t have to be big productions, and then they turn into 2,500 words a week for 26 weeks. It’s nice work but it’s also a (slender) pop mathematics book a year, on top of everything else I write in the corners around my actual work.

So how to do less, and without losing the Mathematics A-to-Z theme? And Iva Sallay, creator of Find the Factors and always a kind and generous reader, had the solution. This year I’ll plan on a subset of the alphabet, corresponding to a simple phrase. That phrase? I’m embarrassed to say how long it took me to think of, but it must be the right one.

I plan to do, in this order, the letters of “MATHEMATICS A-TO-Z”.

That is still a 15-week course of essays, but I did want something that would still be a worthwhile project. I intend to keep the essays shorter this year, aiming at a 1,000-word cap, so look forward to me breaking 4,000 words explaining “saddle points”. This also implies that I’ll be doubling and even tripling letters, for the first time in one of these sequences. There’s to be three A’s, three T’s, and two M’s. Also one each of C, E, H, I, O, S, and Z. I figure I have one Z essay left before I exhaust the letter. I may deal with that problem in 2022.

I plan to set my call for topics soon. I’d like to get the sequence started publishing in July, so I have to do that soon. But to give some idea the range of things I’ve discussed before, here’s the roster of past, full-alphabet, A-to-Z topics:

I, too, am fascinated by the small changes in how I titled these posts and even chose whether to capitalize subject names in the roster. By “am fascinated by the small changes” I mean “am annoyed beyond reason by the inconsistencies”. I hope you too have an appropriate reaction to them.

What I Wrote About In My All 2020 Mathematics A to Z

I am happy, as ever, to complete an A-to-Z. Also to take some time to recover after the project. I had thought that spreading things out to 26 weeks would make them less stressful, and instead, I just wrote even longer pieces, in compensation. I’ll try to have other good observations in an essay next week.

For now, though, a piece that I will find useful for years to come: a roster of what essays I wrote this year. In future years, I may even check them before writing a third piece about tiling.

Color cartoon illustration of a coati in a beret and neckerchief, holding up a director's megaphone and looking over the Hollywood hills. The megaphone has the symbols + x (division obelus) and = on it. The Hollywood sign is, instead, the letters MATHEMATICS. In the background are spotlights, with several of them crossing so as to make the letters A and Z; one leg of the spotlights has 'TO' in it, so the art reads out, subtly, 'Mathematics A to Z'. — Art by Thomas K Dye, creator of the web comics **Projection Edge**, **Newshounds**, **Infinity Refugees**, and **Something Happens**. He’s on Twitter as @projectionedge. You can get to read **Projection Edge** six months early by subscribing to his Patreon.

Gathered at this link are all the 2020 A-to-Z essays. And gathered at this link are all A-to-Z essays, for this and every year. Including, I hope, the 2021 essays when I start those.

What I Wrote About In My 2019 Mathematics A To Z

And I have made it to the end! As is traditional, I mean to write a few words about what I learned in doing all of this. Also as is traditional, I need to collapse after the work of thirteen weeks of two essays per week describing a small glossary of terms mostly suggested by kind readers. So while I wait to do that, let me gather in one bundle a list of all the essays from this project. If this seems to you like a lazy use of old content to fill a publication hole let me assure you: this will make my life so much easier next time I do an A-to-Z. I’ve learned that, at least, over the years.

Cartoony banner illustration of a coati, a raccoon-like animal, flying a kite in the clear autumn sky. A skywriting plane has written 'MATHEMATIC A TO Z'; the kite, with the letter 'S' on it to make the word 'MATHEMATICS'. — Art by Thomas K Dye, creator of the web comics **Projection Edge**, **Newshounds**, **Infinity Refugees**, and **Something Happens**. He’s on Twitter as @projectionedge. You can get to read **Projection Edge** six months early by subscribing to his Patreon.

What I Wrote About in My 2018 Mathematics A To Z

I have reached the end! Thirteen weeks at two essays per week to describe a neat sampling of mathematics. I hope to write a few words about what I learned by doing all this. In the meanwhile, though, I want to gather together the list of all the essays I did put into this project.

I’m Looking For The Last Topics For My Fall 2018 Mathematics A-To-Z

And now it’s my last request for my Fall 2018 mathematics A-To-Z. There’s only a half-dozen letters left, but nto to fear: they include letters with no end of potential topics, like, ‘X’.

If you have any mathematical topics with a name that starts U through Z that you’d like to see me write about, please say so. I’m happy to write what I fully mean to be a tight 500 words about the subject and then find I’ve put up my second 1800-word essay of the week. I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me. This might be the only way to get a good ‘X’ letter.

Also when you do make a request, please feel free to mention your blog, Twitter feed, YouTube channel, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.

So! I’m open for nominations. Here are the words I’ve used in past A to Z sequences. I probably don’t want to revisit them. But I will think over, if I get a request, whether I might have new opinions.

Excerpted From The Summer 2015 A To Z

Excerpted From The Leap Day 2016 A To Z

Excerpted From The End 2016 A To Z

Excerpted From The Summer 2017 A To Z

And there we go! … To avoid confusion I’ll mark off here when I have taken a letter.

Available Letters for the Fall 2018 A To Z:

All of my Fall 2018 Mathematics A-To-Z should appear at this link. And it’ll have some extra stuff like these topic-request pages and such.

My 2018 Mathematics A To Z: Quadratic Equation

I have another topic today suggested by Dina Yagodich. I’ve mentioned before her YouTube channel. It’s got a variety of educational videos you might enjoy. Give it a try.

I’m planning this week to open up the end of the alphabet — and the year — to topic suggestions. So there’s no need to panic about that.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background. — Art by Thomas K Dye, creator of the web comics **Newshounds**, **Something Happens**, and **Infinity Refugees**. His current project is **Projection Edge**. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Quadratic Equation.

The Quadratic Equation is the tool humanity used to discover mathematics. Yes, I exaggerate a bit. But it touches a stunning array of important things. It is most noteworthy because of the time I impressed by several-levels-removed boss at the summer job I had while an undergraduate. He had been stumped by a data-optimization problem for weeks. I noticed it was just a quadratic equation, that’s easy to solve. He was, must be said, overly impressed. I would go on to grad school where I was once stymied for a week because I couldn’t find the derivative of $e^t$ correctly. It is, correctly, $e^t$ . So I have sympathy for my remote supervisor.

We normally write the Quadratic Equation in one of two forms:

$ax^2 + bx + c = 0$

$a_0 + a_1 x + a_2 x^2 = 0$

The first form is great when you are first learning about polynomials, and parabolas. And you’re content to something raised to the second power. The second form is great when you are learning advanced stuff about polynomials. Then you start wanting to know things true about polynomials that go up to arbitrarily high powers. And we always want to know about polynomials. The subscripts under $a_j$ mean we can’t run out of letters to be coefficients. Setting the subscripts and powers to keep increasing lets us write this out neatly.

We don’t have to use x. We never do. But we mostly use x. Maybe t, if we’re writing an equation that describes something changing with time. Maybe z, if we want to emphasize how complex-valued numbers might enter into things. The name of the independent variable doesn’t matter. But stick to the obvious choices. If you’re going to make the variable ‘f’ you better have a good reason.

The equation is very old. We have ancient Babylonian clay tablets which describe it. Well, not the quadratic equation as we write it. The oldest problems put it as finding numbers that simultaneously solve two equations, one of them a sum and one of them a product. Changing one equation into two is a venerable mathematical process. It often makes problems simpler. We do this all the time in Ordinary Differential Equations. I doubt there is a direct connection between Ordinary Differential Equations and this alternate form of the Quadratic Equation. But it is a reminder that the ways we express mathematical problems are our conventions. We can rewrite problems to make our lives easier, to make answers clearer. We should look for chances to do that.

It weaves into everything. Some things seem obvious. Suppose the coefficients — a, b, and c; or $a_0, a_1, a_2$ if you’d rather — are all real-valued numbers. Then the quadratic equation has to hav two solutions. There can be two real-valued solutions. There can be one real-valued solution, counted twice for reasons that make sense but are too much a digression for me to justify here. There can be two complex-valued solutions. We can infer the usefulness of imaginary and complex-valued numbers by finding solutions to the quadratic equation.

(The quadratic equation is a great introduction complex-valued numbers. It’s not how mathematicians came to them. Complex-valued numbers looked like obvious nonsense. They corresponded to there being no real-valued answers. A formula that gives obvious nonsense when there’s no answer is great. It’s formulas that give subtle nonsense when there’s no answer that are dangerous. But similar-in-design formulas for cubic and quartic polynomials could use complex-valued numbers in intermediate steps. Plunging ahead as though these complex-valued numbers were proper would get to the real-valued answers. This made the argument that complex-valued numbers should be taken seriously.)

We learn useful things right away from trying to solve it. We teach students to “complete the square” as a first approach to solving it. Completing the square is not that useful by itself: a few pages later in the textbook we get to the quadratic formula and that has every quadratic equation solved. Just plug numbers into the formula. But completing the square teaches something more useful than just how to solve an equation. It’s a method in which we solve a problem by saying, you know, this would be easy to solve if only it were different. And then thinking how to change it into a different-looking problem with the same solutions. This is brilliant work. A mathematician is imagined to have all sorts of brilliant ideas on how to solve problems. Closer to to the truth is that she’s learned all sorts of brilliant ways to make a problem more like one she already knows how to solve. (This is the nugget of truth which makes one genre of mathematical jokes. These jokes have the punch line, “the mathematician declares, `this is a problem already solved’ and goes back to sleep.”)

Stare at the solutions of the quadratic equation. You will find patterns. Suppose the coefficients are all real numbers. Then there are some numbers that can be solutions: 0, 1, square root of 15, -3.5, these can all turn up. There are some numbers that can’t be. π. e. The tangent of 2. It’s not just a division between rational and irrational numbers. There are different kinds of irrational numbers. This — alongside looking at other polynomials — leads us to transcendental numbers.

Keep staring at the two solutions of the quadratic equation. You’ll notice the sum of the solutions is $-\frac{b}{a}$ . You’ll notice the product of the two solutions is $\frac{c}{a}$ . You’ll glance back at those ancient Babylonian tablets. This seems interesting, but little more than that. It’s a lead, though. Similar formulas exist for the sum of the solutions for a cubic, for a quartic, for other polynomials. Also for the sum of products of pairs of these solutions. Or the sum of products of triplets of these solutions. Or the product of all these solutions. These are known as Vieta’s Formulas, after the 16th-century mathematician François Viète. (This by way of his Latinized, academic’sona, name, Franciscus Vieta.) This gives us a way to rewrite the original polynomial as a set of polynomials in several variables. What’s interesting is the set of polynomials have symmetries. They all look like, oh, “xy + yz + zx”. No one variable gets used in a way distinguishable from the others.

This leads us to group theory. The coefficients start out in a ring. The quotients from these Vieta’s Formulas give us an “extension” of the ring. An extension is roughly what the common use of the word suggests. It takes the ring and builds from it a bigger thing that satisfies some nice interesting rules. And it leads us to surprises. The ancient Greeks had several challenges to be done with only straightedge and compass. One was to make a cube double the volume of a given cube. It’s impossible to do, with these tools. (Even ignoring the question of what we would draw on.) Another was to trisect any arbitrary angle; it turns out, there are angles it’s just impossible. The group theory derived, in part, from this tells us why. One more impossibility: drawing a square that has exactly the same area as a given circle.

But there are possible things still. Step back from the quadratic equation, that $ax^2 + bx + c = 0$ bit. Make a function, instead, something that matches numbers (real, complex, what have you) to numbers (the same). Its rule: any x in the domain matches to the number $f(x) = ax^2 + bx + c$ in the range. We can make a picture that represents this. Set Cartesian coordinates — the x and y coordinates that people think of as the default — on a surface. Then highlight all the points with coordinates (x, y) which make true the equation $y = f(x)$ . This traces out a particular shape, the parabola.

Draw a line that crosses this parabola twice. There’s now one fully-enclosed piece of the surface. How much area is enclosed there? It’s possible to find a triangle with area three-quarters that of the enclosed part. It’s easy to use straightedge and compass to draw a square the same area as a given triangle. Showing the enclosed area is four-thirds the triangle’s area? That can … kind of … be done by straightedge and compass. It takes infinitely many steps to do this. But if you’re willing to allow a process to go on forever? And you show that the process would reach some fixed, knowable answer? This could be done by the ancient Greeks; indeed, it was. Aristotle used this as an example of the method of exhaustion. It’s one of the ideas that reaches toward integral calculus.

This has been a lot of exact, “analytic” results. There are neat numerical results too. Vieta’s formulas, for example, give us good ways to find approximate solutions of the quadratic equation. They work well if one solution is much bigger than the other. Numerical methods for finding solutions tend to work better if you can start from a decent estimate of the answer. And you can learn of numerical stability, and the need for it, studying these.

Numerical calculations have a problem. We have a set number of decimal places with which to work. What happens if we need a calculation that takes more decimal places than we’re given to do perfectly? Here’s a toy version: two-thirds is the number 0.6666. Or 0.6667. Already we’re in trouble. What is three times two-thirds? We’re going to get either 1.9998 or 2.0001 and either way something’s wrong. The wrongness looks small. But any formula you want to use has some numbers that will turn these small errors into big ones. So numerical stability is, in fairness, not something unique to the quadratic equation. It is something you learn if you study the numerics of the equation deeply enough.

I’m also delighted to learn, through Wikipedia, that there’s a prosthaphaeretic method for solving the quadratic equation. Prosthaphaeretic methods use trigonometric functions and identities to rewrite problems. You might call it madness to rely on arctangents and half-angle formulas and such instead of, oh, doing a division or taking a square root. This is because you have calculators. But if you don’t? If you have to do all that work by hand? That’s terrible. But if someone has already prepared a table listing the sines and cosines and tangents of a great variety of angles? They did a great many calculations already. You just need to pick out the one that tells you what you hope to know. I’ll spare you the steps of solving the quadratic equation using trig tables. Wikipedia describes it fine enough.

So you see how much mathematics this connects to. It’s a bit of question-begging to call it that important. As I said, we’ve known the quadratic equation for a long time. We’ve thought about it for a long while. It would be surprising if we didn’t find many and deep links to other things. Even if it didn’t have links, we would try to understand new mathematical tools in terms of how they affect familiar old problems like this. But these are some of the things which we’ve found, and which run through much of what we understand mathematics to be.

The letter ‘R’ for this Fall 2018 Mathematics A-To-Z post should be published Friday. It’ll be available at this link, as are the rest of these glossary posts.

I’m Looking For The Next Set Of Topics For My Fall 2018 Mathematics A-To-Z

We’re at the end of another month. So it’s a good chance to set out requests for the next several week’s worth of my mathematics A-To-Z. As I say, I’ve been doing this piecemeal so that I can keep track of requests better. I think it’s been working out, too.

If you have any mathematical topics with a name that starts N through T, let me know! I usually go by a first-come, first-serve basis for each letter. But I will vary that if I realize one of the alternatives is more suggestive of a good essay topic. And I may use a synonym or an alternate phrasing if both topics for a particular letter interest me.

Also when you do make a request, please feel free to mention your blog, Twitter feed, Mathstodon account, or any other project of yours that readers might find interesting. I’m happy to throw in a mention as I get to the word of the day.