Reading the Comics, October 14, 2017: Physics Equations Edition


So that busy Saturday I promised for the mathematically-themed comic strips? Here it is, along with a Friday that reached the lowest non-zero levels of activity.

Stephan Pastis’s Pearls Before Swine for the 13th is one of those equations-of-everything jokes. Naturally it features a panel full of symbols that, to my eye, don’t parse. There are what look like syntax errors, for example, with the one that anyone could see the { mark that isn’t balanced by a }. But when someone works rough they will, often, write stuff that doesn’t quite parse. Think of it as an artist’s rough sketch of a complicated scene: the lines and anatomy may be gibberish, but if the major lines of the composition are right then all is well.

Most attempts to write an equation for everything are really about writing a description of the fundamental forces of nature. We trust that it’s possible to go from a description of how gravity and electromagnetism and the nuclear forces go to, ultimately, a description of why chemistry should work and why ecologies should form and there should be societies. There are, as you might imagine, a number of assumed steps along the way. I would accept the idea that we’ll have a unification of the fundamental forces of physics this century. I’m not sure I would believe having all the steps between the fundamental forces and, say, how nerve cells develop worked out in that time.

Mark Anderson’s Andertoons makes it overdue appearance for the week on the 14th, with a chalkboard word-problem joke. Amusing enough. And estimating an answer, getting it wrong, and refining it is good mathematics. It’s not just numerical mathematics that will look for an approximate solution and then refine it. As a first approximation, 15 minus 7 isn’t far off 10. And for mental arithmetic approximating 15 minus 7 as 10 is quite justifiable. It could be made more precise if a more exact answer were needed.

Maria Scrivan’s Half Full for the 14th I’m going to call the anthropomorphic geometry joke for the week. If it’s not then it’s just wordplay and I’d have no business including it here.

Keith Tutt and Daniel Saunders’s Lard’s World Peace Tips for the 14th tosses in the formula describing how strong the force of gravity between two objects is. In Newtonian gravity, which is why it’s the Newton Police. It’s close enough for most purposes. I’m not sure how this supports the cause of world peace.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 14th names Riemann’s Quaternary Conjecture. I was taken in by the panel, trying to work out what the proposed conjecture could even mean. The reason it works is that Bernhard Riemann wrote like 150,000 major works in every field of mathematics, and about 149,000 of them are big, important foundational works. The most important Riemann conjecture would be the one about zeroes of the Riemann Zeta function. This is typically called the Riemann Hypothesis. But someone could probably write a book just listing the stuff named for Riemann, and that’s got to include a bunch of very specific conjectures.

Advertisements

The Summer 2017 Mathematics A To Z: Volume Forms


I’ve been reading Elke Stangl’s Elkemental Force blog for years now. Sometimes I even feel social-media-caught-up enough to comment, or at least to like posts. This is relevant today as I discuss one of the Stangl’s suggestions for my letter-V topic.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Volume Forms.

So sometime in pre-algebra, or early in (high school) algebra, you start drawing equations. It’s a simple trick. Lay down a coordinate system, some set of axes for ‘x’ and ‘y’ and maybe ‘z’ or whatever letters are important. Look to the equation, made up of x’s and y’s and maybe z’s and so. Highlight all the points with coordinates whose values make the equation true. This is the logical basis for saying (eg) that the straight line “is” y = 2x + 1 .

A short while later, you learn about polar coordinates. Instead of using ‘x’ and ‘y’, you have ‘r’ and ‘θ’. ‘r’ is the distance from the center of the universe. ‘θ’ is the angle made with respect to some reference axis. It’s as legitimate a way of describing points in space. Some classrooms even have a part of the blackboard (whiteboard, whatever) with a polar-coordinates “grid” on it. This looks like the lines of a dartboard. And you learn that some shapes are easy to describe in polar coordinates. A circle, centered on the origin, is ‘r = 2’ or something like that. A line through the origin is ‘θ = 1’ or whatever. The line that we’d called y = 2x + 1 before? … That’s … some mess. And now r = 2\theta + 1 … that’s not even a line. That’s some kind of spiral. Two spirals, really. Kind of wild.

And something to bother you a while. y = 2x + 1 is an equation that looks the same as r = 2\theta + 1 . You’ve changed the names of the variables, but not how they relate to each other. But one is a straight line and the other a spiral thing. How can that be?

The answer, ultimately, is that the letters in the equations aren’t these content-neutral labels. They carry meaning. ‘x’ and ‘y’ imply looking at space a particular way. ‘r’ and ‘θ’ imply looking at space a different way. A shape has different representations in different coordinate systems. Fair enough. That seems to settle the question.

But if you get to calculus the question comes back. You can integrate over a region of space that’s defined by Cartesian coordinates, x’s and y’s. Or you can integrate over a region that’s defined by polar coordinates, r’s and θ’s. The first time you try this, you find … well, that any region easy to describe in Cartesian coordinates is painful in polar coordinates. And vice-versa. Way too hard. But if you struggle through all that symbol manipulation, you get … different answers. Eventually the calculus teacher has mercy and explains. If you’re integrating in Cartesian coordinates you need to use “dx dy”. If you’re integrating in polar coordinates you need to use “r dr dθ”. If you’ve never taken calculus, never mind what this means. What is important is that “r dr dθ” looks like three things multiplied together, while “dx dy” is two.

We get this explained as a “change of variables”. If we want to go from one set of coordinates to a different one, we have to do something fiddly. The extra ‘r’ in “r dr dθ” is what we get going from Cartesian to polar coordinates. And we get formulas to describe what we should do if we need other kinds of coordinates. It’s some work that introduces us to the Jacobian, which looks like the most tedious possible calculation ever at that time. (In Intro to Differential Equations we learn we were wrong, and the Wronskian is the most tedious possible calculation ever. This is also wrong, but it might as well be true.) We typically move on after this and count ourselves lucky it got no worse than that.

None of this is wrong, even from the perspective of more advanced mathematics. It’s not even misleading, which is a refreshing change. But we can look a little deeper, and get something good from doing so.

The deeper perspective looks at “differential forms”. These are about how to encode information about how your coordinate system represents space. They’re tensors. I don’t blame you for wondering if they would be. A differential form uses interactions between some of the directions in a space. A volume form is a differential form that uses all the directions in a space. And satisfies some other rules too. I’m skipping those because some of the symbols involved I don’t even know how to look up, much less make WordPress present.

What’s important is the volume form carries information compactly. As symbols it tells us that this represents a chunk of space that’s constant no matter what the coordinates look like. This makes it possible to do analysis on how functions work. It also tells us what we would need to do to calculate specific kinds of problem. This makes it possible to describe, for example, how something moving in space would change.

The volume form, and the tools to do anything useful with it, demand a lot of supporting work. You can dodge having to explicitly work with tensors. But you’ll need a lot of tensor-related materials, like wedge products and exterior derivatives and stuff like that. If you’ve never taken freshman calculus don’t worry: the people who have taken freshman calculus never heard of those things either. So what makes this worthwhile?

Yes, person who called out “polynomials”. Good instinct. Polynomials are usually a reason for any mathematics thing. This is one of maybe four exceptions. I have to appeal to my other standard answer: “group theory”. These volume forms match up naturally with groups. There’s not only information about how coordinates describe a space to consider. There’s ways to set up coordinates that tell us things.

That isn’t all. These volume forms can give us new invariants. Invariants are what mathematicians say instead of “conservation laws”. They’re properties whose value for a given problem is constant. This can make it easier to work out how one variable depends on another, or to work out specific values of variables.

For example, classical physics problems like how a bunch of planets orbit a sun often have a “symplectic manifold” that matches the problem. This is a description of how the positions and momentums of all the things in the problem relate. The symplectic manifold has a volume form. That volume is going to be constant as time progresses. That is, there’s this way of representing the positions and speeds of all the planets that does not change, no matter what. It’s much like the conservation of energy or the conservation of angular momentum. And this has practical value. It’s the subject that brought my and Elke Stangl’s blogs into contact, years ago. It also has broader applicability.

There’s no way to provide an exact answer for the movement of, like, the sun and nine-ish planets and a couple major moons and all that. So there’s no known way to answer the question of whether the Earth’s orbit is stable. All the planets are always tugging one another, changing their orbits a little. Could this converge in a weird way suddenly, on geologic timescales? Might the planet might go flying off out of the solar system? It doesn’t seem like the solar system could be all that unstable, or it would have already. But we can’t rule out that some freaky alignment of Jupiter, Saturn, and Halley’s Comet might not tweak the Earth’s orbit just far enough for catastrophe to unfold. Granted there’s nothing we could do about the Earth flying out of the solar system, but it would be nice to know if we face it, we tell ourselves.

But we can answer this numerically. We can set a computer to simulate the movement of the solar system. But there will always be numerical errors. For example, we can’t use the exact value of π in a numerical computation. 3.141592 (and more digits) might be good enough for projecting stuff out a day, a week, a thousand years. But if we’re looking at millions of years? The difference can add up. We can imagine compensating for not having the value of π exactly right. But what about compensating for something we don’t know precisely, like, where Jupiter will be in 16 million years and two months?

Symplectic forms can help us. The volume form represented by this space has to be conserved. So we can rewrite our simulation so that these forms are conserved, by design. This does not mean we avoid making errors. But it means we avoid making certain kinds of errors. We’re more likely to make what we call “phase” errors. We predict Jupiter’s location in 16 million years and two months. Our simulation puts it thirty degrees farther in its circular orbit than it actually would be. This is a less serious mistake to make than putting Jupiter, say, eight-tenths as far from the Sun as it would really be.

Volume forms seem, at first, a lot of mechanism for a small problem. And, unfortunately for students, they are. They’re more trouble than they’re worth for changing Cartesian to polar coordinates, or similar problems. You know, ones that the student already has some feel for. They pay off on more abstract problems. Tracking the movement of a dozen interacting things, say, or describing a space that’s very strangely shaped. Those make the effort to learn about forms worthwhile.

The Summer 2017 Mathematics A To Z: Topology


Today’s glossary entry comes from Elke Stangl, author of the Elkemental Force blog. I’ll do my best, although it would have made my essay a bit easier if I’d had the chance to do another topic first. We’ll get there.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Topology.

Start with a universe. Nice thing to have around. Call it ‘M’. I’ll get to why that name.

I’ve talked a fair bit about weird mathematical objects that need some bundle of traits to be interesting. So this will change the pace some. Here, I request only that the universe have a concept of “sets”. OK, that carries a little baggage along with it. We have to have intersections and unions. Those come about from having pairs of sets. The intersection of two sets is all the things that are in both sets simultaneously. The union of two sets is all the things that are in one set, or the other, or both simultaneously. But it’s hard to think of something that could have sets that couldn’t have intersections and unions.

So from your universe ‘M’ create a new collection of things. Call it ‘T’. I’ll get to why that name. But if you’ve formed a guess about why, then you know. So I suppose I don’t need to say why, now. ‘T’ is a collection of subsets of ‘M’. Now let’s suppose these four things are true.

First. ‘M’ is one of the sets in ‘T’.

Second. The empty set ∅ (which has nothing at all in it) is one of the sets in ‘T’.

Third. Whenever two sets are in ‘T’, their intersection is also in ‘T’.

Fourth. Whenever two (or more) sets are in ‘T’, their union is also in ‘T’.

Got all that? I imagine a lot of shrugging and head-nodding out there. So let’s take that. Your universe ‘M’ and your collection of sets ‘T’ are a topology. And that’s that.

Yeah, that’s never that. Let me put in some more text. Suppose we have a universe that consists of two symbols, say, ‘a’ and ‘b’. There’s four distinct topologies you can make of that. Take the universe plus the collection of sets {∅}, {a}, {b}, and {a, b}. That’s a topology. Try it out. That’s the first collection you would probably think of.

Here’s another collection. Take this two-thing universe and the collection of sets {∅}, {a}, and {a, b}. That’s another topology and you might want to double-check that. Or there’s this one: the universe and the collection of sets {∅}, {b}, and {a, b}. Last one: the universe and the collection of sets {∅} and {a, b} and nothing else. That one barely looks legitimate, but it is. Not a topology: the universe and the collection of sets {∅}, {a}, and {b}.

The number of toplogies grows surprisingly with the number of things in the universe. Like, if we had three symbols, ‘a’, ‘b’, and ‘c’, there would be 29 possible topologies. The universe of the three symbols and the collection of sets {∅}, {a}, {b, c}, and {a, b, c}, for example, would be a topology. But the universe and the collection of sets {∅}, {a}, {b}, {c}, and {a, b, c} would not. It’s a good thing to ponder if you need something to occupy your mind while awake in bed.

With four symbols, there’s 355 possibilities. Good luck working those all out before you fall asleep. Five symbols have 6,942 possibilities. You realize this doesn’t look like any expected sequence. After ‘4’ the count of topologies isn’t anything obvious like “two to the number of symbols” or “the number of symbols factorial” or something.

Are you getting ready to call me on being inconsistent? In the past I’ve talked about topology as studying what we can know about geometry without involving the idea of distance. How’s that got anything to do with this fiddling about with sets and intersections and stuff?

So now we come to that name ‘M’, and what it’s finally mnemonic for. I have to touch on something Elke Stangl hoped I’d write about, but a letter someone else bid on first. That would be a manifold. I come from an applied-mathematics background so I’m not sure I ever got a proper introduction to manifolds. They appeared one day in the background of some talk about physics problems. I think they were introduced as “it’s a space that works like normal space”, and that was it. We were supposed to pretend we had always known about them. (I’m translating. What we were actually told would be that it “works like R3”. That’s how mathematicians say “like normal space”.) That was all we needed.

Properly, a manifold is … eh. It’s something that works kind of like normal space. That is, it’s a set, something that can be a universe. And it has to be something we can define “open sets” on. The open sets for the manifold follow the rules I gave for a topology above. You can make a collection of these open sets. And the empty set has to be in that collection. So does the whole universe. The intersection of two open sets in that collection is itself in that collection. The union of open sets in that collection is in that collection. If all that’s true, then we have a manifold.

And now the piece that makes every pop mathematics article about topology talk about doughnuts and coffee cups. It’s possible that two topologies might be homeomorphic to each other. “Homeomorphic” is a term of art. But you understand it if you remember that “morph” means shape, and suspect that “homeo” is probably close to “homogenous”. Two things being homeomorphic means you can match their parts up. In the matching there’s nothing left over in the first thing or the second. And the relations between the parts of the first thing are the same as the relations between the parts of the second thing.

So. Imagine the snippet of the number line for the numbers larger than -π and smaller than π. Think of all the open sets you can use to cover that. It will have a set like “the numbers bigger than 0 and less than 1”. A set like “the numbers bigger than -π and smaller than 2.1”. A set like “the numbers bigger than 0.01 and smaller than 0.011”. And so on.

Now imagine the points that exist on a circle, if you’ve omitted one point. Let’s say it’s the unit circle, centered on the origin, and that what we’re leaving out is the point that’s exactly to the left of the origin. The open sets for this are the arcs that cover some part of this punctured circle. There’s the arc that corresponds to the angles from 0 to 1 radian measure. There’s the arc that corresponds to the angles from -π to 2.1 radians. There’s the arc that corresponds to the angles from 0.01 to 0.011 radians. You see where this is going. You see why I say we can match those sets on the number line to the arcs of this punctured circle. There’s some details to fill in here. But you probably believe me this could be done if I had to.

There’s two (or three) great branches of topology. One is called “algebraic topology”. It’s the one that makes for fun pop mathematics articles about imaginary rubber sheets. It’s called “algebraic” because this field makes it natural to study the holes in a sheet. And those holes tend to form groups and rings, basic pieces of Not That Algebra. The field (I’m told) can be interpreted as looking at functors on groups and rings. This makes for some neat tying-together of subjects this A To Z round.

The other branch is called “differential topology”, which is a great field to study because it sounds like what Mister Spock is thinking about. It inspires awestruck looks where saying you study, like, Bayesian probability gets blank stares. Differential topology is about differentiable functions on manifolds. This gets deep into mathematical physics.

As you study mathematical physics, you stop worrying about ever solving specific physics problems. Specific problems are petty stuff. What you like is solving whole classes of problems. A steady trick for this is to try to find some properties that are true about the problem regardless of what exactly it’s doing at the time. This amounts to finding a manifold that relates to the problem. Consider a central-force problem, for example, with planets orbiting a sun. A planet can’t move just anywhere. It can only be in places and moving in directions that give the system the same total energy that it had to start. And the same linear momentum. And the same angular momentum. We can match these constraints to manifolds. Whatever the planet does, it does it without ever leaving these manifolds. To know the shapes of these manifolds — how they are connected — and what kinds of functions are defined on them tells us something of how the planets move.

The maybe-third branch is “low-dimensional topology”. This is what differential topology is for two- or three- or four-dimensional spaces. You know, shapes we can imagine with ease in the real world. Maybe imagine with some effort, for four dimensions. This kind of branches out of differential topology because having so few dimensions to work in makes a lot of problems harder. We need specialized theoretical tools that only work for these cases. Is that enough to count as a separate branch? It depends what topologists you want to pick a fight with. (I don’t want a fight with any of them. I’m over here in numerical mathematics when I’m not merely blogging. I’m happy to provide space for anyone wishing to defend her branch of topology.)

But each grows out of this quite general, quite abstract idea, also known as “point-set topology”, that’s all about sets and collections of sets. There is much that we can learn from thinking about how to collect the things that are possible.

The Summer 2017 Mathematics A To Z: Ricci Tensor


Today’s is technically a request from Elke Stangl, author of the Elkemental Force blog. I think it’s also me setting out my own petard for self-hoisting, as my recollection is that I tossed off a mention of “defining the Ricci Tensor” as the sort of thing that’s got a deep beauty that’s hard to share with people. And that set off the search for where I had written about the Ricci Tensor. I hadn’t, and now look what trouble I’m in. Well, here goes.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Ricci Tensor.

Imagine if nothing existed.

You’re not doing that right, by the way. I expect what you’re thinking of is a universe that’s a big block of space that doesn’t happen to have any things clogging it up. Maybe you have a natural sense of volume in it, so that you know something is there. Maybe you even imagine something with grid lines or reticules or some reference points. What I imagine after a command like that is a sort of great rectangular expanse, dark and faintly purple-tinged, with small dots to mark its expanse. That’s fine. This is what I really want. But it’s not really imagining nothing existing. There’s space. There’s some sense of where things would be, if they happened to be in there. We’d have to get rid of the space to have “nothing” exist. And even then we have logical problems that sound like word games. (How can nothing have a property like “existing”? Or a property like “not existing”?) This is dangerous territory. Let’s not step there.

So take the empty space that’s what mathematics and physics people mean by “nothing”. What do we know about it? Unless we’re being difficult, it’s got some extent. There are points in it. There’s some idea of distance between these points. There’s probably more than one dimension of space. There’s probably some sense of time, too. At least we’re used to the expectation that things would change if we watched. It’s a tricky sense to have, though. It’s hard to say exactly what time is. We usually fall back on the idea that we know time has passed if we see something change. But if there isn’t anything to see change? How do we know there’s still time passing?

You maybe already answered. We know time is passing because we can see space changing. One of the legs of Modern Physics is geometry, how space is shaped and how its shape changes. This tells us how gravity works, and how electricity and magnetism propagate. If there were no matter, no energy, no things in the universe there would still be some kind of physics. And interesting physics, since the mathematics describing this stuff is even subtler and more challenging to the intuition than even normal Euclidean space. If you’re going to read a pop mathematics blog like this, you’re very used to this idea.

Probably haven’t looked very hard at the idea, though. How do you tell whether space is changing if there’s nothing in it? It’s all right to imagine a coordinate system put on empty space. Coordinates are our concept. They don’t affect the space any more than the names we give the squirrels in the yard affect their behavior. But how to make the coordinates move with the space? It seems question-begging at least.

We have a mathematical gimmick to resolve this. Of course we do. We call it a name like a “test mass” or a “test charge” or maybe just “test particle”. Imagine that we drop into space a thing. But it’s only barely a thing. It’s tiny in extent. It’s tiny in mass. It’s tiny in charge. It’s tiny in energy. It’s so slight in every possible trait that it can’t sully our nothingness. All it does is let us detect it. It’s a good question how. We have good eyes. But now, we could see the particle moving as the space it’s in moves.

But again we can ask how. Just one point doesn’t seem to tell us much. We need a bunch of test particles, a whole cloud of them. They don’t interact. They don’t carry energy or mass or anything. They just carry the sense of place. This is how we would perceive space changing in time. We can ask questions meaningfully.

Here’s an obvious question: how much volume does our cloud take up? If we’re going to be difficult about this, none at all, since it’s a finite number of particles that all have no extent. But you know what we mean. Draw a ball, or at least an ellipsoid, around the test particles. How big is that? Wait a while. Draw another ball around the now-moved test particles. How big is that now?

Here’s another question: has the cloud rotated any? The test particles, by definition, don’t have mass or anything. So they don’t have angular momentum. They aren’t pulling one another to the side any. If they rotate it’s because space has rotated, and that’s interesting to consider. And another question: might they swap positions? Could a pair of particles that go left-to-right swap so they go right-to-left? That I ask admits that I want to allow the possibility.

These are questions about coordinates. They’re about how one direction shifts to other directions. How it stretches or shrinks. That is to say, these are questions of tensors. Tensors are tools for many things, most of them about how things transmit through different directions. In this context, time is another direction.

All our questions about how space moves we can describe as curvature. How do directions fall away from being perpendicular to one another? From being parallel to themselves? How do their directions change in time? If we have three dimensions in space and one in time — a four-dimensional “manifold” — then there’s 20 different “directions” each with maybe their own curvature to consider. This may seem a lot. Every point on this manifold has this set of twenty numbers describing the curvature of space around it. There’s not much to do but accept that, though. If we could do with fewer numbers we would, but trying cheats us out of physics.

Ten of the numbers in that set are themselves a tensor. It’s known as the Weyl Tensor. It describes gravity’s equivalent to light waves. It’s about how the shape of our cloud will change as it moves. The other ten numbers form another tensor. That is, a thousand words into the essay, the Ricci Tensor. The Ricci Tensor describes how the volume of our cloud will change as the test particles move along. It may seem odd to need ten numbers for this, but that’s what we need. For three-dimensional space and one-dimensional time, anyway. We need fewer for two-dimensional space; more, for more dimensions of space.

The Ricci Tensor is a geometric construct. Most of us come to it, if we do, by way of physics. It’s a useful piece of general relativity. It has uses outside this, though. It appears in the study of Ricci Flows. Here space moves in ways akin to how heat flows. And the Ricci Tensor appears in projective geometry, in the study of what properties of shapes don’t depend on how we present them.

It’s still tricky stuff to get a feeling for. I’m not sure I have a good feel for it myself. There’s a long trail of mathematical symbols leading up to these tensors. The geometry of them becomes more compelling in four or more dimensions, which taxes the imagination. Yann Ollivier here has a paper that attempts to provide visual explanations for many of the curvatures and tensors that are part of the field. It might help.

The Summer 2017 Mathematics A To Z: Open Set


Today’s glossary entry is another request from Elke Stangl, author of the Elkemental Force blog. I’m hoping this also turns out to be a well-received entry. Half of that is up to you, the kind reader. At least I hope you’re a reader. It’s already gone wrong, as it was supposed to be Friday’s entry. I discovered I hadn’t actually scheduled it while I was too far from my laptop to do anything about that mistake. This spoils the nice Monday-Wednesday-Friday routine of these glossary entries that dates back to the first one I ever posted and just means I have to quit forever and not show my face ever again. Sorry, Ulam Spiral. Someone else will have to think of you.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Open Set.

Mathematics likes to present itself as being universal truths. And it is. At least if we allow that the rules of logic by which mathematics works are universal. Suppose them to be true and the rest follows. But we start out with intuition, with things we observe in the real world. We’re happy when we can remove the stuff that’s clearly based on idiosyncratic experience. We find something that’s got to be universal.

Sets are pretty abstract things, as mathematicians use the term. They get to be hard to talk about; we run out of simpler words that we can use. A set is … a bunch of things. The things are … stuff that could be in a set, or else that we’d rule out of a set. We can end up better understanding things by drawing a picture. We draw the universe, which is a rectangular block, sometimes with dashed lines as the edges. The set is some blotch drawn on the inside of it. Some shade it in to emphasize which stuff we want in the set. If we need to pick out a couple things in the universe we drop in dots or numerals. If we’re rigorous about the drawing we could create a Venn Diagram.

When we do this, we’re giving up on the pure mathematical abstraction of the set. We’re replacing it with a territory on a map. Several territories, if we have several sets. The territories can overlap or be completely separate. We’re subtly letting our sense of geography, our sense of the spaces in which we move, infiltrate our understanding of sets. That’s all right. It can give us useful ideas. Later on, we’ll try to separate out the ideas that are too bound to geography.

A set is open if whenever you’re in it, you can’t be on its boundary. We never quite have this in the real world, with territories. The border between, say, New Jersey and New York becomes this infinitesimally slender thing, as wide in space as midnight is in time. But we can, with some effort, imagine the state. Imagine being as tiny in every direction as the border between two states. Then we can imagine the difference between being on the border and being away from it.

And not being on the border matters. If we are not on the border we can imagine the problem of getting to the border. Pick any direction; we can move some distance while staying inside the set. It might be a lot of distance, it might be a tiny bit. But we stay inside however we might move. If we are on the border, then there’s some direction in which any movement, however small, drops us out of the set. That’s a difference in kind between a set that’s open and a set that isn’t.

I say “a set that’s open and a set that isn’t”. There are such things as closed sets. A set doesn’t have to be either open or closed. It can be neither, a set that includes some of its borders but not other parts of it. It can even be both open and closed simultaneously. The whole universe, for example, is both an open and a closed set. The empty set, with nothing in it, is both open and closed. (This looks like a semantic trick. OK, if you’re in the empty set you’re not on its boundary. But you can’t be in the empty set. So what’s going on? … The usual. It makes other work easier if we call the empty set ‘open’. And the extra work we’d have to do to rule out the empty set doesn’t seem to get us anything interesting. So we accept what might be a trick.) The definitions of ‘open’ and ‘closed’ don’t exclude one another.

I’m not sure how this confusing state of affairs developed. My hunch is that the words ‘open’ and ‘closed’ evolved independent of each other. Why do I think this? An open set has its openness from, well, not containing its boundaries; from the inside there’s always a little more to it. A closed set has its closedness from sequences. That is, you can consider a string of points inside a set. Are these points leading somewhere? Is that point inside your set? If a string of points always leads to somewhere, and that somewhere is inside the set, then you have closure. You have a closed set. I’m not sure that the terms were derived with that much thought. But it does explain, at least in terms a mathematician might respect, why a set that isn’t open isn’t necessarily closed.

Back to open sets. What does it mean to not be on the boundary of the set? How do we know if we’re on it? We can define sets by all sorts of complicated rules: complex-valued numbers of size less than five, say. Rational numbers whose denominator (in lowest form) is no more than ten. Points in space from which a satellite dropped would crash into the moon rather than into the Earth or Sun. If we have an idea of distance we could measure how far it is from a point to the nearest part of the boundary. Do we need distance, though?

No, it turns out. We can get the idea of open sets without using distance. Introduce a neighborhood of a point. A neighborhood of a point is an open set that contains that point. It doesn’t have to be small, but that’s the connotation. And we get to thinking of little N-balls, circle or sphere-like constructs centered on the target point. It doesn’t have to be N-balls. But we think of them so much that we might as well say it’s necessary. If every point in a set has a neighborhood around it that’s also inside the set, then the set’s open.

You’re going to accuse me of begging the question. Fair enough. I was using open sets to define open sets. This use is all right for an intuitive idea of what makes a set open, but it’s not rigorous. We can give in and say we have to have distance. Then we have N-balls and we can build open sets out of balls that don’t contain the edges. Or we can try to drive distance out of our idea of open sets.

We can do it this way. Start off by saying the whole universe is an open set. Also that the union of any number of open sets is also an open set. And that the intersection of any finite number of open sets is also an open set. Does this sound weak? So it sounds weak. It’s enough. We get the open sets we were thinking of all along from this.

This works for the sets that look like territories on a map. It also works for sets for which we have some idea of distance, however strange it is to our everyday distances. It even works if we don’t have any idea of distance. This lets us talk about topological spaces, and study what geometry looks like if we can’t tell how far apart two points are. We can, for example, at least tell that two points are different. Can we find a neighborhood of one that doesn’t contain the other? Then we know they’re some distance apart, even without knowing what distance is.

That we reached so abstract an idea of what an open set is without losing the idea’s usefulness suggests we’re doing well. So we are. It also shows why Nicholas Bourbaki, the famous nonexistent mathematician, thought set theory and its related ideas were the core of mathematics. Today category theory is a more popular candidate for the core of mathematics. But set theory is still close to the core, and much of analysis is about what we can know from the fact of sets being open. Open sets let us explain a lot.

The Summer 2017 Mathematics A To Z: N-Sphere/N-Ball


Today’s glossary entry is a request from Elke Stangl, author of the Elkemental Force blog, which among other things has made me realize how much there is interesting to say about heat pumps. Well, you never know what’s interesting before you give it serious thought.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

N-Sphere/N-Ball.

I’ll start with space. Mathematics uses a lot of spaces. They’re inspired by geometry, by the thing that fills up our room. Sometimes we make them different by simplifying them, by thinking of the surface of a table, or what geometry looks like along a thread. Sometimes we make them bigger, imagining a space with more directions than we have. Sometimes we make them very abstract. We realize that we can think of polynomials, or functions, or shapes as if they were points in space. We can describe things that work like distance and direction and angle that work for these more abstract things.

What are useful things we know about space? Many things. Whole books full of things. Let me pick one of them. Start with a point. Suppose we have a sense of distance, of how far one thing is from one another. Then we can have an idea of the neighborhood. We can talk about some chunk of space that’s near our starting point.

So let’s agree on a space, and on some point in that space. You give me a distance. I give back to you — well, two obvious choices. One of them is all the points in that space that are exactly that distance from our agreed-on point. We know what this is, at least in the two kinds of space we grow up comfortable with. In three-dimensional space, this is a sphere. A shell, at least, centered around whatever that first point was. In two-dimensional space, on our desktop, it’s a circle. We know it can look a little weird: if we started out in a one-dimensional space, there’d be only two points, one on either side of the original center point. But it won’t look too weird. Imagine a four-dimensional space. Then we can speak of a hypersphere. And we can imagine that as being somehow a ball that’s extremely spherical. Maybe it pokes out of the rendering we try making of it, like a cartoon character falling out of the movie screen. We can imagine a five-dimensional space, or a ten-dimensional one, or something with even more dimensions. And we can conclude there’s a sphere for even that much space. Well, let it.

What are spheres good for? Well, they’re nice familiar shapes. Even if they’re in a weird number of dimensions. They’re useful, too. A lot of what we do in calculus, and in analysis, is about dealing with difficult points. Points where a function is discontinuous. Points where the function doesn’t have a value. One of calculus’s reliable tricks, though, is that we can swap information about the edge of things for information about the interior. We can replace a point with a sphere and find our work is easier.

The other thing I could give you. It’s a ball. That’s all the points that aren’t more than your distance away from our point. It’s the inside, the whole planet rather than just the surface of the Earth.

And here’s an ambiguity. Is the surface a part of the ball? Should we include the edge, or do we just want the inside? And that depends on what we want to do. Either might be right. If we don’t need the edge, then we have an open set (stick around for Friday). This gives us the open ball. If we do need the edge, then we have a closed set, and so, the closed ball.

Balls are so useful. Take a chunk of space that you find interesting for whatever reason. We can represent that space as the joining together (the “union”) of a bunch of balls. Probably not all the same size, but that’s all right. We might need infinitely many of these balls to get the chunk precisely right, or as close to right as can be. But that’s all right. We can still do it. Most anything we want to analyze is easier to prove on any one of these balls. And since we can describe the complicated shape as this combination of balls, then we can know things about the whole complicated shape. It’s much the way we can know things about polygons by breaking them into triangles, and showing things are true about triangles.

Sphere or ball, whatever you like. We can describe how many dimensions of space the thing occupies with the prefix. The 3-ball is everything close enough to a point that’s in a three-dimensional space. The 2-ball is everything close enough in a two-dimensional space. The 10-ball is everything close enough to a point in a ten-dimensional space. The 3-sphere is … oh, all right. Here we have a little squabble. People doing geometry prefer this to be the sphere in three dimensions. People doing topology prefer this to be the sphere whose surface has three dimensions, that is, the sphere in four dimensions. Usually which you mean will be clear from context: are you reading a geometry or a topology paper? If you’re not sure, oh, look for anything hinting at the number of spatial dimensions. If nothing gives you a hint maybe it doesn’t matter.

Either way, we do want to talk about the family of shapes without committing ourselves to any particular number of dimensions. And so that’s why we fall back on ‘N’. ‘N’ is a good name for “the number of dimensions we’re working in”, and so we use it. Then we have the N-sphere and the N-ball, a sphere-like shape, or a ball-like shape, that’s in however much space we need for the problem.

I mentioned something early on that I bet you paid no attention to. That was that we need a space, and a point inside the space, and some idea of distance. One of the surprising things mathematics teaches us about distance is … there’s a lot of ideas of distance out there. We have what I’ll call an instinctive idea of distance. It’s the one that matches what holding a ruler up to stuff tells us. But we don’t have to have that.

I sense the grumbling already. Yes, sure, we can define distance by some screwball idea, but do we ever need it? To which the mathematician answers, well, what if you’re trying to figure out how far away something in midtown Manhattan is? Where you can only walk along streets or avenues and we pretend Broadway doesn’t exist? Huh? How about that? Oh, fine, the skeptic might answer. Grant that there can be weird cases where the straight-line ruler distance is less enlightening than some other scheme is.

Well, there are. There exists a whole universe of different ideas of distance. There’s a handful of useful ones. The ordinary straight-line ruler one, the Euclidean distance, you get in a method so familiar it’s worth saying what you do. You find the coordinates of your two given points. Take the pairs of corresponding coordinates: the x-coordinates of the two points, the y-coordinates of the two points, the z-coordinates, and so on. Find the differences between corresponding coordinates. Take the absolute value of those differences. Square all those absolute-value differences. Add up all those squares. Take the square root of that. Fine enough.

There’s a lot of novelty acts. For example, do that same thing, only instead of raising the differences to the second power, raise them to the 26th power. When you get the sum, instead of the square root, take the 26th root. There. That’s a legitimate distance. No, you will never need this, but your analysis professor might give you it as a homework problem sometime.

Some are useful, though. Raising to the first power, and then eventually taking the first root, gives us something useful. Yes, raising to a first power and taking a first root isn’t doing anything. We just say we’re doing that for the sake of consistency. Raising to an infinitely large power, and then taking an infinitely great root, inspires angry glares. But we can make that idea rigorous. When we do it gives us something useful.

And here’s a new, amazing thing. We can still make “spheres” for these other distances. On a two-dimensional space, the “sphere” with this first-power-based distance will look like a diamond. The “sphere” with this infinite-power-based distance will look like a square. On a three-dimensional space the “sphere” with the first-power-based distance looks like a … well, more complicated, three-dimensional diamond. The “sphere” with the infinite-power-based distance looks like a box. The “balls” in all these cases look like what you expect from knowing the spheres.

As with the ordinary ideas of spheres and balls these shapes let us understand space. Spheres offer a natural path to understanding difficult points. Balls offer a natural path to understanding complicated shapes. The different ideas of distance change how we represent these, and how complicated they are, but not the fact that we can do it. And it allows us to start thinking of what spheres and balls for more abstract spaces, universes made of polynomials or formed of trig functions, might be. They’re difficult to visualize. But we have the grammar that lets us speak about them now.

And for a postscript: I also wrote about spheres and balls as part of my Set Tour a couple years ago. Here’s the essay about the N-sphere, although I didn’t exactly call it that. And here’s the essay about the N-ball, again not quite called that.

The Summer 2017 Mathematics A To Z: Morse Theory


Today’s A To Z entry is a change of pace. It dives deeper into analysis than this round has been. The term comes from Mr Wu, of the Singapore Maths Tuition blog, whom I thank for the request.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Morse Theory.

An old joke, as most of my academia-related ones are. The young scholar says to his teacher how amazing it was in the old days, when people were foolish, and thought the Sun and the Stars moved around the Earth. How fortunate we are to know better. The elder says, ah yes, but what would it look like if it were the other way around?

There are many things to ponder packed into that joke. For one, the elder scholar’s awareness that our ancestors were no less smart or perceptive or clever than we are. For another, the awareness that there is a problem. We want to know about the universe. But we can only know what we perceive now, where we are at this moment. Even a note we’ve written in the past, or a message from a trusted friend, we can’t take uncritically. What we know is that we perceive this information in this way, now. When we pay attention to our friends in the philosophy department we learn that knowledge is even harder than we imagine. But I’ll stop there. The problem is hard enough already.

We can put it in a mathematical form, one that seems immune to many of the worst problems of knowledge. In this form it looks something like this: if what can we know about the universe, if all we really know is what things in that universe are doing near us? The things that we look at are functions. The universe we’re hoping to understand is the domain of the functions. One filter we use to see the universe is Morse Theory.

We don’t look at every possible function. Functions are too varied and weird for that. We look at functions whose range is the real numbers. And they must be smooth. This is a term of art. It means the function has derivatives. It has to be continuous. It can’t have sharp corners. And it has to have lots of derivatives. The first derivative of a smooth function has to also be continuous, and has to also lack corners. And the derivative of that first derivative has to be continuous, and to lack corners. And the derivative of that derivative has to be the same. A smooth function can can differentiate over and over again, infinitely many times. None of those derivatives can have corners or jumps or missing patches or anything. This is what makes it smooth.

Most functions are not smooth, in much the same way most shapes are not circles. That’s all right. There are many smooth functions anyway, and they describe things we find interesting. Or we think they’re interesting, anyway. Smooth functions are easy for us to work with, and to know things about. There’s plenty of smooth functions. If you’re interested in something else there’s probably a smooth function that’s close enough for practical use.

Morse Theory builds on the “critical points” of these smooth functions. A critical point, in this context, is one where the derivative is zero. Derivatives being zero usually signal something interesting going on. Often they show where the function changes behavior. In freshman calculus they signal where a function changes from increasing to decreasing, so the critical point is a maximum. In physics they show where a moving body no longer has an acceleration, so the critical point is an equilibrium. Or where a system changes from one kind of behavior to another. And here — well, many things can happen.

So take a smooth function. And take a critical point that it’s got. (And, erg. Technical point. The derivative of your smooth function, at that critical point, shouldn’t be having its own critical point going on at the same spot. That makes stuff more complicated.) It’s possible to approximate your smooth function near that critical point with, of course, a polynomial. It’s always polynomials. The shape of these polynomials gives you an index for these points. And that can tell you something about the shape of the domain you’re on.

At least, it tells you something about what the shape is where you are. The universal model for this — based on skimming texts and papers and popularizations of this — is of a torus standing vertically. Like a doughnut that hasn’t tipped over, or like a tire on a car that’s working as normal. I suspect this is the best shape to use for teaching, as anyone can understand it while it still shows the different behaviors. I won’t resist.

Imagine slicing this tire horizontally. Slice it close to the bottom, below the central hole, and the part that drops down is a disc. At least, it could be flattened out tolerably well to a disc.

Slice it somewhere that intersects the hole, though, and you have a different shape. You can’t squash that down to a disc. You have a noodle shape. A cylinder at least. That’s different from what you got the first slice.

Slice the tire somewhere higher. Somewhere above the central hole, and you have … well, it’s still a tire. It’s got a hole in it, but you could imagine patching it and driving on. There’s another different shape that we’ve gotten from this.

Imagine we were confined to the surface of the tire, but did not know what surface it was. That we start at the lowest point on the tire and ascend it. From the way the smooth functions around us change we can tell how the surface we’re on has changed. We can see its change from “basically a disc” to “basically a noodle” to “basically a doughnut”. We could work out what the surface we’re on has to be, thanks to how these smooth functions around us change behavior.

Occasionally we mathematical-physics types want to act as though we’re not afraid of our friends in the philosophy department. So we deploy the second thing we know about Immanuel Kant. He observed that knowing the force of gravity falls off as the square of the distance between two things implies that the things should exist in a three-dimensional space. (Source: I dunno, I never read his paper or book or whatever and dunno I ever heard anyone say they did.) It’s a good observation. Geometry tells us what physics can happen, but what physics does happen tells us what geometry they happen in. And it tells the philosophy department that we’ve heard of Immanuel Kant. This impresses them greatly, we tell ourselves.

Morse Theory is a manifestation of how observable physics teaches us the geometry they happen on. And in an urgent way, too. Some of Edward Witten’s pioneering work in superstring theory was in bringing Morse Theory to quantum field theory. He showed a set of problems called the Morse Inequalities gave us insight into supersymmetric quantum mechanics. The link between physics and doughnut-shapes may seem vague. This is because you’re not remembering that mathematical physics sees “stuff happening” as curves drawn on shapes which represent the kind of problem you’re interested in. Learning what the shapes representing the problem look like is solving the problem.

If you’re interested in the substance of this, the universally-agreed reference is J Milnor’s 1963 text Morse Theory. I confess it’s hard going to read, because it’s a symbols-heavy textbook written before the existence of LaTeX. Each page reminds one why typesetters used to get hazard pay, and not enough of it.

The Summer 2017 Mathematics A To Z: Klein Bottle


Gaurish, of the For The Love Of Mathematics blog, takes me back into topology today. And it’s a challenging one, because what can I say about a shape this involved when I’m too lazy to draw pictures or include photographs most of the time?

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

In 1958 Clifton Fadiman, an open public intellectual and panelist on many fine old-time radio and early TV quiz shows, edited the book Fantasia Mathematica. It’s a pleasant read and you likely can find a copy in a library or university library nearby. It’s a collection of mathematically-themed stuff. Mostly short stories, a few poems, some essays, even that bit where Socrates works through a proof. And some of it is science fiction, this from an era when science fiction was really disreputable.

If there’s a theme to the science fiction stories included it is: Möbius Strips, huh? There are so many stories in the book that amount to, “what is this crazy bizarre freaky weird ribbon-like structure that only has the one side? Huh?” As I remember even one of the non-science-fiction stories is a Möbius Strip story.

I don’t want to sound hard on the writers, nor on Fadiman for collecting what he has. A story has to be about people doing something, even if it’s merely exploring some weird phenomenon. You can imagine people dealing with weird shapes. It’s hard to imagine what story you could tell about an odd perfect number. (Well, that isn’t “here’s how we discovered the odd perfect number”, which amounts to a lot of thinking and false starts. Or that doesn’t make the odd perfect number a MacGuffin, the role equally well served by letters of transit or a heap of gold or whatever.) Many of the stories that aren’t about the Möbius Strip are about four- and higher-dimensional shapes that people get caught in or pass through. One of the hyperdimensional stories, A J Deutsch’s “A Subway Named Möbius”, even pulls in the Möbius Strip. The name doesn’t fit, but it is catchy, and is one of the two best tall tales about the Boston subway system.

Besides, it’s easy to see why the Möbius Strip is interesting. It’s a ribbon where both sides are the same side. What’s not neat about that? It forces us to realize that while we know what “sides” are, there’s stuff about them that isn’t obvious. That defies intuition. It’s so easy to make that it holds another mystery. How is this not a figure known to the ancients and used as a symbol of paradox for millennia? I have no idea; it’s hard to guess why something was not noticed when it could easily have been It dates to 1858, when August Ferdinand Möbius and Johann Bendict Listing independently published on it.

The Klein Bottle is newer by a generation. Felix Klein, who used group theory to enlighten geometry and vice-versa, described the surface in 1882. It has much in common with the Möbius Strip. It’s a thing that looks like a solid. But it’s impossible to declare one side to be outside and the other in, at least not in any logically coherent way. Take one and dab a spot with a magic marker. You could trace, with the marker, a continuous curve that gets around to the same spot on the “other” “side” of the thing. You see why I have to put quotes around “other” and “side”. I believe you know what I mean when I say this. But taken literally, it’s nonsense.

The Klein Bottle’s a two-dimensional surface. By that I mean that could cover it with what look like lines of longitude and latitude. Those coordinates would tell you, without confusion, where a point on the surface is. But it’s embedded in a four-dimensional space. (Or a higher-dimensional space, but everything past the fourth dimension is extravagance.) We have never seen a Klein Bottle in its whole. I suppose there are skilled people who can imagine it faithfully, but how would anyone else ever know?

Big deal. We’ve never seen a tesseract either, but we know the shadow it casts in three-dimensional space. So it is with the Klein Bottle. Visit any university mathematics department. If they haven’t got a glass replica of one in the dusty cabinets welcoming guests to the department, never fear. At least one of the professors has one on an office shelf, probably beside some exams from eight years ago. They make nice-looking jars. Klein Bottles don’t have to. There are different shapes their projection into three dimensions can take. But the only really different one is this sort of figure-eight helical shape that looks like a roller coaster gone vicious. (There’s also a mirror image of this, the helix winding the opposite way.) These representations have the surface cross through itself. In four dimensions, it does no such thing, any more than the edges of a cube cross one another. It’s just the lines in a picture on a piece of paper that cross.

The Möbius Strip is good practice for learning about the Klein Bottle. We can imagine creating a Bottle by the correct stitching-together of two strips. Or, if you feel destructive, we can start with a Bottle and slice it, producing a pair of Möbius Strips. Both are non-orientable. We can’t make a division between one side and another that reflects any particular feature of the shape. One of the helix-like representations of the Klein Bottle also looks like a pool toy-ring version of the Möbius Strip.

And strange things happen on these surfaces. You might remember the four-color map theorem. Four colors are enough to color any two-dimensional map without adjacent territories having to share a color. (This isn’t actually so, as the territories have to be contiguous, with no enclaves of one territory inside another. Never mind.) This is so for territories on the sphere. It’s hard to prove (although the five-color theorem is easy.) Not so for the Möbius Strip: territories on it might need as many as six colors. And likewise for the Klein Bottle. That’s a particularly neat result, as the Heawood Conjecture tells us the Klein Bottle might need seven. The Heawood Conjecture is otherwise dead-on in telling us how many colors different kinds of surfaces need for their map-colorings. The Klein Bottle is a strange surface. And yes, it was easier to prove the six-color theorem on the Klein Bottle than it was to prove the four-color theorem on the plane or sphere.

(Though it’s got the tentative-sounding name of conjecture, the Heawood Conjecture is proven. Heawood put it out as a conjecture in 1890. It took to 1968 for the whole thing to be finally proved. I imagine all those decades of being thought but not proven true gave it a reputation. It’s not wrong for Klein Bottles. If six colors are enough for these maps, then so are seven colors. It’s just that Klein Bottles are the lone case where the bound is tighter than Heawood suggests.)

All that said, do we care? Do Klein Bottles represent something of particular mathematical interest? Or are they imagination-capturing things we don’t really use? I confess I’m not enough of a topologist to say how useful they are. They are easily-understood examples of algebraic or geometric constructs. These are things with names like “quotient spaces” and “deck transformations” and “fiber bundles”. The thought of the essay I would need to write to say what a fiber bundle is makes me appreciate having good examples of the thing around. So if nothing else they are educationally useful.

And perhaps they turn up more than I realize. The geometry of Möbius Strips turns up in many surprising places: music theory and organic chemistry, superconductivity and roller coasters. It would seem out of place if the kinds of connections which make a Klein Bottle don’t turn up in our twisty world.

The Summer 2017 Mathematics A To Z: Jordan Canonical Form


I made a mistake! I thought we had got to the end of the block of A To Z topics suggested by Gaurish, of the For The Love Of Mathematics blog. Not so and, indeed, I wonder if it wouldn’t be a viable writing strategy around here for me to just ask Gaurish to throw out topics and I have two weeks to write about them. I don’t think there’s a single unpromising one in the set.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Jordan Canonical Form.

Before you ask, yes, this is named for the Camille Jordan.

So this is a thing from algebra. Particularly, linear algebra. And more particularly, matrices. Matrices are so much of linear algebra that you could be forgiven thinking they’re all of linear algebra. The thing is, matrices are a really good way of describing linear transformations. That is, where you take a block of space and stretch it out, or squash it down, or rotate it, or do some combination of these things. And stretching and squashing and rotating is a lot of what you’d ever want to do. Refer to any book on how to draw animated cartoons. The only thing matrices can’t do is have their eyes bug out huge when an attractive region of space walks past.

Thing about a matrix is if you want to do something with it, you’re going to write it as a grid of numbers. It doesn’t have to be a grid of numbers. But about all the matrices anyone does anything with are grids of numbers. And that’s fine. They do an incredible lot of stuff. What’s not fine is that on looking at a huge block of numbers, the mind sees: huh. That’s a big block of numbers. Good luck finding what’s meaningful in them. To help find meaning we have a set of standard forms. We call them “canonical” or “normal” or some other approving term. They rearrange and change the terms in the matrix so that more interesting stuff is more obvious.

Now you’re justified asking: how can we rearrange and change the terms in a matrix without changing what the matrix is? We can get away with doing this because we can show some rearrangements don’t change what we’re interested in. That covers the “how dare we” part of “how”. We do it by using matrix multiplication. You might remember from high school algebra that matrix multiplication is this agonizing process of multiplying every pair of numbers that ever existed together, then adding them all up, and then maybe you multiply something by minus one because you’re thinking of determinants, and it all comes out wrong anyway and you have to do it over? Yeah. Well, matrix multiplication is defined hard because it makes stuff like this work out. So that covers the “by what technique” part of “how”. We start out with some matrix, let me imaginatively name it A . And then we find some transformation matrix for which, eh, let’s say P is a good enough name. I’ll say why in a moment. Then we use that matrix and its multiplicative inverse P^{-1} . And we evaluate the product P^{-1} A P . This won’t just be the same old matrix we started with. Not usually. Promise. But what this will be, if we chose our matrix P correctly, is some new matrix that’s easier to read.

The matrices involved here have to follow some rules. Most important, they’re all going to be square matrices. There’ll be more rules that your linear algebra textbook will tell you. Or your instructor will, after checking the textbook.

So what makes a matrix easy to read? Zeroes. Lots and lots of zeroes. When we have a standardized form of a matrix it’s nearly all zeroes. This is for a good reason: zeroes are easy to multiply stuff by. And they’re easy to add stuff to. And almost everything we do with matrices, as a calculation, is a lot of multiplication and addition of the numbers in the matrix.

What also makes a matrix easy to read? Everything important being on the diagonal. The diagonal is one of the two things you would imagine if you were told “here’s a grid of numbers, pick out the diagonal”. In particular it’s the one that goes from the upper left to the bottom right, that is, row one column one, and row two column two, and row three column three, and so on up to row 86 column 86 (or whatever). If everything is on the diagonal the matrix is incredibly easy to work with. If it can’t all be on the diagonal at least everything should be close to it. As close as possible.

In the Jordan Canonical Form not everything is on the diagonal. I mean, it can be, but you shouldn’t count on that. But everything either will be on the diagonal or else it’ll be one row up from the diagonal. That is, row one column two, row two column three, row 85 column 86. Like that. There’s two other important pieces.

First is the thing in the row above the diagonal will be either 1 or 0. Second is that on the diagonal you’ll have a sequence of all the same number. Like, you’ll get four instances of the number ‘2’ along this string of the diagonal. Third is that you’ll get a 1 above all but the row above first instance of this particular number. Fourth is that you’ll get a 0 in the row above the first instance of this number.

Yeah, that’s fussy to visualize. This is one of those things easiest to show in a picture. A Jordan canonical form is a matrix that looks like this:

2 1 0 0 0 0 0 0 0 0 0 0
0 2 1 0 0 0 0 0 0 0 0 0
0 0 2 1 0 0 0 0 0 0 0 0
0 0 0 2 0 0 0 0 0 0 0 0
0 0 0 0 3 1 0 0 0 0 0 0
0 0 0 0 0 3 0 0 0 0 0 0
0 0 0 0 0 0 4 1 0 0 0 0
0 0 0 0 0 0 0 4 1 0 0 0
0 0 0 0 0 0 0 0 4 0 0 0
0 0 0 0 0 0 0 0 0 -1 0 0
0 0 0 0 0 0 0 0 0 0 -2 1
0 0 0 0 0 0 0 0 0 0 0 -2

This may have you dazzled. It dazzles mathematicians too. When we have to write a matrix that’s almost all zeroes like this we drop nearly all the zeroes. If we have to write anything we just write a really huge 0 in the upper-right and the lower-left corners.

What makes this the Jordan Canonical Form is that the matrix looks like it’s put together from what we call Jordan Blocks. Look around the diagonals. Here’s the first Jordan Block:

2 1 0 0
0 2 1 0
0 0 2 1
0 0 0 2

Here’s the second:

3 1
0 3

Here’s the third:

4 1 0
0 4 1
0 0 4

Here’s the fourth:

-1

And here’s the fifth:

-2 1
0 -2

And we can represent the whole matrix as this might-as-well-be-diagonal thing:

First Block 0 0 0 0
0 Second Block 0 0 0
0 0 Third Block 0 0
0 0 0 Fourth Block 0
0 0 0 0 Fifth Block

These blocks can be as small as a single number. They can be as big as however many rows and columns you like. Each individual block is some repeated number on the diagonal, and a repeated one in the row above the diagonal. You can call this the “superdiagonal”.

(Mathworld, and Wikipedia, assert that sometimes the row below the diagonal — the “subdiagonal” — gets the 1’s instead of the superdiagonal. That’s fine if you like it that way, and it won’t change any of the real work. I have not seen these subdiagonal 1’s in the wild. But I admit I don’t do a lot of this field and maybe there’s times it’s more convenient.)

Using the Jordan Canonical Form for a matrix is a lot like putting an object in a standard reference pose for photographing. This is a good metaphor. We get a Jordan Canonical Form by matrix multiplication, which works like rotating and scaling volumes of space. You can view the Jordan Canonical Form for a matrix as how you represent the original matrix from a new viewing angle that makes it easy to recognize. And this is why P is not a bad name for the matrix that does this work. We can see all this as “projecting” the matrix we started with into a new frame of reference. The new frame is maybe rotated and stretched and squashed and whatnot, compared to how we started. But it’s as valid a base. Projecting a mathematical object from one frame of reference to another usually involves calculating something that looks like P^{-1} A P so, projection. That’s our name.

Mathematicians will speak of “the” Jordan Canonical Form for a matrix as if there were such a thing. I don’t mean that Jordan Canonical Forms don’t exist. They exist just as much as matrices do. It’s the “the” that misleads. You can put the Jordan Blocks in any order and have as valid, and as useful, a Jordan Canonical Form. But it’s easy to swap the orders of these blocks around — it’s another matrix multiplication, and a blessedly easy one — so it doesn’t matter which form you have. Get any one and you have them all.

I haven’t said anything about what these numbers on the diagonal are. They’re the eigenvalues of the original matrix. I hope that clears things up.

Yeah, not to anyone who didn’t know what a Jordan Canonical Form was to start with. Rather than get into calculations let me go to well-established metaphor. Take a sample of an unknown chemical and set it on fire. Put the light from this through a prism and photograph the spectrum. There will be lines, interruptions in the progress of colors. The locations of those lines and how intense they are tell you what the chemical is made of, and in what proportions. These are much like the eigenvectors and eigenvalues of a matrix. The eigenvectors tell you what the matrix is made of, and the eigenvalues how much of the matrix is those. This stuff gets you very far in proving a lot of great stuff. And part of what makes the Jordan Canonical Form great is that you get the eigenvalues right there in neat order, right where anyone can see them.

So! All that’s left is finding the things. The best way to find the Jordan Canonical Form for a given matrix is to become an instructor for a class on linear algebra and assign it as homework. The second-best way is to give the problem to your TA, who will type it in to Mathematica and return the result. It’s too much work to do most of the time. Almost all the stuff you could learn from having the thing in the Jordan Canonical Form you work out in the process of finding the matrix P that would let you calculate what the Jordan Canonical Form is. And once you had that, why go on?

Where the Jordan Canonical Form shines is in doing proofs about what matrices can do. We can always put a square matrix into a Jordan Canonical Form. So if we want to show something is true about matrices in general, we can show that it’s true for the simpler-to-work-with Jordan Canonical Form. Then show that shifting a matrix to or from the Jordan Canonical Form doesn’t change whether the thing we’re interested in is true. It exists in that strange space: it is quite useful, but never on a specific problem.

Oh, all right. Yes, it’s the same Camille Jordan of the Jordan Curve and also of the Jordan Curve Theorem. That fellow.

The Summer 2017 Mathematics A To Z: Integration


One more mathematics term suggested by Gaurish for the A-To-Z today, and then I’ll move on to a couple of others. Today’s is a good one.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Integration.

Stand on the edge of a plot of land. Walk along its boundary. As you walk the edge pay attention. Note how far you walk before changing direction, even in the slightest. When you return to where you started consult your notes. Contained within them is the area you circumnavigated.

If that doesn’t startle you perhaps you haven’t thought about how odd that is. You don’t ever touch the interior of the region. You never do anything like see how many standard-size tiles would fit inside. You walk a path that is as close to one-dimensional as your feet allow. And encoded in there somewhere is an area. Stare at that incongruity and you realize why integrals baffle the student so. They have a deep strangeness embedded in them.

We who do mathematics have always liked integration. They grow, in the western tradition, out of geometry. Given a shape, what is a square that has the same area? There are shapes it’s easy to find the area for, given only straightedge and compass: a rectangle? Easy. A triangle? Just as straightforward. A polygon? If you know triangles then you know polygons. A lune, the crescent-moon shape formed by taking a circular cut out of a circle? We can do that. (If the cut is the right size.) A circle? … All right, we can’t do that, but we spent two thousand years trying before we found that out for sure. And we can do some excellent approximations.

That bit of finding-a-square-with-the-same-area was called “quadrature”. The name survives, mostly in the phrase “numerical quadrature”. We use that to mean that we computed an integral’s approximate value, instead of finding a formula that would get it exactly. The otherwise obvious choice of “numerical integration” we use already. It describes computing the solution of a differential equation. We’re not trying to be difficult about this. Solving a differential equation is a kind of integration, and we need to do that a lot. We could recast a solving-a-differential-equation problem as a find-the-area problem, and vice-versa. But that’s bother, if we don’t need to, and so we talk about numerical quadrature and numerical integration.

Integrals are built on two infinities. This is part of why it took so long to work out their logic. One is the infinity of number; we find an integral’s value, in principle, by adding together infinitely many things. The other is an infinity of smallness. The things we add together are infinitesimally small. That we need to take things, each smaller than any number yet somehow not zero, and in such quantity that they add up to something, seems paradoxical. Their geometric origins had to be merged into that of arithmetic, of algebra, and it is not easy. Bishop George Berkeley made a steady name for himself in calculus textbooks by pointing this out. We have worked out several logically consistent schemes for evaluating integrals. They work, mostly, by showing that we can make the error caused by approximating the integral smaller than any margin we like. This is a standard trick, or at least it is, now that we know it.

That “in principle” above is important. We don’t actually work out an integral by finding the sum of infinitely many, infinitely tiny, things. It’s too hard. I remember in grad school the analysis professor working out by the proper definitions the integral of 1. This is as easy an integral as you can do without just integrating zero. He escaped with his life, but it was a close scrape. He offered the integral of x as a way to test our endurance, without actually doing it. I’ve never made it through that.

But we do integrals anyway. We have tools on our side. We can show, for example, that if a function obeys some common rules then we can use simpler formulas. Ones that don’t demand so many symbols in such tight formation. Ones that we can use in high school. Also, ones we can adapt to numerical computing, so that we can let machines give us answers which are near enough right. We get to choose how near is “near enough”. But then the machines decide how long we’ll have to wait to get that answer.

The greatest tool we have on our side is the Fundamental Theorem of Calculus. Even the name promises it’s the greatest tool we might have. This rule tells us how to connect integrating a function to differentiating another function. If we can find a function whose derivative is the thing we want to integrate, then we have a formula for the integral. It’s that function we found. What a fantastic result.

The trouble is it’s so hard to find functions whose derivatives are the thing we wanted to integrate. There are a lot of functions we can find, mind you. If we want to integrate a polynomial it’s easy. Sine and cosine and even tangent? Yeah. Logarithms? A little tedious but all right. A constant number raised to the power x? Also tedious but doable. A constant number raised to the power x2? Hold on there, that’s madness. No, we can’t do that.

There is a weird grab-bag of functions we can find these integrals for. They’re mostly ones we can find some integration trick for. An integration trick is some way to turn the integral we’re interested in into a couple of integrals we can do and then mix back together. A lot of a Freshman Calculus course is a heap of tricks we’ve learned. They have names like “u-substitution” and “integration by parts” and “trigonometric substitution”. Some of them are really exotic, such as turning a single integral into a double integral because that leads us to something we can do. And there’s something called “differentiation under the integral sign” that I don’t know of anyone actually using. People know of it because Richard Feynman, in his fun memoir What Do You Care What Other People Think: 250 Pages Of How Awesome I Was In Every Situation Ever, mentions how awesome it made him in so many situations. Mathematics, physics, and engineering nerds are required to read this at an impressionable age, so we fall in love with a technique no textbook ever mentions. Sorry.

I’ve written about all this as if we were interested just in areas. We’re not. We like calculating lengths and volumes and, if we dare venture into more dimensions, hypervolumes and the like. That’s all right. If we understand how to calculate areas, we have the tools we need. We can adapt them to as many or as few dimensions as we need. By weighting integrals we can do calculations that tell us about centers of mass and moments of inertial, about the most and least probable values of something, about all quantum mechanics.

As often happens, this powerful tool starts with something anyone might ponder: what size square has the same area as this other shape? And then think seriously about it.

The Summer 2017 Mathematics A To Z: Elliptic Curves


Gaurish, of the For The Love Of Mathematics gives me another subject today. It’s one that isn’t about ellipses. Sad to say it’s also not about elliptic integrals. This is sad to me because I have a cute little anecdote about a time I accidentally gave my class an impossible problem. I did apologize. No, nobody solved it anyway.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Elliptic Curves.

Elliptic Curves start, of course, with polynomials. Particularly, they’re polynomials with two variables. We call the ‘x’ and ‘y’ because we have no reason to be difficult. They’re of at most third degree. That is, we can have terms like ‘x’ and ‘y2‘ and ‘x2y’ and ‘y3‘. Something with higher powers, like, ‘x4‘ or ‘x2y2‘ — a fourth power, all together — is right out. Doesn’t matter. Start from this and we can do some slick changes of variables so that we can rewrite it to look like this:

y^2 = x^3 + Ax + B

Here, ‘A’ and ‘B’ are some numbers that don’t change for this particular curve. Also, we need it to be true that 4A^3 + 27B^2 doesn’t equal zero. It avoids problems. What we’ll be looking at are coordinates, values of ‘x’ and ‘y’ together which make this equation true. That is, it’s points on the curve. If you pick some real numbers ‘A’ and ‘B’ and draw all the values of ‘x’ and ‘y’ that make the equation true you get … well, there’s different shapes. They all look like those microscope photos of a water drop emerging and falling from a tap, only rotated clockwise ninety degrees.

So. Pick any of these curves that you like. Pick a point. I’m going to name your point ‘P’. Now pick a point once more. I’m going to name that point ‘Q’. Now draw a line from P through Q. Keep drawing it. It’ll cross the original elliptic curve again. And that point is … not actually special. What is special is the reflection of that point. That is, the same x-coordinate, but flip the plus or minus sign for the y-coordinate. (WARNING! Do not call it “the reflection” at your thesis defense! Call it the “conjugate” point. It means “reflection”.) Your elliptic curve will be symmetric around the x-axis. If, say, the point with x-coordinate 4 and y-coordinate 3 is on the curve, so is the point with x-coordinate 4 and y-coordinate -3. So that reflected point is … something special.

Kind of a curved-out less-than-sign shape.
y^2 = x^3 - 1 . The water drop bulges out from the surface.

This lets us do something wonderful. We can think of this reflected point as the sum of your ‘P’ and ‘Q’. You can ‘add’ any two points on the curve and get a third point. This means we can do something that looks like addition for points on the elliptic curve. And this means the points on this curve are a group, and we can bring all our group-theory knowledge to studying them. It’s a commutative group, too; ‘P’ added to ‘Q’ leads to the same point as ‘Q’ added to ‘P’.

Let me head off some clever thoughts that make fair objections. What if ‘P’ and ‘Q’ are already reflections, so the line between them is vertical? That never touches the original elliptic curve again, right? Yeah, fair complaint. We patch this by saying that there’s one more point, ‘O’, that’s off “at infinity”. Where is infinity? It’s wherever your vertical lines end. Shut up, this can too be made rigorous. In any case it’s a common hack for this sort of problem. When we add that, everything’s nice. The ‘O’ serves the role in this group that zero serves in arithmetic: the sum of point ‘O’ and any point ‘P’ is going to be ‘P’ again.

Second clever thought to head off: what if ‘P’ and ‘Q’ are the same point? There’s infinitely many lines that go through a single point so how do we pick one to find an intersection with the elliptic curve? Huh? If you did that, then we pick the tangent line to the elliptic curve that touches ‘P’, and carry on as before.

The curved-out less-than-sign shape has a noticeable c-shaped bulge on the end.
y^2 = x^3 + 1 . The water drop is close to breaking off, but surface tension has not yet pinched off the falling form.

There’s more. What kind of number is ‘x’? Or ‘y’? I’ll bet that you figured they were real numbers. You know, ordinary stuff. I didn’t say what they were, so left it to our instinct, and that usually runs toward real numbers. Those are what I meant, yes. But we didn’t have to. ‘x’ and ‘y’ could be in other sets of numbers too. They could be complex-valued numbers. They could be just the rational numbers. They could even be part of a finite collection of possible numbers. As the equation y^2 = x^3 + Ax + B is something meaningful (and some technical points are met) we can carry on. The elliptical curves, and the points we “add” on them, might not look like the curves we started with anymore. They might not look like anything recognizable anymore. But the logic continues to hold. We still create these groups out of the points on these lines intersecting a curve.

By now you probably admit this is neat stuff. You may also think: so what? We can take this thing you never thought about, draw points and lines on it, and make it look very loosely kind of like just adding numbers together. Why is this interesting? No appreciation just for the beauty of the structure involved? Well, we live in a fallen world.

It comes back to number theory. The modern study of Diophantine equations grows out of studying elliptic curves on the rational numbers. It turns out the group of points you get for that looks like a finite collection of points with some collection of integers hanging on. How long that collection of numbers is is called the ‘rank’, and there are deep mysteries at work. We know there are elliptic equations that have a rank as big as 28. Nobody knows if the rank can be arbitrary high, though. And I believe we don’t even know if there are any curves with rank of, like, 27, or 25.

Yeah, I’m still sensing skepticism out there. Fine. We’ll go back to the only part of number theory everybody agrees is useful. Encryption. We have roughly the same goals for every encryption scheme. We want it to be easy to encode a message. We want it to be easy to decode the message if you have the key. We want it to be hard to decode the message if you don’t have the key.

The curved-out sign has a bulge with convex loops to it, so that it resembles the cut of a jigsaw puzzle piece.
y^2 = 3x^2 - 3x + 3 . The water drop is almost large enough that its weight overcomes the surface tension holding it to the main body of water.

Take something inside one of these elliptic curve groups. Especially one that’s got a finite field. Let me call your thing ‘g’. It’s really easy for you, knowing what ‘g’ is and what your field is, to raise it to a power. You can pretty well impress me by sharing the value of ‘g’ raised to some whole number ‘m’. Call that ‘h’.

Why am I impressed? Because if all I know is ‘h’, I have a heck of a time figuring out what ‘g’ is. Especially on these finite field groups there’s no obvious connection between how big ‘h’ is and how big ‘g’ is and how big ‘m’ is. Start with a big enough finite field and you can encode messages in ways that are crazy hard to crack.

We trust. At least, if there are any ways to break the code quickly, nobody’s shared them. And there’s one of those enormous-money-prize awards waiting for someone who does know how to break such a code quickly. (I don’t know which. I’m going by what I expect from people.)

And then there’s fame. These were used to prove Fermat’s Last Theorem. Suppose there are some non-boring numbers ‘a’, ‘b’, and ‘c’, so that for some prime number ‘p’ that’s five or larger, it’s true that a^p + b^p = c^p . (We can separately prove Fermat’s Last Theorem for a power that isn’t a prime number, or a power that’s 3 or 4.) Then this implies properties about the elliptic curve:

y^2 = x(x - a^p)(x + b^p)

This is a convenient way of writing things since it showcases the ap and bp. It’s equal to:

y^2 = x^3 + \left(b^p - a^p\right)x^2 + a^p b^p x

(I was so tempted to leave an arithmetic error in there so I could make sure someone commented.)

A little ball off to the side of a curved-out less-than-sign shape.
y^2 = 3x^3 - 4x . The water drop has broken off, and the remaining surface rebounds to its normal meniscus.

If there’s a solution to Fermat’s Last Theorem, then this elliptic equation can’t be modular. I don’t have enough words to explain what ‘modular’ means here. Andrew Wiles and Richard Taylor showed that the equation was modular. So there is no solution to Fermat’s Last Theorem except the boring ones. (Like, where ‘b’ is zero and ‘a’ and ‘c’ equal each other.) And it all comes from looking close at these neat curves, none of which looks like an ellipse.

They’re named elliptic curves because we first noticed them when Carl Jacobi — yes, that Carl Jacobi — while studying the length of arcs of an ellipse. That’s interesting enough on its own. But it is hard. Maybe I could have fit in that anecdote about giving my class an impossible problem after all.

Reading the Comics, August 5, 2017: Lazy Summer Week Edition


It wasn’t like the week wasn’t busy. Comic Strip Master Command sent out as many mathematically-themed comics as I might be able to use. But they were again ones that don’t leave me much to talk about. I’ll try anyway. It was looking like an anthropomorphic-symboles sort of week, too.

Tom Thaves’s Frank and Ernest for the 30th of July is an anthropomorphic-symbols joke. The tick marks used for counting make an appearance and isn’t that enough? Maybe.

Dan Thompson’s Brevity for the 31st is another entry in the anthropomorphic-symbols joke contest. This one sticks to mathematical symbols, so if the Frank and Ernest makes the cut this week so must this one.

Eric the Circle for the 31st, this installment by “T daug”, gives the slightly anthropomorphic geometric figure a joke that at least mentions a radius, and isn’t that enough? What catches my imagination about this panel particularly is that the “fractured radius” is not just a legitimate pun but also resembles a legitimate geometry drawing. Drawing a diameter line is sensible enough. Drawing some other point on the circle and connecting that to the ends of the diameter is also something we might do.

Scott Hilburn’s The Argyle Sweater for the 1st of August is one of the logical mathematics jokes you could make about snakes. The more canonical one runs like this: God in the Garden of Eden makes all the animals and bids them to be fruitful. And God inspects them all and finds rabbits and doves and oxen and fish and fowl all growing in number. All but a pair of snakes. God asks why they haven’t bred and they say they can’t, not without help. What help? They need some thick tree branches chopped down. The bemused God grants them this. God checks back in some time later and finds an abundance of baby snakes in the Garden. But why the delay? “We’re adders,” explain the snakes, “so we need logs to multiply”. This joke absolutely killed them in the mathematics library up to about 1978. I’m told.

John Deering’s Strange Brew for the 1st is a monkeys-at-typewriters joke. It faintly reminds me that I might have pledged to retire mentions of the monkeys-at-typewriters joke. But I don’t remember so I’ll just have to depend on saying I don’t think I retired the monkeys-at-typewriters jokes and trust that someone will tell me if I’m wrong.

Dana Simpson’s Ozy and Millie rerun for the 2nd name-drops multiplication tables as the sort of thing a nerd child wants to know. They may have fit the available word balloon space better than “know how to diagram sentences” would.

Mark Anderson’s Andertoons for the 3rd is the reassuringly normal appearance of Andertoons for this week. It is a geometry class joke about rays, line segments with one point where there’s an end and … a direction where it just doesn’t. And it riffs on the notion of the existence of mathematical things. At least I can see it that way.

Dad: 'How many library books have you read this summer, Hammie?' Hammie: 'About 47.' Zoe: 'HA!' Dad: 'Hammie ... ' Hammie: 'Okay ... two.' Dad: 'Then why did you say 47?' Hammie: 'I was rounding up.' Zoe: 'NOW he understands math!'
Rick Kirkman and Jerry Scott’s Baby Blues for the 5th of August, 2017. Hammie totally blew it by saying “about forty-seven”. Too specific a number to be a plausible lie. “About forty” or “About fifty”, something you can see as the result of rounding off, yes. He needs to know there are rules about how to cheat.

Rick Kirkman and Jerry Scott’s Baby Blues for the 5th is a rounding-up joke that isn’t about herds of 198 cattle.

Stephen Bentley’s Herb and Jamaal for the 5th tosses off a mention of the New Math as something well out of fashion. There are fashions in mathematics, as in all human endeavors. It startles many to learn this.

The Summer 2017 Mathematics A To Z: Cohomology


Today’s A To Z topic is another request from Gaurish, of the For The Love Of Mathematics blog. Also part of what looks like a quest to make me become a topology blogger, at least for a little while. It’s going to be exciting and I hope not to faceplant as I try this.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Also, a note about Thomas K Dye, who’s drawn the banner art for this and for the Why Stuff Can Orbit series: the publisher for collections of his comic strip is having a sale this weekend.

Cohomology.

The word looks intimidating, and faintly of technobabble. It’s less cryptic than it appears. We see parts of it in non-mathematical contexts. In biology class we would see “homology”, the sharing of structure in body parts that look superficially very different. We also see it in art class. The instructor points out that a dog’s leg looks like that because they stand on their toes. What looks like a backward-facing knee is just the ankle, and if we stand on our toes we see that in ourselves. We might see it in chemistry, as many interesting organic compounds differ only in how long or how numerous the boring parts are. The stuff that does work is the same, or close to the same. And this is a hint to what a mathematician means by cohomology. It’s something in shapes. It’s particularly something in how different things might have similar shapes. Yes, I am using a homology in language here.

I often talk casually about the “shape” of mathematical things. Or their “structures”. This sounds weird and abstract to start and never really gets better. We can get some footing if we think about drawing the thing we’re talking about. Could we represent the thing we’re working on as a figure? Often we can. Maybe we can draw a polygon, with the vertices of the shape matching the pieces of our mathematical thing. We get the structure of our thing from thinking about what we can do to that polygon without changing the way it looks. Or without changing the way we can do whatever our original mathematical thing does.

This leads us to homologies. We get them by looking for stuff that’s true even if we moosh up the original thing. The classic homology comes from polyhedrons, three-dimensional shapes. There’s a relationship between the number of vertices, the number of edges, and the number of faces of a polyhedron. It doesn’t change even if you stretch the shape out longer, or squish it down, for that matter slice off a corner. It only changes if you punch a new hole through the middle of it. Or if you plug one up. That would be unsporting. A homology describes something about the structure of a mathematical thing. It might even be literal. Topology, the study of what we know about shapes without bringing distance into it, has the number of holes that go through a thing as a homology. This gets feeling like a comfortable, familiar idea now.

But that isn’t a cohomology. That ‘co’ prefix looks dangerous. At least it looks significant. When the ‘co’ prefix has turned up before it’s meant something is shaped by how it refers to something else. Coordinates aren’t just number lines; they’re collections of number lines that we can use to say where things are. If ‘a’ is a factor of the number ‘x’, its cofactor is the number you multiply ‘a’ by in order to get ‘x’. (For real numbers that’s just x divided by a. For other stuff it might be weirder.) A codomain is a set that a function maps a domain into (and must contain the range, at least). Cosets aren’t just sets; they’re ways we can divide (for example) the counting numbers into odds and evens.

So what’s the ‘co’ part for a homology? I’m sad to say we start losing that comfortable feeling now. We have to look at something we’re used to thinking of as a process as though it were a thing. These things are morphisms: what are the ways we can match one mathematical structure to another? Sometimes the morphisms are easy. We can match the even numbers up with all the integers: match 0 with 0, match 2 with 1, match -6 with -3, and so on. Addition on the even numbers matches with addition on the integers: 4 plus 6 is 10; 2 plus 3 is 5. For that matter, we can match the integers with the multiples of three: match 1 with 3, match -1 with -3, match 5 with 15. 1 plus -2 is -1; 3 plus -6 is -9.

What happens if we look at the sets of matchings that we can do as if that were a set of things? That is, not some human concept like ‘2’ but rather ‘match a number with one-half its value’? And ‘match a number with three times its value’? These can be the population of a new set of things.

And these things can interact. Suppose we “match a number with one-half its value” and then immediately “match a number with three times its value”. Can we do that? … Sure, easily. 4 matches to 2 which goes on to 6. 8 matches to 4 which goes on to 12. Can we write that as a single matching? Again, sure. 4 matches to 6. 8 matches to 12. -2 matches to -3. We can write this as “match a number with three-halves its value”. We’ve taken “match a number with one-half its value” and combined it with “match a number with three times its value”. And it’s given us the new “match a number with three-halves its value”. These things we can do to the integers are themselves things that can interact.

This is a good moment to pause and let the dizziness pass.

It isn’t just you. There is something weird thinking of “doing stuff to a set” as a thing. And we have to get a touch more abstract than even this. We should be all right, but please do not try not to use this to defend your thesis in category theory. Just use it to not look forlorn when talking to your friend who’s defending her thesis in category theory.

Now, we can take this collection of all the ways we can relate one set of things to another. And we can combine this with an operation that works kind of like addition. Some way to “add” one way-to-match-things to another and get a way-to-match-things. There’s also something that works kind of like multiplication. It’s a different way to combine these ways-to-match-things. This forms a ring, which is a kind of structure that mathematicians learn about in Introduction to Not That Kind Of Algebra. There are many constructs that are rings. The integers, for example, are also a ring, with addition and multiplication the same old processes we’ve always used.

And just as we can sort the integers into odds and evens — or into other groupings, like “multiples of three” and “one plus a multiple of three” and “two plus a multiple of three” — so we can sort the ways-to-match-things into new collections. And this is our cohomology. It’s the ways we can sort and classify the different ways to manipulate whatever we started on.

I apologize that this sounds so abstract as to barely exist. I admit we’re far from a nice solid example such as “six”. But the abstractness is what gives cohomologies explanatory power. We depend very little on the specifics of what we might talk about. And therefore what we can prove is true for very many things. It takes a while to get there, is all.

Why Stuff Can Orbit, Part 10: Where Time Comes From And How It Changes Things


Why Stuff Can Orbit, featuring a dazed-looking coati (it's a raccoon-like creature from Latin America) and a starry background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patron for those able to support his work.

Previously:

And the supplemental reading:


And again my thanks to Thomas K Dye, creator of the web comic Newshounds, for the banner art. He has a Patreon to support his creative habit.

In the last installment I introduced perturbations. These are orbits that are a little off from the circles that make equilibriums. And they introduce something that’s been lurking, unnoticed, in all the work done before. That’s time.

See, how do we know time exists? … Well, we feel it, so, it’s hard for us not to notice time exists. Let me rephrase it then, and put it in contemporary technology terms. Suppose you’re looking at an animated GIF. How do you know it’s started animating? Or that it hasn’t stalled out on some frame?

If the picture changes, then you know. It has to be going. But if it doesn’t change? … Maybe it’s stalled out. Maybe it hasn’t. You don’t know. You know there’s time when you can see change. And that’s one of the little practical insights of physics. You can build an understanding of special relativity by thinking hard about that. Also think about the observation that the speed of light (in vacuum) doesn’t change.

When something physical’s in equilibrium, it isn’t changing. That’s how we found equilibriums to start with. And that means we stop keeping track of time. It’s one more thing to keep track of that doesn’t tell us anything new. Who needs it?

For the planet orbiting a sun, in a perfect circle, or its other little variations, we do still need time. At least some. How far the planet is from the sun doesn’t change, no, but where it is on the orbit will change. We can track where it is by setting some reference point. Where the planet is at the start of our problem. How big is the angle between where the planet is now, the sun (the center of our problem’s universe), and that origin point? That will change over time.

But it’ll change in a boring way. The angle will keep increasing in magnitude at a constant speed. Suppose it takes five time units for the angle to grow from zero degrees to ten degrees. Then it’ll take ten time units for the angle to grow from zero to twenty degrees. It’ll take twenty time units for the angle to grow from zero to forty degrees. Nice to know if you want to know when the planet is going to be at a particular spot, and how long it’ll take to get back to the same spot. At this rate it’ll be eighteen time units before the angle grows to 360 degrees, which looks the same as zero degrees. But it’s not anything interesting happening.

We’ll label this sort of change, where time passes, yeah, but it’s too dull to notice as a “dynamic equilibrium”. There’s change, but it’s so steady and predictable it’s not all that exciting. And I’d set up the circular orbits so that we didn’t even have to notice it. If the radius of the planet’s orbit doesn’t change, then the rate at which its apsidal angle changes, its “angular velocity”, also doesn’t change.

Now, with perturbations, the distance between the planet and the center of the universe will change in time. That was the stuff at the end of the last installment. But also the apsidal angle is going to change. I’ve used ‘r(t)’ to represent the radial distance between the planet and the sun before, and to note that what value it is depends on the time. I need some more symbols.

There’s two popular symbols to use for angles. Both are Greek letters because, I dunno, they’ve always been. (Florian Cajori’s A History of Mathematical Notation doesn’t seem to have anything. And when my default go-to for explaining mathematician’s choices tells me nothing, what can I do? Look at Wikipedia? Sure, but that doesn’t enlighten me either.) One is to use theta, θ. The other is to use phi, φ. Both are good, popular choices, and in three-dimensional problems we’ll often need both. We don’t need both. The orbit of something moving under a central force might be complicated, but it’s going to be in a single plane of movement. The conservation of angular momentum gives us that. It’s not the last thing angular momentum will give us. The orbit might happen not to be in a horizontal plane. But that’s all right. We can tilt our heads until it is.

So I’ll reach deep into the universe of symbols for angles and call on θ for the apsidal angle. θ will change with time, so, ‘θ(t)’ is the angular counterpart to ‘r(t)’.

I’d said before the apsidal angle is the angle made between the planet, the center of the universe, and some reference point. What is my reference point? I dunno. It’s wherever θ(0) is, that is, where the planet is when my time ‘t’ is zero. There’s probably a bootstrapping fallacy here. I’ll cover it up by saying, you know, the reference point doesn’t matter. It’s like the choice of prime meridian. We have to have one, but we can pick whatever one is convenient. So why not pick one that gives us the nice little identity that ‘θ(0) = 0’? If you don’t buy that and insist I pick a reference point first, fine, go ahead. But you know what? The labels on my time axis are arbitrary. There’s no difference in the way physics works whether ‘t’ is ‘0’ or ‘2017’ or ‘21350’. (At least as long as I adjust any time-dependent forces, which there aren’t here.) So we get back to ‘θ(0) = 0’.

For a circular orbit, the dynamic equilibrium case, these are pretty boring, but at least they’re easy to write. They’re:

r(t) = a	\\ \theta(t) = \omega t

Here ‘a’ is the radius of the circular orbit. And ω is a constant number, the angular velocity. It’s how much a bit of time changes the apsidal angle. And this set of equations is pretty dull. You can see why it barely rates a mention.

The perturbed case gets more interesting. We know how ‘r(t)’ looks. We worked that out last time. It’s some function like:

r(t) = a + A cos\left(\sqrt{\frac{k}{m}} t\right) + B sin\left(\sqrt{\frac{k}{m}} t\right)

Here ‘A’ and ‘B’ are some numbers telling us how big the perturbation is, and ‘m’ is the mass of the planet, and ‘k’ is something related to how strong the central force is. And ‘a’ is that radius of the circular orbit, the thing we’re perturbed around.

What about ‘θ(t)’? How’s that look? … We don’t seem to have a lot to go on. We could go back to Newton and all that force equalling the change in momentum over time stuff. We can always do that. It’s tedious, though. We have something better. It’s another gift from the conservation of angular momentum. When we can turn a forces-over-time problem into a conservation-of-something problem we’re usually doing the right thing. The conservation-of-something is typically a lot easier to set up and to track. We’ve used it in the conservation of energy, before, and we’ll use it again. The conservation of ordinary, ‘linear’, momentum helps other problems, though not I’ll grant this one. The conservation of angular momentum will help us here.

So what is angular momentum? … It’s something about ice skaters twirling around and your high school physics teacher sitting on a bar stool spinning a bike wheel. All right. But it’s also a quantity. We can get some idea of it by looking at the formula for calculating linear momentum:

\vec{p} = m\vec{v}

The linear momentum of a thing is its inertia times its velocity. This is if the thing isn’t moving fast enough we have to notice relativity. Also if it isn’t, like, an electric or a magnetic field so we have to notice it’s not precisely a thing. Also if it isn’t a massless particle like a photon because see previous sentence. I’m talking about ordinary things like planets and blocks of wood on springs and stuff. The inertia, ‘m’, is rather happily the same thing as its mass. The velocity is how fast something is travelling and which direction it’s going in.

Angular momentum, meanwhile, we calculate with this radically different-looking formula:

\vec{L} = I\vec{\omega}

Here, again, talking about stuff that isn’t moving so fast we have to notice relativity. That isn’t electric or magnetic fields. That isn’t massless particles. And so on. Here ‘I’ is the “moment of inertia” and \vec{w} is the angular velocity. The angular velocity is a vector that describes for us how fast the spinning is and what direction the axis around which the thing spins is. The moment of inertia describes how easy or hard it is to make the thing spin around each axis. It’s a tensor because real stuff can be easier to spin in some directions than in others. If you’re not sure that’s actually so, try tossing some stuff in the air so it spins in each of the three major directions. You’ll see.

We’re fortunate. For central force problems the moment of inertia is easy to calculate. We don’t need the tensor stuff. And we don’t even need to notice that the angular velocity is a vector. We know what axis the planet’s rotating around; it’s the one pointing out of the plane of motion. We can focus on the size of the angular velocity, the number ‘ω’. See how they’re different, what with one not having an arrow over the symbol. The arrow-less version is easier. For a planet, or other object, with mass ‘m’ that’s orbiting a distance ‘r’ from the sun, the moment of inertia is:

I = mr^2

So we know this number is going to be constant:

L = mr^2\omega

The mass ‘m’ doesn’t change. We’re not doing those kinds of problem. So however ‘r’ changes in time, the angular velocity ‘ω’ has to change with it, so that this product stays constant. The angular velocity is how the apsidal angle ‘θ’ changes over time. So since we know ‘L’ doesn’t change, and ‘m’ doesn’t change, then the way ‘r’ changes must tell us something about how ‘θ’ changes. We’ll get into that next time.

Great Stuff By David Hilbert That I’ll Never Finish Reading


And then this came across my Twitter feed (@Nebusj, for the record):

It is to Project Gutenberg’s edition of David Hilbert’s The Foundations Of Geometry. David Hilbert you may know as the guy who gave us 20th Century mathematics. He had help. But he worked hard on the axiomatizing of mathematics, getting rid of intuition and relying on nothing but logical deduction for all mathematical results. “Didn’t we do that already, like, with the Ancient Greeks and all?” you may ask. We aimed for that since the Ancient Greeks, yes, but it’s really hard to do. The Foundations Of Geometry is an example of Hilbert’s work of looking very critically at all of the things we assume, and all of the things that we need, and all of the things we need defined, and trying to get at it all.

Hilbert gave much of 20th Century Mathematics its shape with a list presented at the 1900 International Congress of Mathematicians in Paris. This formed a great list of important unsolved problems. Some of them have been solved since. Some are still unsolved. Some have been proven unsolvable. Each of these results is very interesting. This tells you something about how great his questions were; only a great question is interesting however it turns out.

The Project Gutenberg edition of The Foundations Of Geometry is, mercifully, not a stitched-together PDF version of an ancient library copy. It’s a PDF compiled by, if I’m reading the credits correctly, Joshua Hutchinson, Roger Frank, and David Starner. The text was copied into LaTeX, an incredibly powerful and standard mathematics-writing tool, and compiled into something that … looks a little bit like every mathematics paper and thesis you’ll read these days. It’s a bit odd for a 120-year-old text to look quite like that. But it does mean the formatting looks familiar, if you’re the sort of person who reads mathematics regularly.

(There are a couple lines that read weird to me, but I can’t judge whether that owes to a typo in the preparation of the document or just that the translation from Hilbert’s original German to English produced odd effects. I’m thinking here of Axiom I, 2, shown on page 2, which I understand but feel weird about. Roll with it.)

Why Stuff Can Orbit, Part 9: How The Spring In The Cosmos Behaves


Why Stuff Can Orbit, featuring a dazed-looking coati (it's a raccoon-like creature from Latin America) and a starry background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patron for those able to support his work.

Previously:

And the supplemental reading:


First, I thank Thomas K Dye for the banner art I have for this feature! Thomas is the creator of the longrunning web comic Newshounds. He’s hoping soon to finish up special editions of some of the strip’s stories and to publish a definitive edition of the comic’s history. He’s also got a Patreon account to support his art habit. Please give his creations some of your time and attention.

Now back to central forces. I’ve run out of obvious fun stuff to say about a mass that’s in a circular orbit around the center of the universe. Before you question my sense of fun, remember that I own multiple pop histories about the containerized cargo industry and last month I read another one that’s changed my mind about some things. These sorts of problems cover a lot of stuff. They cover planets orbiting a sun and blocks of wood connected to springs. That’s about all we do in high school physics anyway. Well, there’s spheres colliding, but there’s no making a central force problem out of those. You can also make some things that look like bad quantum mechanics models out of that. The mathematics is interesting even if the results don’t match anything in the real world.

But I’m sticking with central forces that look like powers. These have potential energy functions with rules that look like V(r) = C rn. So far, ‘n’ can be any real number. It turns out ‘n’ has to be larger than -2 for a circular orbit to be stable, but that’s all right. There are lots of numbers larger than -2. ‘n’ carries the connotation of being an integer, a whole (positive or negative) number. But if we want to let it be any old real number like 0.1 or π or 18 and three-sevenths that’s fine. We make a note of that fact and remember it right up to the point we stop pretending to care about non-integer powers. I estimate that’s like two entries off.

We get a circular orbit by setting the thing that orbits in … a circle. This sounded smarter before I wrote it out like that. Well. We set it moving perpendicular to the “radial direction”, which is the line going from wherever it is straight to the center of the universe. This perpendicular motion means there’s a non-zero angular momentum, which we write as ‘L’ for some reason. For each angular momentum there’s a particular radius that allows for a circular orbit. Which radius? It’s whatever one is a minimum for the effective potential energy:

V_{eff}(r) = Cr^n + \frac{L^2}{2m}r^{-2}

This we can find by taking the first derivative of ‘Veff‘ with respect to ‘r’ and finding where that first derivative is zero. This is standard mathematics stuff, quite routine. We can do with any function whether it represents something physics or not. So:

\frac{dV_{eff}}{dr} = nCr^{n-1} - 2\frac{L^2}{2m}r^{-3} = 0

And after some work, this gets us to the circular orbit’s radius:

r = \left(\frac{L^2}{nCm}\right)^{\frac{1}{n + 2}}

What I’d like to talk about is if we’re not quite at that radius. If we set the planet (or whatever) a little bit farther from the center of the universe. Or a little closer. Same angular momentum though, so the equilibrium, the circular orbit, should be in the same spot. It happens there isn’t a planet there.

This enters us into the world of perturbations, which is where most of the big money in mathematical physics is. A perturbation is a little nudge away from an equilibrium. What happens in response to the little nudge is interesting stuff. And here we already know, qualitatively, what’s going to happen: the planet is going to rock around the equilibrium. This is because the circular orbit is a stable equilibrium. I’d described that qualitatively last time. So now I want to talk quantitatively about how the perturbation changes given time.

Before I get there I need to introduce another bit of notation. It is so convenient to be able to talk about the radius of the circular orbit that would be the equilibrium. I’d called that ‘r’ up above. But I also need to be able to talk about how far the perturbed planet is from the center of the universe. That’s also really hard not to call ‘r’. Something has to give. Since the radius of the circular orbit is not going to change I’m going to give that a new name. I’ll call it ‘a’. There’s several reasons for this. One is that ‘a’ is commonly used for describing the size of ellipses, which turn up in actual real-world planetary orbits. That’s something we know because this is like the thirteenth part of an essay series about the mathematics of orbits. You aren’t reading this if you haven’t picked up a couple things about orbits on your own. Also we’ve used ‘a’ before, in these sorts of approximations. It was handy in the last supplemental as the point of expansion’s name. So let me make that unmistakable:

a \equiv r = \left(\frac{L^2}{nCm}\right)^{\frac{1}{n + 2}}

The \equiv there means “defined to be equal to”. You might ask how this is different from “equals”. It seems like more emphasis to me. Also, there are other names for the circular orbit’s radius that I could have used. ‘re‘ would be good enough, as the subscript would suggest “radius of equilibrium”. Or ‘r0‘ would be another popular choice, the 0 suggesting that this is something of key, central importance and also looking kind of like a circle. (That’s probably coincidence.) I like the ‘a’ better there because I know how easy it is to drop a subscript. If you’re working on a problem for yourself that’s easy to fix, with enough cursing and redoing your notes. On a board in front of class it’s even easier to fix since someone will ask about the lost subscript within three lines. In a post like this? It would be a mess.

So now I’m going to look at possible values of the radius ‘r’ that are close to ‘a’. How close? Close enough that ‘Veff‘, the effective potential energy, looks like a parabola. If it doesn’t look much like a parabola then I look at values of ‘r’ that are even closer to ‘a’. (Do you see how the game is played? If you don’t, look closer. Yes, this is actually valid.) If ‘r’ is that close to ‘a’, then we can get away with this polynomial expansion:

V_{eff}(r) \approx V_{eff}(a) + m\cdot(r - a) + \frac{1}{2} m_2 (r - a)^2

where

m = \frac{dV_{eff}}{dr}\left(a\right)	\\ m_2  = \frac{d^2V_{eff}}{dr^2}\left(a\right)

The “approximate” there is because this is an approximation. V_{eff}(r) is in truth equal to the thing on the right-hand-side there plus something that isn’t (usually) zero, but that is small.

I am sorry beyond my ability to describe that I didn’t make that ‘m’ and ‘m2‘ consistent last week. That’s all right. One of these is going to disappear right away.

Now, what V_{eff}(a) is? Well, that’s whatever you get from putting in ‘a’ wherever you start out seeing ‘r’ in the expression for V_{eff}(r) . I’m not going to bother with that. Call it math, fine, but that’s just a search-and-replace on the character ‘r’. Also, where I’m going next, it’s going to disappear, never to be seen again, so who cares? What’s important is that this is a constant number. If ‘r’ changes, the value of V_{eff}(a) does not, because ‘r’ doesn’t appear anywhere in V_{eff}(a) .

How about ‘m’? That’s the value of the first derivative of ‘Veff‘ with respect to ‘r’, evaluated when ‘r’ is equal to ‘a’. That might be something. It’s not, because of what ‘a’ is. It’s the value of ‘r’ which would make \frac{dV_{eff}}{dr}(r) equal to zero. That’s why ‘a’ has that value instead of some other, any other.

So we’ll have a constant part ‘Veff(a)’, plus a zero part, plus a part that’s a parabola. This is normal, by the way, when we do expansions around an equilibrium. At least it’s common. Good to see it. To find ‘m2‘ we have to take the second derivative of ‘Veff(r)’ and then evaluate it when ‘r’ is equal to ‘a’ and ugh but here it is.

\frac{d^2V_{eff}}{dr^2}(r) = n (n - 1) C r^{n - 2} + 3\cdot\frac{L^2}{m}r^{-4}

And at the point of approximation, where ‘r’ is equal to ‘a’, it’ll be:

m_2 = \frac{d^2V_{eff}}{dr^2}(a) = n (n - 1) C a^{n - 2} + 3\cdot\frac{L^2}{m}a^{-4}

We know exactly what ‘a’ is so we could write that out in a nice big expression. You don’t want to. I don’t want to. It’s a bit of a mess. I mean, it’s not hard, but it has a lot of symbols in it and oh all right. Here. Look fast because I’m going to get rid of that as soon as I can.

m_2 = \frac{d^2V_{eff}}{dr^2}(a) = n (n - 1) C \left(\frac{L^2}{n C m}\right)^{n - 2} + 3\cdot\frac{L^2}{m}\left(\frac{L^2}{n C m}\right)^{-4}

For the values of ‘n’ that we actually care about because they turn up in real actual physics problems this expression simplifies some. Enough, anyway. If we pretend we know nothing about ‘n’ besides that it is a number bigger than -2 then … ugh. We don’t have a lot that can clean it up.

Here’s how. I’m going to define an auxiliary little function. Its role is to contain our symbolic sprawl. It has a legitimate role too, though. At least it represents something that it makes sense to give a name. It will be a new function, named ‘F’ and that depends on the radius ‘r’:

F(r) \equiv -\frac{dV}{dr}

Notice that’s the derivative of the original ‘V’, not the angular-momentum-equipped ‘Veff‘. This is the secret of its power. It doesn’t do anything to make V_{eff}(r) easier to work with. It starts being good when we take its derivatives, though:

\frac{dV_{eff}}{dr} = -F(r) - \frac{L^2}{m}r^{-3}

That already looks nicer, doesn’t it? It’s going to be really slick when you think about what ‘F(a)’ is. Remember that ‘a’ is the value for ‘r’ which makes the derivative of ‘Veff‘ equal to zero. So … I may not know much, but I know this:

0 = \frac{dV_{eff}}{dr}(a) = -F(a) - \frac{L^2}{m}a^{-3}	\\ F(a) = -\frac{L}{ma^3}

I’m not going to say what value F(r) has for other values of ‘r’ because I don’t care. But now look at what it does for the second derivative of ‘Veff‘:

\frac{d^2 V_{eff}}{dr^2}(r) = -F'(r) + 3\frac{L^2}{mr^4}

Here the ‘F'(r)’ is a shorthand way of writing ‘the derivative of F with respect to r’. You can do when there’s only the one free variable to consider. And now something magic that happens when we look at the second derivative of ‘Veff‘ when ‘r’ is equal to ‘a’ …

\frac{d^2 V_{eff}}{dr^2}(a) = -F'(a) - \frac{3}{a} F(a)

We get away with this because we happen to know that ‘F(a)’ is equal to -\frac{L}{ma^3} and doesn’t that work out great? We’ve turned a symbolic mess into a … less symbolic mess.

Now why do I say it’s legitimate to introduce ‘F(r)’ here? It’s because minus the derivative of the potential energy with respect to the position of something can be something of actual physical interest. It’s the amount of force exerted on the particle by that potential energy at that point. The amount of force on a thing is something that we could imagine being interested in. Indeed, we’d have used that except potential energy is usually so much easier to work with. I’ve avoided it up to this point because it wasn’t giving me anything I needed. Here, I embrace it because it will save me from some awful lines of symbols.

Because with this expression in place I can write the approximation to the potential energy of:

V_{eff}(r) \approx V_{eff}(a) + \frac{1}{2} \left( -F'(a) - \frac{3}{a}F(a) \right) (r - a)^2

So if ‘r’ is close to ‘a’, then the polynomial on the right is a good enough approximation to the effective potential energy. And that potential energy has the shape of a spring’s potential energy. We can use what we know about springs to describe its motion. Particularly, we’ll have this be true:

\frac{dp}{dt} = -\frac{dv_{eff}}{dr}(r) = -\left( F'(a) + \frac{3}{a} F(a)\right) r

Here, ‘p’ is the (linear) momentum of whatever’s orbiting, which we can treat as equal to ‘mr’, the mass of the orbiting thing times how far it is from the center. You may sense in me some reluctance about doing this, what with that ‘we can treat as equal to’ talk. There’s reasons for this and I’d have to get deep into geometry to explain why. I can get away with specifically this use because the problem allows it. If you’re trying to do your own original physics problem inspired by this thread, and it’s not orbits like this, be warned. This is a spot that could open up to a gigantic danger pit, lined at the bottom with sharp spikes and angry poison-clawed mathematical tigers and I bet it’s raining down there too.

So we can rewrite all this as

m\frac{d^2r}{dt^2} = -\frac{dv_{eff}}{dr}(r) = -\left( F'(a) + \frac{3}{a} F(a)\right) r

And when we learned everything interesting there was to know about springs we learned what the solutions to this look like. Oh, in that essay the variable that changed over time was called ‘x’ and here it’s called ‘r’, but that’s not an actual difference. ‘r’ will be some sinusoidal curve:

r(t) = A cos\left(\sqrt{\frac{k}{m}} t\right) + B sin\left(\sqrt{\frac{k}{m}} t\right)

where, here, ‘k’ is equal to that whole mass of constants on the right-hand side:

k = -\left( F'(a) + \frac{3}{a} F(a)\right)

I don’t know what ‘A’ and ‘B’ are. It’ll depend on just what the perturbation is like, how far the planet is from the circular orbit. But I can tell you what the behavior is like. The planet will wobble back and forth around the circular orbit, sometimes closer to the center, sometimes farther away. It’ll spend as much time closer to the center than the circular orbit as it does farther away. And the period of that oscillation will be

T = 2\pi\sqrt{\frac{m}{k}} = 2\pi\sqrt{\frac{m}{-\left(F'(a) + \frac{3}{a}F(a)\right)}}

This tells us something about what the orbit of a thing not in a circular orbit will be like. Yes, I see you in the back there, quivering with excitement about how we’ve got to elliptical orbits. You’re moving too fast. We haven’t got that. There will be elliptical orbits, yes, but only for a very particular power ‘n’ for the potential energy. Not for most of them. We’ll see.

It might strike you there’s something in that square root. We need to take the square root of a positive number, so maybe this will tell us something about what kinds of powers we’re allowed. It’s a good thought. It turns out not to tell us anything useful, though. Suppose we started with V(r) = Cr^n . Then F(r) = -nCr^{n - 1}, and F'(r) = -n(n - 1)C^{n - 2} . Sad to say, this leads us to a journey which reveals that we need ‘n’ to be larger than -2 or else we don’t get oscillations around a circular orbit. We already knew that, though. We already found we needed it to have a stable equilibrium before. We can see there not being a period for these oscillations around the circular orbit as another expression of the circular orbit not being stable. Sad to say, we haven’t got something new out of this.

We will get to new stuff, though. Maybe even ellipses.

My Mathematics Reading For The 13th of June


I’m working on the next Why Stuff Can Orbit post, this one to feature a special little surprise. In the meanwhile here’s some of the things I’ve read recently and liked.

The Theorem of the Day is just what the name offers. They’re fit onto single slides, so there’s not much text to read. I’ll grant some of them might be hard reading at once, though, if you’re not familiar with the lingo. Anyway, this particular theorem, the Lindemann-Weierstrass Theorem, is one of the famous ones. Also one of the best-named ones. Karl Weierstrass is one of those names you find all over analysis. Over the latter half of the 19th century he attacked the logical problems that had bugged calculus for the previous three centuries and beat them all. I’m lying, but not by much. Ferdinand von Lindemann’s name turns up less often, but he’s known in mathematics circles for proving that π is transcendental (and so, ultimately, that the circle can’t be squared by compass and straightedge). And he was David Hilbert’s thesis advisor.

The Lindemann-Weierstrass Theorem is one of those little utility theorems that’s neat on its own, yes, but is good for proving other stuff. This theorem says that if a given number is algebraic (ask about that some A To Z series) then e raised to that number has to be transcendental, and vice-versa. (The exception: e raised to 0 is equal to 1.) The page also mentions one of those fun things you run across when you have a scientific calculator and can repeat an operation on whatever the result of the last operation was.

I’ve mentioned Maths By A Girl before, but, it’s worth checking in again. This is a piece about Apéry’s Constant, which is one of those numbers mathematicians have heard of, and that we don’t know whether is transcendental or not. It’s hard proving numbers are transcendental. If you go out trying to build a transcendental number it’s easy, but otherwise, you have to hope you know your number is the exponential of an algebraic number.

I forget which Twitter feed brought this to my attention, but here’s a couple geometric theorems demonstrated and explained some by Dave Richeson. There’s something wonderful in a theorem that’s mostly a picture. It feels so supremely mathematical to me.

And last, Katherine Bourzac writing for Nature.com reports the creation of a two-dimensional magnet. This delights me since one of the classic problems in statistical mechanics is a thing called the Ising model. It’s a basic model for the mathematics of how magnets would work. The one-dimensional version is simple enough that you can give it to undergrads and have them work through the whole problem. The two-dimensional version is a lot harder to solve and I’m not sure I ever saw it laid out even in grad school. (Mind, I went to grad school for mathematics, not physics, and the subject is a lot more physics.) The four- and higher-dimensional model can be solved by a clever approach called mean field theory. The three-dimensional model .. I don’t think has any exact solution, which seems odd given how that’s the version you’d think was most useful.

That there’s a real two-dimensional magnet (well, a one-molecule-thick magnet) doesn’t really affect the model of two-dimensional magnets. The model is interesting enough for its mathematics, which teaches us about all kinds of phase transitions. And it’s close enough to the way certain aspects of real-world magnets behave to enlighten our understanding. The topic couldn’t avoid drawing my eye, is all.

Reading the Comics, May 31, 2017: Feast Week Edition


You know we’re getting near the end of the (United States) school year when Comic Strip Master Command orders everyone to clear out their mathematics jokes. I’m assuming that’s what happened here. Or else a lot of cartoonists had word problems on their minds eight weeks ago. Also eight weeks ago plus whenever they originally drew the comics, for those that are deep in reruns. It was busy enough to split this week’s load into two pieces and might have been worth splitting into three, if I thought I had publishing dates free for all that.

Larry Wright’s Motley Classics for the 28th of May, a rerun from 1989, is a joke about using algebra. Occasionally mathematicians try to use the the ability of people to catch things in midair as evidence of the sorts of differential equations solution that we all can do, if imperfectly, in our heads. But I’m not aware of evidence that anyone does anything that sophisticated. I would be stunned if we didn’t really work by a process of making a guess of where the thing should be and refining it as time allows, with experience helping us make better guesses. There’s good stuff to learn in modeling how to catch stuff, though.

Michael Jantze’s The Norm Classics rerun for the 28th opines about why in algebra you had to not just have an answer but explain why that was the answer. I suppose mathematicians get trained to stop thinking about individual problems and instead look to classes of problems. Is it possible to work out a scheme that works for many cases instead of one? If it isn’t, can we at least say something interesting about why it’s not? And perhaps that’s part of what makes algebra classes hard. To think about a collection of things is usually harder than to think about one, and maybe instructors aren’t always clear about how to turn the specific into the general.

Also I want to say some very good words about Jantze’s graphical design. The mock textbook cover for the title panel on the left is so spot-on for a particular era in mathematics textbooks it’s uncanny. The all-caps Helvetica, the use of two slightly different tans, the minimalist cover art … I know shelves stuffed full in the university mathematics library where every book looks like that. Plus, “[Mathematics Thing] And Their Applications” is one of the roughly four standard approved mathematics book titles. He paid good attention to his references.

Gary Wise and Lance Aldrich’s Real Life Adventures for the 28th deploys a big old whiteboard full of equations for the “secret” of the universe. This makes a neat change from finding the “meaning” of the universe, or of life. The equations themselves look mostly like gibberish to me, but Wise and Aldrich make good uses of their symbols. The symbol \vec{B} , a vector-valued quantity named B, turns up a lot. This symbol we often use to represent magnetic flux. The B without a little arrow above it would represent the intensity of the magnetic field. Similarly an \vec{H} turns up. This we often use for magnetic field strength. While I didn’t spot a \vec{E} — electric field — which would be the natural partner to all this, there are plenty of bare E symbols. Those would represent electric potential. And many of the other symbols are what would naturally turn up if you were trying to model how something is tossed around by a magnetic field. Q, for example, is often the electric charge. ω is a common symbol for how fast an electromagnetic wave oscillates. (It’s not the frequency, but it’s related to the frequency.) The uses of symbols is consistent enough, in fact, I wonder if Wise and Aldrich did use a legitimate sprawl of equations and I’m missing the referenced problem.

John Graziano’s Ripley’s Believe It Or Not for the 28th mentions how many symbols are needed to write out the numbers from 1 to 100. Is this properly mathematics? … Oh, who knows. It’s just neat to know.

Mark O’Hare’s Citizen Dog rerun for the 29th has the dog Fergus struggle against a word problem. Ordinary setup and everything, but I love the way O’Hare draws Fergus in that outfit and thinking hard.

The Eric the Circle rerun for the 29th by ACE10203040 is a mistimed Pi Day joke.

Bill Amend’s FoxTrot Classicfor the 31st, a rerun from the 7th of June, 2006, shows the conflation of “genius” and “good at mathematics” in everyday use. Amend has picked a quixotic but in-character thing for Jason Fox to try doing. Euclid’s Fifth Postulate is one of the classic obsessions of mathematicians throughout history. Euclid admitted the thing — a confusing-reading mess of propositions — as a postulate because … well, there’s interesting geometry you can’t do without it, and there doesn’t seem any way to prove it from the rest of his geometric postulates. So it must be assumed to be true.

There isn’t a way to prove it from the rest of the geometric postulates, but it took mathematicians over two thousand years of work at that to be convinced of the fact. But I know I went through a time of wanting to try finding a proof myself. It was a mercifully short-lived time that ended in my humbly understanding that as smart as I figured I was, I wasn’t that smart. We can suppose Euclid’s Fifth Postulate to be false and get interesting geometries out of that, particularly the geometries of the surface of the sphere, and the geometry of general relativity. Jason will surely sometime learn.

Reading the Comics, May 27, 2017: Panels Edition


Can’t say this was too fast or too slow a week for mathematically-themed comic strips. A bunch of the strips were panel comics, so that’ll do for my theme.

Norm Feuti’s Retail for the 21st mentions every (not that) algebra teacher’s favorite vague introduction to group theory, the Rubik’s Cube. Well, the ways you can rotate the various sides of the cube do form a group, which is something that acts like arithmetic without necessarily being numbers. And it gets into value judgements. There exist algorithms to solve Rubik’s cubes. Is it a show of intelligence that someone can learn an algorithm and solve any cube? — But then, how is solving a Rubik’s cube, with or without the help of an algorithm, a show of intelligence? At least of any intelligence more than the bit of spatial recognition that’s good for rotating cubes around?

'Rubik's cube, huh? I never could solve one of those.' 'I'm just fidgeting with it. I never bothered learning the algorithm either.' 'What algorithm?' 'The pattern you use to solve it.' 'Wait. All you have to do to solve it is memorize a pattern?' 'Of course. How did you think people solved it?' 'I always thought you had to be super smart to figure it out.' 'Well, memorizing the pattern does take a degree of intelligence.' 'Yeah, but that's not the same thing as solving it on your own.' 'I'm sure some people figured out the algorithm without help.' 'I KNEW Chad Gustafson was a liar! He was no eighth-grade prodigy, he just memorized the pattern!' 'Sounds like you and the CUBE have some unresolved issues.'
Norm Feuti’s Retail for the 21st of May, 2017. A few weeks ago I ran across a book about the world of competitive Rubik’s Cube solving. I haven’t had the chance to read it, but am interested by the ways people form rules for what would seem like a naturally shapeless feature such as solving Rubik’s Cubes. Not featured: the early 80s Saturday morning cartoon that totally existed because somehow that made sense back then.

I don’t see that learning an algorithm for a problem is a lack of intelligence. No more than using a photo reference shows a lack of drawing skill. It’s still something you need to learn, and to apply, and to adapt to the cube as you have it to deal with. Anyway, I never learned any techniques for solving it either. Would just play for the joy of it. Here’s a page with one approach to solving the cube, if you’d like to give it a try yourself. Good luck.

Bob Weber Jr and Jay Stephens’s Oh, Brother! for the 22nd is a word-problem avoidance joke. It’s a slight thing to include, but the artwork is nice.

Brian and Ron Boychuk’s Chuckle Brothers for the 23rd is a very slight thing to include, but it’s looking like a slow week. I need something here. If you don’t see it then things picked up. They similarly tried sprucing things up the 27th, with another joke for taping onto the door.

Nate Fakes’s Break of Day for the 24th features the traditional whiteboard full of mathematics scrawls as a sign of intelligence. The scrawl on the whiteboard looks almost meaningful. The integral, particularly, looks like it might have been copied from a legitimate problem in polar or cylindrical coordinates. I say “almost” because while I think that some of the r symbols there are r’ I’m not positive those aren’t just stray marks. If they are r’ symbols, it’s the sort of integral that comes up when you look at surfaces of spheres. It would be the electric field of a conductive metal ball given some charge, or the gravitational field of a shell. These are tedious integrals to solve, but fortunately after you do them in a couple of introductory physics-for-majors classes you can just look up the answers instead.

Samson’s Dark Side of the Horse for the 26th is the Roman numerals joke for this installment. I feel like it ought to be a pie chart joke too, but I can’t find a way to make it one.

Izzy Ehnes’s The Best Medicine Cartoon for the 27th is the anthropomorphic numerals joke for this paragraph.

Getting Into Shapes


This is, in part, a post for myself. They all are, but this is moreso. My day job includes some Geographic Information Services stuff, which is how we say “maps” when we want to be taken seriously as Information Technology professionals. When we make maps, what we really do is have a computer draw polygons, and then put dots on them. A common need is to put a dot in the middle of a polygon. Yes, this sounds silly, but describe your job this abstractly and see how it comes out.

The trouble is polygons can be complicated stuff. Can be, not are. If the polygon is, like, the border of your building’s property it’s probably not too crazy. It’s probably a rectangle, or at least a trapezoid. Maybe there’s a curved boundary. If you need a dot, such as to place the street address or a description of the property, you can make a good guess about where to put it so it’s inside the property and not too close to an edge.

But you can’t always. The polygons can be complicated. Especially if you’re representing stuff that reflects government or scientific or commercial interest. There’s good reasons to be interested in the boundaries between the low-low tide and the high-high tide lines of a beach, but that’s not going to look like anything simple for any realistic property. Finding a representative spot to fix labels or other business gets tricky.

So this crossed my Twitter feed and I’ll probably want to refer back to it at some point. It’s an algorithm, published last August by Vladimir Agafonkin at Mapbox, which uses some computation tricks to find a reasonable center.

The approach is, broadly, of a kind with many numerical methods. It tries to find an answer by taking a guess and then seeing if any obvious variations will make it a little better. If you can, then, repeat these variations. Eventually, usually, you’ll get to a pretty good answer. It may not be the exact best possible answer, but that’s all right. We accept that we’ll have a merely approximate answer, but we’ll get it more quickly than we otherwise would have. Often this is fine. Nobody will be upset that the label on a map would be “better” moved one pixel to the right if they get the map ten seconds faster. Optimization is often like that.

I have not tried putting this code into mine yet; I’ve just now read it and I have some higher-priority tasks at work. But I’m hoping to remember that this exists and to see whether I can use it.