Consider a book. It’s a collection. It’s easy to see the ordered setting of words, maybe pictures, possibly numbers or even equations. The important thing is the ideas those all represent.
Set the book in a library. How can this change the book?
Perhaps the comparison to other books shows us something the original book neglected. Perhaps something in the original book we now realize was a brilliantly-presented insight. The way we appreciate the book may change.
What can’t change is the content of the original book. The words stay the same, in the same order. If it’s a physical book, the number of pages stays the same, as does the size of the page. The ideas expressed remain the same.
So now you understand embedding. It’s a broad concept, something that can have meaning for any mathematical structure. A structure here is a bunch of items and some things you can do with them. A group, for example, is a good structure to use with this sort of thing. So, for example, the integers and regular addition. This original structure’s embedded in another when everything in the original structure is in the new, and everything you can do with the original structure you can do in the new and get the same results. So, for example, the group you get by taking the integers and regular addition? That’s embedded in the group you get by taking the rational numbers and regular addition. 4 + 8 is 12 whether or not you consider 6.5 a topic fit for discussion. It’s an embedding that expands the set of elements, and that modifies the things you can do to match.
The group you get from the integers and addition is embedded in other things. For example, it’s embedded in the ring you get from the integers and regular addition and regular multiplication. 4 + 8 remains 12 whether or not you can multiply 4 by 8. This embedding doesn’t add any new elements, just new things you can do with them.
Once you have the name, you see embedding everywhere. When we first learn arithmetic we — I, anyway — learn it as adding whole numbers together. Then we embed that into whole numbers with addition and multiplication. And then the (nonnegative) rational numbers with addition and multiplication. At some point (I forget when) the negative numbers came in. So did the whole set of real numbers. Eventually the real numbers got embedded into the complex numbers. And the complex numbers got embedded into the quaternions, although we found real and complex numbers enough for most of our work. I imagine something similar goes on these days.
There’s never only one embedding possible. Consider, for example, two-dimensional geometry, the shapes of figures on a sheet of paper. It’s easy to put that in three dimensions, by setting the paper on the floor, and expand it by drawing in chalk on the wall. Or you can set the paper on the wall, and extend its figures by drawing in chalk on the floor. Or set the paper at an angle to the floor. What you use depends on what’s most convenient. And that can be driven by laziness. It’s easy to match, say, the point in two dimensions at coordinates (3, 4) with the point in three dimensions at coordinates (3, 4, 0), even though (0, 3, 4) or (4, 0, 3) are as valid.
Why embed something in another thing? For the same reasons we do any transformation in mathematics. One is that we figure to embed the thing we’re working on into something easier to deal with. A famous example of this is the Nash embedding theorem. It describes when certain manifolds can be embedded into something that looks like normal space. And that’s useful because it can turn nonlinear partial differential equations — the most insufferable equations — into something solvable.
Another good reason, though, is the one implicit in that early arithmetic education. We started with whole-numbers-with-addition. And then we added the new operation of multiplication. And then new elements, like fractions and negative numbers. If we follow this trail we get to some abstract, tricky structures like octonions. But by small steps in which we have great experience guiding us into new territories.
Elkement, who’s been a longtime support of my blogging here, has been thinking about stereographic projection recently. This comes from playing with complex-valued numbers. It’s hard to start thinking about something like “what is and not get into the projection. The projection itself Elkement describes a bit in this post, from early in August. It’s one of the ways to try to match the points on a sphere to the points on the entire, infinite plane. One common way to imagine it, and to draw it, is to imagine setting the sphere on the plane. Imagine sitting on the top of the sphere. Draw the line connecting the top of the sphere with whatever point you find interesting on the sphere, and then extend that line until it intersects the plane. Match your point on the sphere with that point on the plane. You can use this to trace out shapes on the sphere and find their matching shapes on the plane.
This distorts the shapes, as you’d expect. Well, the sphere has a finite area, the plane an infinite one. We can’t possibly preserve the areas of shapes in this transformation. But this transformation does something amazing that offends students when they first encounter it. It preserves circles: a circle on the original sphere becomes a circle on the plane, and vice-versa. I know, you want it to turn something into ellipses, at least. She takes a turn at thinking out reasons why this should be reasonable. There are abundant proofs of this, but it helps the intuition to see different ways to make the argument. And to have rough proofs, that outline the argument you mean to make. We need rigorous proofs, yes, but a good picture that makes the case convincing helps a good deal.
I made a mistake! I thought we had got to the end of the block of A To Z topics suggested by Gaurish, of the For The Love Of Mathematics blog. Not so and, indeed, I wonder if it wouldn’t be a viable writing strategy around here for me to just ask Gaurish to throw out topics and I have two weeks to write about them. I don’t think there’s a single unpromising one in the set.
Before you ask, yes, this is named for the Camille Jordan.
So this is a thing from algebra. Particularly, linear algebra. And more particularly, matrices. Matrices are so much of linear algebra that you could be forgiven thinking they’re all of linear algebra. The thing is, matrices are a really good way of describing linear transformations. That is, where you take a block of space and stretch it out, or squash it down, or rotate it, or do some combination of these things. And stretching and squashing and rotating is a lot of what you’d ever want to do. Refer to any book on how to draw animated cartoons. The only thing matrices can’t do is have their eyes bug out huge when an attractive region of space walks past.
Thing about a matrix is if you want to do something with it, you’re going to write it as a grid of numbers. It doesn’t have to be a grid of numbers. But about all the matrices anyone does anything with are grids of numbers. And that’s fine. They do an incredible lot of stuff. What’s not fine is that on looking at a huge block of numbers, the mind sees: huh. That’s a big block of numbers. Good luck finding what’s meaningful in them. To help find meaning we have a set of standard forms. We call them “canonical” or “normal” or some other approving term. They rearrange and change the terms in the matrix so that more interesting stuff is more obvious.
Now you’re justified asking: how can we rearrange and change the terms in a matrix without changing what the matrix is? We can get away with doing this because we can show some rearrangements don’t change what we’re interested in. That covers the “how dare we” part of “how”. We do it by using matrix multiplication. You might remember from high school algebra that matrix multiplication is this agonizing process of multiplying every pair of numbers that ever existed together, then adding them all up, and then maybe you multiply something by minus one because you’re thinking of determinants, and it all comes out wrong anyway and you have to do it over? Yeah. Well, matrix multiplication is defined hard because it makes stuff like this work out. So that covers the “by what technique” part of “how”. We start out with some matrix, let me imaginatively name it . And then we find some transformation matrix for which, eh, let’s say is a good enough name. I’ll say why in a moment. Then we use that matrix and its multiplicative inverse . And we evaluate the product . This won’t just be the same old matrix we started with. Not usually. Promise. But what this will be, if we chose our matrix correctly, is some new matrix that’s easier to read.
The matrices involved here have to follow some rules. Most important, they’re all going to be square matrices. There’ll be more rules that your linear algebra textbook will tell you. Or your instructor will, after checking the textbook.
So what makes a matrix easy to read? Zeroes. Lots and lots of zeroes. When we have a standardized form of a matrix it’s nearly all zeroes. This is for a good reason: zeroes are easy to multiply stuff by. And they’re easy to add stuff to. And almost everything we do with matrices, as a calculation, is a lot of multiplication and addition of the numbers in the matrix.
What also makes a matrix easy to read? Everything important being on the diagonal. The diagonal is one of the two things you would imagine if you were told “here’s a grid of numbers, pick out the diagonal”. In particular it’s the one that goes from the upper left to the bottom right, that is, row one column one, and row two column two, and row three column three, and so on up to row 86 column 86 (or whatever). If everything is on the diagonal the matrix is incredibly easy to work with. If it can’t all be on the diagonal at least everything should be close to it. As close as possible.
In the Jordan Canonical Form not everything is on the diagonal. I mean, it can be, but you shouldn’t count on that. But everything either will be on the diagonal or else it’ll be one row up from the diagonal. That is, row one column two, row two column three, row 85 column 86. Like that. There’s two other important pieces.
First is the thing in the row above the diagonal will be either 1 or 0. Second is that on the diagonal you’ll have a sequence of all the same number. Like, you’ll get four instances of the number ‘2’ along this string of the diagonal. Third is that you’ll get a 1 above all but the row above first instance of this particular number. Fourth is that you’ll get a 0 in the row above the first instance of this number.
Yeah, that’s fussy to visualize. This is one of those things easiest to show in a picture. A Jordan canonical form is a matrix that looks like this:
2
1
0
0
0
0
0
0
0
0
0
0
0
2
1
0
0
0
0
0
0
0
0
0
0
0
2
1
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
0
3
1
0
0
0
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
4
1
0
0
0
0
0
0
0
0
0
0
0
4
1
0
0
0
0
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
0
0
0
0
0
-1
0
0
0
0
0
0
0
0
0
0
0
0
-2
1
0
0
0
0
0
0
0
0
0
0
0
-2
This may have you dazzled. It dazzles mathematicians too. When we have to write a matrix that’s almost all zeroes like this we drop nearly all the zeroes. If we have to write anything we just write a really huge 0 in the upper-right and the lower-left corners.
What makes this the Jordan Canonical Form is that the matrix looks like it’s put together from what we call Jordan Blocks. Look around the diagonals. Here’s the first Jordan Block:
2
1
0
0
0
2
1
0
0
0
2
1
0
0
0
2
Here’s the second:
3
1
0
3
Here’s the third:
4
1
0
0
4
1
0
0
4
Here’s the fourth:
-1
And here’s the fifth:
-2
1
0
-2
And we can represent the whole matrix as this might-as-well-be-diagonal thing:
First Block
0
0
0
0
0
Second Block
0
0
0
0
0
Third Block
0
0
0
0
0
Fourth Block
0
0
0
0
0
Fifth Block
These blocks can be as small as a single number. They can be as big as however many rows and columns you like. Each individual block is some repeated number on the diagonal, and a repeated one in the row above the diagonal. You can call this the “superdiagonal”.
(Mathworld, and Wikipedia, assert that sometimes the row below the diagonal — the “subdiagonal” — gets the 1’s instead of the superdiagonal. That’s fine if you like it that way, and it won’t change any of the real work. I have not seen these subdiagonal 1’s in the wild. But I admit I don’t do a lot of this field and maybe there’s times it’s more convenient.)
Using the Jordan Canonical Form for a matrix is a lot like putting an object in a standard reference pose for photographing. This is a good metaphor. We get a Jordan Canonical Form by matrix multiplication, which works like rotating and scaling volumes of space. You can view the Jordan Canonical Form for a matrix as how you represent the original matrix from a new viewing angle that makes it easy to recognize. And this is why is not a bad name for the matrix that does this work. We can see all this as “projecting” the matrix we started with into a new frame of reference. The new frame is maybe rotated and stretched and squashed and whatnot, compared to how we started. But it’s as valid a base. Projecting a mathematical object from one frame of reference to another usually involves calculating something that looks like so, projection. That’s our name.
Mathematicians will speak of “the” Jordan Canonical Form for a matrix as if there were such a thing. I don’t mean that Jordan Canonical Forms don’t exist. They exist just as much as matrices do. It’s the “the” that misleads. You can put the Jordan Blocks in any order and have as valid, and as useful, a Jordan Canonical Form. But it’s easy to swap the orders of these blocks around — it’s another matrix multiplication, and a blessedly easy one — so it doesn’t matter which form you have. Get any one and you have them all.
I haven’t said anything about what these numbers on the diagonal are. They’re the eigenvalues of the original matrix. I hope that clears things up.
Yeah, not to anyone who didn’t know what a Jordan Canonical Form was to start with. Rather than get into calculations let me go to well-established metaphor. Take a sample of an unknown chemical and set it on fire. Put the light from this through a prism and photograph the spectrum. There will be lines, interruptions in the progress of colors. The locations of those lines and how intense they are tell you what the chemical is made of, and in what proportions. These are much like the eigenvectors and eigenvalues of a matrix. The eigenvectors tell you what the matrix is made of, and the eigenvalues how much of the matrix is those. This stuff gets you very far in proving a lot of great stuff. And part of what makes the Jordan Canonical Form great is that you get the eigenvalues right there in neat order, right where anyone can see them.
So! All that’s left is finding the things. The best way to find the Jordan Canonical Form for a given matrix is to become an instructor for a class on linear algebra and assign it as homework. The second-best way is to give the problem to your TA, who will type it in to Mathematica and return the result. It’s too much work to do most of the time. Almost all the stuff you could learn from having the thing in the Jordan Canonical Form you work out in the process of finding the matrix that would let you calculate what the Jordan Canonical Form is. And once you had that, why go on?
Where the Jordan Canonical Form shines is in doing proofs about what matrices can do. We can always put a square matrix into a Jordan Canonical Form. So if we want to show something is true about matrices in general, we can show that it’s true for the simpler-to-work-with Jordan Canonical Form. Then show that shifting a matrix to or from the Jordan Canonical Form doesn’t change whether the thing we’re interested in is true. It exists in that strange space: it is quite useful, but never on a specific problem.
To my surprise nobody requested any terms beginning with `R’ for this A To Z. So I take this free day to pick on a concept I’d imagine nobody saw coming.
Riemann Sphere.
We need to start with the complex plane. This is just, well, a plane. All the points on the plane correspond to a complex-valued number. That’s a real number plus a real number times i. And i is one of those numbers which, squared, equals -1. It’s like the real number line, only in two directions at once.
Take that plane. Now put a sphere on it. The sphere has radius one-half. And it sits on top of the plane. Its lowest point, the south pole, sits on the origin. That’s whatever point corresponds to the number 0 + 0i, or as humans know it, “zero”.
We’re going to do something amazing with this. We’re going to make a projection, something that maps every point on the sphere to every point on the plane, and vice-versa. In other words, we can match every complex-valued number to one point on the sphere. And every point on the sphere to one complex-valued number. Here’s how.
Imagine sitting at the north pole. And imagine that you can see through the sphere. Pick any point on the plane. Look directly at it. Shine a laser beam, if that helps you pick the point out. The laser beam is going to go into the sphere — you’re squatting down to better look through the sphere — and come out somewhere on the sphere, before going on to the point in the plane. The point where the laser beam emerges? That’s the mapping of the point on the plane to the sphere.
There’s one point with an obvious match. The south pole is going to match zero. They touch, after all. Other points … it’s less obvious. But some are easy enough to work out. The equator of the sphere, for instance, is going to match all the points a distance of 1 from the origin. So it’ll have the point matching the number 1 on it. It’ll also have the point matching the number -1, and the point matching i, and the point matching -i. And some other numbers.
All the numbers that are less than 1 from the origin, in fact, will have matches somewhere in the southern hemisphere. If you don’t see why that is, draw some sketches and think about it. You’ll convince yourself. If you write down what convinced you and sprinkle the word “continuity” in here and there, you’ll convince a mathematician. (WARNING! Don’t actually try getting through your Intro to Complex Analysis class doing this. But this is what you’ll be doing.)
What about the numbers more than 1 from the origin? … Well, they all match to points on the northern hemisphere. And tell me that doesn’t stagger you. It’s one thing to match the southern hemisphere to all the points in a circle of radius 1 away from the origin. But we can match everything outside that little circle to the northern hemisphere. And it all fits in!
Not amazed enough? How about this: draw a circle on the plane. Then look at the points on the Riemann sphere that match it. That set of points? It’s also a circle. A line on the plane? That’s also a line on the sphere. (Well, it’s a geodesic. It’s the thing that looks like a line, on spheres.)
How about this? Take a pair of intersecting lines or circles in the plane. Look at what they map to. That mapping, squashed as it might be to the northern hemisphere of the sphere? The projection of the lines or circles will intersect at the same angles as the original. As much as space gets stretched out (near the south pole) or squashed down (near the north pole), angles stay intact.
OK, but besides being stunning, what good is all this?
Well, one is that it’s a good thing to learn on. Geometry gets interested in things that look, at least in places, like planes, but aren’t necessarily. These spheres are, and the way a sphere matches a plane is obvious. We can learn the tools for geometry on the Möbius strip or the Klein bottle or other exotic creations by the tools we prove out on this.
And then physics comes in, being all weird. Much of quantum mechanics makes sense if you imagine it as things on the sphere. (I admit I don’t know exactly how. I went to grad school in mathematics, not in physics, and I didn’t get to the physics side of mathematics much at that time.) The strange ways distance can get mushed up or stretched out have echoes in relativity. They’ll continue having these echoes in other efforts to explain physics as geometry, the way that string theory will.
Also important is that the sphere has a top, the north pole. That point matches … well, what? It’s got to be something infinitely far away from the origin. And this make sense. We can use this projection to make a logically coherent, sensible description of things “approaching infinity”, the way we want to when we first learn about infinitely big things. Wrapping all the complex-valued numbers to this ball makes the vast manageable.
It’s also good numerical practice. Computer simulations have problems with infinitely large things, for the obvious reason. We have a couple of tools to handle this. One is to model a really big but not infinitely large space and hope we aren’t breaking anything. One is to create a “tiling”, making the space we are able to simulate repeat itself in a perfect grid forever and ever. But recasting the problem from the infinitely large plane onto the sphere can also work. This requires some ingenuity, to be sure we do the recasting correctly, but that’s all right. If we need to run a simulation over all of space, we can often get away with doing a simulation on a sphere. And isn’t that also grand?
The Riemann named here is Bernhard Riemann, yet another of those absurdly prolific 19th century mathematicians, especially considering how young he was when he died. His name is all over the fundamentals of analysis and geometry. When you take Introduction to Calculus you get introduced pretty quickly to the Riemann Sum, which is how we first learn how to calculate integrals. It’s that guy. General relativity, and much of modern physics, is based on advanced geometries that again fall back on principles Riemann noticed or set out or described so well that we still think of them as he discovered.
So with several examples I’ve managed to prove what nobody really questioned, that it’s possible to imagine a complicated curve like the route of the New York Thruway and assign to all the points on it, or at least to the center line of the road, a unique number that no other point on the road has. And, more, it’s possible to assign these unique numbers in many different ways, from any lower bound we like to any upper bound we like. It’s a nice system, particularly if we’re short on numbers to tell us when we approach Loudonville.
But I’m feeling ambitious right now and want to see how ridiculously huge, positive or negative, a number I can assign to some point on the road. Since we’d measured distances from a reference point by miles before and got a range of about 500, or by millimeters and got a range of about 800,000,000, obviously we could get to any number, however big or small, just by measuring distance using the appropriate unit: lay megaparsecs or angstroms down on the Thruway, or even use some awkward or contrived units. I want to shoot for infinitely big numbers. I’ll start by dividing the road in two.
After all, there are two halves to the Thruway, a northern and a southern end, both arranged like upside-down u’s across the state. Instead of thinking of the center line of the whole Thruway, then, think of the center lines of the northern road and of the southern. They’re both about the same 496-mile length, but, it’d be remarkable if they were exactly the same length. Let’s suppose the northern belt is 497 miles, and the southern 495. Pretty naturally the northern belt we can give numbers from 0 to 497, based on how far they are from the south-eastern end of the road; similarly, the southern belt gets numbers from 0 to 495, from the same reference point.