My All 2020 Mathematics A to Z: Complex Numbers


Mr Wu, author of the Singapore Maths Tuition blog, suggested complex numbers for a theme. I wrote long ago a bit about what complex numbers are and how to work with them. But that hardly exhausts the subject, and I’m happy revisiting it.

Color cartoon illustration of a coati in a beret and neckerchief, holding up a director's megaphone and looking over the Hollywood hills. The megaphone has the symbols + x (division obelus) and = on it. The Hollywood sign is, instead, the letters MATHEMATICS. In the background are spotlights, with several of them crossing so as to make the letters A and Z; one leg of the spotlights has 'TO' in it, so the art reads out, subtly, 'Mathematics A to Z'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Complex Numbers.

A throwaway joke somewhere in The Hitchhiker’s Guide To The Galaxy has Marvin The Paranoid Android grumble that he’s invented a square root for minus one. Marvin’s gone and rejiggered all of mathematics while waiting for something better to do. Nobody cares. It reminds us while Douglas Adams established much of a particular generation of nerd humor, he was not himself a nerd. The nerds who read The Hitchhiker’s Guide To The Galaxy obsessively know we already did that, centuries ago. Marvin’s creation was as novel as inventing “one-half”. (It may be that Adams knew, and intended Marvin working so hard on the already known as the joke.)

Anyone who’d read a pop mathematics blog like this likely knows the rough story of complex numbers in Western mathematics. The desire to find roots of polynomials. The discovery of formulas to find roots. Polynomials with numbers whose formulas demanded the square roots of negative numbers. And the discovery that sometimes, if you carried on as if the square root of a negative number made sense, the ugly terms vanished. And you got correct answers in the end. And, eventually, mathematicians relented. These things were unsettling enough to get unflattering names. To call a number “imaginary” may be more pejorative than even “negative”. It hints at the treatment of these numbers as falsework, never to be shown in the end. To call the sum of a “real” number and an “imaginary” “complex” is to warn. An expert might use these numbers only with care and deliberation. But we can count them as numbers.

I mentioned when writing about quaternions how when I learned of complex numbers I wanted to do the same trick again. My suspicion is many mathematicians do. The example of complex numbers teases us with the possibilites of other numbers. If we’ve defined \imath to be “a number that, squared, equals minus one”, what next? Could we define a \sqrt{\imath} ? How about a \log{\imath} ? Maybe something else? An arc-cosine of \imath ?

You can try any of these. They turn out to be redundant. The real numbers and \imath already let you describe any of those new numbers. You might have a flash of imagination: what if there were another number that, squared, equalled minus one, and that wasn’t equal to \imath ? Numbers that look like a + b\imath + c\jmath ? Here, and later on, a and b and c are some real numbers. b\imath means “multiply the real number b by whatever \imath is”, and we trust that this makes sense. There’s a similar setup for c and \jmath . And if you just try that, with a + b\imath + c\jmath , you get some interesting new mathematics. Then you get stuck on what the product of these two different square roots should be.

If you think of that. If all you think of is addition and subtraction and maybe multiplication by a real number? a + b\imath + c\jmath works fine. You only spot trouble if you happen to do multiplication. Granted, multiplication is to us not an exotic operation. Take that as a warning, though, of how trouble could develop. How do we know, say, that complex numbers are fine as long as you don’t try to take the log of the haversine of one of them, or some other obscurity? And that then they produce gibberish? Or worse, produce that most dread construct, a contradiction?

Here I am indebted to an essay that ten minutes ago I would have sworn was in one of the two books I still have out from the university library. I’m embarrassed to learn my error. It was about the philosophy of complex numbers and it gave me fresh perspectives. When the university library reopens for lending I will try to track back through my borrowing and find the original. I suspect, without confirming, that it may have been in Reuben Hersh’s What Is Mathematics, Really?.

The insight is that we can think of complex numbers in several ways. One fruitful way is to match complex numbers with points in a two-dimensional space. It’s common enough to pair, for example, the number 3 + 4\imath with the point at Cartesian coordinates (3, 4) . Mathematicians do this so often it can take a moment to remember that is just a convention. And there is a common matching between points in a Cartesian coordinate system and vectors. Chaining together matches like this can worry. Trust that we believe our matches are sound. Then we notice that adding two complex numbers does the same work as adding ordered coordinate pairs. If we trust that adding coordinate pairs makes sense, then we need to accept that adding complex numbers makes sense. Adding coordinate pairs is the same work as adding real numbers. It’s just a lot of them. So we’re lead to trust that if addition for real numbers works then addition for complex numbers does.

Multiplication looks like a mess. A different perspective helps us. A different way to look at where point are on the plane is to use polar coordinates. That is, the distance a point is from the origin, and the angle between the positive x-axis and the line segment connecting the origin to the point. In this format, multiplying two complex numbers is easy. Let the first complex number have polar coordinates (r_1, \theta_1) . Let the second have polar coordinates (r_2, \theta_2) . Their product, by the rules of complex numbers, is a point with polar coordinates (r_1\cdot r_2, \theta_1 + \theta_2) . These polar coordinates are real numbers again. If we trust addition and multiplication of real numbers, we can trust this for complex numbers.

If we’re confident in adding complex numbers, and confident in multiplying them, then … we’re in quite good shape. If we can add and multiply, we can do polynomials. And everything is polynomials.

We might feel suspicious yet. Going from complex numbers to points in space is calling on our geometric intuitions. That might be fooling ourselves. Can we find a different rationalization? The same result by several different lines of reasoning makes the result more believable. Is there a rationalization for complex numbers that never touches geometry?

We can. One approach is to use the mathematics of matrices. We can match the complex number a + b\imath to the sum of the matrices

a \left[\begin{tabular}{c c} 1 & 0 \\ 0 & 1 \end{tabular}\right] + b \left[\begin{tabular}{c c} 0 & 1 \\ -1 & 0  \end{tabular}\right]

Adding matrices is compelling. It’s the same work as adding ordered pairs of numbers. Multiplying matrices is tedious, though it’s not so bad for matrices this small. And it’s all done with real-number multiplication and addition. If we trust that the real numbers work, we can trust complex numbers do. If we can show that our new structure can be understood as a configuration of the old, we convince ourselves the new structure is meaningful.

The process by which we learn to trust them as numbers, guides us to learning how to trust any new mathematical structure. So here is a new thing that complex numbers can teach us, years after we have learned how to divide them. Do not attempt to divide complex numbers. That’s too much work.

The Summer 2017 Mathematics A To Z: Volume Forms


I’ve been reading Elke Stangl’s Elkemental Force blog for years now. Sometimes I even feel social-media-caught-up enough to comment, or at least to like posts. This is relevant today as I discuss one of the Stangl’s suggestions for my letter-V topic.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Volume Forms.

So sometime in pre-algebra, or early in (high school) algebra, you start drawing equations. It’s a simple trick. Lay down a coordinate system, some set of axes for ‘x’ and ‘y’ and maybe ‘z’ or whatever letters are important. Look to the equation, made up of x’s and y’s and maybe z’s and so. Highlight all the points with coordinates whose values make the equation true. This is the logical basis for saying (eg) that the straight line “is” y = 2x + 1 .

A short while later, you learn about polar coordinates. Instead of using ‘x’ and ‘y’, you have ‘r’ and ‘θ’. ‘r’ is the distance from the center of the universe. ‘θ’ is the angle made with respect to some reference axis. It’s as legitimate a way of describing points in space. Some classrooms even have a part of the blackboard (whiteboard, whatever) with a polar-coordinates “grid” on it. This looks like the lines of a dartboard. And you learn that some shapes are easy to describe in polar coordinates. A circle, centered on the origin, is ‘r = 2’ or something like that. A line through the origin is ‘θ = 1’ or whatever. The line that we’d called y = 2x + 1 before? … That’s … some mess. And now r = 2\theta + 1 … that’s not even a line. That’s some kind of spiral. Two spirals, really. Kind of wild.

And something to bother you a while. y = 2x + 1 is an equation that looks the same as r = 2\theta + 1 . You’ve changed the names of the variables, but not how they relate to each other. But one is a straight line and the other a spiral thing. How can that be?

The answer, ultimately, is that the letters in the equations aren’t these content-neutral labels. They carry meaning. ‘x’ and ‘y’ imply looking at space a particular way. ‘r’ and ‘θ’ imply looking at space a different way. A shape has different representations in different coordinate systems. Fair enough. That seems to settle the question.

But if you get to calculus the question comes back. You can integrate over a region of space that’s defined by Cartesian coordinates, x’s and y’s. Or you can integrate over a region that’s defined by polar coordinates, r’s and θ’s. The first time you try this, you find … well, that any region easy to describe in Cartesian coordinates is painful in polar coordinates. And vice-versa. Way too hard. But if you struggle through all that symbol manipulation, you get … different answers. Eventually the calculus teacher has mercy and explains. If you’re integrating in Cartesian coordinates you need to use “dx dy”. If you’re integrating in polar coordinates you need to use “r dr dθ”. If you’ve never taken calculus, never mind what this means. What is important is that “r dr dθ” looks like three things multiplied together, while “dx dy” is two.

We get this explained as a “change of variables”. If we want to go from one set of coordinates to a different one, we have to do something fiddly. The extra ‘r’ in “r dr dθ” is what we get going from Cartesian to polar coordinates. And we get formulas to describe what we should do if we need other kinds of coordinates. It’s some work that introduces us to the Jacobian, which looks like the most tedious possible calculation ever at that time. (In Intro to Differential Equations we learn we were wrong, and the Wronskian is the most tedious possible calculation ever. This is also wrong, but it might as well be true.) We typically move on after this and count ourselves lucky it got no worse than that.

None of this is wrong, even from the perspective of more advanced mathematics. It’s not even misleading, which is a refreshing change. But we can look a little deeper, and get something good from doing so.

The deeper perspective looks at “differential forms”. These are about how to encode information about how your coordinate system represents space. They’re tensors. I don’t blame you for wondering if they would be. A differential form uses interactions between some of the directions in a space. A volume form is a differential form that uses all the directions in a space. And satisfies some other rules too. I’m skipping those because some of the symbols involved I don’t even know how to look up, much less make WordPress present.

What’s important is the volume form carries information compactly. As symbols it tells us that this represents a chunk of space that’s constant no matter what the coordinates look like. This makes it possible to do analysis on how functions work. It also tells us what we would need to do to calculate specific kinds of problem. This makes it possible to describe, for example, how something moving in space would change.

The volume form, and the tools to do anything useful with it, demand a lot of supporting work. You can dodge having to explicitly work with tensors. But you’ll need a lot of tensor-related materials, like wedge products and exterior derivatives and stuff like that. If you’ve never taken freshman calculus don’t worry: the people who have taken freshman calculus never heard of those things either. So what makes this worthwhile?

Yes, person who called out “polynomials”. Good instinct. Polynomials are usually a reason for any mathematics thing. This is one of maybe four exceptions. I have to appeal to my other standard answer: “group theory”. These volume forms match up naturally with groups. There’s not only information about how coordinates describe a space to consider. There’s ways to set up coordinates that tell us things.

That isn’t all. These volume forms can give us new invariants. Invariants are what mathematicians say instead of “conservation laws”. They’re properties whose value for a given problem is constant. This can make it easier to work out how one variable depends on another, or to work out specific values of variables.

For example, classical physics problems like how a bunch of planets orbit a sun often have a “symplectic manifold” that matches the problem. This is a description of how the positions and momentums of all the things in the problem relate. The symplectic manifold has a volume form. That volume is going to be constant as time progresses. That is, there’s this way of representing the positions and speeds of all the planets that does not change, no matter what. It’s much like the conservation of energy or the conservation of angular momentum. And this has practical value. It’s the subject that brought my and Elke Stangl’s blogs into contact, years ago. It also has broader applicability.

There’s no way to provide an exact answer for the movement of, like, the sun and nine-ish planets and a couple major moons and all that. So there’s no known way to answer the question of whether the Earth’s orbit is stable. All the planets are always tugging one another, changing their orbits a little. Could this converge in a weird way suddenly, on geologic timescales? Might the planet might go flying off out of the solar system? It doesn’t seem like the solar system could be all that unstable, or it would have already. But we can’t rule out that some freaky alignment of Jupiter, Saturn, and Halley’s Comet might not tweak the Earth’s orbit just far enough for catastrophe to unfold. Granted there’s nothing we could do about the Earth flying out of the solar system, but it would be nice to know if we face it, we tell ourselves.

But we can answer this numerically. We can set a computer to simulate the movement of the solar system. But there will always be numerical errors. For example, we can’t use the exact value of π in a numerical computation. 3.141592 (and more digits) might be good enough for projecting stuff out a day, a week, a thousand years. But if we’re looking at millions of years? The difference can add up. We can imagine compensating for not having the value of π exactly right. But what about compensating for something we don’t know precisely, like, where Jupiter will be in 16 million years and two months?

Symplectic forms can help us. The volume form represented by this space has to be conserved. So we can rewrite our simulation so that these forms are conserved, by design. This does not mean we avoid making errors. But it means we avoid making certain kinds of errors. We’re more likely to make what we call “phase” errors. We predict Jupiter’s location in 16 million years and two months. Our simulation puts it thirty degrees farther in its circular orbit than it actually would be. This is a less serious mistake to make than putting Jupiter, say, eight-tenths as far from the Sun as it would really be.

Volume forms seem, at first, a lot of mechanism for a small problem. And, unfortunately for students, they are. They’re more trouble than they’re worth for changing Cartesian to polar coordinates, or similar problems. You know, ones that the student already has some feel for. They pay off on more abstract problems. Tracking the movement of a dozen interacting things, say, or describing a space that’s very strangely shaped. Those make the effort to learn about forms worthwhile.

Why Stuff Can Orbit, Part 1: Laying Some Groundwork


My recent talking about central forces got me going. There’s interesting stuff about what central forces allow things to orbit one another, and what forces allow for closed orbits. And I feel like trying out a bit of real mathematics, the kind that physics majors do as undergraduates, around here. I should get something for the student loans I’m still paying off and I’ll accept “showing off on my meager little blog here” as something.

Central forces are, uh, forces. Pairs of particles attract each other. The strength of the attraction depends on how far apart they are. The direction of the attraction is exactly towards the other in the pair. So it works like gravity or electric attraction. It might follow a different rule, although I know I’m going to casually refer to things as “gravity” or “gravitational” because that’s just too familiar a reference. I’m formally talking about a problem in classical mechanics, but the ideas and approaches come from orbital mechanics. The language of orbital mechanics comes along with it.

And it is too possible that the force would point some other way. Electric charges in a magnetic field feel a force perpendicular to the magnet. And we can represent vortices, things that swirl around the way cyclones do, as particles pushing each other in perpendicular directions. We’re not going to deal with those.

The easiest kind of orbit to find is a circular one, made by a single pair of particles. I so want to describe that, but if I do, I’m just going to make things more confusing. It’s an orbit that’s a circle. And we’re sticking to a single pair of particles because it turns out it’s easy to describe the central-force movement of two particles. And it’s kind of impossible to describe the central-force movement three particles. So, let’s stick to two.

When we start thinking about what we need to describe the system it’s easy to despair. We need the x, y, and z coordinates for two particles. Plus there’s the mass of both particles. Plus there’s some gravitational constant, however strong the force itself is. That’s at least nine things to keep track of.

We don’t need all that. Physics helps us. Ever hear of the Conservation of Angular Momentum? It’s that thing that makes an ice skater twirling around speed up by pulling in his arms and slow down by reaching them out again. In an argument I’m not dealing with here, the Conservation of Angular Momentum tells us the two particles are going to keep to a single plane. They can move together or apart, but they’ll trace out paths in a two-dimensional slice of space. We can, without loss of generality, suppose it to be the horizontal plane. That is, that the z-coordinate for both planets starts as zero and stays there. So we’re down to seven things to keep track of.

We can simplify some other stuff. For example, suppose we have one really big mass and one really small one: a sun and a planet, or a planet and a satellite. The sun isn’t going to move very much; the planet hasn’t got enough gravity to matter. We can pretend the sun doesn’t move. We’ll make a little error, but it’ll be small enough we don’t have to care. So we’re down to five things to keep track of.

And we’ll do better. The strength of the attractive force isn’t going to change because we don’t need a universe that complicated. The mass of the sun and the planet? Well, that could change, if we wanted to work out how rockets behave. We don’t. So their masses are not going to change. So that’s three things whose value we might not have, but which aren’t going to change. We’ll give those numbers labels that will be letters, but there’s nothing to keep track of. They don’t change. We only have to worry about the x- and y-coordinates of the planet.

But we don’t even have to do that, not really. The force between the sun and the planet depends on how far apart they are. This almost begs us to use polar coordinates instead of Cartesian coordinates. In polar coordinates we identify a point by two things. First is how far it is from the origin. Second is what angle the line from the origin to that point makes with some reference line. And if we’re looking for a circular orbit, then we don’t care what the angle is. It’s going to start at some arbitrary value and increase (or decrease) steadily in time. We don’t have to keep track of it. The only thing that changes that we have to keep track of is the distance between the sun and the planet. Since this is a distance, we naturally call this ‘r’. Well, it’s the radius of the circle traced out by the planet. That’s why it makes sense.

So we have one thing that changes, r. And we have a couple things whose value we don’t know, but which aren’t going to change during the problem. This is getting manageable. (Later on, when we’ll want to allow for elliptic or other funny-shaped orbits, we’ll need an angle θ. But by then we’ll be so comfortable with one variable we’ll be looking to get the thrill of the challenge back.)

When I pick this up again I mean to introduce all the kinds of central forces that we might possibly look at. And then how right away we can see there’s no such thing as an orbit. Should be fun.

%d bloggers like this: