## As I Try To Make Wronski’s Formula For Pi Into Something I Like

Previously:

I remain fascinated with Józef Maria Hoëne-Wronski’s attempted definition of π. It had started out like this:

$\pi = \frac{4\infty}{\sqrt{-1}}\left\{ \left(1 + \sqrt{-1}\right)^{\frac{1}{\infty}} - \left(1 - \sqrt{-1}\right)^{\frac{1}{\infty}} \right\}$

And I’d translated that into something that modern mathematicians would accept without flinching. That is to evaluate the limit of a function that looks like this:

$\displaystyle \lim_{x \to \infty} f(x)$

where

$f(x) = -4 \imath x \left\{ \left(1 + \imath\right)^{\frac{1}{x}} - \left(1 - \imath\right)^{\frac{1}{x}} \right\}$

So. I don’t want to deal with that f(x) as it’s written. I can make it better. One thing that bothers me is seeing the complex number $1 + \imath$ raised to a power. I’d like to work with something simpler than that. And I can’t see that number without also noticing that I’m subtracting from it $1 - \imath$ raised to the same power. $1 + \imath$ and $1 - \imath$ are a “conjugate pair”. It’s usually nice to see those. It often hints at ways to make your expression simpler. That’s one of those patterns you pick up from doing a lot of problems as a mathematics major, and that then look like magic to the lay audience.

Here’s the first way I figure to make my life simpler. It’s in rewriting that $1 + \imath$ and $1 - \imath$ stuff so it’s simpler. It’ll be simpler by using exponentials. Shut up, it will too. I get there through Gauss, Descartes, and Euler.

At least I think it was Gauss who pointed out how you can match complex-valued numbers with points on the two-dimensional plane. On a sheet of graph paper, if you like. The number $1 + \imath$ matches to the point with x-coordinate 1, y-coordinate 1. The number $1 - \imath$ matches to the point with x-coordinate 1, y-coordinate -1. Yes, yes, this doesn’t sound like much of an insight Gauss had, but his work goes on. I’m leaving it off here because that’s all that I need for right now.

So these two numbers that offended me I can think of as points. They have Cartesian coordinates (1, 1) and (1, -1). But there’s never only one coordinate system for something. There may be only one that’s good for the problem you’re doing. I mean that makes the problem easier to study. But there are always infinitely many choices. For points on a flat surface like a piece of paper, and where the points don’t represent any particular physics problem, there’s two good choices. One is the Cartesian coordinates. In it you refer to points by an origin, an x-axis, and a y-axis. How far is the point from the origin in a direction parallel to the x-axis? (And in which direction? This gives us a positive or a negative number) How far is the point from the origin in a direction parallel to the y-axis? (And in which direction? Same positive or negative thing.)

The other good choice is polar coordinates. For that we need an origin and a positive x-axis. We refer to points by how far they are from the origin, heedless of direction. And then to get direction, what angle the line segment connecting the point with the origin makes with the positive x-axis. The first of these numbers, the distance, we normally label ‘r’ unless there’s compelling reason otherwise. The other we label ‘θ’. ‘r’ is always going to be a positive number or, possibly, zero. ‘θ’ might be any number, positive or negative. By convention, we measure angles so that positive numbers are counterclockwise from the x-axis. I don’t know why. I guess it seemed less weird for, say, the point with Cartesian coordinates (0, 1) to have a positive angle rather than a negative angle. That angle would be $\frac{\pi}{2}$, because mathematicians like radians more than degrees. They make other work easier.

So. The point $1 + \imath$ corresponds to the polar coordinates $r = \sqrt{2}$ and $\theta = \frac{\pi}{4}$. The point $1 - \imath$ corresponds to the polar coordinates $r = \sqrt{2}$ and $\theta = -\frac{\pi}{4}$. Yes, the θ coordinates being negative one times each other is common in conjugate pairs. Also, if you have doubts about my use of the word “the” before “polar coordinates”, well-spotted. If you’re not sure about that thing where ‘r’ is not negative, again, well-spotted. I intend to come back to that.

With the polar coordinates ‘r’ and ‘θ’ to describe a point I can go back to complex numbers. I can match the point to the complex number with the value given by $r e^{\imath\theta}$, where ‘e’ is that old 2.71828something number. Superficially, this looks like a big dumb waste of time. I had some problem with imaginary numbers raised to powers, so now, I’m rewriting things with a number raised to imaginary powers. Here’s why it isn’t dumb.

It’s easy to raise a number written like this to a power. $r e^{\imath\theta}$ raised to the n-th power is going to be equal to $r^n e^{\imath\theta \cdot n}$. (Because $(a \cdot b)^n = a^n \cdot b^n$ and we’re going to go ahead and assume this stays true if ‘b’ is a complex-valued number. It does, but you’re right to ask how we know that.) And this turns into raising a real-valued number to a power, which we know how to do. And it involves dividing a number by that power, which is also easy.

And we can get back to something that looks like $1 + \imath$ too. That is, something that’s a real number plus $\imath$ times some real number. This is through one of the many Euler’s Formulas. The one that’s relevant here is that $e^{\imath \phi} = \cos(\phi) + \imath \sin(\phi)$ for any real number ‘φ’. So, that’s true also for ‘θ’ times ‘n’. Or, looking to where everybody knows we’re going, also true for ‘θ’ divided by ‘x’.

OK, on to the people so anxious about all this. I talked about the angle made between the line segment that connects a point and the origin and the positive x-axis. “The” angle. “The”. If that wasn’t enough explanation of the problem, mention how your thinking’s done a 360 degree turn and you see it different now. In an empty room, if you happen to be in one. Your pedantic know-it-all friend is explaining it now. There’s an infinite number of angles that correspond to any given direction. They’re all separated by 360 degrees or, to a mathematician, 2π.

And more. What’s the difference between going out five units of distance in the direction of angle 0 and going out minus-five units of distance in the direction of angle -π? That is, between walking forward five paces while facing east and walking backward five paces while facing west? Yeah. So if we let ‘r’ be negative we’ve got twice as many infinitely many sets of coordinates for each point.

This complicates raising numbers to powers. θ times n might match with some point that’s very different from θ-plus-2-π times n. There might be a whole ring of powers. This seems … hard to work with, at least. But it’s, at heart, the same problem you get thinking about the square root of 4 and concluding it’s both plus 2 and minus 2. If you want “the” square root, you’d like it to be a single number. At least if you want to calculate anything from it. You have to pick out a preferred θ from the family of possible candidates.

For me, that’s whatever set of coordinates has ‘r’ that’s positive (or zero), and that has ‘θ’ between -π and π. Or between 0 and 2π. It could be any strip of numbers that’s 2π wide. Pick what makes sense for the problem you’re doing. It’s going to be the strip from -π to π. Perhaps the strip from 0 to 2π.

What this all amounts to is that I can turn this:

$f(x) = -4 \imath x \left\{ \left(1 + \imath\right)^{\frac{1}{x}} - \left(1 - \imath\right)^{\frac{1}{x}} \right\}$

into this:

$f(x) = -4 \imath x \left\{ \left(\sqrt{2} e^{\imath \frac{\pi}{4}}\right)^{\frac{1}{x}} - \left(\sqrt{2} e^{-\imath \frac{\pi}{4}} \right)^{\frac{1}{x}} \right\}$

without changing its meaning any. Raising a number to the one-over-x power looks different from raising it to the n power. But the work isn’t different. The function I wrote out up there is the same as this function:

$f(x) = -4 \imath x \left\{ \sqrt{2}^{\frac{1}{x}} e^{\imath \frac{\pi}{4}\cdot\frac{1}{x}} - \sqrt{2}^{\frac{1}{x}} e^{-\imath \frac{\pi}{4}\cdot\frac{1}{x}} \right\}$

I can’t look at that number, $\sqrt{2}^{\frac{1}{x}}$, sitting there, multiplied by two things added together, and leave that. (OK, subtracted, but same thing.) I want to something something distributive law something and that gets us here:

$f(x) = -4 \imath x \sqrt{2}^{\frac{1}{x}} \left\{ e^{\imath \frac{\pi}{4}\cdot\frac{1}{x}} - e^{- \imath \frac{\pi}{4}\cdot\frac{1}{x}} \right\}$

Also, yeah, that square root of two raised to a power looks weird. I can turn that square root of two into “two to the one-half power”. That gets to this rewrite:

$f(x) = -4 \imath x 2^{\frac{1}{2}\cdot \frac{1}{x}} \left\{ e^{\imath \frac{\pi}{4}\cdot\frac{1}{x}} - e^{- \imath \frac{\pi}{4}\cdot\frac{1}{x}} \right\}$

And then. Those parentheses. e raised to an imaginary number minus e raised to minus-one-times that same imaginary number. This is another one of those magic tricks that mathematicians know because they see it all the time. Part of what we know from Euler’s Formula, the one I waved at back when I was talking about coordinates, is this:

$\sin\left(\phi\right) = \frac{e^{\imath \phi} - e^{-\imath \phi}}{2\imath }$

That’s good for any real-valued φ. For example, it’s good for the number $\frac{\pi}{4}\cdot\frac{1}{x}$. And that means we can rewrite that function into something that, finally, actually looks a little bit simpler. It looks like this:

$f(x) = -2 x 2^{\frac{1}{2}\cdot \frac{1}{x}} \sin\left(\frac{\pi}{4}\cdot \frac{1}{x}\right)$

And that’s the function whose limit I want to take at ∞. No, really.

## As I Try To Figure Out What Wronski Thought ‘Pi’ Was

A couple weeks ago I shared a fascinating formula for π. I got it from Carl B Boyer’s The History of Calculus and its Conceptual Development. He got it from Józef Maria Hoëne-Wronski, early 19th-century Polish mathematician. His idea was that an absolute, culturally-independent definition of π would come not from thinking about circles and diameters but rather this formula:

$\pi = \frac{4\infty}{\sqrt{-1}}\left\{ \left(1 + \sqrt{-1}\right)^{\frac{1}{\infty}} - \left(1 - \sqrt{-1}\right)^{\frac{1}{\infty}} \right\}$

Now, this formula is beautiful, at least to my eyes. It’s also gibberish. At least it’s ungrammatical. Mathematicians don’t like to write stuff like “four times infinity”, at least not as more than a rough draft on the way to a real thought. What does it mean to multiply four by infinity? Is arithmetic even a thing that can be done on infinitely large quantities? Among Wronski’s problems is that they didn’t have a clear answer to this. We’re a little more advanced in our mathematics now. We’ve had a century and a half of rather sound treatment of infinitely large and infinitely small things. Can we save Wronski’s work?

Start with the easiest thing. I’m offended by those $\sqrt{-1}$ bits. Well, no, I’m more unsettled by them. I would rather have $\imath$ in there. The difference? … More taste than anything sound. I prefer, if I can get away with it, using the square root symbol to mean the positive square root of the thing inside. There is no positive square root of -1, so, pfaugh, away with it. Mere style? All right, well, how do you know whether those $\sqrt{-1}$ terms are meant to be $\imath$ or its additive inverse, $-\imath$? How do you know they’re all meant to be the same one? See? … As with all style preferences, it’s impossible to be perfectly consistent. I’m sure there are times I accept a big square root symbol over a negative or a complex-valued quantity. But I’m not forced to have it here so I’d rather not. First step:

$\pi = \frac{4\infty}{\imath}\left\{ \left(1 + \imath\right)^{\frac{1}{\infty}} - \left(1 - \imath\right)^{\frac{1}{\infty}} \right\}$

Also dividing by $\imath$ is the same as multiplying by $-\imath$ so the second easy step gives me:

$\pi = -4 \imath \infty \left\{ \left(1 + \imath\right)^{\frac{1}{\infty}} - \left(1 - \imath\right)^{\frac{1}{\infty}} \right\}$

Now the hard part. All those infinities. I don’t like multiplying by infinity. I don’t like dividing by infinity. I really, really don’t like raising a quantity to the one-over-infinity power. Most mathematicians don’t. We have a tool for dealing with this sort of thing. It’s called a “limit”.

Mathematicians developed the idea of limits over … well, since they started doing mathematics. In the 19th century limits got sound enough that we still trust the idea. Here’s the rough way it works. Suppose we have a function which I’m going to name ‘f’ because I have better things to do than give functions good names. Its domain is the real numbers. Its range is the real numbers. (We can define functions for other domains and ranges, too. Those definitions look like what they do here.)

I’m going to use ‘x’ for the independent variable. It’s any number in the domain. I’m going to use ‘a’ for some point. We want to know the limit of the function “at a”. ‘a’ might be in the domain. But — and this is genius — it doesn’t have to be. We can talk sensibly about the limit of a function at some point where the function doesn’t exist. We can say “the limit of f at a is the number L”. I hadn’t introduced ‘L’ into evidence before, but … it’s a number. It has some specific set value. Can’t say which one without knowing what ‘f’ is and what its domain is and what ‘a’ is. But I know this about it.

Pick any error margin that you like. Call it ε because mathematicians do. However small this (positive) number is, there’s at least one neighborhood in the domain of ‘f’ that surrounds ‘a’. Check every point in that neighborhood other than ‘a’. The value of ‘f’ at all those points in that neighborhood other than ‘a’ will be larger than L – ε and smaller than L + ε.

Yeah, pause a bit there. It’s a tricky definition. It’s a nice common place to crash hard in freshman calculus. Also again in Intro to Real Analysis. It’s not just you. Perhaps it’ll help to think of it as a kind of mutual challenge game. Try this.

1. You draw whatever error bar, as big or as little as you like, around ‘L’.
2. But I always respond by drawing some strip around ‘a’.
3. You then pick absolutely any ‘x’ inside my strip, other than ‘a’.
4. Is f(x) always within the error bar you drew?

Suppose f(x) is. Suppose that you can pick any error bar however tiny, and I can answer with a strip however tiny, and every single ‘x’ inside my strip has an f(x) within your error bar … then, L is the limit of f at a.

Again, yes, tricky. But mathematicians haven’t found a better definition that doesn’t break something mathematicians need.

To write “the limit of f at a is L” we use the notation:

$\displaystyle \lim_{x \to a} f(x) = L$

The ‘lim’ part probably makes perfect sense. And you can see where ‘f’ and ‘a’ have to enter into it. ‘x’ here is a “dummy variable”. It’s the falsework of the mathematical expression. We need some name for the independent variable. It’s clumsy to do without. But it doesn’t matter what the name is. It’ll never appear in the answer. If it does then the work went wrong somewhere.

What I want to do, then, is turn all those appearances of ‘∞’ in Wronski’s expression into limits of something at infinity. And having just said what a limit is I have to do a patch job. In that talk about the limit at ‘a’ I talked about a neighborhood containing ‘a’. What’s it mean to have a neighborhood “containing ∞”?

The answer is exactly what you’d think if you got this question and were eight years old. The “neighborhood of infinity” is “all the big enough numbers”. To make it rigorous, it’s “all the numbers bigger than some finite number that let’s just call N”. So you give me an error bar around ‘L’. I’ll give you back some number ‘N’. Every ‘x’ that’s bigger than ‘N’ has f(x) inside your error bars. And note that I don’t have to say what ‘f(∞)’ is or even commit to the idea that such a thing can be meaningful. I only ever have to think directly about values of ‘f(x)’ where ‘x’ is some real number.

So! First, let me rewrite Wronski’s formula as a function, defined on the real numbers. Then I can replace each ∞ with the limit of something at infinity and … oh, wait a minute. There’s three ∞ symbols there. Do I need three limits?

Ugh. Yeah. Probably. This can be all right. We can do multiple limits. This can be well-defined. It can also be a right pain. The challenge-and-response game needs a little modifying to work. You still draw error bars. But I have to draw multiple strips. One for each of the variables. And every combination of values inside all those strips has give an ‘f’ that’s inside your error bars. There’s room for great mischief. You can arrange combinations of variables that look likely to break ‘f’ outside the error bars.

So. Three independent variables, all taking a limit at ∞? That’s not guaranteed to be trouble, but I’d expect trouble. At least I’d expect something to keep the limit from existing. That is, we could find there’s no number ‘L’ so that this drawing-neighborhoods thing works for all three variables at once.

Let’s try. One of the ∞ will be a limit of a variable named ‘x’. One of them a variable named ‘y’. One of them a variable named ‘z’. Then:

$f(x, y, z) = -4 \imath x \left\{ \left(1 + \imath\right)^{\frac{1}{y}} - \left(1 - \imath\right)^{\frac{1}{z}} \right\}$

Without doing the work, my hunch is: this is utter madness. I expect it’s probably possible to make this function take on many wildly different values by the judicious choice of ‘x’, ‘y’, and ‘z’. Particularly ‘y’ and ‘z’. You maybe see it already. If you don’t, you maybe see it now that I’ve said you maybe see it. If you don’t, I’ll get there, but not in this essay. But let’s suppose that it’s possible to make f(x, y, z) take on wildly different values like I’m getting at. This implies that there’s not any limit ‘L’, and therefore Wronski’s work is just wrong.

Thing is, Wronski wouldn’t have thought that. Deep down, I am certain, he thought the three appearances of ∞ were the same “value”. And that to translate him fairly we’d use the same name for all three appearances. So I am going to do that. I shall use ‘x’ as my variable name, and replace all three appearances of ∞ with the same variable and a common limit. So this gives me the single function:

$f(x) = -4 \imath x \left\{ \left(1 + \imath\right)^{\frac{1}{x}} - \left(1 - \imath\right)^{\frac{1}{x}} \right\}$

And then I need to take the limit of this at ∞. If Wronski is right, and if I’ve translated him fairly, it’s going to be π. Or something easy to get π from.

I hope to get there next week.

## The Summer 2017 Mathematics A To Z: Zeta Function

Today Gaurish, of For the love of Mathematics, gives me the last subject for my Summer 2017 A To Z sequence. And also my greatest challenge: the Zeta function. The subject comes to all pop mathematics blogs. It comes to all mathematics blogs. It’s not difficult to say something about a particular zeta function. But to say something at all original? Let’s watch.

# Zeta Function.

The spring semester of my sophomore year I had Intro to Complex Analysis. Monday Wednesday 7:30; a rare evening class, one of the few times I’d eat dinner and then go to a lecture hall. There I discovered something strange and wonderful. Complex Analysis is a far easier topic than Real Analysis. Both are courses about why calculus works. But why calculus for complex-valued numbers works is a much easier problem than why calculus for real-valued numbers works. It’s dazzling. Part of this is that Complex Analysis, yes, builds on Real Analysis. So Complex can take for granted some things that Real has to prove. I didn’t mind. Given the way I crashed through Intro to Real Analysis I was glad for a subject that was, relatively, a breeze.

As we worked through Complex Variables and Applications so many things, so very many things, got to be easy. The basic unit of complex analysis, at least as we young majors learned it, was in contour integrals. These are integrals whose value depends on the values of a function on a closed loop. The loop is in the complex plane. The complex plane is, well, your ordinary plane. But we say the x-coordinate and the y-coordinate are parts of the same complex-valued number. The x-coordinate is the real-valued part. The y-coordinate is the imaginary-valued part. And we call that summation ‘z’. In complex-valued functions ‘z’ serves the role that ‘x’ does in normal mathematics.

So a closed loop is exactly what you think. Take a rubber band and twist it up and drop it on the table. That’s a closed loop. Suppose you want to integrate a function, ‘f(z)’. If you can always take its derivative on this loop and on the interior of that loop, then its contour integral is … zero. No matter what the function is. As long as it’s “analytic”, as the terminology has it. Yeah, we were all stunned into silence too. (Granted, mathematics classes are usually quiet, since it’s hard to get a good discussion going. Plus many of us were in post-dinner digestive lulls.)

Integrating regular old functions of real-valued numbers is this tedious process. There’s sooooo many rules and possibilities and special cases to consider. There’s sooooo many tricks that get you the integrals of some functions. And then here, with complex-valued integrals for analytic functions, you know the answer before you even look at the function.

As you might imagine, since this is only page 113 of a 341-page book there’s more to it. Most functions that anyone cares about aren’t analytic. At least they’re not analytic everywhere inside regions that might be interesting. There’s usually some points where an interesting function ‘f(z)’ is undefined. We call these “singularities”. Yes, like starships are always running into. Only we rarely get propelled into other universes or other times or turned into ghosts or stuff like that.

So much of the rest of the course turns into ways to avoid singularities. Sometimes you can spackel them over. This is when the function happens not to be defined somewhere, but you can see what it ought to be. Sometimes you have to do something more. This turns into a search for “removable” singularities. And this does something so brilliant it looks illicit. You modify your closed loop, so that it comes up very close, as close as possible, to the singularity, but studiously avoids it. Follow this game of I’m-not-touching-you right and you can turn your integral into two parts. One is the part that’s equal to zero. The other is the part that’s a constant times whatever the function is at the singularity you’re removing. And that ought to be easy to find the value for. (Being able to find a function’s value doesn’t mean you can find its derivative.)

Those tricks were hard to master. Not because they were hard. Because they were easy, in a context where we expected hard. But after that we got into how to move singularities. That is, how to do a change of variables that moved the singularities to where they’re more convenient for some reason. How could this be more convenient? Because of chapter five, series. In regular old calculus we learn how to approximate well-behaved functions with polynomials. In complex-variable calculus, we learn the same thing all over again. They’re polynomials of complex-valued variables, but it’s the same sort of thing. And not just polynomials, but things that look like polynomials except they’re powers of $\frac{1}{z}$ instead. These open up new ways to approximate functions, and to remove singularities from functions.

And then we get into transformations. These are about turning a problem that’s hard into one that’s easy. Or at least different. They’re a change of variable, yes. But they also change what exactly the function is. This reshuffles the problem. Makes for a change in singularities. Could make ones that are easier to work with.

One of the useful, and so common, transforms is called the Laplace-Stieltjes Transform. (“Laplace” is said like you might guess. “Stieltjes” is said, or at least we were taught to say it, like “Stilton cheese” without the “ton”.) And it tends to create functions that look like a series, the sum of a bunch of terms. Infinitely many terms. Each of those terms looks like a number times another number raised to some constant times ‘z’. As the course came to its conclusion, we were all prepared to think about these infinite series. Where singularities might be. Which of them might be removable.

These functions, these results of the Laplace-Stieltjes Transform, we collectively call ‘zeta functions’. There are infinitely many of them. Some of them are relatively tame. Some of them are exotic. One of them is world-famous. Professor Walsh — I don’t mean to name-drop, but I discovered the syllabus for the course tucked in the back of my textbook and I’m delighted to rediscover it — talked about it.

That world-famous one is, of course, the Riemann Zeta function. Yes, that same Riemann who keeps turning up, over and over again. It looks simple enough. Almost tame. Take the counting numbers, 1, 2, 3, and so on. Take your ‘z’. Raise each of the counting numbers to that ‘z’. Take the reciprocals of all those numbers. Add them up. What do you get?

A mass of fascinating results, for one. Functions you wouldn’t expect are concealed in there. There’s strips where the real part is zero. There’s strips where the imaginary part is zero. There’s points where both the real and imaginary parts are zero. We know infinitely many of them. If ‘z’ is -2, for example, the sum is zero. Also if ‘z’ is -4. -6. -8. And so on. These are easy to show, and so are dubbed ‘trivial’ zeroes. To say some are ‘trivial’ is to say that there are others that are not trivial. Where are they?

Professor Walsh explained. We know of many of them. The nontrivial zeroes we know of all share something in common. They have a real part that’s equal to 1/2. There’s a zero that’s at about the number $\frac{1}{2} - \imath 14.13$. Also at $\frac{1}{2} + \imath 14.13$. There’s one at about $\frac{1}{2} - \imath 21.02$. Also about $\frac{1}{2} + \imath 21.02$. (There’s a symmetry, you maybe guessed.) Every nontrivial zero we’ve found has a real component that’s got the same real-valued part. But we don’t know that they all do. Nobody does. It is the Riemann Hypothesis, the great unsolved problem of mathematics. Much more important than that Fermat’s Last Theorem, which back then was still merely a conjecture.

What a prospect! What a promise! What a way to set us up for the final exam in a couple of weeks.

I had an inspiration, a kind of scheme of showing that a nontrivial zero couldn’t be within a given circular contour. Make the size of this circle grow. Move its center farther away from the z-coordinate $\frac{1}{2} + \imath 0$ to match. Show there’s still no nontrivial zeroes inside. And therefore, logically, since I would have shown nontrivial zeroes couldn’t be anywhere but on this special line, and we know nontrivial zeroes exist … I leapt enthusiastically into this project. A little less enthusiastically the next day. Less so the day after. And on. After maybe a week I went a day without working on it. But came back, now and then, prodding at my brilliant would-be proof.

The Riemann Zeta function was not on the final exam, which I’ve discovered was also tucked into the back of my textbook. It asked more things like finding all the singular points and classifying what kinds of singularities they were for functions like $e^{-\frac{1}{z}}$ instead. If the syllabus is accurate, we got as far as page 218. And I’m surprised to see the professor put his e-mail address on the syllabus. It was merely “bwalsh@math”, but understand, the Internet was a smaller place back then.

I finished the course with an A-, but without answering any of the great unsolved problems of mathematics.

## The Summer 2017 Mathematics A To Z: Gaussian Primes

Once more do I have Gaurish to thank for the day’s topic. (There’ll be two more chances this week, providing I keep my writing just enough ahead of deadline.) This one doesn’t touch category theory or topology.

# Gaussian Primes.

I keep touching on group theory here. It’s a field that’s about what kinds of things can work like arithmetic does. A group is a set of things that you can add together. At least, you can do something that works like adding regular numbers together does. A ring is a set of things that you can add and multiply together.

There are many interesting rings. Here’s one. It’s called the Gaussian Integers. They’re made of numbers we can write as $a + b\imath$, where ‘a’ and ‘b’ are some integers. $\imath$ is what you figure, that number that multiplied by itself is -1. These aren’t the complex-valued numbers, you notice, because ‘a’ and ‘b’ are always integers. But you add them together the way you add complex-valued numbers together. That is, $a + b\imath$ plus $c + d\imath$ is the number $(a + c) + (b + d)\imath$. And you multiply them the way you multiply complex-valued numbers together. That is, $a + b\imath$ times $c + d\imath$ is the number $(a\cdot c - b\cdot d) + (a\cdot d + b\cdot c)\imath$.

We created something that has addition and multiplication. It picks up subtraction for free. It doesn’t have division. We can create rings that do, but this one won’t, any more than regular old integers have division. But we can ask what other normal-arithmetic-like stuff these Gaussian integers do have. For instance, can we factor numbers?

This isn’t an obvious one. No, we can’t expect to be able to divide one Gaussian integer by another. But we can’t expect to divide a regular old integer by another, not and get an integer out of it. That doesn’t mean we can’t factor them. It means we divide the regular old integers into a couple classes. There’s prime numbers. There’s composites. There’s the unit, the number 1. There’s zero. We know prime numbers; they’re 2, 3, 5, 7, and so on. Composite numbers are the ones you get by multiplying prime numbers together: 4, 6, 8, 9, 10, and so on. 1 and 0 are off on their own. Leave them there. We can’t divide any old integer by any old integer. But we can say an integer is equal to this string of prime numbers multiplied together. This gives us a handle by which we can prove a lot of interesting results.

We can do the same with Gaussian integers. We can divide them up into Gaussian primes, Gaussian composites, units, and zero. The words mean what they mean for regular old integers. A Gaussian composite can be factored into the multiples of Gaussian primes. Gaussian primes can’t be factored any further.

If we know what the prime numbers are for regular old integers we can tell whether something’s a Gaussian prime. Admittedly, knowing all the prime numbers is a challenge. But a Gaussian integer $a + b\imath$ will be prime whenever a couple simple-to-test conditions are true. First is if ‘a’ and ‘b’ are both not zero, but $a^2 + b^2$ is a prime number. So, for example, $5 + 4\imath$ is a Gaussian prime.

You might ask, hey, would $-5 - 4\imath$ also be a Gaussian prime? That’s also got components that are integers, and the squares of them add up to a prime number (41). Well-spotted. Gaussian primes appear in quartets. If $a + b\imath$ is a Gaussian prime, so is $-a -b\imath$. And so are $-b + a\imath$ and $b - a\imath$.

There’s another group of Gaussian primes. These are the numbers $a + b\imath$ where either ‘a’ or ‘b’ is zero. Then the other one is, if positive, three more than a whole multiple of four. If it’s negative, then it’s three less than a whole multiple of four. So ‘3’ is a Gaussian prime, as is -3, and as is $3\imath$ and so is $-3\imath$.

This has strange effects. Like, ‘3’ is a prime number in the regular old scheme of things. It’s also a Gaussian prime. But familiar other prime numbers like ‘2’ and ‘5’? Not anymore. Two is equal to $(1 + \imath) \cdot (1 - \imath)$; both of those terms are Gaussian primes. Five is equal to $(2 + \imath) \cdot (2 - \imath)$. There are similar shocking results for 13. But, roughly, the world of composites and prime numbers translates into Gaussian composites and Gaussian primes. In this slightly exotic structure we have everything familiar about factoring numbers.

You might have some nagging thoughts. Like, sure, two is equal to $(1 + \imath) \cdot (1 - \imath)$. But isn’t it also equal to $(1 + \imath) \cdot (1 - \imath) \cdot \imath \cdot (-\imath)$? One of the important things about prime numbers is that every composite number is the product of a unique string of prime numbers. Do we have to give that up for Gaussian integers?

Good nag. But no; the doubt is coming about because you’ve forgotten the difference between “the positive integers” and “all the integers”. If we stick to positive whole numbers then, yeah, (say) ten is equal to two times five and no other combination of prime numbers. But suppose we have all the integers, positive and negative. Then ten is equal to either two times five or it’s equal to negative two times negative five. Or, better, it’s equal to negative one times two times negative one times five. Or suffix times any even number of negative ones.

Remember that bit about separating ‘one’ out from the world of primes and composites? That’s because the number one screws up these unique factorizations. You can always toss in extra factors of one, to taste, without changing the product of something. If we have positive and negative integers to use, then negative one does almost the same trick. We can toss in any even number of extra negative ones without changing the product. This is why we separate “units” out of the numbers. They’re not part of the prime factorization of any numbers.

For the Gaussian integers there are four units. 1 and -1, $\imath$ and $-\imath$. They are neither primes nor composites, and we don’t worry about how they would otherwise multiply the number of factorizations we get.

But let me close with a neat, easy-to-understand puzzle. It’s called the moat-crossing problem. In the regular old integers it’s this: imagine that the prime numbers are islands in a dangerous sea. You start on the number ‘2’. Imagine you have a board that can be set down and safely crossed, then picked up to be put down again. Could you get from the start and go off to safety, which is infinitely far away? If your board is some, fixed, finite length?

No, you can’t. The problem amounts to how big the gap between one prime number and the next largest prime number can be. It turns out there’s no limit to that. That is, you give me a number, as small or as large as you like. I can find some prime number that’s more than your number less than its successor. There are infinitely large gaps between prime numbers.

Gaussian primes, though? Since a Gaussian prime might have nearest neighbors in any direction? Nobody knows. We know there are arbitrarily large gaps. Pick a moat size; we can (eventually) find a Gaussian prime that’s at least that far away from its nearest neighbors. But this does not say whether it’s impossible to get from the smallest Gaussian primes — $1 + \imath$ and its companions $-1 + \imath$ and on — infinitely far away. We know there’s a moat of width 6 separating the origin of things from infinity. We don’t know that there’s bigger ones.

You’re not going to solve this problem. Unless I have more brilliant readers than I know about; if I have ones who can solve this problem then I might be too intimidated to write anything more. But there is surely a pleasant pastime, maybe a charming game, to be made from this. Try finding the biggest possible moats around some set of Gaussian prime island.

Ellen Gethner, Stan Wagon, and Brian Wick’s A Stroll Through the Gaussian Primes describes this moat problem. It also sports some fine pictures of where the Gaussian primes are and what kinds of moats you can find. If you don’t follow the reasoning, you can still enjoy the illustrations.

## What Is The Logarithm of a Negative Number?

Learning of imaginary numbers, things created to be the square roots of negative numbers, inspired me. It probably inspires anyone who’s the sort of person who’d become a mathematician. The trick was great. I wondered could I do it? Could I find some other useful expansion of the number system?

The square root of a complex-valued number sounded like the obvious way to go, until a little later that week when I learned that’s just some other complex-valued numbers. The next thing I hit on: how about the logarithm of a negative number? Couldn’t that be a useful expansion of numbers?

No. It turns out you can make a sensible logarithm of negative, and complex-valued, numbers using complex-valued numbers. Same with trigonometric and inverse trig functions, tangents and arccosines and all that. There isn’t anything we can do with the normal mathematical operations that needs something bigger than the complex-valued numbers to play with. It’s possible to expand on the complex-valued numbers. We can make quaternions and some more elaborate constructs there. They don’t solve any particular shortcoming in complex-valued numbers, but they’ve got their uses. I never got anywhere near reinventing them. I don’t regret the time spent on that. There’s something useful in trying to invent something even if it fails.

One problem with mathematics — with all intellectual fields, really — is that it’s easy, when teaching, to give the impression that this stuff is the Word of God, built into the nature of the universe and inarguable. It’s so not. The stuff we find interesting and how we describe those things are the results of human thought, attempts to say what is interesting about a thing and what is useful. And what best approximates our ideas of what we would like to know. So I was happy to see this come across my Twitter feed:

The links to a 12-page paper by Deepak Bal, Leibniz, Bernoulli, and the Logarithms of Negative Numbers. It’s a review of how the idea of a logarithm of a negative number got developed over the course of the 18th century. And what great minds, like Gottfried Leibniz and John (I) Bernoulli argued about as they find problems with the implications of what they were doing. (There were a lot of Bernoullis doing great mathematics, and even multiple John Bernoullis. The (I) is among the ways we keep them sorted out.) It’s worth a read, I think, even if you’re not all that versed in how to calculate logarithms. (but if you’d like to be better-versed, here’s the tail end of some thoughts about that.) The process of how a good idea like this comes to be is worth knowing.

Also: it turns out there’s not “the” logarithm of a complex-valued number. There’s infinitely many logarithms. But they’re a family, all strikingly similar, so we can pick one that’s convenient and just use that. Ask if you’re really interested.

## The End 2016 Mathematics A To Z: Xi Function

I have today another request from gaurish, who’s also been good enough to give me requests for ‘Y’ and ‘Z’. I apologize for coming to this a day late. But it was Christmas and many things demanded my attention.

## Xi Function.

We start with complex-valued numbers. People discovered them because they were useful tools to solve polynomials. They turned out to be more than useful fictions, if numbers are anything more than useful fictions. We can add and subtract them easily. Multiply and divide them less easily. We can even raise them to powers, or raise numbers to them.

If you become a mathematics major then somewhere in Intro to Complex Analysis you’re introduced to an exotic, infinitely large sum. It’s spoken of reverently as the Riemann Zeta Function, and it connects to something named the Riemann Hypothesis. Then you remember that you’ve heard of this, because if you’re willing to become a mathematics major you’ve read mathematics popularizations. And you know the Riemann Hypothesis is an unsolved problem. It proposes something that might be true or might be false. Either way has astounding implications for the way numbers fit together.

Riemann here is Bernard Riemann, who’s turned up often in these A To Z sequences. We saw him in spheres and in sums, leading to integrals. We’ll see him again. Riemann just covered so much of 19th century mathematics; we can’t talk about calculus without him. Zeta, Xi, and later on, Gamma are the famous Greek letters. Mathematicians fall back on them because the Roman alphabet just hasn’t got enough letters for our needs. I’m writing them out as English words instead because if you aren’t familiar with them they look like an indistinct set of squiggles. Even if you are familiar, sometimes. I got confused in researching this some because I did slip between a lowercase-xi and a lowercase-zeta in my mind. All I can plead is it’s been a hard week.

Riemann’s Zeta function is famous. It’s easy to approach. You can write it as a sum. An infinite sum, but still, those are easy to understand. Pick a complex-valued number. I’ll call it ‘s’ because that’s the standard. Next take each of the counting numbers: 1, 2, 3, and so on. Raise each of them to the power ‘s’. And take the reciprocal, one divided by those numbers. Add all that together. You’ll get something. Might be real. Might be complex-valued. Might be zero. We know many values of ‘s’ what would give us a zero. The Riemann Hypothesis is about characterizing all the possible values of ‘s’ that give us a zero. We know some of them, so boring we call them trivial: -2, -4, -6, -8, and so on. (This looks crazy. There’s another way of writing the Riemann Zeta function which makes it obvious instead.) The Riemann Hypothesis is about whether all the proper, that is, non-boring values of ‘s’ that give us a zero are 1/2 plus some imaginary number.

It’s a rare thing mathematicians have only one way of writing. If something’s been known and studied for a long time there are usually variations. We find different ways to write the problem. Or we find different problems which, if solved, would solve the original problem. The Riemann Xi function is an example of this.

I’m going to spare you the formula for it. That’s in self-defense. I haven’t found an expression of the Xi function that isn’t a mess. The normal ways to write it themselves call on the Zeta function, as well as the Gamma function. The Gamma function looks like factorials, for the counting numbers. It does its own thing for other complex-valued numbers.

That said, I’m not sure what the advantages are in looking at the Xi function. The one that people talk about is its symmetry. Its value at a particular complex-valued number ‘s’ is the same as its value at the number ‘1 – s’. This may not seem like much. But it gives us this way of rewriting the Riemann Hypothesis. Imagine all the complex-valued numbers with the same imaginary part. That is, all the numbers that we could write as, say, ‘x + 4i’, where ‘x’ is some real number. If the size of the value of Xi, evaluated at ‘x + 4i’, always increases as ‘x’ starts out equal to 1/2 and increases, then the Riemann hypothesis is true. (This has to be true not just for ‘x + 4i’, but for all possible imaginary numbers. So, ‘x + 5i’, and ‘x + 6i’, and even ‘x + 4.1 i’ and so on. But it’s easier to start with a single example.)

Or another way to write it. Suppose the size of the value of Xi, evaluated at ‘x + 4i’ (or whatever), always gets smaller as ‘x’ starts out at a negative infinitely large number and keeps increasing all the way to 1/2. If that’s true, and true for every imaginary number, including ‘x – i’, then the Riemann hypothesis is true.

And it turns out if the Riemann hypothesis is true we can prove the two cases above. We’d write the theorem about this in our papers with the start ‘The Following Are Equivalent’. In our notes we’d write ‘TFAE’, which is just as good. Then we’d take which ever of them seemed easiest to prove and find out it isn’t that easy after all. But if we do get through we declare ourselves fortunate, sit back feeling triumphant, and consider going out somewhere to celebrate. But we haven’t got any of these alternatives solved yet. None of the equivalent ways to write it has helped so far.

We know some some things. For example, we know there are infinitely many roots for the Xi function with a real part that’s 1/2. This is what we’d need for the Riemann hypothesis to be true. But we don’t know that all of them are.

The Xi function isn’t entirely about what it can tell us for the Zeta function. The Xi function has its own exotic and wonderful properties. In a 2009 paper on arxiv.org, for example, Drs Yang-Hui He, Vishnu Jejjala, and Djordje Minic describe how if the zeroes of the Xi function are all exactly where we expect them to be then we learn something about a particular kind of string theory. I admit not knowing just what to say about a genus-one free energy of the topological string past what I have read in this paper. In another paper they write of how the zeroes of the Xi function correspond to the description of the behavior for a quantum-mechanical operator that I just can’t find a way to describe clearly in under three thousand words.

But mathematicians often speak of the strangeness that mathematical constructs can match reality so well. And here is surely a powerful one. We learned of the Riemann Hypothesis originally by studying how many prime numbers there are compared to the counting numbers. If it’s true, then the physics of the universe may be set up one particular way. Is that not astounding?

## A Leap Day 2016 Mathematics A To Z: Quaternion

I’ve got another request from Gaurish today. And it’s a word I had been thinking to do anyway. When one looks for mathematical terms starting with ‘q’ this is one that stands out. I’m a little surprised I didn’t do it for last summer’s A To Z. But here it is at last:

## Quaternion.

I remember the seizing of my imagination the summer I learned imaginary numbers. If we could define a number i, so that i-squared equalled negative 1, and work out arithmetic which made sense out of that, why not do it again? Complex-valued numbers are great. Why not something more? Maybe we could also have some other non-real number. I reached deep into my imagination and picked j as its name. It could be something else. Maybe the logarithm of -1. Maybe the square root of i. Maybe something else. And maybe we could build arithmetic with a whole second other non-real number.

My hopes of this brilliant idea petered out over the summer. It’s easy to imagine a super-complex number, something that’s “1 + 2i + 3j”. And it’s easy to work out adding two super-complex numbers like this together. But multiplying them together? What should i times j be? I couldn’t solve the problem. Also I learned that we didn’t need another number to be the logarithm of -1. It would be π times i. (Or some other numbers. There’s some surprising stuff in logarithms of negative or of complex-valued numbers.) We also don’t need something special to be the square root of i, either. $\frac{1}{2}\sqrt{2} + \frac{1}{2}\sqrt{2}\imath$ will do. (So will another number.) So I shelved the project.

Even if I hadn’t given up, I wouldn’t have invented something. Not along those lines. Finer minds had done the same work and had found a way to do it. The most famous of these is the quaternions. It has a famous discovery. Sir William Rowan Hamilton — the namesake of “Hamiltonian mechanics”, so you already know what a fantastic mind he was — had a flash of insight that’s come down in the folklore and romance of mathematical history. He had the idea on the 16th of October, 1843, while walking with his wife along the Royal Canal, in Dublin, Ireland. While walking across the bridge he saw what was missing. It seems he lacked pencil and paper. He carved it into the bridge:

$i^2 = j^2 = k^2 = ijk = -1$

The bridge now has a plaque commemorating the moment. You can’t make a sensible system with two non-real numbers. But three? Three works.

And they are a mysterious three! i, j, and k are somehow not the same number. But each of them, multiplied by themselves, gives us -1. And the product of the three is -1. They are even more mysterious. To work sensibly, i times j can’t be the same thing as j times i. Instead, i times j equals minus j times i. And j times k equals minus k times j. And k times i equals minus i times k. We must give up commutivity, the idea that the order in which we multiply things doesn’t matter.

But if we’re willing to accept that the order matters, then quaternions are well-behaved things. We can add and subtract them just as we would think to do if we didn’t know they were strange constructs. If we keep the funny rules about the products of i and j and k straight, then we can multiply them as easily as we multiply polynomials together. We can even divide them. We can do all the things we do with real numbers, only with these odd sets of four real numbers.

The way they look, that pattern of 1 + 2i + 3j + 4k, makes them look a lot like vectors. And we can use them like vectors pointing to stuff in three-dimensional space. It’s not quite a comfortable fit, though. That plain old real number at the start of things seems like it ought to signify something, but it doesn’t. In practice, it doesn’t give us anything that regular old vectors don’t. And vectors allow us to ponder not just three- or maybe four-dimensional spaces, but as many as we need. You might wonder why we need more than four dimensions, even allowing for time. It’s because if we want to track a lot of interacting things, it’s surprisingly useful to put them all into one big vector in a very high-dimension space. It’s hard to draw, but the mathematics is nice. Hamiltonian mechanics, particularly, almost beg for it.

That’s not to call them useless, or even a niche interest. They do some things fantastically well. One of them is rotations. We can represent rotating a point around an arbitrary axis by an arbitrary angle as the multiplication of quaternions. There are many ways to calculate rotations. But if we need to do three-dimensional rotations this is a great one because it’s easy to understand and easier to program. And as you’d imagine, being able to calculate what rotations do is useful in all sorts of applications.

They’ve got good uses in number theory too, as they correspond well to the different ways to solve problems, often polynomials. They’re also popular in group theory. They might be the simplest rings that work like arithmetic but that don’t commute. So they can serve as ways to learn properties of more exotic ring structures.

Knowing of these marvelous exotic creatures of the deep mathematics your imagination might be fired. Can we do this again? Can we make something with, say, four unreal numbers? No, no we can’t. Four won’t work. Nor will five. If we keep going, though, we do hit upon success with seven unreal numbers.

This is a set called the octonions. Hamilton had barely worked out the scheme for quaternions when John T Graves, a friend of his at least up through the 16th of December, 1843, wrote of this new scheme. (Graves didn’t publish before Arthur Cayley did. Cayley’s one of those unspeakably prolific 19th century mathematicians. He has at least 967 papers to his credit. And he was a lawyer doing mathematics on the side for about 250 of those papers. This depresses every mathematician who ponders it these days.)

But where quaternions are peculiar, octonions are really peculiar. Let me call a couple quaternions p, q, and r. p times q might not be the same thing as q times r. But p times the product of q and r will be the same thing as the product of p and q itself times r. This we call associativity. Octonions don’t have that. Let me call a couple quaternions s, t, and u. s times the product of t times u may be either positive or negative the product of s and t times u. (It depends.)

Octonions have some neat mathematical properties. But I don’t know of any general uses for them that are as catchy as understanding rotations. Not rotations in the three-dimensional world, anyway.

Yes, yes, we can go farther still. There’s a construct called “sedenions”, which have fifteen non-real numbers on them. That’s 16 terms in each number. Where octonions are peculiar, sedenions are really peculiar. They work even less like regular old numbers than octonions do. With octonions, at least, when you multiply s by the product of s and t, you get the same number as you would multiplying s by s and then multiplying that by t. Sedenions don’t even offer that shred of normality. Besides being a way to learn about abstract algebra structures I don’t know what they’re used for.

I also don’t know of further exotic terms along this line. It would seem to fit a pattern if there’s some 32-term construct that we can define something like multiplication for. But it would presumably be even less like regular multiplication than sedenion multiplication is. If you want to fiddle about with that please do enjoy yourself. I’d be interested to hear if you turn up anything, but I don’t expect it’ll revolutionize the way I look at numbers. Sorry. But the discovery might be the fun part anyway.

## The Set Tour, Part 7: Matrices

I feel a bit odd about this week’s guest in the Set Tour. I’ve been mostly concentrating on sets that get used as the domains or ranges for functions a lot. The ones I want to talk about here don’t tend to serve the role of domain or range. But they are used a great deal in some interesting functions. So I loosen my rule about what to talk about.

## Rm x n and Cm x n

Rm x n might explain itself by this point. If it doesn’t, then this may help: the “x” here is the multiplication symbol. “m” and “n” are positive whole numbers. They might be the same number; they might be different. So, are we done here?

Maybe not quite. I was fibbing a little when I said “x” was the multiplication symbol. R2 x 3 is not a longer way of saying R6, an ordered collection of six real-valued numbers. The x does represent a kind of product, though. What we mean by R2 x 3 is an ordered collection, two rows by three columns, of real-valued numbers. Say the “x” here aloud as “by” and you’re pronouncing it correctly.

What we get is called a “matrix”. If we put into it only real-valued numbers, it’s a “real matrix”, or a “matrix of reals”. Sometimes mathematical terminology isn’t so hard to follow. Just as with vectors, Rn, it matters just how the numbers are organized. R2 x 3 means something completely different from what R3 x 2 means. And swapping which positions the numbers in the matrix occupy changes what matrix we have, as you might expect.

You can add together matrices, exactly as you can add together vectors. The same rules even apply. You can only add together two matrices of the same size. They have to have the same number of rows and the same number of columns. You add them by adding together the numbers in the corresponding slots. It’s exactly what you would do if you went in without preconceptions.

You can also multiply a matrix by a single number. We called this scalar multiplication back when we were working with vectors. With matrices, we call this scalar multiplication. If it strikes you that we could see vectors as a kind of matrix, yes, we can. Sometimes that’s wise. We can see a vector as a matrix in the set R1 x n or as one in the set Rn x 1, depending on just what we mean to do.

It’s trickier to multiply two matrices together. As with vectors multiplying the numbers in corresponding positions together doesn’t give us anything. What we do instead is a time-consuming but not actually hard process. But according to its rules, something in Rm x n we can multiply by something in Rn x k. “k” is another whole number. The second thing has to have exactly as many rows as the first thing has columns. What we get is a matrix in Rm x k.

I grant you maybe didn’t see that coming. Also a potential complication: if you can multiply something in Rm x n by something in Rn x k, can you multiply the thing in Rn x k by the thing in Rm x n? … No, not unless k and m are the same number. Even if they are, you can’t count on getting the same product. Matrices are weird things this way. They’re also gateways to weirder things. But it is a productive weirdness, and I’ll explain why in a few paragraphs.

A matrix is a way of organizing terms. Those terms can be anything. Real matrices are surely the most common kind of matrix, at least in mathematical usage. Next in common use would be complex-valued matrices, much like how we get complex-valued vectors. These are written Cm x n. A complex-valued matrix is different from a real-valued matrix. The terms inside the matrix can be complex-valued numbers, instead of real-valued numbers. Again, sometimes, these mathematical terms aren’t so tricky.

I’ve heard occasionally of people organizing matrices of other sets. The notation is similar. If you’re building a matrix of “m” rows and “n” columns out of the things you find inside a set we’ll call H, then you write that as Hm x n. I’m not saying you should do this, just that if you need to, that’s how to tell people what you’re doing.

Now. We don’t really have a lot of functions that use matrices as domains, and I can think of fewer that use matrices as ranges. There are a couple of valuable ones, ones so valuable they get special names like “eigenvalue” and “eigenvector”. (Don’t worry about what those are.) They take in Rm x n or Cm x n and return a set of real- or complex-valued numbers, or real- or complex-valued vectors. Not even those, actually. Eigenvectors and eigenfunctions are only meaningful if there are exactly as many rows as columns. That is, for Rm x m and Cm x m. These are known as “square” matrices, just as you might guess if you were shaken awake and ordered to say what you guessed a “square matrix” might be.

They’re important functions. There are some other important functions, with names like “rank” and “condition number” and the like. But they’re not many. I believe they’re not even thought of as functions, any more than we think of “the length of a vector” as primarily a function. They’re just properties of these matrices, that’s all.

So why are they worth knowing? Besides the joy that comes of knowing something, I mean?

Here’s one answer, and the one that I find most compelling. There is cultural bias in this: I come from an applications-heavy mathematical heritage. We like differential equations, which study how stuff changes in time and in space. It’s very easy to go from differential equations to ordered sets of equations. The first equation may describe how the position of particle 1 changes in time. It might describe how the velocity of the fluid moving past point 1 changes in time. It might describe how the temperature measured by sensor 1 changes as it moves. It doesn’t matter. We get a set of these equations together and we have a majestic set of differential equations.

Now, the dirty little secret of differential equations: we can’t solve them. Most interesting physical phenomena are nonlinear. Linear stuff is easy. Small change 1 has effect A; small change 2 has effect B. If we make small change 1 and small change 2 together, this has effect A plus B. Nonlinear stuff, though … it just doesn’t work. Small change 1 has effect A; small change 2 has effect B. Small change 1 and small change 2 together has effect … A plus B plus some weird A times B thing plus some effect C that nobody saw coming and then C does something with A and B and now maybe we’d best hide.

There are some nonlinear differential equations we can solve. Those are the result of heroic work and brilliant insights. Compared to all the things we would like to solve there’s not many of them. Methods to solve nonlinear differential equations are as precious as ways to slay krakens.

But here’s what we can do. What we usually like to know about in systems are equilibriums. Those are the conditions in which the system stops changing. Those are interesting. We can usually find those points by boring but not conceptually challenging calculations. If we can’t, we can declare x0 represents the equilibrium. If we still care, we leave calculating its actual values to the interested reader or hungry grad student.

But what’s really interesting is: what happens if we’re near but not exactly at the equilibrium? Sometimes, we stay near it. Think of pushing a swing. However good a push you give, it’s going to settle back to the boring old equilibrium of dangling straight down. Sometimes, we go racing away from it. Think of trying to balance a pencil on its tip; if we did this perfectly it would stay balanced. It never does. We’re never perfect, or there’s some wind or somebody walks by and the perfect balance is foiled. It falls down and doesn’t bounce back up. Sometimes, whether it it stays near or goes away depends on what way it’s away from the equilibrium.

And now we finally get back to matrices. Suppose we are starting out near an equilibrium. We can, usually, approximate the differential equations that describe what will happen. The approximation may only be good if we’re just a tiny bit away from the equilibrium, but that might be all we really want to know. That approximation will be some linear differential equations. (If they’re not, then we’re just wasting our time.) And that system of linear differential equations we can describe using matrices.

If we can write what we are interested in as a set of linear differential equations, then we have won. We can use the many powerful tools of matrix arithmetic — linear algebra, specifically — to tell us everything we want to know about the system. We can say whether a small push away from the equilibrium stays small, or whether it grows, or whether it depends. We can say how fast the small push shrinks, or grows (for a while). We can say how the system will change, approximately.

This is what I love in matrices. It’s not everything there is to them. But it’s enough to make matrices important to me.

## C

The square root of negative one. Everybody knows it doesn’t exist; there’s no real number you can multiply by itself and get negative one out. But then sometime in algebra, deep in a section about polynomials, suddenly we come out and declare there is such a thing. It’s an “imaginary number” that we call “i”. It’s hard to blame students for feeling betrayed by this. To make it worse, we throw real and imaginary numbers together and call the result “complex numbers”. It’s as if we’re out to tease them for feeling confused.

It’s an important set of things, though. It turns up as the domain, or the range, of functions so often that one of the major fields of analysis is called, “Complex Analysis”. If the course listing allows for more words, it’s called “Analysis of Functions of a Complex Variable” or something like that. Despite the connotations of the word “complex”, though, the field is a delight. It’s considerably easier to understand than Real Analysis, the study of functions of mere real numbers. When there is a theorem that has a version in Real Analysis and a version in Complex Analysis, the Complex Analysis side is usually easier to prove and easier to understand. It’s uncanny.

The set of all complex numbers is denoted C, in parallel to the set of real numbers, R. To make it clear that we mean this set, and not some piddling little common set that might happen to share the name C, add a vertical stroke to the left of the letter. This is just as we add a vertical stroke to R to emphasize we mean the Real Numbers. We should approach the set with respect, removing our hats, thinking seriously about great things. It would look silly to add a second curve to C though, so we just add a straight vertical stroke on the left side of the letter C. This makes it look a bit like it’s an Old English typeface (the kind you call Gothic until you learn that means “sans serif”) pared down to its minimum.

Why do we teach people there’s no such thing as a square root of minus one, and then one day, teach them there is? Part of it is that whether there is a square root depends on your context. If you are interested only in the real numbers, there’s nothing that, squared, gives you minus one. This is exactly the way that it’s not possible to equally divide five objects between two people if you aren’t allowed to cut the objects in half. But if you are willing to allow half-objects to be things, then you can do what was previously forbidden. What you can do depends on what the rules you set out are.

And there’s surely some echo of the historical discovery of imaginary and complex numbers at work here. They were noticed when working out the roots of third- and fourth-degree polynomials. These can be done by way of formulas that nobody ever remembers because there are so many better things to remember. These formulas would sometimes require one to calculate a square root of a negative number, a thing that obviously didn’t exist. Except that if you pretended it did, you could get out correct answers, just as if these were ordinary numbers. You can see why this may be dubbed an “imaginary” number. The name hints at the suspicion with which it’s viewed. It’s much as “negative” numbers look like some trap to people who’re just getting comfortable with fractions.

It goes against the stereotype of mathematicians to suppose they’d accept working with something they don’t understand because the results are all right, afterwards. But, actually, mathematicians are willing to accept getting answers by any crazy method. If you have a plausible answer, you can test whether it’s right, and if all you really need this minute is the right answer, good.

But we do like having methods; they’re more useful than mere answers. And we can imagine this set called the complex numbers. They contain … well, all the possible roots, the solutions, of all polynomials. (The polynomials might have coefficients — the numbers in front of the variable — of integers, or rational numbers, or irrational numbers. If we already accept the idea of complex numbers, the coefficients can be complex numbers too.)

It’s exceedingly common to think of the complex numbers by starting off with a new number called “i”. This is a number about which we know nothing except that i times i equals minus one. Then we tend to think of complex numbers as “a real number plus i times another real number”. The first real number gets called “the real component”, and is usually denoted as either “a” or “x”. The second real number gets called “the imaginary component”, and is usually denoted as either “b” or “y”. Then the complex number is written “a + i*b” or “x + i*y”. Sometimes it’s written “a + b*i” or “x + y*i”; that’s a mere matter of house style. Don’t let it throw you.

Writing a complex number this way has advantages. Particularly, it makes it easy to see how one would add together (or subtract) complex numbers: “a + b*i + x + y*i” almost suggests that the sum should be “(a + x) + (b + y)*i”. What we know from ordinary arithmetic gives us guidance. And if we’re comfortable with binomials, then we know how to multiply complex numbers. Start with “(a + b*i) * (x + y*i)” and follow the distributive law. We get, first, “a*x + a*y*i + b*i*x + b*y*i*i”. But “i*i” equals minus one, so this is the same as “a*x + a*y*i + b*i*x – b*y”. Move the real components together, and move the imaginary components together, and we have “(a*x – b*y) + (a*y + b*x)*i”.

That’s the most common way of writing out complex numbers. It’s so common that Eric W Weisstein’s Mathworld encyclopedia even says that’s what complex numbers are. But it isn’t the only way to construct, or look at, complex numbers. A common alternate way to look at complex numbers is to match a complex number to a point on the plane, or if you prefer, a point in the set R2.

It’s surprisingly natural to think of the real component as how far to the right or left of an origin your complex number is, and to think of the imaginary component as how far above or below the origin it is. Much complex-number work makes sense if you think of complex numbers as points in space, or directions in space. The language of vectors trips us up only a little bit here. We speak of a complex number as corresponding to a point on the “complex plane”, just as we might speak of a real number as a point on the “(real) number line”.

But there are other descriptions yet. We can represent complex numbers as a pair of numbers with a scheme that looks like polar coordinates. Pick a point on the complex plane. We can say where that is by two points of information. The first is the amplitude, or magnitude: how far the point is from the origin. The second is the phase, or angle: draw the line segment connecting the origin and your point. What angle does that make with the positive horizontal axis?

This representation is called the “phasor” representation. It’s tolerably popular in physics and I hear tell of engineers liking it. We represent numbers then not as “x + i*y” but instead as “r * e”, with r the magnitude and θ the angle. “e” is the base of the natural logarithm, which you get very comfortable with if you do much mathematics or physics. And “i” is just what we’ve been talking about here. This is a pretty natural way to write about complex numbers that represent stuff that oscillates, such as alternating current or the probability function in quantum mechanics. A lot of stuff oscillates, if you study it through the right lens. So numbers that look like this keep creeping in, and into unexpected places. It’s quite easy to multiply numbers in phasor form — just multiply the magnitude parts, and add the angle parts — although addition and subtraction become a pain.

Mathematicians generally use the letter “z” to represent a complex-valued number whose identity is not known. As best I can tell, this is because we do think so much of a complex number as the sum “x + y*i”. So if we used familiar old “x” for an unknown number, it would carry the connotations of “the real component of our complex-valued number” and mislead the unwary mathematician. The connection is so common that a mathematician might carelessly switch between “z” and the real and imaginary components “x” and “y” without specifying that “z” is another way of writing “x + y*i”. A good copy editor or an alert student should catch this.

Complex numbers work very much like real numbers do. They add and multiply in natural-looking ways, and you can do subtraction and division just as well. You can take exponentials, and can define all the common arithmetic functions — sines and cosines, square roots and logarithms, integrals and differentials — on them just as well as you can with real numbers. And you can embed the real numbers within the complex numbers: if you have a real number x, you can match that perfectly with the complex number “x + 0*i”.

But that doesn’t mean complex numbers are exactly like the real numbers. For example, it’s possible to order the real numbers. You can say that the number “a” is less than the number “b”, and have that mean something. That’s not possible to do with complex numbers. You can’t say that “a + b*i” is less than, or greater than, “x + y*i” in a logically consistent way. You can say the magnitude of one complex-valued number is greater than the magnitude of another. But the magnitudes are real numbers. For all that complex numbers give us there are things they’re not good for.

## Reading the Comics, July 1, 2012

This will be a hastily-written installment since I married just this weekend and have other things occupying me. But there’s still comics mentioning math subjects so let me summarize them for you. The first since my last collection of these, on the 13th of June, came on the 15th, with Dave Whamond’s Reality Check, which goes into one of the minor linguistic quirks that bothers me: the claim that one can’t give “110 percent,” since 100 percent is all there is. I don’t object to phrases like “110 percent”, though, since it seems to me the baseline, the 100 percent, must be to some standard reference performance. For example, the Space Shuttle Main Engines routinely operated at around 104 percent, not because they were exceeding their theoretical limits, but because the original design thrust was found to be not quite enough, and the engines were redesigned to deliver more thrust, and it would have been far too confusing to rewrite all the documentation so that the new design thrust was the new 100 percent. Instead 100 percent was the design capacity of an engine which never flew but which existed in paper form. So I’m forgiving of “110 percent” constructions, is the important thing to me.

## Reading The Comics, May 20, 2012

Since I suspect that the comics roundup posts are the most popular ones I post, I’m very glad to see there was a bumper crop of strips among the ones I read regularly (from King Features Syndicate and from gocomics.com) this past week. Some of those were from cancelled strips in perpetual reruns, but that’s fine, I think: there aren’t any particular limits on how big an electronic comics page one can have, after all, and while it’s possible to read a short-lived strip long enough that you see all its entries, it takes a couple go-rounds to actually have them all memorized.

The first entry, and one from one of these cancelled strips, comes from Mark O’Hare’s Citizen Dog, a charmer of a comic set in a world-plus-talking-animals strip. In this case Fergus has taken the place of Maggie, a girl who’s not quite ready to come back from summer vacation. It’s also the sort of series of questions that it feels like come at the start of any class where a homework assignment’s due.