A Leap Day 2016 Mathematics A To Z: Transcendental Number

I’m down to the last seven letters in the Leap Day 2016 A To Z. It’s also the next-to-the-last of Gaurish’s requests. This was a fun one.

Transcendental Number.

Take a huge bag and stuff all the real numbers into it. Give the bag a good solid shaking. Stir up all the numbers until they’re thoroughly mixed. Reach in and grab just the one. There you go: you’ve got a transcendental number. Enjoy!

OK, I detect some grumbling out there. The first is that you tried doing this in your head because you somehow don’t have a bag large enough to hold all the real numbers. And you imagined pulling out some number like “2” or “37” or maybe “one-half”. And you may not be exactly sure what a transcendental number is. But you’re confident the strangest number you extracted, “minus 8”, isn’t it. And you’re right. None of those are transcendental numbers.

I regret saying this, but that’s your own fault. You’re lousy at picking random numbers from your head. So am I. We all are. Don’t believe me? Think of a positive whole number. I predict you probably picked something between 1 and 10. Almost surely something between 1 and 100. Surely something less than 10,000. You didn’t even consider picking something between 10,012,002,214,473,325,937,775 and 10,012,002,214,473,325,937,785. Challenged to pick a number, people will select nice and familiar ones. The nice familiar numbers happen not to be transcendental.

I detect some secondary grumbling there. Somebody picked π. And someone else picked e. Very good. Those are transcendental numbers. They’re also nice familiar numbers, at least to people who like mathematics a lot. So they attract attention.

Still haven’t said what they are. What they are traces back, of course, to polynomials. Take a polynomial that’s got one variable, which we call ‘x’ because we don’t want to be difficult. Suppose that all the coefficients of the polynomial, the constant numbers we presumably know or could find out, are integers. What are the roots of the polynomial? That is, for what values of x is the polynomial a complicated way of writing ‘zero’?

For example, try the polynomial x2 – 6x + 5. If x = 1, then that polynomial is equal to zero. If x = 5, the polynomial’s equal to zero. Or how about the polynomial x2 + 4x + 4? That’s equal to zero if x is equal to -2. So a polynomial with integer coefficients can certainly have positive and negative integers as roots.

How about the polynomial 2x – 3? Yes, that is so a polynomial. This is almost easy. That’s equal to zero if x = 3/2. How about the polynomial (2x – 3)(4x + 5)(6x – 7)? It’s my polynomial and I want to write it so it’s easy to find the roots. That polynomial will be zero if x = 3/2, or if x = -5/4, or if x = 7/6. So a polynomial with integer coefficients can have positive and negative rational numbers as roots.

How about the polynomial x2 – 2? That’s equal to zero if x is the square root of 2, about 1.414. It’s also equal to zero if x is minus the square root of 2, about -1.414. And the square root of 2 is irrational. So we can certainly have irrational numbers as roots.

So if we can have whole numbers, and rational numbers, and irrational numbers as roots, how can there be anything else? Yes, complex numbers, I see you raising your hand there. We’re not talking about complex numbers just now. Only real numbers.

It isn’t hard to work out why we can get any whole number, positive or negative, from a polynomial with integer coefficients. Or why we can get any rational number. The irrationals, though … it turns out we can only get some of them this way. We can get square roots and cube roots and fourth roots and all that. We can get combinations of those. But we can’t get everything. There are irrational numbers that are there but that even polynomials can’t reach.

It’s all right to be surprised. It’s a surprising result. Maybe even unsettling. Transcendental numbers have something peculiar about them. The 19th Century French mathematician Joseph Liouville first proved the things must exist, in 1844. (He used continued fractions to show there must be such things.) It would be seven years later that he gave an example of one in nice, easy-to-understand decimals. This is the number 0.110 001 000 000 000 000 000 001 000 000 (et cetera). This number is zero almost everywhere. But there’s a 1 in the n-th digit past the decimal if n is the factorial of some number. That is, 1! is 1, so the 1st digit past the decimal is a 1. 2! is 2, so the 2nd digit past the decimal is a 1. 3! is 6, so the 6th digit past the decimal is a 1. 4! is 24, so the 24th digit past the decimal is a 1. The next 1 will appear in spot number 5!, which is 120. After that, 6! is 720 so we wait for the 720th digit to be 1 again.

And what is this Liouville number 0.110 001 000 000 000 000 000 001 000 000 (et cetera) used for, besides showing that a transcendental number exists? Not a thing. It’s of no other interest. And this plagued the transcendental numbers until 1873. The only examples anyone had of transcendental numbers were ones built to show that they existed. In 1873 Charles Hermite showed finally that e, the base of the natural logarithm, was transcendental. e is a much more interesting number; we have reasons to care about it. Every exponential growth or decay or oscillating process has e lurking in it somewhere. In 1882 Ferdinand von Lindemann showed that π was transcendental, and that’s an even more interesting number.

That bit about π has interesting implications. One goes back to the ancient Greeks. Is it possible, using straightedge and compass, to create a square that’s exactly the same size as a given circle? This is equivalent to saying, if I give you a line segment, can you create another line segment that’s exactly the square root of π times as long? This geometric problem is equivalent to an algebraic one. That problem: can you create a polynomial, with integer coefficients, that has the square root of π as a root? (WARNING: I’m skipping some important points for the sake of clarity. DO NOT attempt to use this to pass your thesis defense without putting those points back in.) We want the square root of π because … well, what’s the area of a square whose sides are the square root of π long? That’s right. So we start with a line segment that’s equal to the radius of the circle and we can do that, surely. Once we have the radius, can’t we make a line that’s the square root of π times the radius, and from that make a square with area exactly π times the radius squared? Since π is transcendental, then, no. We can’t. Sorry. One of the great problems of ancient mathematics, and one that still has the power to attract the casual mathematician, got its final answer in 1882.

Georg Cantor is a name even non-mathematicians might recognize. He showed there have to be some infinite sets bigger than others, and that there must be more real numbers than there are rational numbers. Four years after showing that, he proved there are as many transcendental numbers as there are real numbers.

They’re everywhere. They permeate the real numbers so much that we can understand the real numbers as the transcendental numbers plus some dust. They’re almost the dark matter of mathematics. We don’t actually know all that many of them. Wolfram MathWorld has a table listing numbers proven to be transcendental, and the fact we can list that on a single web page is remarkable. Some of them are large sets of numbers, yes, like $e^{\pi \sqrt{d}}$ for every positive whole number d. And we can infer many more from them; if π is transcendental then so is 2π, and so is 5π, and so is -20.38π, and so on. But the table of numbers proven to be irrational is still just 25 rows long.

There are even mysteries about obvious numbers. π is transcendental. So is e. We know that at least one of π times e and π plus e is transcendental. Perhaps both are. We don’t know which one is, or if both are. We don’t know whether ππ is transcendental. We don’t know whether ee is, either. Don’t even ask if πe is.

How, by the way, does this fit with my claim that everything in mathematics is polynomials? — Well, we found these numbers in the first place by looking at polynomials. The set is defined, even to this day, by how a particular kind of polynomial can’t reach them. Thinking about a particular kind of polynomial makes visible this interesting set.

The Set Tour, Part 12: What Can You Do With Functions?

I want to resume my tour of sets that turn up a lot as domains and ranges. But I need to spend some time explaining stuff before the next bunch. I want to talk about things that aren’t so familiar as “numbers” or “shapes”. We get into more abstract things.

We have to start out with functions. Functions are built of three points, a set that’s the domain, a set that’s the range, and a rule that matches things in the domain to things in the range. But what’s a set? Sets are bunches of things. (If we want to avoid logical chaos we have to be more exact. But we’re not going near the zones of logical chaos. So we’re all right going with “sets are bunches of things”. WARNING: do not try to pass this off at your thesis defense.)

So if a function is a thing, can’t we have a set that’s made up of functions? Sure, why not? We can get a set by describing the collection of things we want in it. At least if we aren’t doing anything weird. (See above warning.)

Let’s pick out a set of functions. Put together a group of functions that all have the same set as their domain, and that have compatible sets as their range. The real numbers are a good pick for a domain. They’re also good for a range.

Is this an interesting set? Generally, a set is boring unless we can do something with the stuff in it. That something is, almost always, taking a pair of the things in the set and relating it to something new. Whole numbers, for example, would be trivia if we weren’t able to add them together. Real numbers would be a complicated pile of digits if we couldn’t multiply them together. Having things is nice. Doing stuff with things is all that’s meaningful.

So what can we do with a couple of functions, if they have the same domains and ranges? Let’s pick one out. Give it the name ‘f’. That’s a common name for functions. It was given to us by Leonhard Euler, who was brilliant in every field of mathematics, including in creating notation. Now let’s pick out a function again. Give this new one the name ‘g’. That’s a common name for functions, given to us by every mathematician who needed something besides ‘f’. (There are alternatives. One is to start using subscripts, like f1 and f2. That’s too hard for me to type. Another is to use different typefaces. Again, too hard for me. Another is to use lower- and upper-case letters, ‘f’ and ‘F’. Using alternate-case forms usually connotes that these two functions are related in some way. I don’t want to suggest that they are related here. So, ‘g’ it is.)

We can do some obvious things. We can add them together. We can create a new function, imaginatively named f + g’. It’ll have the same domain and the same range as f and g did. What rule defines how it matches things in the domain to things in the range?

Mathematicians throw the term “obvious” around a lot. Also “intuitive”. What they mean is “what makes sense to me but I don’t want to write it down”. Saying that is fine if your mathematician friend knows roughly what you’d think makes sense. It can be catastrophic if she’s much smarter than you, or thinks in weird ways, and is always surprised other people don’t think like her. It’s hard to better describe it than “obvious”, though. Well, here goes.

Let me pick something that’s in the domain of both f and g. I’m going to call that x, which mathematicians have been doing ever since René Descartes gave us the idea. So “f(x)” is something in the range of f, and “g(x) is something in the range of g. I said, way up earlier, that both of these ranges are the same set and suggested the real numbers there. That is, f(x) is some real number and I don’t care which just now. g(x) is also some real number and again I don’t care right now just which.

The function we call “f + g” matches the thing x, in the domain, to something in the range. What thing? The number f(x) + g(x). I told you, I can’t see any fair way to describe that besides being “obvious” and “intuitive”.

Another thing we’ll want to do is multiply a function by a real number. Suppose we have a function f, just like above. Give me a real number. We’ll call that real number ‘a’ because I don’t remember if you can do the alpha symbol easily on web pages. Anyway, we can define a function, af’, the multiplication of the real number a by the function f. It has the same domain as f, and the same range as f. What’s its rule?

Let me say x is something in the domain of f. So f(x) is some real number. Then the new function af’ matches the x in the domain with a real number. That number is what you get by multiplying a’ by whatever `f(x)’ is. So there are major parts of your mathematician friend from college’s classes that you could have followed without trouble.

(Her class would have covered many more things, mind you, and covered these more cryptically.)

There’s more stuff we would like to do with functions. But for now, this is enough. This lets us turn a set of functions into a “vector space”. Vector spaces are kinds of things that work, at least a bit, like arithmetic. And mathematicians have studied these kinds of things. We have a lot of potent tools that work on vector spaces. So mathematicians develop a habit of finding vector spaces in what they study.

And I’m subject to that too. This is why I’ve spent such time talking about what we can do with functions rather than naming particular sets. I’ll pick up from that.

The Set Tour, Part 7: Matrices

I feel a bit odd about this week’s guest in the Set Tour. I’ve been mostly concentrating on sets that get used as the domains or ranges for functions a lot. The ones I want to talk about here don’t tend to serve the role of domain or range. But they are used a great deal in some interesting functions. So I loosen my rule about what to talk about.

Rm x n and Cm x n

Rm x n might explain itself by this point. If it doesn’t, then this may help: the “x” here is the multiplication symbol. “m” and “n” are positive whole numbers. They might be the same number; they might be different. So, are we done here?

Maybe not quite. I was fibbing a little when I said “x” was the multiplication symbol. R2 x 3 is not a longer way of saying R6, an ordered collection of six real-valued numbers. The x does represent a kind of product, though. What we mean by R2 x 3 is an ordered collection, two rows by three columns, of real-valued numbers. Say the “x” here aloud as “by” and you’re pronouncing it correctly.

What we get is called a “matrix”. If we put into it only real-valued numbers, it’s a “real matrix”, or a “matrix of reals”. Sometimes mathematical terminology isn’t so hard to follow. Just as with vectors, Rn, it matters just how the numbers are organized. R2 x 3 means something completely different from what R3 x 2 means. And swapping which positions the numbers in the matrix occupy changes what matrix we have, as you might expect.

You can add together matrices, exactly as you can add together vectors. The same rules even apply. You can only add together two matrices of the same size. They have to have the same number of rows and the same number of columns. You add them by adding together the numbers in the corresponding slots. It’s exactly what you would do if you went in without preconceptions.

You can also multiply a matrix by a single number. We called this scalar multiplication back when we were working with vectors. With matrices, we call this scalar multiplication. If it strikes you that we could see vectors as a kind of matrix, yes, we can. Sometimes that’s wise. We can see a vector as a matrix in the set R1 x n or as one in the set Rn x 1, depending on just what we mean to do.

It’s trickier to multiply two matrices together. As with vectors multiplying the numbers in corresponding positions together doesn’t give us anything. What we do instead is a time-consuming but not actually hard process. But according to its rules, something in Rm x n we can multiply by something in Rn x k. “k” is another whole number. The second thing has to have exactly as many rows as the first thing has columns. What we get is a matrix in Rm x k.

I grant you maybe didn’t see that coming. Also a potential complication: if you can multiply something in Rm x n by something in Rn x k, can you multiply the thing in Rn x k by the thing in Rm x n? … No, not unless k and m are the same number. Even if they are, you can’t count on getting the same product. Matrices are weird things this way. They’re also gateways to weirder things. But it is a productive weirdness, and I’ll explain why in a few paragraphs.

A matrix is a way of organizing terms. Those terms can be anything. Real matrices are surely the most common kind of matrix, at least in mathematical usage. Next in common use would be complex-valued matrices, much like how we get complex-valued vectors. These are written Cm x n. A complex-valued matrix is different from a real-valued matrix. The terms inside the matrix can be complex-valued numbers, instead of real-valued numbers. Again, sometimes, these mathematical terms aren’t so tricky.

I’ve heard occasionally of people organizing matrices of other sets. The notation is similar. If you’re building a matrix of “m” rows and “n” columns out of the things you find inside a set we’ll call H, then you write that as Hm x n. I’m not saying you should do this, just that if you need to, that’s how to tell people what you’re doing.

Now. We don’t really have a lot of functions that use matrices as domains, and I can think of fewer that use matrices as ranges. There are a couple of valuable ones, ones so valuable they get special names like “eigenvalue” and “eigenvector”. (Don’t worry about what those are.) They take in Rm x n or Cm x n and return a set of real- or complex-valued numbers, or real- or complex-valued vectors. Not even those, actually. Eigenvectors and eigenfunctions are only meaningful if there are exactly as many rows as columns. That is, for Rm x m and Cm x m. These are known as “square” matrices, just as you might guess if you were shaken awake and ordered to say what you guessed a “square matrix” might be.

They’re important functions. There are some other important functions, with names like “rank” and “condition number” and the like. But they’re not many. I believe they’re not even thought of as functions, any more than we think of “the length of a vector” as primarily a function. They’re just properties of these matrices, that’s all.

So why are they worth knowing? Besides the joy that comes of knowing something, I mean?

Here’s one answer, and the one that I find most compelling. There is cultural bias in this: I come from an applications-heavy mathematical heritage. We like differential equations, which study how stuff changes in time and in space. It’s very easy to go from differential equations to ordered sets of equations. The first equation may describe how the position of particle 1 changes in time. It might describe how the velocity of the fluid moving past point 1 changes in time. It might describe how the temperature measured by sensor 1 changes as it moves. It doesn’t matter. We get a set of these equations together and we have a majestic set of differential equations.

Now, the dirty little secret of differential equations: we can’t solve them. Most interesting physical phenomena are nonlinear. Linear stuff is easy. Small change 1 has effect A; small change 2 has effect B. If we make small change 1 and small change 2 together, this has effect A plus B. Nonlinear stuff, though … it just doesn’t work. Small change 1 has effect A; small change 2 has effect B. Small change 1 and small change 2 together has effect … A plus B plus some weird A times B thing plus some effect C that nobody saw coming and then C does something with A and B and now maybe we’d best hide.

There are some nonlinear differential equations we can solve. Those are the result of heroic work and brilliant insights. Compared to all the things we would like to solve there’s not many of them. Methods to solve nonlinear differential equations are as precious as ways to slay krakens.

But here’s what we can do. What we usually like to know about in systems are equilibriums. Those are the conditions in which the system stops changing. Those are interesting. We can usually find those points by boring but not conceptually challenging calculations. If we can’t, we can declare x0 represents the equilibrium. If we still care, we leave calculating its actual values to the interested reader or hungry grad student.

But what’s really interesting is: what happens if we’re near but not exactly at the equilibrium? Sometimes, we stay near it. Think of pushing a swing. However good a push you give, it’s going to settle back to the boring old equilibrium of dangling straight down. Sometimes, we go racing away from it. Think of trying to balance a pencil on its tip; if we did this perfectly it would stay balanced. It never does. We’re never perfect, or there’s some wind or somebody walks by and the perfect balance is foiled. It falls down and doesn’t bounce back up. Sometimes, whether it it stays near or goes away depends on what way it’s away from the equilibrium.

And now we finally get back to matrices. Suppose we are starting out near an equilibrium. We can, usually, approximate the differential equations that describe what will happen. The approximation may only be good if we’re just a tiny bit away from the equilibrium, but that might be all we really want to know. That approximation will be some linear differential equations. (If they’re not, then we’re just wasting our time.) And that system of linear differential equations we can describe using matrices.

If we can write what we are interested in as a set of linear differential equations, then we have won. We can use the many powerful tools of matrix arithmetic — linear algebra, specifically — to tell us everything we want to know about the system. We can say whether a small push away from the equilibrium stays small, or whether it grows, or whether it depends. We can say how fast the small push shrinks, or grows (for a while). We can say how the system will change, approximately.

This is what I love in matrices. It’s not everything there is to them. But it’s enough to make matrices important to me.

The Set Tour, Part 6: One Big One Plus Some Rubble

I have a couple of sets for this installment of the Set Tour. It’s still an unusual installment because only one of the sets is that important for my purposes here. The rest I mention because they appear a lot, even if they aren’t much used in these contexts.

I, or J, or maybe Z

The important set here is the integers. You know the integers: they’re the numbers everyone knows. They’re the numbers we count with. They’re 1 and 2 and 3 and a hundred million billion. As we get older we come to accept 0 as an integer, and even the negative integers like “negative 12” and “minus 40” and all that. The integers might be the easiest mathematical construct to know. The positive integers, anyway. The negative ones are still a little suspicious.

The set of integers has several shorthand names. I is a popular and common one. As with the real-valued numbers R and the complex-valued numbers C it gets written by hand, and typically typeset, with a double vertical stroke. And we’ll put horizontal serifs on the top and bottom of the symbol. That’s a concession to readability. You see the same effect in comic strip lettering. A capital “I” in the middle of a word will often be written without serifs, while the word by itself needs the extra visual bulk.

The next popular symbol is J, again with a double vertical stroke. This gets used if we want to reserve “I”, or the word “I”, for some other purpose. J probably gets used because it’s so very close to I, and it’s only quite recently (in historic terms) that they’ve even been seen as different letters.

The symbol that seems to come out of nowhere is Z. It comes less from nowhere than it does from German. The symbol derives from “Zahl”, meaning “number”. It seems to have got into mathematics by way of Nicolas Bourbaki, the renowned imaginary French mathematician. The Z gets written with a double diagonal stroke.

Personally, I like Z most of this set, but on trivial grounds. It’s a more fun letter to write, especially since I write it with the middle horizontal stroke that. I’ve got no good cultural or historical reason for this. I just picked it up as a kid and never set it back down.

In these Set Tour essays I’m trying to write about sets that get used often as domains and ranges for functions. The integers get used a fair bit, although not nearly as often as real numbers do. The integers are a natural way to organize sequences of numbers. If the record of a week’s temperatures (in Fahrenheit) are “58, 45, 49, 54, 58, 60, 64”, there’s an almost compelling temperature function here. f(1) = 58, f(2) = 45, f(3) = 49, f(4) = 54, f(5) = 58, f(6) = 60, f(7) = 64. This is a function that has as its domain the integers. It happens that the range here is also integers, although you might be able to imagine a day when the temperature reading was 54.5.

Sequences turn up a lot. We are almost required to measure things we are interested in in discrete samples. So mathematical work with sequences uses integers as the domain almost by default. The use of integers as a domain gets done so often that it often becomes invisible, though. Someone studying my temperature data above might write the data as f1, f2, f3, and so on. One might reasonably never even notice there’s a function there, or a domain.

And that’s fine. A tool can be so useful it disappears. Attend a play; the stage is in light and the audience in darkness. The roles the light and darkness play disappear unless the director chooses to draw attention to this choice.

And to be honest, integers are a lousy domain for functions. It’s achingly hard to prove things for functions defined just on the integers. The easiest way to do anything useful is typically to find an equivalent problem for a related function that’s got the real numbers as a domain. Then show the answer for that gives you your best-possible answer for the original question.

If all we want are the positive integers, we put a little superscript + to our symbol: I+ or J+ or Z+. That’s a popular choice if we’re using the integers as an index. If we just want the negative numbers that’s a little weird, but, change the plus sign to a minus: I.

Now for some trouble.

Sometimes we want the positive numbers and zero, or in the lingo, the “nonnegative numbers”. Good luck with that. Mathematicians haven’t quite settled on what this should be called, or abbreviated. The “Natural numbers” is a common name for the numbers 0, 1, 2, 3, 4, and so on, and this makes perfect sense and gets abbreviated N. You can double-brace the left vertical stroke, or the diagonal stroke, as you like and that will be understood by everybody.

That is, everybody except the people who figure “natural numbers” should be 1, 2, 3, 4, and so on, and that zero has no place in this set. After all, every human culture counts with 1 and 2 and 3, and for that matter crows and raccoons understand the concept of “four”. Yet it took thousands of years for anyone to think of “zero”, so how natural could that be?

So we might resort to speaking of the “whole numbers” instead. More good luck with that. Besides leaving open the question of whether zero should be considered “whole” there’s the linguistic problem. “Whole” number carries, for many, the implication of a number that is an integer with no fractional part. We already have the word “integer” for that, yes. But the fact people will talk about rounding off to a whole number suggests the phrase “whole number” serves some role that the word “integer” doesn’t. Still, W is sitting around not doing anything useful.

Then there’s “counting numbers”. I would be willing to endorse this as a term for the integers 0, 1, 2, 3, 4, and so on, except. Have you ever met anybody who starts counting from zero? Yes, programmers for some — not all! — computer languages. You know which computer languages. They’re the languages which baffle new students because why on earth would we start counting things from zero all of a sudden? And the obvious single-letter abbreviation C is no good because we need that for complex numbers, a set that people actually use for domains a lot.

There is a good side to this, if you aren’t willing to sit out the 150 years or so mathematicians are going to need to sort this all out. You can set out a symbol that makes sense to you, early on in your writing, and stick with it. If you find you don’t like it, you can switch to something else in your next paper and nobody will protest. If you figure out a good one, people may imitate you. If you figure out a really good one, people will change it just a tiny bit so that their usage drives you crazy. Life is like that.

Eric Weisstein’s Mathworld recommends using Z* for the nonnegative integers. I don’t happen to care for that. I usually associate superscript * symbols with some operations involving complex-valued numbers and with the duals of sets, neither of which is in play here. But it’s not like he’s wrong and I’m right. If I were forced to pick a symbol right now I’d probably give Z0+. And for the nonpositive itself — the negative integers and zero — Z0- presents itself. I fully understand there are people who would be driven stark raving mad by this. Maybe you have a better one. I’d believe that.

Let me close with something non-controversial.

These are some sets that are too important to go unmentioned. But they don’t get used much in the domain-and-range role I’ve been using as basis for these essays. They are, in the terrain of these essays, some rubble.

You know the rational numbers? They’re the things you can write as fractions: 1/2, 5/13, 32/7, -6/7, 0 (think about it). This is a quite useful set, although it doesn’t get used much for the domain or range of functions, at least not in the fields of mathematics I see. It gets abbreviated as Q, though. There’s an extra vertical stroke on the left side of the loop, just as a vertical stroke gets added to the C for complex-valued numbers. Why Q? Well, “R” is already spoken for, as we need it for the real numbers. The key here is that every rational number can be written as the quotient of one integer divided by another. So, this is the set of Quotients. This abbreviation we get thanks to Bourbaki, the same folks who gave us Z for integers. If it strikes you that the imaginary French mathematician Bourbaki used a lot of German words, all I can say is I think that might have been part of the fun of the Bourbaki project. (Well, and German mathematicians gave us many breakthroughs in the understanding of sets in the late 19th and early 20th centuries. We speak with their language because they spoke so well.)

If you’re comfortable with real numbers and with rational numbers, you know of irrational numbers. These are (most) square roots, and pi and e, and the golden ratio and a lot of cosines of angles. Strangely, there really isn’t any common shorthand name or common notation for the irrational numbers. If we need to talk about them, we have the shorthand “R \ Q”. This means “the real numbers except for the rational numbers”. Or we have the shorthand “Qc”. This means “everything except the rational numbers”. That “everything” carries the implication “everything in the real numbers”. The “c” in the superscript stands for “complement”, everything outside the set we’re talking about. These are ungainly, yes. And it’s a bit odd considering that most real numbers are irrational numbers. The rational numbers are a most ineffable cloud of dust the atmosphere of the real numbers.

But, mostly, we don’t need to talk about functions that have an irrational-number domain. We can do our work with a real-number domain instead. So we leave that set with a clumsy symbol. If there’s ever a gold rush of fruitful mathematics to be done with functions on irrational domains then we’ll put in some better notation. Until then, there are better jobs for our letters to do.

The Set Tour, Part 5: C^n

The next piece in this set tour is a hybrid. It mixes properties of the last two sets. And I’ll own up now that while it’s a set that gets used a lot, it’s one that gets used a lot in just some corners of mathematics. It’s got a bit of that “Internet fame”. In particular circles it’s well-known; venture outside those circles even a little, and it’s not. But it leads us into other, useful places.

Cn

C here is the set of complex-valued numbers. We may have feared them once, but now they’re friends, or at least something we can work peacefully with. n here is some counting number, just as it is with Rn. n could be one or two or forty or a hundred billion. It’ll be whatever fits the problem we’re doing, if we need to pin down its value at all.

The reference to Rn, another friend, probably tipped you off to the rest. The items in Cn are n-tuples, ordered sets of some number n of numbers. Each of those numbers is itself a complex-valued number, something from C. Cn gets typeset in bold, and often with that extra vertical stroke on the left side of the C arc. It’s handwritten that way, too.

As with Rn we can add together things in Cn. Suppose that we are in C2 so that I don’t have to type too much. Suppose the first number is (2 + i, -3 – 3*i) and the second number is (6 – 2*i, 2 + 9*i). There could be fractions or irrational numbers in the real and imaginary components, but I don’t want to type that much. The work is the same. Anyway, the sum will be another number in Cn. The first term in that sum will be the sum of the first term in the first number, 2 + i, and the first term in the second number, 6 – 2*i. That in turn will be the sum of the real and of the imaginary components, so, 2 + 6 + i – 2*i, or 8 – i all told. The second term of the sum will be the second term of the first number, -3 – 3*i, and the second term of the second number, 2 + 9*i, which will be -3 – 3*i + 2 + 9*i or, all told, -1 + 6*i. The sum is the n-tuple (8 – i, -1 + 6*i).

And also as with Rn there really isn’t multiplying of one term of Cn by another. Generally, we can’t do this in any useful way. We can multiply something in Cn by a scalar, a single real — or, why not, complex-valued — number, though.

So let’s start out with (8 – i, -1 + 6*i), a number in C2. And then pick a scalar, say, 2 + 2*i. It doesn’t have to be complex-valued, but, why not? The product of this scalar and this term will be another number in C2. Its first term will the scalar, 2 + 2*i, multiplied by the first term in it, 8 – i. That’s (2 + 2*i) * (8 – i), or 2*8 – 2*i + 16*i – 2*i*i, or 2*8 – 2*i + 16*i + 2, or 18 + 14*i. And then its second term will be the scalar 2 + 2*i multiplied by the second term, -1 + 6*i. That’s (2 + 2*i)*(-1 + 6*i), or 2*(-1) + 2*6*i -2*i + 2*6*i*i. And that’s -2 + 12*i – 2*i -12, or -14 + 10*i. So the product is (18 + 14*i, -14 + 10*i).

So as with Rn, Cn creates a “vector space”. These spaces are useful in complex analysis. They’re also useful in the study of affine geometry, a corner of geometry that I’m sad to admit falls outside what I studied. I have tried reading up on them on my own, and I run aground each time. I understand the basic principles but never quite grasp why they are interesting. That’s my own failing, of course, and I’d be glad for a pointer that explained in ways I understood why they’re so neat.

I do understand some of what’s neat about them: affine geometry tells us what we can know about shapes without using the concept of “distance”. When you discover that we can know anything about shapes without the idea of “distance” your imagination should be fired. Mine is, too. I just haven’t followed from that to feel comfortable with the terminology and symbols of the field.

You could, if you like, think of Cn as being a specially-delineated version of R2*n. This is just as you can see a complex number as an ordered pair of real numbers. But sometimes information is usefully thought of as a single, complex-valued number. And there is a value in introducing the idea of ordered sets of things that are not real numbers. We will see the concept again.

Also, the heck did I write an 800-word essay about the family of sets of complex-valued n-tuples and have Hemingway Editor judge it to be at the “Grade 3” reading level? I rarely get down to “Grade 6” when I do a Reading the Comics post explaining how Andertoons did a snarky-word-problem-answers panel. That’s got to be a temporary glitch.

C

The square root of negative one. Everybody knows it doesn’t exist; there’s no real number you can multiply by itself and get negative one out. But then sometime in algebra, deep in a section about polynomials, suddenly we come out and declare there is such a thing. It’s an “imaginary number” that we call “i”. It’s hard to blame students for feeling betrayed by this. To make it worse, we throw real and imaginary numbers together and call the result “complex numbers”. It’s as if we’re out to tease them for feeling confused.

It’s an important set of things, though. It turns up as the domain, or the range, of functions so often that one of the major fields of analysis is called, “Complex Analysis”. If the course listing allows for more words, it’s called “Analysis of Functions of a Complex Variable” or something like that. Despite the connotations of the word “complex”, though, the field is a delight. It’s considerably easier to understand than Real Analysis, the study of functions of mere real numbers. When there is a theorem that has a version in Real Analysis and a version in Complex Analysis, the Complex Analysis side is usually easier to prove and easier to understand. It’s uncanny.

The set of all complex numbers is denoted C, in parallel to the set of real numbers, R. To make it clear that we mean this set, and not some piddling little common set that might happen to share the name C, add a vertical stroke to the left of the letter. This is just as we add a vertical stroke to R to emphasize we mean the Real Numbers. We should approach the set with respect, removing our hats, thinking seriously about great things. It would look silly to add a second curve to C though, so we just add a straight vertical stroke on the left side of the letter C. This makes it look a bit like it’s an Old English typeface (the kind you call Gothic until you learn that means “sans serif”) pared down to its minimum.

Why do we teach people there’s no such thing as a square root of minus one, and then one day, teach them there is? Part of it is that whether there is a square root depends on your context. If you are interested only in the real numbers, there’s nothing that, squared, gives you minus one. This is exactly the way that it’s not possible to equally divide five objects between two people if you aren’t allowed to cut the objects in half. But if you are willing to allow half-objects to be things, then you can do what was previously forbidden. What you can do depends on what the rules you set out are.

And there’s surely some echo of the historical discovery of imaginary and complex numbers at work here. They were noticed when working out the roots of third- and fourth-degree polynomials. These can be done by way of formulas that nobody ever remembers because there are so many better things to remember. These formulas would sometimes require one to calculate a square root of a negative number, a thing that obviously didn’t exist. Except that if you pretended it did, you could get out correct answers, just as if these were ordinary numbers. You can see why this may be dubbed an “imaginary” number. The name hints at the suspicion with which it’s viewed. It’s much as “negative” numbers look like some trap to people who’re just getting comfortable with fractions.

It goes against the stereotype of mathematicians to suppose they’d accept working with something they don’t understand because the results are all right, afterwards. But, actually, mathematicians are willing to accept getting answers by any crazy method. If you have a plausible answer, you can test whether it’s right, and if all you really need this minute is the right answer, good.

But we do like having methods; they’re more useful than mere answers. And we can imagine this set called the complex numbers. They contain … well, all the possible roots, the solutions, of all polynomials. (The polynomials might have coefficients — the numbers in front of the variable — of integers, or rational numbers, or irrational numbers. If we already accept the idea of complex numbers, the coefficients can be complex numbers too.)

It’s exceedingly common to think of the complex numbers by starting off with a new number called “i”. This is a number about which we know nothing except that i times i equals minus one. Then we tend to think of complex numbers as “a real number plus i times another real number”. The first real number gets called “the real component”, and is usually denoted as either “a” or “x”. The second real number gets called “the imaginary component”, and is usually denoted as either “b” or “y”. Then the complex number is written “a + i*b” or “x + i*y”. Sometimes it’s written “a + b*i” or “x + y*i”; that’s a mere matter of house style. Don’t let it throw you.

Writing a complex number this way has advantages. Particularly, it makes it easy to see how one would add together (or subtract) complex numbers: “a + b*i + x + y*i” almost suggests that the sum should be “(a + x) + (b + y)*i”. What we know from ordinary arithmetic gives us guidance. And if we’re comfortable with binomials, then we know how to multiply complex numbers. Start with “(a + b*i) * (x + y*i)” and follow the distributive law. We get, first, “a*x + a*y*i + b*i*x + b*y*i*i”. But “i*i” equals minus one, so this is the same as “a*x + a*y*i + b*i*x – b*y”. Move the real components together, and move the imaginary components together, and we have “(a*x – b*y) + (a*y + b*x)*i”.

That’s the most common way of writing out complex numbers. It’s so common that Eric W Weisstein’s Mathworld encyclopedia even says that’s what complex numbers are. But it isn’t the only way to construct, or look at, complex numbers. A common alternate way to look at complex numbers is to match a complex number to a point on the plane, or if you prefer, a point in the set R2.

It’s surprisingly natural to think of the real component as how far to the right or left of an origin your complex number is, and to think of the imaginary component as how far above or below the origin it is. Much complex-number work makes sense if you think of complex numbers as points in space, or directions in space. The language of vectors trips us up only a little bit here. We speak of a complex number as corresponding to a point on the “complex plane”, just as we might speak of a real number as a point on the “(real) number line”.

But there are other descriptions yet. We can represent complex numbers as a pair of numbers with a scheme that looks like polar coordinates. Pick a point on the complex plane. We can say where that is by two points of information. The first is the amplitude, or magnitude: how far the point is from the origin. The second is the phase, or angle: draw the line segment connecting the origin and your point. What angle does that make with the positive horizontal axis?

This representation is called the “phasor” representation. It’s tolerably popular in physics and I hear tell of engineers liking it. We represent numbers then not as “x + i*y” but instead as “r * e”, with r the magnitude and θ the angle. “e” is the base of the natural logarithm, which you get very comfortable with if you do much mathematics or physics. And “i” is just what we’ve been talking about here. This is a pretty natural way to write about complex numbers that represent stuff that oscillates, such as alternating current or the probability function in quantum mechanics. A lot of stuff oscillates, if you study it through the right lens. So numbers that look like this keep creeping in, and into unexpected places. It’s quite easy to multiply numbers in phasor form — just multiply the magnitude parts, and add the angle parts — although addition and subtraction become a pain.

Mathematicians generally use the letter “z” to represent a complex-valued number whose identity is not known. As best I can tell, this is because we do think so much of a complex number as the sum “x + y*i”. So if we used familiar old “x” for an unknown number, it would carry the connotations of “the real component of our complex-valued number” and mislead the unwary mathematician. The connection is so common that a mathematician might carelessly switch between “z” and the real and imaginary components “x” and “y” without specifying that “z” is another way of writing “x + y*i”. A good copy editor or an alert student should catch this.

Complex numbers work very much like real numbers do. They add and multiply in natural-looking ways, and you can do subtraction and division just as well. You can take exponentials, and can define all the common arithmetic functions — sines and cosines, square roots and logarithms, integrals and differentials — on them just as well as you can with real numbers. And you can embed the real numbers within the complex numbers: if you have a real number x, you can match that perfectly with the complex number “x + 0*i”.

But that doesn’t mean complex numbers are exactly like the real numbers. For example, it’s possible to order the real numbers. You can say that the number “a” is less than the number “b”, and have that mean something. That’s not possible to do with complex numbers. You can’t say that “a + b*i” is less than, or greater than, “x + y*i” in a logically consistent way. You can say the magnitude of one complex-valued number is greater than the magnitude of another. But the magnitudes are real numbers. For all that complex numbers give us there are things they’re not good for.

A Venn Diagram of the Real Number System

I’m aware that it isn’t properly exactly a Venn diagram, now, but the mathematics-artist Robert Austin has a nice picture of the real numbers, and the most popular subsets of the real numbers, and how they relate. The bubbles aren’t to scale — there’s just as many counting numbers (1, 2, 3, 4, et cetera) as there are rational numbers, and there are far more irrational numbers than there are rational numbers — but if you don’t mind that, then, this is at least a nice little illustration.

Augustin-Louis Cauchy’s birthday

The Maths History feed on Twitter mentioned that the 21st of August was the birthday of Augustin-Louis Cauchy, who lived from 1789 to 1857. His is one of those names you get to know very well when you’re a mathematics major, since he published 789 papers in his life, and did very well at publishing important papers, ones that established concepts people would actually use.

He’s got an intriguing biography, as he lived (mostly) in France during the time of the Revolution, the Directorate, Napoleon, the Bourbon Restoration, the July Monarchy, the Revolutions of 1848, the Second Republic, and the Second Empire, and had a career which got inextricably tangled with the political upheavals of the era. I note that, according to the MacTutor biography linked to earlier this paragraph, he followed the deposed King Charles X to Prague in order to tutor his grandson, but might not have had the right temperament for it: at least once he got annoyed at the grandson’s confusion and screamed and yelled, with the Queen, Marie Thérèse, sometimes telling him, “too loud, not so loud”. But we’ve all had students that frustrate us.

Cauchy’s name appears on many theorems and principles and definitions of interesting things — I just checked Mathworld and his name returned 124 different items — though I’ll admit I’m stumped how to describe what the Cauchy-Frobenius Lemma is without scaring readers off. So let me talk about something simpler.

How Big Charlotte Was In 1975

[ I cannot and do not try to explain it, but yesterday was a busier-than-average day around these parts, with a surprising number of references coming from an Entertainment weekly article about the House series finale for some reason. In this context a “surprising” number is “any number other than zero” since I don’t know why anyone would go from there to here. I watched House, sometimes, sure, and liked it, but kind of drifted away when there was other stuff to do, you know? ]

That’s enough time spent establishing the heck out of the idea of a polynomial. Let’s actually put one in place. My goal back when was estimating what the population of Charlotte, North Carolina, was around 1975. I had some old Census data from 1970 and 1980 giving its population on the first of April, the earlier year, as 840,347; and the first of April, 1980, as 971,391.