From my Second A-To-Z: Transcendental Number

The second time I did one of these A-to-Z’s, I hit on the idea of asking people for suggestions. It was a good move as it opened up subjects I had not come close to considering. I didn’t think to include the instructions for making your own transcendental number, though. You never get craft projects in mathematics, not after you get past the stage of making construction-paper rhombuses or something. I am glad to see my schtick of including a warning about using this stuff at your thesis defense was established by then.

I’m down to the last seven letters in the Leap Day 2016 A To Z. It’s also the next-to-the-last of Gaurish’s requests. This was a fun one.

Transcendental Number.

Take a huge bag and stuff all the real numbers into it. Give the bag a good solid shaking. Stir up all the numbers until they’re thoroughly mixed. Reach in and grab just the one. There you go: you’ve got a transcendental number. Enjoy!

OK, I detect some grumbling out there. The first is that you tried doing this in your head because you somehow don’t have a bag large enough to hold all the real numbers. And you imagined pulling out some number like “2” or “37” or maybe “one-half”. And you may not be exactly sure what a transcendental number is. But you’re confident the strangest number you extracted, “minus 8”, isn’t it. And you’re right. None of those are transcendental numbers.

I regret saying this, but that’s your own fault. You’re lousy at picking random numbers from your head. So am I. We all are. Don’t believe me? Think of a positive whole number. I predict you probably picked something between 1 and 10. Almost surely something between 1 and 100. Surely something less than 10,000. You didn’t even consider picking something between 10,012,002,214,473,325,937,775 and 10,012,002,214,473,325,937,785. Challenged to pick a number, people will select nice and familiar ones. The nice familiar numbers happen not to be transcendental.

I detect some secondary grumbling there. Somebody picked π. And someone else picked e. Very good. Those are transcendental numbers. They’re also nice familiar numbers, at least to people who like mathematics a lot. So they attract attention.

Still haven’t said what they are. What they are traces back, of course, to polynomials. Take a polynomial that’s got one variable, which we call ‘x’ because we don’t want to be difficult. Suppose that all the coefficients of the polynomial, the constant numbers we presumably know or could find out, are integers. What are the roots of the polynomial? That is, for what values of x is the polynomial a complicated way of writing ‘zero’?

For example, try the polynomial x2 – 6x + 5. If x = 1, then that polynomial is equal to zero. If x = 5, the polynomial’s equal to zero. Or how about the polynomial x2 + 4x + 4? That’s equal to zero if x is equal to -2. So a polynomial with integer coefficients can certainly have positive and negative integers as roots.

How about the polynomial 2x – 3? Yes, that is so a polynomial. This is almost easy. That’s equal to zero if x = 3/2. How about the polynomial (2x – 3)(4x + 5)(6x – 7)? It’s my polynomial and I want to write it so it’s easy to find the roots. That polynomial will be zero if x = 3/2, or if x = -5/4, or if x = 7/6. So a polynomial with integer coefficients can have positive and negative rational numbers as roots.

How about the polynomial x2 – 2? That’s equal to zero if x is the square root of 2, about 1.414. It’s also equal to zero if x is minus the square root of 2, about -1.414. And the square root of 2 is irrational. So we can certainly have irrational numbers as roots.

So if we can have whole numbers, and rational numbers, and irrational numbers as roots, how can there be anything else? Yes, complex numbers, I see you raising your hand there. We’re not talking about complex numbers just now. Only real numbers.

It isn’t hard to work out why we can get any whole number, positive or negative, from a polynomial with integer coefficients. Or why we can get any rational number. The irrationals, though … it turns out we can only get some of them this way. We can get square roots and cube roots and fourth roots and all that. We can get combinations of those. But we can’t get everything. There are irrational numbers that are there but that even polynomials can’t reach.

It’s all right to be surprised. It’s a surprising result. Maybe even unsettling. Transcendental numbers have something peculiar about them. The 19th Century French mathematician Joseph Liouville first proved the things must exist, in 1844. (He used continued fractions to show there must be such things.) It would be seven years later that he gave an example of one in nice, easy-to-understand decimals. This is the number 0.110 001 000 000 000 000 000 001 000 000 (et cetera). This number is zero almost everywhere. But there’s a 1 in the n-th digit past the decimal if n is the factorial of some number. That is, 1! is 1, so the 1st digit past the decimal is a 1. 2! is 2, so the 2nd digit past the decimal is a 1. 3! is 6, so the 6th digit past the decimal is a 1. 4! is 24, so the 24th digit past the decimal is a 1. The next 1 will appear in spot number 5!, which is 120. After that, 6! is 720 so we wait for the 720th digit to be 1 again.

And what is this Liouville number 0.110 001 000 000 000 000 000 001 000 000 (et cetera) used for, besides showing that a transcendental number exists? Not a thing. It’s of no other interest. And this plagued the transcendental numbers until 1873. The only examples anyone had of transcendental numbers were ones built to show that they existed. In 1873 Charles Hermite showed finally that e, the base of the natural logarithm, was transcendental. e is a much more interesting number; we have reasons to care about it. Every exponential growth or decay or oscillating process has e lurking in it somewhere. In 1882 Ferdinand von Lindemann showed that π was transcendental, and that’s an even more interesting number.

That bit about π has interesting implications. One goes back to the ancient Greeks. Is it possible, using straightedge and compass, to create a square that’s exactly the same size as a given circle? This is equivalent to saying, if I give you a line segment, can you create another line segment that’s exactly the square root of π times as long? This geometric problem is equivalent to an algebraic one. That problem: can you create a polynomial, with integer coefficients, that has the square root of π as a root? (WARNING: I’m skipping some important points for the sake of clarity. DO NOT attempt to use this to pass your thesis defense without putting those points back in.) We want the square root of π because … well, what’s the area of a square whose sides are the square root of π long? That’s right. So we start with a line segment that’s equal to the radius of the circle and we can do that, surely. Once we have the radius, can’t we make a line that’s the square root of π times the radius, and from that make a square with area exactly π times the radius squared? Since π is transcendental, then, no. We can’t. Sorry. One of the great problems of ancient mathematics, and one that still has the power to attract the casual mathematician, got its final answer in 1882.

Georg Cantor is a name even non-mathematicians might recognize. He showed there have to be some infinite sets bigger than others, and that there must be more real numbers than there are rational numbers. Four years after showing that, he proved there are as many transcendental numbers as there are real numbers.

They’re everywhere. They permeate the real numbers so much that we can understand the real numbers as the transcendental numbers plus some dust. They’re almost the dark matter of mathematics. We don’t actually know all that many of them. Wolfram MathWorld has a table listing numbers proven to be transcendental, and the fact we can list that on a single web page is remarkable. Some of them are large sets of numbers, yes, like $e^{\pi \sqrt{d}}$ for every positive whole number d. And we can infer many more from them; if π is transcendental then so is 2π, and so is 5π, and so is -20.38π, and so on. But the table of numbers proven to be irrational is still just 25 rows long.

There are even mysteries about obvious numbers. π is transcendental. So is e. We know that at least one of π times e and π plus e is transcendental. Perhaps both are. We don’t know which one is, or if both are. We don’t know whether ππ is transcendental. We don’t know whether ee is, either. Don’t even ask if πe is.

How, by the way, does this fit with my claim that everything in mathematics is polynomials? — Well, we found these numbers in the first place by looking at polynomials. The set is defined, even to this day, by how a particular kind of polynomial can’t reach them. Thinking about a particular kind of polynomial makes visible this interesting set.

My Little 2021 Mathematics A-to-Z: Analysis

I’m fortunate this week to have another topic suggested again by Mr Wu, blogger and Singaporean mathematics tutor. It’s a big field, so forgive me not explaining the entire subject.

Analysis.

Analysis is about proving why the rest of mathematics works. It’s a hard field. My experience, a typical one, included crashing against real analysis as an undergraduate and again as a graduate student. It turns out mathematics works by throwing a lot of $\epsilon$ symbols around.

Let me give an example. If you read pop mathematics blogs you know about the number represented by $0.999999\cdots$. You’ve seen proofs, some of them even convincing, that this number equals 1. Not a tiny bit less than 1, but exactly 1. Here’s a real-analysis treatment. And — I may regret this — I recommend you don’t read it. Not closely, at least. Instead, look at its shape. Look at the words and symbols as graphic design elements, and trust that what I say is not nonsense. Resume reading after the horizontal rule.

It’s convenient to have a name for the number $0.999999\cdots$. I’ll call that $r$, for “repeating”. 1 we’ll call 1. I think you’ll grant that whatever r is, it can’t be more than 1. I hope you’ll accept that if the difference between 1 and r is zero, then r equals 1. So what is the difference between 1 and r?

Give me some number $\epsilon$. It has to be a positive number. The implication in the letter $\epsilon$ is that it’s a small number. This isn’t actually required in general. We expect it. We feel surprise and offense if it’s ever not the case.

I can show that the difference between 1 and r is less than $\epsilon$. I know there is some smallest counting number N so that $\epsilon > \frac{1}{10^{N}}$. For example, say $\epsilon$ is 0.125. Then we can let N = 1, and $0.125 > \frac{1}{10^{1}}$. Or suppose $\epsilon$ is 0.00625. But then if N = 3, $0.00625 > \frac{1}{10^{3}}$. (If $\epsilon$ is bigger than 1, let N = 1.) Now we have to ask why I want this N.

Whatever the value of r is, I know that it is more than 0.9. And that it is more than 0.99. And that it is more than 0.999. In fact, it’s more than the number you get by truncating r after any whole number N of digits. Let me call $r_N$ the number you get by truncating r after N digits. So, $r_1 = 0.9$ and $r_2 = 0.99$ and $r_5 = 0.99999$ and so on.

Since $r > r_N$, it has to be true that $1 - r < 1 - r_N$. And since we know what $r_N$ is, we can say exactly what $1 - r_N$ is. It's $\frac{1}{10^{N}}$. And we picked N so that $\frac{1}{10^{N}} < \epsilon$. So $1 - r < 1 - r_N = \frac{1}{10^{N}} < \epsilon$. But all we know of $\epsilon$ is that it's a positive number. It can be any positive number. So $1 - r$ has to be smaller than each and every positive number. The biggest number that’s smaller than every positive number is zero. So the difference between 1 and r must be zero and so they must be equal.

That is a compelling argument. Granted, it compels much the way your older brother kneeling on your chest and pressing your head into the ground compels. But this argument gives the flavor of what much of analysis is like.

For one, it is fussy, leaning to technical. You see why the subject has the reputation of driving off all but the most intent mathematics majors. If you get comfortable with this sort of argument it’s hard to notice anymore.

For another, the argument shows that the difference between two things is less than every positive number. Therefore the difference is zero and so the things are equal. This is one of mathematics’ most important tricks. And another point, there’s a lot of talk about $\epsilon$. And about finding differences that are, it usually turns out, smaller than some $\epsilon$. (As an undergraduate I found something wasteful in how the differences were so often so much less than $\epsilon$. We can’t exhaust the small numbers, though. It still feels uneconomic.)

Something this misses is another trick, though. That’s adding zero. I couldn’t think of a good way to use that here. What we often get is the need to show that, say, function $f$ and function $g$ are equal. That is, that they are less than $\epsilon$ apart. What we can often do is show that $f$ is close to some related function, which let me call $f_n$.

I know what you’re suspecting: $f_n$ must be a polynomial. Good thought! Although in my experience, it’s actually more likely to be a piecewise constant function. That is, it’s some number, eg, “2”, for part of the domain, and then “2.5” in some other region, with no transition between them. Some other values, even values not starting with “2”, in other parts of the domain. Usually this is easier to prove stuff about than even polynomials are.

But get back to $g_n$. It’s got the same deal as $f_n$, some approximation easier to prove stuff about. Then we want to show that $g$ is close to some $g_n$. And then show that $f_n$ is close to $g_n$. So — watch this trick. Or, again, watch the shape of this trick. Read again after the horizontal rule.

The difference $| f - g |$ is equal to $| f - f_n + f_n - g |$ since adding zero, that is, adding the number $( -f_n + f_n )$, can’t change a quantity. And $| f - f_n + f_n - g |$ is equal to $| f - f_n + f_n -g_n + g_n - g |$. Same reason: $( -g_n + g_n )$ is zero. So:

$| f - g | = |f - f_n + f_n -g_n + g_n - g |$

Now we use the “triangle inequality”. If a, b, and c are the lengths of a triangle’s sides, the sum of any two of those numbers is larger than the third. And that tells us:

$|f - f_n + f_n -g_n + g_n - g | \le |f - f_n| + |f_n - g_n| + | g_n - g |$

And then if you can show that $| f - f_n |$ is less than $\frac{1}{3}\epsilon$? And that $| f_n - g_n |$ is also $\frac{1}{3}\epsilon$? And you see where this is going for $| g_n - g |$? Then you’ve shown that $| f - g | \le \epsilon$. With luck, each of these little pieces is something you can prove.

Don’t worry about what all this means. It’s meant to give a flavor of what you do in an analysis course. It looks hard, but most of that is because it’s a different sort of work than you’d done before. If you hadn’t seen the adding-zero and triangle-inequality tricks? I don’t know how long you’d need to imagine them.

There are other tricks too. An old reliable one is showing that one thing is bounded by the other. That is, that $f \le g$. You use this trick all the time because if you can also show that $g \le f$, then those two have to be equal.

The good thing — and there is good — is that once you get the hang of these tricks analysis starts to come together. And even get easier. The first course you take as a mathematics major is real analysis, all about functions of real numbers. The next course in this track is complex analysis, about functions of complex-valued numbers. And it is easy. Compared to what comes before, yes. But also on its own. Every theorem in complex analysis named after Augustin-Louis Cauchy. They all show that the integral of your function, calculated along a closed loop, is zero. I exaggerate by $\epsilon$.

In grad school, if you make it, you get to functional analysis, which examines functions on functions and other abstractions like that. This, too, is easy, possibly because all the basic approaches you’ve seen several courses over. Or it feels easy after all that mucking around with the real numbers.

This is not the entirety of explaining how mathematics works. Since all these proofs depend on how numbers work, we need to show how numbers work. How logic works. But those are subjects we can leave for grad school, for someone who’s survived this gauntlet.

I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all this year’s essays, and all A-to-Z essays from past years, should be at this link. Thank you once more for reading.

My Little 2021 Mathematics A-to-Z: Embedding

Elkement, who’s one of my longest blog-friends here, put forth this suggestion for an ‘E’ topic. It’s a good one. They’re author of the Theory and Practice of Trying to Combine Just Anything blog. Their blog has recently been exploring complex-valued numbers and how to represent rotations.

Embedding.

Consider a book. It’s a collection. It’s easy to see the ordered setting of words, maybe pictures, possibly numbers or even equations. The important thing is the ideas those all represent.

Set the book in a library. How can this change the book?

Perhaps the comparison to other books shows us something the original book neglected. Perhaps something in the original book we now realize was a brilliantly-presented insight. The way we appreciate the book may change.

What can’t change is the content of the original book. The words stay the same, in the same order. If it’s a physical book, the number of pages stays the same, as does the size of the page. The ideas expressed remain the same.

So now you understand embedding. It’s a broad concept, something that can have meaning for any mathematical structure. A structure here is a bunch of items and some things you can do with them. A group, for example, is a good structure to use with this sort of thing. So, for example, the integers and regular addition. This original structure’s embedded in another when everything in the original structure is in the new, and everything you can do with the original structure you can do in the new and get the same results. So, for example, the group you get by taking the integers and regular addition? That’s embedded in the group you get by taking the rational numbers and regular addition. 4 + 8 is 12 whether or not you consider 6.5 a topic fit for discussion. It’s an embedding that expands the set of elements, and that modifies the things you can do to match.

The group you get from the integers and addition is embedded in other things. For example, it’s embedded in the ring you get from the integers and regular addition and regular multiplication. 4 + 8 remains 12 whether or not you can multiply 4 by 8. This embedding doesn’t add any new elements, just new things you can do with them.

Once you have the name, you see embedding everywhere. When we first learn arithmetic we — I, anyway — learn it as adding whole numbers together. Then we embed that into whole numbers with addition and multiplication. And then the (nonnegative) rational numbers with addition and multiplication. At some point (I forget when) the negative numbers came in. So did the whole set of real numbers. Eventually the real numbers got embedded into the complex numbers. And the complex numbers got embedded into the quaternions, although we found real and complex numbers enough for most of our work. I imagine something similar goes on these days.

There’s never only one embedding possible. Consider, for example, two-dimensional geometry, the shapes of figures on a sheet of paper. It’s easy to put that in three dimensions, by setting the paper on the floor, and expand it by drawing in chalk on the wall. Or you can set the paper on the wall, and extend its figures by drawing in chalk on the floor. Or set the paper at an angle to the floor. What you use depends on what’s most convenient. And that can be driven by laziness. It’s easy to match, say, the point in two dimensions at coordinates (3, 4) with the point in three dimensions at coordinates (3, 4, 0), even though (0, 3, 4) or (4, 0, 3) are as valid.

Why embed something in another thing? For the same reasons we do any transformation in mathematics. One is that we figure to embed the thing we’re working on into something easier to deal with. A famous example of this is the Nash embedding theorem. It describes when certain manifolds can be embedded into something that looks like normal space. And that’s useful because it can turn nonlinear partial differential equations — the most insufferable equations — into something solvable.

Another good reason, though, is the one implicit in that early arithmetic education. We started with whole-numbers-with-addition. And then we added the new operation of multiplication. And then new elements, like fractions and negative numbers. If we follow this trail we get to some abstract, tricky structures like octonions. But by small steps in which we have great experience guiding us into new territories.

I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all A-to-Z essays from past years, should be at this link. Thank you once more for reading.

My 2019 Mathematics A To Z: Infimum

Today’s A To Z term is a free pick. I didn’t notice any suggestions for a mathematics term starting with this letter. I apologize if you did submit one and I missed it. I don’t mean any insult.

What I’ve picked is a concept from analysis. I’ve described this casually as the study of why calculus works. That’s a good part of what it is. Analysis is also about why real numbers work. Later on you also get to why complex numbers and why functions work. But it’s in the courses about Real Analysis where a mathematics major can expect to find the infimum, and it’ll stick around on the analysis courses after that.

Infimum.

The infimum is the thing you mean when you say “lower bound”. It applies to a set of things that you can put in order. The order has to work the way less-than-or-equal-to works with whole numbers. You don’t have to have numbers to put a number-like order on things. Otherwise whoever made up the Alphabet Song was fibbing to us all. But starting out with numbers can let you get confident with the idea, and we’ll trust you can go from numbers to other stuff, in case you ever need to.

A lower bound would start out meaning what you’d imagine if you spoke English. Let me call it L. It’ll make my sentences so much easier to write. Suppose that L is less than or equal to all the elements in your set. Then, great! L is a lower bound of your set.

You see the loophole here. It’s in the article “a”. If L is a lower bound, then what about L – 1? L – 10? L – 1,000,000,000½? Yeah, they’re all lower bounds, too. There’s no end of lower bounds. And that is not what you mean be a lower bound, in everyday language. You mean “the smallest thing you have to deal with”.

But you can’t just say “well, the lower bound of a set is the smallest thing in the set”. There’s sets that don’t have a smallest thing. The iconic example is positive numbers. No positive number can be a lower bound of this. All the negative numbers are lowest bounds of this. Zero can be a lower bound of this.

For the postive numbers, it’s obvious: zero is the lower bound we want. It’s smaller than all of the positive numbers. And there’s no greater number that’s also smaller than all the positive numbers. So this is the infimum of the positive numbers. It’s the greatest lower bound of the set.

The infimum of a set may or may not be part of the original set. But. Between the infimum of a set and the infimum plus any positive number, however tiny that is? There’s always at least one thing in the set.

And there isn’t always an infimum. This is obvious if your set is, like, the set of all the integers. If there’s no lower bound at all, there can’t be a greatest lower bound. So that’s obvious enough.

Infimums turn up in a good number of proofs. There are a couple reasons they do. One is that we want to prove a boundary between two kinds of things exist. It’s lurking in the proof, for example, of the intermediate value theorem. This is the proposition that if you have a continuous function on the domain [a, b], and range of real numbers, and pick some number g that’s between f(a) and f(b)? There’ll be at least one point c, between a and b, where f(c) equals g. You can structure this: look at the set of numbers x in the domain [a, b] whose f(x) is larger than g. So what’s the infimum of this set? What does f have to be for that infimum?

It also turns up a lot in proofs about calculus. Proofs about functions, particularly, especially integrating functions. A proof like this will, generically, not deal with the original function, which might have all kinds of unpleasant aspects. Instead it’ll look at a sequence of approximations of the original function. Each approximation is chosen so it has no unpleasant aspect. And then prove that we could make arbitrarily tiny the difference between the result for the function we want and the result for the sequence of functions we make. Infimums turn up in this, since we’ll want a minimum function without being sure that the minimum is in the sequence we work with.

This is the terminology of stuff to work as lower bounds. There’s a similar terminology to work with upper bounds. The upper-bound equivalent of the infimum is the supremum. They’re abbreviated as inf and sup. The supremum turns up most every time an infimum does, and for the reasons you’d expect.

If an infimum does exist, it’s unique; there can’t be two different ones. Same with the supremum.

And things can get weird. It’s possible to have lower bounds but no infimum. This seems bizarre. This is because we’ve been relying on the real numbers to guide our intuition. And the real numbers have a useful property called being “complete”. So let me break the real numbers. Imagine the real numbers except for zero. Call that the set R’. Now look at the set of positive numbers inside R’. What’s the infimum of the positive numbers, within R’? All we can do is shrug and say there is none, even though there are plenty of lower bounds. The infimum of a set depends on the set. It also depends on what bigger set that the set is within. That something depends both on a set and what the bigger set of things is, is another thing that turns up all the time in analysis. It’s worth becoming familiar with.

Thanks for reading this. All of Fall 2019 A To Z posts should be at this link. Later this week I should have my ‘J’ post. All of my past A To Z essays should be available at this link and when I get a free afternoon I’ll make that “should be” into “are”. For tomorrow I hope to finish off last week’s comic strips. See you then.

A Leap Day 2016 Mathematics A To Z: Transcendental Number

I’m down to the last seven letters in the Leap Day 2016 A To Z. It’s also the next-to-the-last of Gaurish’s requests. This was a fun one.

Transcendental Number.

Take a huge bag and stuff all the real numbers into it. Give the bag a good solid shaking. Stir up all the numbers until they’re thoroughly mixed. Reach in and grab just the one. There you go: you’ve got a transcendental number. Enjoy!

OK, I detect some grumbling out there. The first is that you tried doing this in your head because you somehow don’t have a bag large enough to hold all the real numbers. And you imagined pulling out some number like “2” or “37” or maybe “one-half”. And you may not be exactly sure what a transcendental number is. But you’re confident the strangest number you extracted, “minus 8”, isn’t it. And you’re right. None of those are transcendental numbers.

I regret saying this, but that’s your own fault. You’re lousy at picking random numbers from your head. So am I. We all are. Don’t believe me? Think of a positive whole number. I predict you probably picked something between 1 and 10. Almost surely something between 1 and 100. Surely something less than 10,000. You didn’t even consider picking something between 10,012,002,214,473,325,937,775 and 10,012,002,214,473,325,937,785. Challenged to pick a number, people will select nice and familiar ones. The nice familiar numbers happen not to be transcendental.

I detect some secondary grumbling there. Somebody picked π. And someone else picked e. Very good. Those are transcendental numbers. They’re also nice familiar numbers, at least to people who like mathematics a lot. So they attract attention.

Still haven’t said what they are. What they are traces back, of course, to polynomials. Take a polynomial that’s got one variable, which we call ‘x’ because we don’t want to be difficult. Suppose that all the coefficients of the polynomial, the constant numbers we presumably know or could find out, are integers. What are the roots of the polynomial? That is, for what values of x is the polynomial a complicated way of writing ‘zero’?

For example, try the polynomial x2 – 6x + 5. If x = 1, then that polynomial is equal to zero. If x = 5, the polynomial’s equal to zero. Or how about the polynomial x2 + 4x + 4? That’s equal to zero if x is equal to -2. So a polynomial with integer coefficients can certainly have positive and negative integers as roots.

How about the polynomial 2x – 3? Yes, that is so a polynomial. This is almost easy. That’s equal to zero if x = 3/2. How about the polynomial (2x – 3)(4x + 5)(6x – 7)? It’s my polynomial and I want to write it so it’s easy to find the roots. That polynomial will be zero if x = 3/2, or if x = -5/4, or if x = 7/6. So a polynomial with integer coefficients can have positive and negative rational numbers as roots.

How about the polynomial x2 – 2? That’s equal to zero if x is the square root of 2, about 1.414. It’s also equal to zero if x is minus the square root of 2, about -1.414. And the square root of 2 is irrational. So we can certainly have irrational numbers as roots.

So if we can have whole numbers, and rational numbers, and irrational numbers as roots, how can there be anything else? Yes, complex numbers, I see you raising your hand there. We’re not talking about complex numbers just now. Only real numbers.

It isn’t hard to work out why we can get any whole number, positive or negative, from a polynomial with integer coefficients. Or why we can get any rational number. The irrationals, though … it turns out we can only get some of them this way. We can get square roots and cube roots and fourth roots and all that. We can get combinations of those. But we can’t get everything. There are irrational numbers that are there but that even polynomials can’t reach.

It’s all right to be surprised. It’s a surprising result. Maybe even unsettling. Transcendental numbers have something peculiar about them. The 19th Century French mathematician Joseph Liouville first proved the things must exist, in 1844. (He used continued fractions to show there must be such things.) It would be seven years later that he gave an example of one in nice, easy-to-understand decimals. This is the number 0.110 001 000 000 000 000 000 001 000 000 (et cetera). This number is zero almost everywhere. But there’s a 1 in the n-th digit past the decimal if n is the factorial of some number. That is, 1! is 1, so the 1st digit past the decimal is a 1. 2! is 2, so the 2nd digit past the decimal is a 1. 3! is 6, so the 6th digit past the decimal is a 1. 4! is 24, so the 24th digit past the decimal is a 1. The next 1 will appear in spot number 5!, which is 120. After that, 6! is 720 so we wait for the 720th digit to be 1 again.

And what is this Liouville number 0.110 001 000 000 000 000 000 001 000 000 (et cetera) used for, besides showing that a transcendental number exists? Not a thing. It’s of no other interest. And this plagued the transcendental numbers until 1873. The only examples anyone had of transcendental numbers were ones built to show that they existed. In 1873 Charles Hermite showed finally that e, the base of the natural logarithm, was transcendental. e is a much more interesting number; we have reasons to care about it. Every exponential growth or decay or oscillating process has e lurking in it somewhere. In 1882 Ferdinand von Lindemann showed that π was transcendental, and that’s an even more interesting number.

That bit about π has interesting implications. One goes back to the ancient Greeks. Is it possible, using straightedge and compass, to create a square that’s exactly the same size as a given circle? This is equivalent to saying, if I give you a line segment, can you create another line segment that’s exactly the square root of π times as long? This geometric problem is equivalent to an algebraic one. That problem: can you create a polynomial, with integer coefficients, that has the square root of π as a root? (WARNING: I’m skipping some important points for the sake of clarity. DO NOT attempt to use this to pass your thesis defense without putting those points back in.) We want the square root of π because … well, what’s the area of a square whose sides are the square root of π long? That’s right. So we start with a line segment that’s equal to the radius of the circle and we can do that, surely. Once we have the radius, can’t we make a line that’s the square root of π times the radius, and from that make a square with area exactly π times the radius squared? Since π is transcendental, then, no. We can’t. Sorry. One of the great problems of ancient mathematics, and one that still has the power to attract the casual mathematician, got its final answer in 1882.

Georg Cantor is a name even non-mathematicians might recognize. He showed there have to be some infinite sets bigger than others, and that there must be more real numbers than there are rational numbers. Four years after showing that, he proved there are as many transcendental numbers as there are real numbers.

They’re everywhere. They permeate the real numbers so much that we can understand the real numbers as the transcendental numbers plus some dust. They’re almost the dark matter of mathematics. We don’t actually know all that many of them. Wolfram MathWorld has a table listing numbers proven to be transcendental, and the fact we can list that on a single web page is remarkable. Some of them are large sets of numbers, yes, like $e^{\pi \sqrt{d}}$ for every positive whole number d. And we can infer many more from them; if π is transcendental then so is 2π, and so is 5π, and so is -20.38π, and so on. But the table of numbers proven to be irrational is still just 25 rows long.

There are even mysteries about obvious numbers. π is transcendental. So is e. We know that at least one of π times e and π plus e is transcendental. Perhaps both are. We don’t know which one is, or if both are. We don’t know whether ππ is transcendental. We don’t know whether ee is, either. Don’t even ask if πe is.

How, by the way, does this fit with my claim that everything in mathematics is polynomials? — Well, we found these numbers in the first place by looking at polynomials. The set is defined, even to this day, by how a particular kind of polynomial can’t reach them. Thinking about a particular kind of polynomial makes visible this interesting set.

The Set Tour, Part 12: What Can You Do With Functions?

I want to resume my tour of sets that turn up a lot as domains and ranges. But I need to spend some time explaining stuff before the next bunch. I want to talk about things that aren’t so familiar as “numbers” or “shapes”. We get into more abstract things.

We have to start out with functions. Functions are built of three points, a set that’s the domain, a set that’s the range, and a rule that matches things in the domain to things in the range. But what’s a set? Sets are bunches of things. (If we want to avoid logical chaos we have to be more exact. But we’re not going near the zones of logical chaos. So we’re all right going with “sets are bunches of things”. WARNING: do not try to pass this off at your thesis defense.)

So if a function is a thing, can’t we have a set that’s made up of functions? Sure, why not? We can get a set by describing the collection of things we want in it. At least if we aren’t doing anything weird. (See above warning.)

Let’s pick out a set of functions. Put together a group of functions that all have the same set as their domain, and that have compatible sets as their range. The real numbers are a good pick for a domain. They’re also good for a range.

Is this an interesting set? Generally, a set is boring unless we can do something with the stuff in it. That something is, almost always, taking a pair of the things in the set and relating it to something new. Whole numbers, for example, would be trivia if we weren’t able to add them together. Real numbers would be a complicated pile of digits if we couldn’t multiply them together. Having things is nice. Doing stuff with things is all that’s meaningful.

So what can we do with a couple of functions, if they have the same domains and ranges? Let’s pick one out. Give it the name ‘f’. That’s a common name for functions. It was given to us by Leonhard Euler, who was brilliant in every field of mathematics, including in creating notation. Now let’s pick out a function again. Give this new one the name ‘g’. That’s a common name for functions, given to us by every mathematician who needed something besides ‘f’. (There are alternatives. One is to start using subscripts, like f1 and f2. That’s too hard for me to type. Another is to use different typefaces. Again, too hard for me. Another is to use lower- and upper-case letters, ‘f’ and ‘F’. Using alternate-case forms usually connotes that these two functions are related in some way. I don’t want to suggest that they are related here. So, ‘g’ it is.)

We can do some obvious things. We can add them together. We can create a new function, imaginatively named f + g’. It’ll have the same domain and the same range as f and g did. What rule defines how it matches things in the domain to things in the range?

Mathematicians throw the term “obvious” around a lot. Also “intuitive”. What they mean is “what makes sense to me but I don’t want to write it down”. Saying that is fine if your mathematician friend knows roughly what you’d think makes sense. It can be catastrophic if she’s much smarter than you, or thinks in weird ways, and is always surprised other people don’t think like her. It’s hard to better describe it than “obvious”, though. Well, here goes.

Let me pick something that’s in the domain of both f and g. I’m going to call that x, which mathematicians have been doing ever since René Descartes gave us the idea. So “f(x)” is something in the range of f, and “g(x) is something in the range of g. I said, way up earlier, that both of these ranges are the same set and suggested the real numbers there. That is, f(x) is some real number and I don’t care which just now. g(x) is also some real number and again I don’t care right now just which.

The function we call “f + g” matches the thing x, in the domain, to something in the range. What thing? The number f(x) + g(x). I told you, I can’t see any fair way to describe that besides being “obvious” and “intuitive”.

Another thing we’ll want to do is multiply a function by a real number. Suppose we have a function f, just like above. Give me a real number. We’ll call that real number ‘a’ because I don’t remember if you can do the alpha symbol easily on web pages. Anyway, we can define a function, af’, the multiplication of the real number a by the function f. It has the same domain as f, and the same range as f. What’s its rule?

Let me say x is something in the domain of f. So f(x) is some real number. Then the new function af’ matches the x in the domain with a real number. That number is what you get by multiplying a’ by whatever `f(x)’ is. So there are major parts of your mathematician friend from college’s classes that you could have followed without trouble.

(Her class would have covered many more things, mind you, and covered these more cryptically.)

There’s more stuff we would like to do with functions. But for now, this is enough. This lets us turn a set of functions into a “vector space”. Vector spaces are kinds of things that work, at least a bit, like arithmetic. And mathematicians have studied these kinds of things. We have a lot of potent tools that work on vector spaces. So mathematicians develop a habit of finding vector spaces in what they study.

And I’m subject to that too. This is why I’ve spent such time talking about what we can do with functions rather than naming particular sets. I’ll pick up from that.

The Set Tour, Part 7: Matrices

I feel a bit odd about this week’s guest in the Set Tour. I’ve been mostly concentrating on sets that get used as the domains or ranges for functions a lot. The ones I want to talk about here don’t tend to serve the role of domain or range. But they are used a great deal in some interesting functions. So I loosen my rule about what to talk about.

Rm x n and Cm x n

Rm x n might explain itself by this point. If it doesn’t, then this may help: the “x” here is the multiplication symbol. “m” and “n” are positive whole numbers. They might be the same number; they might be different. So, are we done here?

Maybe not quite. I was fibbing a little when I said “x” was the multiplication symbol. R2 x 3 is not a longer way of saying R6, an ordered collection of six real-valued numbers. The x does represent a kind of product, though. What we mean by R2 x 3 is an ordered collection, two rows by three columns, of real-valued numbers. Say the “x” here aloud as “by” and you’re pronouncing it correctly.

What we get is called a “matrix”. If we put into it only real-valued numbers, it’s a “real matrix”, or a “matrix of reals”. Sometimes mathematical terminology isn’t so hard to follow. Just as with vectors, Rn, it matters just how the numbers are organized. R2 x 3 means something completely different from what R3 x 2 means. And swapping which positions the numbers in the matrix occupy changes what matrix we have, as you might expect.

You can add together matrices, exactly as you can add together vectors. The same rules even apply. You can only add together two matrices of the same size. They have to have the same number of rows and the same number of columns. You add them by adding together the numbers in the corresponding slots. It’s exactly what you would do if you went in without preconceptions.

You can also multiply a matrix by a single number. We called this scalar multiplication back when we were working with vectors. With matrices, we call this scalar multiplication. If it strikes you that we could see vectors as a kind of matrix, yes, we can. Sometimes that’s wise. We can see a vector as a matrix in the set R1 x n or as one in the set Rn x 1, depending on just what we mean to do.

It’s trickier to multiply two matrices together. As with vectors multiplying the numbers in corresponding positions together doesn’t give us anything. What we do instead is a time-consuming but not actually hard process. But according to its rules, something in Rm x n we can multiply by something in Rn x k. “k” is another whole number. The second thing has to have exactly as many rows as the first thing has columns. What we get is a matrix in Rm x k.

I grant you maybe didn’t see that coming. Also a potential complication: if you can multiply something in Rm x n by something in Rn x k, can you multiply the thing in Rn x k by the thing in Rm x n? … No, not unless k and m are the same number. Even if they are, you can’t count on getting the same product. Matrices are weird things this way. They’re also gateways to weirder things. But it is a productive weirdness, and I’ll explain why in a few paragraphs.

A matrix is a way of organizing terms. Those terms can be anything. Real matrices are surely the most common kind of matrix, at least in mathematical usage. Next in common use would be complex-valued matrices, much like how we get complex-valued vectors. These are written Cm x n. A complex-valued matrix is different from a real-valued matrix. The terms inside the matrix can be complex-valued numbers, instead of real-valued numbers. Again, sometimes, these mathematical terms aren’t so tricky.

I’ve heard occasionally of people organizing matrices of other sets. The notation is similar. If you’re building a matrix of “m” rows and “n” columns out of the things you find inside a set we’ll call H, then you write that as Hm x n. I’m not saying you should do this, just that if you need to, that’s how to tell people what you’re doing.

Now. We don’t really have a lot of functions that use matrices as domains, and I can think of fewer that use matrices as ranges. There are a couple of valuable ones, ones so valuable they get special names like “eigenvalue” and “eigenvector”. (Don’t worry about what those are.) They take in Rm x n or Cm x n and return a set of real- or complex-valued numbers, or real- or complex-valued vectors. Not even those, actually. Eigenvectors and eigenfunctions are only meaningful if there are exactly as many rows as columns. That is, for Rm x m and Cm x m. These are known as “square” matrices, just as you might guess if you were shaken awake and ordered to say what you guessed a “square matrix” might be.

They’re important functions. There are some other important functions, with names like “rank” and “condition number” and the like. But they’re not many. I believe they’re not even thought of as functions, any more than we think of “the length of a vector” as primarily a function. They’re just properties of these matrices, that’s all.

So why are they worth knowing? Besides the joy that comes of knowing something, I mean?

Here’s one answer, and the one that I find most compelling. There is cultural bias in this: I come from an applications-heavy mathematical heritage. We like differential equations, which study how stuff changes in time and in space. It’s very easy to go from differential equations to ordered sets of equations. The first equation may describe how the position of particle 1 changes in time. It might describe how the velocity of the fluid moving past point 1 changes in time. It might describe how the temperature measured by sensor 1 changes as it moves. It doesn’t matter. We get a set of these equations together and we have a majestic set of differential equations.

Now, the dirty little secret of differential equations: we can’t solve them. Most interesting physical phenomena are nonlinear. Linear stuff is easy. Small change 1 has effect A; small change 2 has effect B. If we make small change 1 and small change 2 together, this has effect A plus B. Nonlinear stuff, though … it just doesn’t work. Small change 1 has effect A; small change 2 has effect B. Small change 1 and small change 2 together has effect … A plus B plus some weird A times B thing plus some effect C that nobody saw coming and then C does something with A and B and now maybe we’d best hide.

There are some nonlinear differential equations we can solve. Those are the result of heroic work and brilliant insights. Compared to all the things we would like to solve there’s not many of them. Methods to solve nonlinear differential equations are as precious as ways to slay krakens.

But here’s what we can do. What we usually like to know about in systems are equilibriums. Those are the conditions in which the system stops changing. Those are interesting. We can usually find those points by boring but not conceptually challenging calculations. If we can’t, we can declare x0 represents the equilibrium. If we still care, we leave calculating its actual values to the interested reader or hungry grad student.

But what’s really interesting is: what happens if we’re near but not exactly at the equilibrium? Sometimes, we stay near it. Think of pushing a swing. However good a push you give, it’s going to settle back to the boring old equilibrium of dangling straight down. Sometimes, we go racing away from it. Think of trying to balance a pencil on its tip; if we did this perfectly it would stay balanced. It never does. We’re never perfect, or there’s some wind or somebody walks by and the perfect balance is foiled. It falls down and doesn’t bounce back up. Sometimes, whether it it stays near or goes away depends on what way it’s away from the equilibrium.

And now we finally get back to matrices. Suppose we are starting out near an equilibrium. We can, usually, approximate the differential equations that describe what will happen. The approximation may only be good if we’re just a tiny bit away from the equilibrium, but that might be all we really want to know. That approximation will be some linear differential equations. (If they’re not, then we’re just wasting our time.) And that system of linear differential equations we can describe using matrices.

If we can write what we are interested in as a set of linear differential equations, then we have won. We can use the many powerful tools of matrix arithmetic — linear algebra, specifically — to tell us everything we want to know about the system. We can say whether a small push away from the equilibrium stays small, or whether it grows, or whether it depends. We can say how fast the small push shrinks, or grows (for a while). We can say how the system will change, approximately.

This is what I love in matrices. It’s not everything there is to them. But it’s enough to make matrices important to me.

The Set Tour, Part 6: One Big One Plus Some Rubble

I have a couple of sets for this installment of the Set Tour. It’s still an unusual installment because only one of the sets is that important for my purposes here. The rest I mention because they appear a lot, even if they aren’t much used in these contexts.

I, or J, or maybe Z

The important set here is the integers. You know the integers: they’re the numbers everyone knows. They’re the numbers we count with. They’re 1 and 2 and 3 and a hundred million billion. As we get older we come to accept 0 as an integer, and even the negative integers like “negative 12” and “minus 40” and all that. The integers might be the easiest mathematical construct to know. The positive integers, anyway. The negative ones are still a little suspicious.

The set of integers has several shorthand names. I is a popular and common one. As with the real-valued numbers R and the complex-valued numbers C it gets written by hand, and typically typeset, with a double vertical stroke. And we’ll put horizontal serifs on the top and bottom of the symbol. That’s a concession to readability. You see the same effect in comic strip lettering. A capital “I” in the middle of a word will often be written without serifs, while the word by itself needs the extra visual bulk.

The next popular symbol is J, again with a double vertical stroke. This gets used if we want to reserve “I”, or the word “I”, for some other purpose. J probably gets used because it’s so very close to I, and it’s only quite recently (in historic terms) that they’ve even been seen as different letters.

The symbol that seems to come out of nowhere is Z. It comes less from nowhere than it does from German. The symbol derives from “Zahl”, meaning “number”. It seems to have got into mathematics by way of Nicolas Bourbaki, the renowned imaginary French mathematician. The Z gets written with a double diagonal stroke.

Personally, I like Z most of this set, but on trivial grounds. It’s a more fun letter to write, especially since I write it with the middle horizontal stroke that. I’ve got no good cultural or historical reason for this. I just picked it up as a kid and never set it back down.

In these Set Tour essays I’m trying to write about sets that get used often as domains and ranges for functions. The integers get used a fair bit, although not nearly as often as real numbers do. The integers are a natural way to organize sequences of numbers. If the record of a week’s temperatures (in Fahrenheit) are “58, 45, 49, 54, 58, 60, 64”, there’s an almost compelling temperature function here. f(1) = 58, f(2) = 45, f(3) = 49, f(4) = 54, f(5) = 58, f(6) = 60, f(7) = 64. This is a function that has as its domain the integers. It happens that the range here is also integers, although you might be able to imagine a day when the temperature reading was 54.5.

Sequences turn up a lot. We are almost required to measure things we are interested in in discrete samples. So mathematical work with sequences uses integers as the domain almost by default. The use of integers as a domain gets done so often that it often becomes invisible, though. Someone studying my temperature data above might write the data as f1, f2, f3, and so on. One might reasonably never even notice there’s a function there, or a domain.

And that’s fine. A tool can be so useful it disappears. Attend a play; the stage is in light and the audience in darkness. The roles the light and darkness play disappear unless the director chooses to draw attention to this choice.

And to be honest, integers are a lousy domain for functions. It’s achingly hard to prove things for functions defined just on the integers. The easiest way to do anything useful is typically to find an equivalent problem for a related function that’s got the real numbers as a domain. Then show the answer for that gives you your best-possible answer for the original question.

If all we want are the positive integers, we put a little superscript + to our symbol: I+ or J+ or Z+. That’s a popular choice if we’re using the integers as an index. If we just want the negative numbers that’s a little weird, but, change the plus sign to a minus: I.

Now for some trouble.

Sometimes we want the positive numbers and zero, or in the lingo, the “nonnegative numbers”. Good luck with that. Mathematicians haven’t quite settled on what this should be called, or abbreviated. The “Natural numbers” is a common name for the numbers 0, 1, 2, 3, 4, and so on, and this makes perfect sense and gets abbreviated N. You can double-brace the left vertical stroke, or the diagonal stroke, as you like and that will be understood by everybody.

That is, everybody except the people who figure “natural numbers” should be 1, 2, 3, 4, and so on, and that zero has no place in this set. After all, every human culture counts with 1 and 2 and 3, and for that matter crows and raccoons understand the concept of “four”. Yet it took thousands of years for anyone to think of “zero”, so how natural could that be?

So we might resort to speaking of the “whole numbers” instead. More good luck with that. Besides leaving open the question of whether zero should be considered “whole” there’s the linguistic problem. “Whole” number carries, for many, the implication of a number that is an integer with no fractional part. We already have the word “integer” for that, yes. But the fact people will talk about rounding off to a whole number suggests the phrase “whole number” serves some role that the word “integer” doesn’t. Still, W is sitting around not doing anything useful.

Then there’s “counting numbers”. I would be willing to endorse this as a term for the integers 0, 1, 2, 3, 4, and so on, except. Have you ever met anybody who starts counting from zero? Yes, programmers for some — not all! — computer languages. You know which computer languages. They’re the languages which baffle new students because why on earth would we start counting things from zero all of a sudden? And the obvious single-letter abbreviation C is no good because we need that for complex numbers, a set that people actually use for domains a lot.

There is a good side to this, if you aren’t willing to sit out the 150 years or so mathematicians are going to need to sort this all out. You can set out a symbol that makes sense to you, early on in your writing, and stick with it. If you find you don’t like it, you can switch to something else in your next paper and nobody will protest. If you figure out a good one, people may imitate you. If you figure out a really good one, people will change it just a tiny bit so that their usage drives you crazy. Life is like that.

Eric Weisstein’s Mathworld recommends using Z* for the nonnegative integers. I don’t happen to care for that. I usually associate superscript * symbols with some operations involving complex-valued numbers and with the duals of sets, neither of which is in play here. But it’s not like he’s wrong and I’m right. If I were forced to pick a symbol right now I’d probably give Z0+. And for the nonpositive itself — the negative integers and zero — Z0- presents itself. I fully understand there are people who would be driven stark raving mad by this. Maybe you have a better one. I’d believe that.

Let me close with something non-controversial.

These are some sets that are too important to go unmentioned. But they don’t get used much in the domain-and-range role I’ve been using as basis for these essays. They are, in the terrain of these essays, some rubble.

You know the rational numbers? They’re the things you can write as fractions: 1/2, 5/13, 32/7, -6/7, 0 (think about it). This is a quite useful set, although it doesn’t get used much for the domain or range of functions, at least not in the fields of mathematics I see. It gets abbreviated as Q, though. There’s an extra vertical stroke on the left side of the loop, just as a vertical stroke gets added to the C for complex-valued numbers. Why Q? Well, “R” is already spoken for, as we need it for the real numbers. The key here is that every rational number can be written as the quotient of one integer divided by another. So, this is the set of Quotients. This abbreviation we get thanks to Bourbaki, the same folks who gave us Z for integers. If it strikes you that the imaginary French mathematician Bourbaki used a lot of German words, all I can say is I think that might have been part of the fun of the Bourbaki project. (Well, and German mathematicians gave us many breakthroughs in the understanding of sets in the late 19th and early 20th centuries. We speak with their language because they spoke so well.)

If you’re comfortable with real numbers and with rational numbers, you know of irrational numbers. These are (most) square roots, and pi and e, and the golden ratio and a lot of cosines of angles. Strangely, there really isn’t any common shorthand name or common notation for the irrational numbers. If we need to talk about them, we have the shorthand “R \ Q”. This means “the real numbers except for the rational numbers”. Or we have the shorthand “Qc”. This means “everything except the rational numbers”. That “everything” carries the implication “everything in the real numbers”. The “c” in the superscript stands for “complement”, everything outside the set we’re talking about. These are ungainly, yes. And it’s a bit odd considering that most real numbers are irrational numbers. The rational numbers are a most ineffable cloud of dust the atmosphere of the real numbers.

But, mostly, we don’t need to talk about functions that have an irrational-number domain. We can do our work with a real-number domain instead. So we leave that set with a clumsy symbol. If there’s ever a gold rush of fruitful mathematics to be done with functions on irrational domains then we’ll put in some better notation. Until then, there are better jobs for our letters to do.

The Set Tour, Part 5: C^n

The next piece in this set tour is a hybrid. It mixes properties of the last two sets. And I’ll own up now that while it’s a set that gets used a lot, it’s one that gets used a lot in just some corners of mathematics. It’s got a bit of that “Internet fame”. In particular circles it’s well-known; venture outside those circles even a little, and it’s not. But it leads us into other, useful places.

Cn

C here is the set of complex-valued numbers. We may have feared them once, but now they’re friends, or at least something we can work peacefully with. n here is some counting number, just as it is with Rn. n could be one or two or forty or a hundred billion. It’ll be whatever fits the problem we’re doing, if we need to pin down its value at all.

The reference to Rn, another friend, probably tipped you off to the rest. The items in Cn are n-tuples, ordered sets of some number n of numbers. Each of those numbers is itself a complex-valued number, something from C. Cn gets typeset in bold, and often with that extra vertical stroke on the left side of the C arc. It’s handwritten that way, too.

As with Rn we can add together things in Cn. Suppose that we are in C2 so that I don’t have to type too much. Suppose the first number is (2 + i, -3 – 3*i) and the second number is (6 – 2*i, 2 + 9*i). There could be fractions or irrational numbers in the real and imaginary components, but I don’t want to type that much. The work is the same. Anyway, the sum will be another number in Cn. The first term in that sum will be the sum of the first term in the first number, 2 + i, and the first term in the second number, 6 – 2*i. That in turn will be the sum of the real and of the imaginary components, so, 2 + 6 + i – 2*i, or 8 – i all told. The second term of the sum will be the second term of the first number, -3 – 3*i, and the second term of the second number, 2 + 9*i, which will be -3 – 3*i + 2 + 9*i or, all told, -1 + 6*i. The sum is the n-tuple (8 – i, -1 + 6*i).

And also as with Rn there really isn’t multiplying of one term of Cn by another. Generally, we can’t do this in any useful way. We can multiply something in Cn by a scalar, a single real — or, why not, complex-valued — number, though.

So let’s start out with (8 – i, -1 + 6*i), a number in C2. And then pick a scalar, say, 2 + 2*i. It doesn’t have to be complex-valued, but, why not? The product of this scalar and this term will be another number in C2. Its first term will the scalar, 2 + 2*i, multiplied by the first term in it, 8 – i. That’s (2 + 2*i) * (8 – i), or 2*8 – 2*i + 16*i – 2*i*i, or 2*8 – 2*i + 16*i + 2, or 18 + 14*i. And then its second term will be the scalar 2 + 2*i multiplied by the second term, -1 + 6*i. That’s (2 + 2*i)*(-1 + 6*i), or 2*(-1) + 2*6*i -2*i + 2*6*i*i. And that’s -2 + 12*i – 2*i -12, or -14 + 10*i. So the product is (18 + 14*i, -14 + 10*i).

So as with Rn, Cn creates a “vector space”. These spaces are useful in complex analysis. They’re also useful in the study of affine geometry, a corner of geometry that I’m sad to admit falls outside what I studied. I have tried reading up on them on my own, and I run aground each time. I understand the basic principles but never quite grasp why they are interesting. That’s my own failing, of course, and I’d be glad for a pointer that explained in ways I understood why they’re so neat.

I do understand some of what’s neat about them: affine geometry tells us what we can know about shapes without using the concept of “distance”. When you discover that we can know anything about shapes without the idea of “distance” your imagination should be fired. Mine is, too. I just haven’t followed from that to feel comfortable with the terminology and symbols of the field.

You could, if you like, think of Cn as being a specially-delineated version of R2*n. This is just as you can see a complex number as an ordered pair of real numbers. But sometimes information is usefully thought of as a single, complex-valued number. And there is a value in introducing the idea of ordered sets of things that are not real numbers. We will see the concept again.

Also, the heck did I write an 800-word essay about the family of sets of complex-valued n-tuples and have Hemingway Editor judge it to be at the “Grade 3” reading level? I rarely get down to “Grade 6” when I do a Reading the Comics post explaining how Andertoons did a snarky-word-problem-answers panel. That’s got to be a temporary glitch.

C

The square root of negative one. Everybody knows it doesn’t exist; there’s no real number you can multiply by itself and get negative one out. But then sometime in algebra, deep in a section about polynomials, suddenly we come out and declare there is such a thing. It’s an “imaginary number” that we call “i”. It’s hard to blame students for feeling betrayed by this. To make it worse, we throw real and imaginary numbers together and call the result “complex numbers”. It’s as if we’re out to tease them for feeling confused.

It’s an important set of things, though. It turns up as the domain, or the range, of functions so often that one of the major fields of analysis is called, “Complex Analysis”. If the course listing allows for more words, it’s called “Analysis of Functions of a Complex Variable” or something like that. Despite the connotations of the word “complex”, though, the field is a delight. It’s considerably easier to understand than Real Analysis, the study of functions of mere real numbers. When there is a theorem that has a version in Real Analysis and a version in Complex Analysis, the Complex Analysis side is usually easier to prove and easier to understand. It’s uncanny.

The set of all complex numbers is denoted C, in parallel to the set of real numbers, R. To make it clear that we mean this set, and not some piddling little common set that might happen to share the name C, add a vertical stroke to the left of the letter. This is just as we add a vertical stroke to R to emphasize we mean the Real Numbers. We should approach the set with respect, removing our hats, thinking seriously about great things. It would look silly to add a second curve to C though, so we just add a straight vertical stroke on the left side of the letter C. This makes it look a bit like it’s an Old English typeface (the kind you call Gothic until you learn that means “sans serif”) pared down to its minimum.

Why do we teach people there’s no such thing as a square root of minus one, and then one day, teach them there is? Part of it is that whether there is a square root depends on your context. If you are interested only in the real numbers, there’s nothing that, squared, gives you minus one. This is exactly the way that it’s not possible to equally divide five objects between two people if you aren’t allowed to cut the objects in half. But if you are willing to allow half-objects to be things, then you can do what was previously forbidden. What you can do depends on what the rules you set out are.

And there’s surely some echo of the historical discovery of imaginary and complex numbers at work here. They were noticed when working out the roots of third- and fourth-degree polynomials. These can be done by way of formulas that nobody ever remembers because there are so many better things to remember. These formulas would sometimes require one to calculate a square root of a negative number, a thing that obviously didn’t exist. Except that if you pretended it did, you could get out correct answers, just as if these were ordinary numbers. You can see why this may be dubbed an “imaginary” number. The name hints at the suspicion with which it’s viewed. It’s much as “negative” numbers look like some trap to people who’re just getting comfortable with fractions.

It goes against the stereotype of mathematicians to suppose they’d accept working with something they don’t understand because the results are all right, afterwards. But, actually, mathematicians are willing to accept getting answers by any crazy method. If you have a plausible answer, you can test whether it’s right, and if all you really need this minute is the right answer, good.

But we do like having methods; they’re more useful than mere answers. And we can imagine this set called the complex numbers. They contain … well, all the possible roots, the solutions, of all polynomials. (The polynomials might have coefficients — the numbers in front of the variable — of integers, or rational numbers, or irrational numbers. If we already accept the idea of complex numbers, the coefficients can be complex numbers too.)

It’s exceedingly common to think of the complex numbers by starting off with a new number called “i”. This is a number about which we know nothing except that i times i equals minus one. Then we tend to think of complex numbers as “a real number plus i times another real number”. The first real number gets called “the real component”, and is usually denoted as either “a” or “x”. The second real number gets called “the imaginary component”, and is usually denoted as either “b” or “y”. Then the complex number is written “a + i*b” or “x + i*y”. Sometimes it’s written “a + b*i” or “x + y*i”; that’s a mere matter of house style. Don’t let it throw you.

Writing a complex number this way has advantages. Particularly, it makes it easy to see how one would add together (or subtract) complex numbers: “a + b*i + x + y*i” almost suggests that the sum should be “(a + x) + (b + y)*i”. What we know from ordinary arithmetic gives us guidance. And if we’re comfortable with binomials, then we know how to multiply complex numbers. Start with “(a + b*i) * (x + y*i)” and follow the distributive law. We get, first, “a*x + a*y*i + b*i*x + b*y*i*i”. But “i*i” equals minus one, so this is the same as “a*x + a*y*i + b*i*x – b*y”. Move the real components together, and move the imaginary components together, and we have “(a*x – b*y) + (a*y + b*x)*i”.

That’s the most common way of writing out complex numbers. It’s so common that Eric W Weisstein’s Mathworld encyclopedia even says that’s what complex numbers are. But it isn’t the only way to construct, or look at, complex numbers. A common alternate way to look at complex numbers is to match a complex number to a point on the plane, or if you prefer, a point in the set R2.

It’s surprisingly natural to think of the real component as how far to the right or left of an origin your complex number is, and to think of the imaginary component as how far above or below the origin it is. Much complex-number work makes sense if you think of complex numbers as points in space, or directions in space. The language of vectors trips us up only a little bit here. We speak of a complex number as corresponding to a point on the “complex plane”, just as we might speak of a real number as a point on the “(real) number line”.

But there are other descriptions yet. We can represent complex numbers as a pair of numbers with a scheme that looks like polar coordinates. Pick a point on the complex plane. We can say where that is by two points of information. The first is the amplitude, or magnitude: how far the point is from the origin. The second is the phase, or angle: draw the line segment connecting the origin and your point. What angle does that make with the positive horizontal axis?

This representation is called the “phasor” representation. It’s tolerably popular in physics and I hear tell of engineers liking it. We represent numbers then not as “x + i*y” but instead as “r * e”, with r the magnitude and θ the angle. “e” is the base of the natural logarithm, which you get very comfortable with if you do much mathematics or physics. And “i” is just what we’ve been talking about here. This is a pretty natural way to write about complex numbers that represent stuff that oscillates, such as alternating current or the probability function in quantum mechanics. A lot of stuff oscillates, if you study it through the right lens. So numbers that look like this keep creeping in, and into unexpected places. It’s quite easy to multiply numbers in phasor form — just multiply the magnitude parts, and add the angle parts — although addition and subtraction become a pain.

Mathematicians generally use the letter “z” to represent a complex-valued number whose identity is not known. As best I can tell, this is because we do think so much of a complex number as the sum “x + y*i”. So if we used familiar old “x” for an unknown number, it would carry the connotations of “the real component of our complex-valued number” and mislead the unwary mathematician. The connection is so common that a mathematician might carelessly switch between “z” and the real and imaginary components “x” and “y” without specifying that “z” is another way of writing “x + y*i”. A good copy editor or an alert student should catch this.

Complex numbers work very much like real numbers do. They add and multiply in natural-looking ways, and you can do subtraction and division just as well. You can take exponentials, and can define all the common arithmetic functions — sines and cosines, square roots and logarithms, integrals and differentials — on them just as well as you can with real numbers. And you can embed the real numbers within the complex numbers: if you have a real number x, you can match that perfectly with the complex number “x + 0*i”.

But that doesn’t mean complex numbers are exactly like the real numbers. For example, it’s possible to order the real numbers. You can say that the number “a” is less than the number “b”, and have that mean something. That’s not possible to do with complex numbers. You can’t say that “a + b*i” is less than, or greater than, “x + y*i” in a logically consistent way. You can say the magnitude of one complex-valued number is greater than the magnitude of another. But the magnitudes are real numbers. For all that complex numbers give us there are things they’re not good for.

A Venn Diagram of the Real Number System

I’m aware that it isn’t properly exactly a Venn diagram, now, but the mathematics-artist Robert Austin has a nice picture of the real numbers, and the most popular subsets of the real numbers, and how they relate. The bubbles aren’t to scale — there’s just as many counting numbers (1, 2, 3, 4, et cetera) as there are rational numbers, and there are far more irrational numbers than there are rational numbers — but if you don’t mind that, then, this is at least a nice little illustration.

Augustin-Louis Cauchy’s birthday

The Maths History feed on Twitter mentioned that the 21st of August was the birthday of Augustin-Louis Cauchy, who lived from 1789 to 1857. His is one of those names you get to know very well when you’re a mathematics major, since he published 789 papers in his life, and did very well at publishing important papers, ones that established concepts people would actually use.

He’s got an intriguing biography, as he lived (mostly) in France during the time of the Revolution, the Directorate, Napoleon, the Bourbon Restoration, the July Monarchy, the Revolutions of 1848, the Second Republic, and the Second Empire, and had a career which got inextricably tangled with the political upheavals of the era. I note that, according to the MacTutor biography linked to earlier this paragraph, he followed the deposed King Charles X to Prague in order to tutor his grandson, but might not have had the right temperament for it: at least once he got annoyed at the grandson’s confusion and screamed and yelled, with the Queen, Marie Thérèse, sometimes telling him, “too loud, not so loud”. But we’ve all had students that frustrate us.

Cauchy’s name appears on many theorems and principles and definitions of interesting things — I just checked Mathworld and his name returned 124 different items — though I’ll admit I’m stumped how to describe what the Cauchy-Frobenius Lemma is without scaring readers off. So let me talk about something simpler.

How Big Charlotte Was In 1975

[ I cannot and do not try to explain it, but yesterday was a busier-than-average day around these parts, with a surprising number of references coming from an Entertainment weekly article about the House series finale for some reason. In this context a “surprising” number is “any number other than zero” since I don’t know why anyone would go from there to here. I watched House, sometimes, sure, and liked it, but kind of drifted away when there was other stuff to do, you know? ]

That’s enough time spent establishing the heck out of the idea of a polynomial. Let’s actually put one in place. My goal back when was estimating what the population of Charlotte, North Carolina, was around 1975. I had some old Census data from 1970 and 1980 giving its population on the first of April, the earlier year, as 840,347; and the first of April, 1980, as 971,391.