The Set Tour, Part 13: Continuity


I hope we’re all comfortable with the idea of looking at sets of functions. If not we can maybe get comfortable soon. What’s important about functions is that we can add them together, and we can multiply them by real numbers. They work in important ways like regular old numbers would. They also work the way vectors do. So all we have to do is be comfortable with vectors. Then we have the background to talk about functions this way. And so, my first example of an oft-used set of functions:

C[a, b]

People like continuity. It’s comfortable. It’s reassuring, even. Most situations, most days, most things are pretty much like they were before, and that’s how we want it. Oh, we cast some hosannas towards the people who disrupt the steady progression of stuff. But we’re lying. Think of the worst days of your life. They were the ones that were very much not like the day before. If the day is discontinuous enough, then afterwards, people ask one another what they were doing when the discontinuous thing happened.

(OK, there are some good days which are very much not like the day before. But imagine someone who seems informed assures you that tomorrow will completely change your world. Do you feel anticipation or dread?)

Mathematical continuity isn’t so fraught with social implications. What we mean by a continuous function is — well, skip the precise definition. Calculus I students see it, stare at it, and run away. It comes back to the mathematics majors in Intro to Real Analysis. Then it comes back again in Real Analysis. Mathematics majors get to accepting it sometime around Real Analysis II, because the alternative is Functional Analysis. The definition’s in truth not so bad. But it’s fussy and if you get any parts wrong silly consequences follow.

If you’re not a mathematics major, or if you’re a mathematics major not taking a test in Real Analysis, you can get away with this. We’re talking here, and we’re going to keep talking, about functions with real numbers as the domain and real numbers as the range. Later, we can go to complex-valued numbers, or even vectors of numbers. The arguments get a bit longer but don’t change much, so if you learn this you’ve got most of the way to learning everything.

A continuous function is one whose graph you can draw without having to lift your pen. We like continuous functions, mathematically, because they are so much easier to work with. Why are they easy? Well, because if you know the value of your function at one point, you know approximately what it is at nearby points. There’s predictability to the function’s values. You can see why this would make it easier to do calculations. But it makes analysis easy too. We want to do a lot of proofs which involve arithmetic with the values functions have. It gets so much easier that we can say the function’s actual value is something like the value it has at some point we happen to know.

So if we want to work with functions, we usually want to work with continuous functions. They behave more predictably, and more like we hope they will.

The set C[a, b] is the set of all continuous real-valued whose domain is the set of real numbers from a to b. For example, pick a function that’s in C[-1, 1]. Let me call it f. Then f is a real-valued function. And its domain is the real numbers from -1 to 1. In the absence of other information about what its range is, we assume it to be the real numbers R. We can have any real numbers as the boundaries; C[-1000, π] is legitimate if eccentric.

There are some ranges that are particularly popular. All the real numbers is one. That might get written C(R) for shorthand. C[0, 1], the range from 0 to 1, is popular and easy to work with. C[-1, 1] is almost as good and has the advantage of giving us negative numbers. C[-π, π] is also liked because it meshes well with the trigonometric functions. You remember those: sines and cosines and tangent functions, plus some unpopular ones we try to not talk about. We don’t often talk about other ranges. We can change, say, C[0, 1] into C[0, 10] exactly the way you’d imagine. Re-scaling numbers, and even shifting them up or down some, requires so little work we don’t bother doing it.

C[-1, 1] is a different set of functions from, say, C[0, 1]. There are many functions in one set that have the same rule as a function in another set. But the functions in C[-1, 1] have a different domain from the functions in C[0, 1]. So they can’t be the same functions. The rule might be meaningful outside the domain. If the rule is “f:x -> 3*x”, well, that makes sense whatever x should be. But a function is the rule, the domain, and the range together. If any of the parts changes, we have a different function.

The way I’ve written the symbols, with straight brackets [a, b], means that both the numbers a and b are in the domain of these functions. If I want to omit the boundaries — have every number greater than a but not a itself, and have every number less than b but not b itself — then we change to parentheses. That would be C(-1, 1). If I want to include one boundary but not the other, use a straight bracket for the boundary to include, and a parenthesis for the boundary to omit. C[-1, 1) says functions in that set have a domain that includes -1 but does not include -1. It also drives my text editor crazy having unmatched parentheses and brackets like that. We must suffer for our mathematical arts.

The Set Tour, Part 12: What Can You Do With Functions?


I want to resume my tour of sets that turn up a lot as domains and ranges. But I need to spend some time explaining stuff before the next bunch. I want to talk about things that aren’t so familiar as “numbers” or “shapes”. We get into more abstract things.

We have to start out with functions. Functions are built of three points, a set that’s the domain, a set that’s the range, and a rule that matches things in the domain to things in the range. But what’s a set? Sets are bunches of things. (If we want to avoid logical chaos we have to be more exact. But we’re not going near the zones of logical chaos. So we’re all right going with “sets are bunches of things”. WARNING: do not try to pass this off at your thesis defense.)

So if a function is a thing, can’t we have a set that’s made up of functions? Sure, why not? We can get a set by describing the collection of things we want in it. At least if we aren’t doing anything weird. (See above warning.)

Let’s pick out a set of functions. Put together a group of functions that all have the same set as their domain, and that have compatible sets as their range. The real numbers are a good pick for a domain. They’re also good for a range.

Is this an interesting set? Generally, a set is boring unless we can do something with the stuff in it. That something is, almost always, taking a pair of the things in the set and relating it to something new. Whole numbers, for example, would be trivia if we weren’t able to add them together. Real numbers would be a complicated pile of digits if we couldn’t multiply them together. Having things is nice. Doing stuff with things is all that’s meaningful.

So what can we do with a couple of functions, if they have the same domains and ranges? Let’s pick one out. Give it the name ‘f’. That’s a common name for functions. It was given to us by Leonhard Euler, who was brilliant in every field of mathematics, including in creating notation. Now let’s pick out a function again. Give this new one the name ‘g’. That’s a common name for functions, given to us by every mathematician who needed something besides ‘f’. (There are alternatives. One is to start using subscripts, like f1 and f2. That’s too hard for me to type. Another is to use different typefaces. Again, too hard for me. Another is to use lower- and upper-case letters, ‘f’ and ‘F’. Using alternate-case forms usually connotes that these two functions are related in some way. I don’t want to suggest that they are related here. So, ‘g’ it is.)

We can do some obvious things. We can add them together. We can create a new function, imaginatively named `f + g’. It’ll have the same domain and the same range as f and g did. What rule defines how it matches things in the domain to things in the range?

Mathematicians throw the term “obvious” around a lot. Also “intuitive”. What they mean is “what makes sense to me but I don’t want to write it down”. Saying that is fine if your mathematician friend knows roughly what you’d think makes sense. It can be catastrophic if she’s much smarter than you, or thinks in weird ways, and is always surprised other people don’t think like her. It’s hard to better describe it than “obvious”, though. Well, here goes.

Let me pick something that’s in the domain of both f and g. I’m going to call that x, which mathematicians have been doing ever since René Descartes gave us the idea. So “f(x)” is something in the range of f, and “g(x) is something in the range of g. I said, way up earlier, that both of these ranges are the same set and suggested the real numbers there. That is, f(x) is some real number and I don’t care which just now. g(x) is also some real number and again I don’t care right now just which.

The function we call “f + g” matches the thing x, in the domain, to something in the range. What thing? The number f(x) + g(x). I told you, I can’t see any fair way to describe that besides being “obvious” and “intuitive”.

Another thing we’ll want to do is multiply a function by a real number. Suppose we have a function f, just like above. Give me a real number. We’ll call that real number ‘a’ because I don’t remember if you can do the alpha symbol easily on web pages. Anyway, we can define a function, `af’, the multiplication of the real number a by the function f. It has the same domain as f, and the same range as f. What’s its rule?

Let me say x is something in the domain of f. So f(x) is some real number. Then the new function `af’ matches the x in the domain with a real number. That number is what you get by multiplying `a’ by whatever `f(x)’ is. So there are major parts of your mathematician friend from college’s classes that you could have followed without trouble.

(Her class would have covered many more things, mind you, and covered these more cryptically.)

There’s more stuff we would like to do with functions. But for now, this is enough. This lets us turn a set of functions into a “vector space”. Vector spaces are kinds of things that work, at least a bit, like arithmetic. And mathematicians have studied these kinds of things. We have a lot of potent tools that work on vector spaces. So mathematicians develop a habit of finding vector spaces in what they study.

And I’m subject to that too. This is why I’ve spent such time talking about what we can do with functions rather than naming particular sets. I’ll pick up from that.

The Set Tour, Part 11: Doughnuts And Lots Of Them


I’ve been slow getting back to my tour of commonly-used domains for several reasons. It’s been a busy season. It’s so much easier to plan out writing something than it is to write something. The usual. But one of my excuses this time is that I’m not sure the set I want to talk about is that common. But I like it, and I imagine a lot of people will like it. So that’s enough.

T and Tn

T stands for the torus. Or the toroid, if you prefer. It’s a fun name. You know the shape. It’s a doughnut. Take a cylindrical tube and curl it around back on itself. Don’t rip it or fold it. That’s hard to do with paper or a sheet of clay or other real-world stuff. But we can imagine it easily enough. I suppose we can make a computer animation of it, if by ‘we’ we mean ‘you’.

We don’t use the whole doughnut shape for T. And no, we don’t use the hole either. What we use is the surface of the doughnut, the part that could get glazed. We ignore the inside, just the same way we had S represent the surface of a sphere (or the edge of a circle, or the boundary of a hypersphere). If there is a common symbol for the torus including the interior I don’t know it. I’d be glad to hear if someone had.

What good is the surface of a torus, though? Well, it’s a neat shape. Slice it in one direction, the way you’d cut a bagel in half, and at the slice you get the shape of a washer, the kind you fit around a nut and bolt. (An annulus, to use the trade term.) Slice it perpendicular to that, the way you’d cut it if you’re one of those people who eats half doughnuts to the amazement of the rest of us, and at the slice you get two detached circles. If you start from any point on the torus shape you can go in one direction and make a circle that loops around the doughnut’s central hole. You can go the perpendicular direction and make a circle that brushes up against but doesn’t go around the central hole. There’s some neat topology in it.

There’s also video games in it. The topology of this is just like old-fashioned video games where if you go off the edge of the screen to the right you come back around on the left, and if you go off the top you come back from the bottom. (And if you go off to the left you come back around the right, and off the bottom you come back to the top.) To go from the flat screen to the surface of a doughnut requires imagining some stretching and scrunching up of the surface, but that’s all right. (OK, in an old video game it was a kind-of flat screen.) We can imagine a nice flexible screen that just behaves.

This is a common trick to deal with boundaries. (I first wrote “to avoid having to deal with boundaries”. But this is dealing with them, by a method that often makes sense.) You just make each boundary match up with a logical other boundary. It’s not just useful in video games. Often we’ll want to study some phenomenon where the current state of things depends on the immediate neighborhood, but it’s hard to say what a logical boundary ought to be. This particularly comes up if we want to model an infinitely large surface without dealing with infinitely large things. The trick will turn up a lot in numerical simulations for that reason. (In that case, we’re in truth working with a numerical approximation of T, but that’ll be close enough.)

Tn, meanwhile, is a vector of things, each of which is a point on a torus. It’s akin to Rn or S2 x n. They’re ordered sets of things that are themselves things. There can be as many as you like. n, here, is whatever positive whole number you need.

You might wonder how big the doughnut is. When we talked about the surface of the sphere, S2, or the surface and interior, B3, we figured on a sphere with radius of 1 unless we heard otherwise. Toruses would seem to have two parameters. There’s how big the outer diameter is and how big the inner diameter is. Which do we pick?

We don’t actually care. It’s much the way we can talk about a point on the surface of a planet by the latitude and longitude of the point, and never care about how big the planet is. We can describe a point on the surface of the torus without needing to refer to how big the whole shape is or how big the hole in the middle is. A popular scheme to describe points is one that looks a lot like latitude and longitude.

Imagine the torus sitting as flat as it gets on the table. Pick a point that you find interesting.

We use some reference point that’s as good as an equator and a prime meridian. One coordinate is the angle you make going horizontally, possibly around the hole in the middle, from the reference point to the point we’re interested in. The other coordinate is the angle you make vertically, going in a loop that doesn’t go around the hole in the middle, from the reference point to the point we’re interested in. The reference point has coordinates 0, 0, as it must. If this sounds confusing it’s because I’m not using a picture. I thought making some pictures would be too much work. I’m a fool. But if you think of real torus-shaped objects it’ll come to you.

In this scheme the coordinates are both angles. Normal people would measure that in degrees, from 0 to 360, or maybe from -180 to 180. Mathematicians would measure as radians, from 0 to 2π, or from -π to +π. Whatever it is, it’s the same as the coordinates of a point on the edge of the circle, what we called S1 a few essays back. So it’s fair to say you can think of T as S1 x S1, an ordered set of points on circles.

I’ve written of these toruses as three-dimensional things. Well, two dimensional-surfaces wrapped up to suggest three-dimensional objects. You don’t have to stick with these dimensions if you don’t want or if your problem needs something else. You can make a torus that’s a three-dimensional shape in four dimensions. For me that’s easiest to imagine as a cube where the left edge and the right edge loop back and meet up, the lower and the upper edges meet up, and the front and the back edges meet up. This works well to model an infinitely large space with a nice and small block.

I like to think I can imagine a four-dimensional doughnut where every cross-section is a sphere. I may be kidding myself. There could also be a five-dimensional torus and you’re on your own working that out, or working out what to do with it.

I’m not sure there is a common standard notation for that, though. Probably the mathematician wanting to make clear she’s working with a torus in four dimensions just says so in text, and trusts that the context of her mathematics makes it clear this is no ordinary torus.

I’ve also written of these toruses as circular, as rounded shapes. That’s the most familiar torus. It’s a doughnut shape, or an O-ring shape, or an inner tube’s shape. It’s the shape you produce by taking a circle and looping it around an axis not on the ring. That’s common and that’s usually all we need.

But if you need some other torus, produced by rotating some other shape around an axis not inside it, go ahead. You’ll need to make clear what that original shape, the generator, is. You’ve seen examples of this in, for example, the washers that fit around nuts and bolts. They’re typically rectangles in cross-section. Or you might have seen that image of someone who fit together a couple dozen iMac boxes to make a giant wheel. I don’t know why you would need this, but it’s your problem, not mine. If these shapes are useful for your work, by all means, use them.

I’m not sure there is a standard notation for that sort of shape. My hunch is to say you’d define your generating shape and give it a name such as A or D. Then name the torus based on that as T(A) or T(D). But I would recommend spelling it out in text before you start using symbols like this.

The Set Tour, Part 10: Lots of Spheres


The next exhibit on the Set Tour here builds on a couple of the previous ones. First is the set Sn, that is, the surface of a hypersphere in n+1 dimensions. Second is Bn, the ball — the interior — of a hypersphere in n dimensions. Yeah, it bugs me too that Sn isn’t the surface of Bn. But it’d be too much work to change things now. The third has lurked implicitly since all the way back to Rn, a set of n real numbers for which the ordering of the numbers matters. (That is, that the set of numbers 2, 3 probably means something different than the set 3, 2.) And fourth is a bit of writing we picked up with matrices. The selection is also dubiously relevant to my own thesis from back in the day.

Sn x m and Bn x m

Here ‘n’ and ‘m’ are whole numbers, and I’m not saying which ones because I don’t need to tie myself down. Just as with Rn and with matrices this is a whole family of sets. Each different pair of n and m gives us a different set Sn x m or Bn x m, but they’ll all look quite similar.

The multiplication symbol here is a kind of multiplication, just as it was in matrices. That kind is called a “direct product”. What we mean by Sn x m is that we have a collection of items. We have the number m of them. Each one of those items is in Sn. That’s the surface of the hypersphere in n+1 dimensions. And we want to keep track of the order of things; we can’t swap items around and suppose they mean the same thing.

So suppose I write S2 x 7. This is an ordered collection of seven items, every one of which is on the surface of a three-dimensional sphere. That is, it’s the location of seven spots on the surface of the Earth. S2 x 8 offers similar prospects for talking about the location of eight spots.

With that written out, you should have a guess what Bn x m means. Your guess is correct. It’s a collection of m things, each of them within the interior of the n-dimensional ball.

Now the dubious relevance to my thesis. My problem was modeling a specific layer of planetary atmospheres. The model used for this was to pretend the atmosphere was made up of some large number of vortices, of whirlpools. Just like you see in the water when you slide your hand through the water and watch the little whirlpools behind you. The winds could be worked out as the sum of the winds produced by all these little vortices.

In the model, each of these vortices was confined to a single distance from the center of the planet. That’s close enough to true for planetary atmospheres. A layer in the atmosphere is not thick at all, compared to the planet. So every one of these vortices could be represented as a point in S2, the surface of a three-dimensional sphere. There would be some large number of these points. Most of my work used a nice round 256 points. So my model of a planetary atmosphere represented the system as a point in the domain S2 x 256. I was particularly interested in the energy of this set of 256 vortices. That was a function which had, as its domain, S2 x 256, and as range, the real numbers R.

But the connection to my actual work is dubious. I was doing numerical work, for the most part. I don’t think my advisor or I ever wrote S2 x 256 or anything like that when working out what I ought to do, much less what I actually did. Had I done a more analytic thesis I’d surely have needed to name this set. But I didn’t. It was lurking there behind my work nevertheless.

The energy of this system of vortices looked a lot like the potential energy for a bunch of planets attracting each other gravitationally, or like point charges repelling each other electrically. We work it out by looking at each pair of vortices. Work out the potential energy of those two vortices being that strong and that far apart. We call that a pairwise interaction. Then add up all the pairwise interactions. That’s it. [1] The pairwise interaction is stronger as each vortex is stronger; it gets weaker as the vortices get farther apart.

In gravity or electricity problems the strength falls off as the reciprocal of the distance between points. In vortices, the strength falls off as minus one times the logarithm of the distance between points. That’s a difference, and it meant that a lot of analytical results known for electric charges didn’t apply to my problem exactly. That was all right. I didn’t need many. But it does mean that I was fibbing up above, when I said I was working with S2 x 256. Pause a moment. Do you see what the fib was?

I’ll put what would otherwise be a footnote here so folks have a harder time reading right through to the answer.

[1] Physics majors may be saying something like: “wait, I see how this would be the potential energy of these 256 vortices, but where’s the kinetic energy?” The answer is, there is none. It’s all potential energy. The dynamics of point vortices are weird. I didn’t have enough grounding in mechanics when I went into them.

That’s all to the footnote.

Here’s where the fib comes in. If I’m really picking sets of vortices from all of the set S2 x 256, then, can two of them be in the exact same place? Sure they can. Why couldn’t they? For precedent, consider R3. In the three-dimensional vectors I can have the first and third numbers “overlap” and have the same value: (1, 2, 1) is a perfectly good vector. Why would that be different for an ordered set of points on the surface of the sphere? Why can’t vortex 1 and vortex 3 happen to have the same value in S2?

The problem is if two vortices were in the exact same position then the energy would be infinitely large. That’s not unique to vortices. It would be true for masses and gravity, or electric charges, if they were brought perfectly on top of each other. Infinitely large energies are a problem. We really don’t want to deal with them.

We could deal with this by pretending it doesn’t happen. Imagine if you dropped 256 poker chips across the whole surface of the Earth. Would you expect any two to be on top of each other? Would you expect two to be exactly and perfectly on top of each other, neither one even slightly overhanging the other? That’s so unlikely you could safely ignore it, for the same reason you could ignore the chance you’ll toss a coin and have it come up tails 56 times in a row.

And if you were interested in modeling the vortices moving it would be incredibly unlikely to have one vortex collide with another. They’d circle around each other, very fast, almost certainly. So ignoring the problem is defensible in this case.

Or we could be proper and responsible and say, “no overlaps” and “no collisions”. We would define some set that represents “all the possible overlaps and arrangements that give us a collision”. Then we’d say we’re looking at S2 x 256 except for those. I don’t think there’s a standard convention for “all the possible overlaps and collisions”, but Ω is a reasonable choice. Then our domain would be S2 x 256 \ Ω. The backslash means “except for the stuff after this”. This might seem unsatisfying. We don’t explicitly say what combinations we’re excluding. But go ahead and try listing all the combinations that would produce trouble. Try something simple, like S2 x 4. This is why we hide all the complicated stuff under a couple ordinary sentences.

It’s not hard to describe “no overlaps” mathematically. (You would say something like “vortex number j and vortex number k are not at the same position”, with maybe a rider of “unless j and k are the same number”. Or you’d put it in symbols that mean the same thing.) “No collisions” is harder. For gravity or electric charge problems we can describe at least some of them. And I realize now I’m not sure if there is an easy way to describe vortices that collide. I have difficulty imagining how they might, since vortices that are close to one another are pushing each other sideways quite intently. I don’t think that I can say they can’t, though. Not without more thought.

The Set Tour, Part 9: Balls, Only The Insides


Last week in the tour of often-used domains I talked about Sn, the surfaces of spheres. These correspond naturally to stuff like the surfaces of planets, or the edges of surfaces. They are also natural fits if you have a quantity that’s made up of a couple of components, and some total amount of the quantity is fixed. More physical systems do that than you might have guessed.

But this is all the surfaces. The great interior of a planet is by definition left out of Sn. This gives away the heart of what this week’s entry in the set tour is.

Bn

Bn is the domain that’s the interior of a sphere. That is, B3 would be all the points in a three-dimensional space that are less than a particular radius from the origin, from the center of space. If we don’t say what the particular radius is, then we mean “1”. That’s just as with the Sn we meant the radius to be “1” unless someone specifically says otherwise. In practice, I don’t remember anyone ever saying otherwise when I was in grad school. I suppose they might if we were doing a numerical simulation of something like the interior of a planet. You know, something where it could make a difference what the radius is.

It may have struck you that B3 is just the points that are inside S2. Alternatively, it might have struck you that S2 is the points that are on the edge of B3. Either way is right. Bn and Sn-1, for any positive whole number n, are tied together, one the edge and the other the interior.

Bn we tend to call the “ball” or the “n-ball”. Probably we hope that suggests bouncing balls and baseballs and other objects that are solid throughout. Sn we tend to call the “sphere” or the “n-sphere”, though I admit that doesn’t make a strong case for ruling out the inside of the sphere. Maybe we should think of it as the surface. We don’t even have to change the letter representing it.

As the “n” suggests, there are balls for as many dimensions of space as you like. B2 is a circle, filled in. B1 is just a line segment, stretching out from -1 to 1. B3 is what’s inside a planet or an orange or an amusement park’s glass light fixture. B4 is more work than I want to do today.

So here’s a natural question: does Bn include Sn-1? That is, when we talk about a ball in three dimensions, do we mean the surface and everything inside it? Or do we just mean the interior, stopping ever so short of the surface? This is a division very much like dividing the real numbers into negative and positive; do you include zero among other set?

Typically, I think, mathematicians don’t. If a mathematician speaks of B3 without saying otherwise, she probably means the interior of a three-dimensional ball. She’s not saying anything one way or the other about the surface. This we name the “open ball”, and if she wants to avoid any ambiguity she will say “the open ball Bn”.

“Open” here means the same thing it does when speaking of an “open set”. That may not communicate well to people who don’t remember their set theory. It means that the edges aren’t included. (Warning! Not actual set theory! Do not attempt to use that at your thesis defense. That description was only a reference to what’s important about this property in this particular context.)

If a mathematician wants to talk about the ball and the surface, she might say “the closed ball Bn”. This means to take the surface and the interior together. “Closed”, again, here means what it does in set theory. It pretty much means “include the edges”. (Warning! See above warning.)

Balls work well as domains for functions that have to describe the interiors of things. They also work if we want to talk about a constraint that’s made up of a couple of components, and that can be up to some size but not larger. For example, suppose you may put up to a certain budget cap into (say) six different projects, but you aren’t required to use the entire budget. We could model your budgeting as finding the point in B6 that gets the best result. How you measure the best is a problem for your operations research people. All I’m telling you is how we might represent the study of the thing you’re doing.

The Set Tour, Part 8: Balls, Only Made Harder


I haven’t forgotten or given up on the Set Tour, don’t worry or celebrate. I just expected there to be more mathematically-themed comic strips the last couple days. Really, three days in a row without anything at ComicsKingdom or GoComics to talk about? That’s unsettling stuff. Ah well.

Sn

We are also starting to get into often-used domains that are a bit stranger. We are going to start seeing domains that strain the imagination more. But this isn’t strange quite yet. We’re looking at the surface of a sphere.

The surface of a sphere we call S2. The “S” suggests a sphere. The “2” means that we have a two-dimensional surface, which matches what we see with the surface of the Earth, or a beach ball, or a soap bubble. All these are sphere enough for our needs. If we want to say where we are on the surface of the Earth, it’s most convenient to do this with two numbers. These are a latitude and a longitude. The latitude is the angle made between the point we’re interested in and the equator. The longitude is the angle made between the point we’re interested in and a reference prime longitude.

There are some variations. We can replace the latitude, for example, with the colatitude. That’s the angle between our point and the north pole. Or we might replace the latitude with the cosine of the colatitude. That has some nice analytic properties that you have to be well into grad school to care about. It doesn’t matter. The details may vary but it’s all the same. We put in a number for the east-west distance and another for the north-south distance.

It may seem pompous to use the same system to say where a point is on the surface of a beach ball. But can you think of a better one? Pointing to the ball and saying “there”, I suppose. But that requires we go around with the beach ball pointing out spots. Giving two numbers saves us having to go around pointing.

(Some weenie may wish to point out that if we were clever we could describe a point exactly using only a single number. This is true. Nobody does that unless they’re weenies trying to make a point. This essay is long enough without describing what mathematicians really mean by “dimension”. “How many numbers normal people use to identify a point in it” is good enough.)

S2 is a common domain. If we talk about something that varies with your position on the surface of the earth, we’re probably using S2 as the domain. If we talk about the temperature as it varies with position, or the height above sea level, or the population density, we have functions with a domain of S2 and a range in R. If we talk about the wind speed and direction we have a function with domain of S2 and a range in R3, because the wind might be moving in any direction.

Of course, I wrote down Sn rather than just S2. As with Rn and with Rm x n, there is really a family of similar domains. They are common enough to share a basic symbol, and the superscript is enough to differentiate them.

What we mean by Sn is “the collection of points in Rn+1 that are all the same distance from the origin”. Let me unpack that a little. The “origin” is some point in space that we pick to measure stuff from. On the number line we just call that “zero”. On your normal two-dimensional plot that’s where the x- and y-axes intersect. On your normal three-dimensional plot that’s where the x- and y- and z-axes intersect.

And by “the same distance” we mean some set, fixed distance. Usually we call that the radius. If we don’t specify some distance then we mean “1”. In fact, this is so regularly the radius I’m not sure how we would specify a different one. Maybe we would write Snr for a radius of “r”. Anyway, Sn, the surface of the sphere with radius 1, is commonly called the “unit sphere”. “Unit” gets used a fair bit for shapes. You’ll see references to a “unit cube” or “unit disc” or so on. A unit cube has sides length 1. A unit disc has radius 1. If you see “unit” in a mathematical setting it usually means “this thing measures out at 1”. (The other thing it may mean is “a unit of measure, but we’re not saying which one”. For example, “a unit of distance” doesn’t commit us to saying whether the distance is one inch, one meter, one million light-years, or one angstrom. We use that when we don’t care how big the unit is, and only wonder how many of them we have.)

S1 is an exotic name for a familiar thing. It’s all the points in two-dimensional space that are a distance 1 from the origin. Real people call this a “circle”. So do mathematicians unless they’re comparing it to other spheres or hyperspheres.

This is a one-dimensional figure. We can identify a single point on it easily with just one number, the angle made with respect to some reference direction. The reference direction is almost always that of the positive x-axis. That’s the line that starts at the center of the circle and points off to the right.

S3 is the first hypersphere we encounter. It’s a surface that’s three-dimensional, and it takes a four-dimensional space to see it. You might be able to picture this in your head. When I try I imagine something that looks like the regular old surface of the sphere, only it has fancier shading and maybe some extra lines to suggest depth. That’s all right. We can describe the thing even if we can’t imagine it perfectly. S4, well, that’s something taking five dimensions of space to fit in. I don’t blame you if you don’t bother trying to imagine what that looks like exactly.

The need for S4 itself tends to be rare. If we want to prove something about a function on a hypersphere we usually make do with Sn. This doesn’t tell us how many dimensions we’re working with. But we can imagine that as a regular old sphere only with a most fancy job of drawing lines on it.

If we want to talk about Sn aloud, or if we just want some variation in our prose, we might call it an n-sphere instead. So the 2-sphere is the surface of the regular old sphere that’s good enough for everybody but mathematicians. The 1-sphere is the circle. The 3-sphere and so on are harder to imagine. Wikipedia asserts that 3-spheres and higher-dimension hyperspheres are sometimes called “glomes”. I have not heard this word before, and I would expect it to start a fight if I tried to play it in Scrabble. However, I do not do mathematics that often requires discussion of hyperspheres. I leave this space open to people who do and who can say whether “glome” is a thing.

Something that all these Sn sets have in common are that they are the surfaces of spheres. They are just the boundary, and omit the interior. If we want a function that’s defined on the interior of the Earth we need to find a different domain.

The Set Tour, Part 7: Matrices


I feel a bit odd about this week’s guest in the Set Tour. I’ve been mostly concentrating on sets that get used as the domains or ranges for functions a lot. The ones I want to talk about here don’t tend to serve the role of domain or range. But they are used a great deal in some interesting functions. So I loosen my rule about what to talk about.

Rm x n and Cm x n

Rm x n might explain itself by this point. If it doesn’t, then this may help: the “x” here is the multiplication symbol. “m” and “n” are positive whole numbers. They might be the same number; they might be different. So, are we done here?

Maybe not quite. I was fibbing a little when I said “x” was the multiplication symbol. R2 x 3 is not a longer way of saying R6, an ordered collection of six real-valued numbers. The x does represent a kind of product, though. What we mean by R2 x 3 is an ordered collection, two rows by three columns, of real-valued numbers. Say the “x” here aloud as “by” and you’re pronouncing it correctly.

What we get is called a “matrix”. If we put into it only real-valued numbers, it’s a “real matrix”, or a “matrix of reals”. Sometimes mathematical terminology isn’t so hard to follow. Just as with vectors, Rn, it matters just how the numbers are organized. R2 x 3 means something completely different from what R3 x 2 means. And swapping which positions the numbers in the matrix occupy changes what matrix we have, as you might expect.

You can add together matrices, exactly as you can add together vectors. The same rules even apply. You can only add together two matrices of the same size. They have to have the same number of rows and the same number of columns. You add them by adding together the numbers in the corresponding slots. It’s exactly what you would do if you went in without preconceptions.

You can also multiply a matrix by a single number. We called this scalar multiplication back when we were working with vectors. With matrices, we call this scalar multiplication. If it strikes you that we could see vectors as a kind of matrix, yes, we can. Sometimes that’s wise. We can see a vector as a matrix in the set R1 x n or as one in the set Rn x 1, depending on just what we mean to do.

It’s trickier to multiply two matrices together. As with vectors multiplying the numbers in corresponding positions together doesn’t give us anything. What we do instead is a time-consuming but not actually hard process. But according to its rules, something in Rm x n we can multiply by something in Rn x k. “k” is another whole number. The second thing has to have exactly as many rows as the first thing has columns. What we get is a matrix in Rm x k.

I grant you maybe didn’t see that coming. Also a potential complication: if you can multiply something in Rm x n by something in Rn x k, can you multiply the thing in Rn x k by the thing in Rm x n? … No, not unless k and m are the same number. Even if they are, you can’t count on getting the same product. Matrices are weird things this way. They’re also gateways to weirder things. But it is a productive weirdness, and I’ll explain why in a few paragraphs.

A matrix is a way of organizing terms. Those terms can be anything. Real matrices are surely the most common kind of matrix, at least in mathematical usage. Next in common use would be complex-valued matrices, much like how we get complex-valued vectors. These are written Cm x n. A complex-valued matrix is different from a real-valued matrix. The terms inside the matrix can be complex-valued numbers, instead of real-valued numbers. Again, sometimes, these mathematical terms aren’t so tricky.

I’ve heard occasionally of people organizing matrices of other sets. The notation is similar. If you’re building a matrix of “m” rows and “n” columns out of the things you find inside a set we’ll call H, then you write that as Hm x n. I’m not saying you should do this, just that if you need to, that’s how to tell people what you’re doing.

Now. We don’t really have a lot of functions that use matrices as domains, and I can think of fewer that use matrices as ranges. There are a couple of valuable ones, ones so valuable they get special names like “eigenvalue” and “eigenvector”. (Don’t worry about what those are.) They take in Rm x n or Cm x n and return a set of real- or complex-valued numbers, or real- or complex-valued vectors. Not even those, actually. Eigenvectors and eigenfunctions are only meaningful if there are exactly as many rows as columns. That is, for Rm x m and Cm x m. These are known as “square” matrices, just as you might guess if you were shaken awake and ordered to say what you guessed a “square matrix” might be.

They’re important functions. There are some other important functions, with names like “rank” and “condition number” and the like. But they’re not many. I believe they’re not even thought of as functions, any more than we think of “the length of a vector” as primarily a function. They’re just properties of these matrices, that’s all.

So why are they worth knowing? Besides the joy that comes of knowing something, I mean?

Here’s one answer, and the one that I find most compelling. There is cultural bias in this: I come from an applications-heavy mathematical heritage. We like differential equations, which study how stuff changes in time and in space. It’s very easy to go from differential equations to ordered sets of equations. The first equation may describe how the position of particle 1 changes in time. It might describe how the velocity of the fluid moving past point 1 changes in time. It might describe how the temperature measured by sensor 1 changes as it moves. It doesn’t matter. We get a set of these equations together and we have a majestic set of differential equations.

Now, the dirty little secret of differential equations: we can’t solve them. Most interesting physical phenomena are nonlinear. Linear stuff is easy. Small change 1 has effect A; small change 2 has effect B. If we make small change 1 and small change 2 together, this has effect A plus B. Nonlinear stuff, though … it just doesn’t work. Small change 1 has effect A; small change 2 has effect B. Small change 1 and small change 2 together has effect … A plus B plus some weird A times B thing plus some effect C that nobody saw coming and then C does something with A and B and now maybe we’d best hide.

There are some nonlinear differential equations we can solve. Those are the result of heroic work and brilliant insights. Compared to all the things we would like to solve there’s not many of them. Methods to solve nonlinear differential equations are as precious as ways to slay krakens.

But here’s what we can do. What we usually like to know about in systems are equilibriums. Those are the conditions in which the system stops changing. Those are interesting. We can usually find those points by boring but not conceptually challenging calculations. If we can’t, we can declare x0 represents the equilibrium. If we still care, we leave calculating its actual values to the interested reader or hungry grad student.

But what’s really interesting is: what happens if we’re near but not exactly at the equilibrium? Sometimes, we stay near it. Think of pushing a swing. However good a push you give, it’s going to settle back to the boring old equilibrium of dangling straight down. Sometimes, we go racing away from it. Think of trying to balance a pencil on its tip; if we did this perfectly it would stay balanced. It never does. We’re never perfect, or there’s some wind or somebody walks by and the perfect balance is foiled. It falls down and doesn’t bounce back up. Sometimes, whether it it stays near or goes away depends on what way it’s away from the equilibrium.

And now we finally get back to matrices. Suppose we are starting out near an equilibrium. We can, usually, approximate the differential equations that describe what will happen. The approximation may only be good if we’re just a tiny bit away from the equilibrium, but that might be all we really want to know. That approximation will be some linear differential equations. (If they’re not, then we’re just wasting our time.) And that system of linear differential equations we can describe using matrices.

If we can write what we are interested in as a set of linear differential equations, then we have won. We can use the many powerful tools of matrix arithmetic — linear algebra, specifically — to tell us everything we want to know about the system. We can say whether a small push away from the equilibrium stays small, or whether it grows, or whether it depends. We can say how fast the small push shrinks, or grows (for a while). We can say how the system will change, approximately.

This is what I love in matrices. It’s not everything there is to them. But it’s enough to make matrices important to me.

The Set Tour, Part 6: One Big One Plus Some Rubble


I have a couple of sets for this installment of the Set Tour. It’s still an unusual installment because only one of the sets is that important for my purposes here. The rest I mention because they appear a lot, even if they aren’t much used in these contexts.

I, or J, or maybe Z

The important set here is the integers. You know the integers: they’re the numbers everyone knows. They’re the numbers we count with. They’re 1 and 2 and 3 and a hundred million billion. As we get older we come to accept 0 as an integer, and even the negative integers like “negative 12” and “minus 40” and all that. The integers might be the easiest mathematical construct to know. The positive integers, anyway. The negative ones are still a little suspicious.

The set of integers has several shorthand names. I is a popular and common one. As with the real-valued numbers R and the complex-valued numbers C it gets written by hand, and typically typeset, with a double vertical stroke. And we’ll put horizontal serifs on the top and bottom of the symbol. That’s a concession to readability. You see the same effect in comic strip lettering. A capital “I” in the middle of a word will often be written without serifs, while the word by itself needs the extra visual bulk.

The next popular symbol is J, again with a double vertical stroke. This gets used if we want to reserve “I”, or the word “I”, for some other purpose. J probably gets used because it’s so very close to I, and it’s only quite recently (in historic terms) that they’ve even been seen as different letters.

The symbol that seems to come out of nowhere is Z. It comes less from nowhere than it does from German. The symbol derives from “Zahl”, meaning “number”. It seems to have got into mathematics by way of Nicolas Bourbaki, the renowned imaginary French mathematician. The Z gets written with a double diagonal stroke.

Personally, I like Z most of this set, but on trivial grounds. It’s a more fun letter to write, especially since I write it with the middle horizontal stroke that. I’ve got no good cultural or historical reason for this. I just picked it up as a kid and never set it back down.

In these Set Tour essays I’m trying to write about sets that get used often as domains and ranges for functions. The integers get used a fair bit, although not nearly as often as real numbers do. The integers are a natural way to organize sequences of numbers. If the record of a week’s temperatures (in Fahrenheit) are “58, 45, 49, 54, 58, 60, 64”, there’s an almost compelling temperature function here. f(1) = 58, f(2) = 45, f(3) = 49, f(4) = 54, f(5) = 58, f(6) = 60, f(7) = 64. This is a function that has as its domain the integers. It happens that the range here is also integers, although you might be able to imagine a day when the temperature reading was 54.5.

Sequences turn up a lot. We are almost required to measure things we are interested in in discrete samples. So mathematical work with sequences uses integers as the domain almost by default. The use of integers as a domain gets done so often that it often becomes invisible, though. Someone studying my temperature data above might write the data as f1, f2, f3, and so on. One might reasonably never even notice there’s a function there, or a domain.

And that’s fine. A tool can be so useful it disappears. Attend a play; the stage is in light and the audience in darkness. The roles the light and darkness play disappear unless the director chooses to draw attention to this choice.

And to be honest, integers are a lousy domain for functions. It’s achingly hard to prove things for functions defined just on the integers. The easiest way to do anything useful is typically to find an equivalent problem for a related function that’s got the real numbers as a domain. Then show the answer for that gives you your best-possible answer for the original question.

If all we want are the positive integers, we put a little superscript + to our symbol: I+ or J+ or Z+. That’s a popular choice if we’re using the integers as an index. If we just want the negative numbers that’s a little weird, but, change the plus sign to a minus: I.

Now for some trouble.

Sometimes we want the positive numbers and zero, or in the lingo, the “nonnegative numbers”. Good luck with that. Mathematicians haven’t quite settled on what this should be called, or abbreviated. The “Natural numbers” is a common name for the numbers 0, 1, 2, 3, 4, and so on, and this makes perfect sense and gets abbreviated N. You can double-brace the left vertical stroke, or the diagonal stroke, as you like and that will be understood by everybody.

That is, everybody except the people who figure “natural numbers” should be 1, 2, 3, 4, and so on, and that zero has no place in this set. After all, every human culture counts with 1 and 2 and 3, and for that matter crows and raccoons understand the concept of “four”. Yet it took thousands of years for anyone to think of “zero”, so how natural could that be?

So we might resort to speaking of the “whole numbers” instead. More good luck with that. Besides leaving open the question of whether zero should be considered “whole” there’s the linguistic problem. “Whole” number carries, for many, the implication of a number that is an integer with no fractional part. We already have the word “integer” for that, yes. But the fact people will talk about rounding off to a whole number suggests the phrase “whole number” serves some role that the word “integer” doesn’t. Still, W is sitting around not doing anything useful.

Then there’s “counting numbers”. I would be willing to endorse this as a term for the integers 0, 1, 2, 3, 4, and so on, except. Have you ever met anybody who starts counting from zero? Yes, programmers for some — not all! — computer languages. You know which computer languages. They’re the languages which baffle new students because why on earth would we start counting things from zero all of a sudden? And the obvious single-letter abbreviation C is no good because we need that for complex numbers, a set that people actually use for domains a lot.

There is a good side to this, if you aren’t willing to sit out the 150 years or so mathematicians are going to need to sort this all out. You can set out a symbol that makes sense to you, early on in your writing, and stick with it. If you find you don’t like it, you can switch to something else in your next paper and nobody will protest. If you figure out a good one, people may imitate you. If you figure out a really good one, people will change it just a tiny bit so that their usage drives you crazy. Life is like that.

Eric Weisstein’s Mathworld recommends using Z* for the nonnegative integers. I don’t happen to care for that. I usually associate superscript * symbols with some operations involving complex-valued numbers and with the duals of sets, neither of which is in play here. But it’s not like he’s wrong and I’m right. If I were forced to pick a symbol right now I’d probably give Z0+. And for the nonpositive itself — the negative integers and zero — Z0- presents itself. I fully understand there are people who would be driven stark raving mad by this. Maybe you have a better one. I’d believe that.

Let me close with something non-controversial.

These are some sets that are too important to go unmentioned. But they don’t get used much in the domain-and-range role I’ve been using as basis for these essays. They are, in the terrain of these essays, some rubble.

You know the rational numbers? They’re the things you can write as fractions: 1/2, 5/13, 32/7, -6/7, 0 (think about it). This is a quite useful set, although it doesn’t get used much for the domain or range of functions, at least not in the fields of mathematics I see. It gets abbreviated as Q, though. There’s an extra vertical stroke on the left side of the loop, just as a vertical stroke gets added to the C for complex-valued numbers. Why Q? Well, “R” is already spoken for, as we need it for the real numbers. The key here is that every rational number can be written as the quotient of one integer divided by another. So, this is the set of Quotients. This abbreviation we get thanks to Bourbaki, the same folks who gave us Z for integers. If it strikes you that the imaginary French mathematician Bourbaki used a lot of German words, all I can say is I think that might have been part of the fun of the Bourbaki project. (Well, and German mathematicians gave us many breakthroughs in the understanding of sets in the late 19th and early 20th centuries. We speak with their language because they spoke so well.)

If you’re comfortable with real numbers and with rational numbers, you know of irrational numbers. These are (most) square roots, and pi and e, and the golden ratio and a lot of cosines of angles. Strangely, there really isn’t any common shorthand name or common notation for the irrational numbers. If we need to talk about them, we have the shorthand “R \ Q”. This means “the real numbers except for the rational numbers”. Or we have the shorthand “Qc”. This means “everything except the rational numbers”. That “everything” carries the implication “everything in the real numbers”. The “c” in the superscript stands for “complement”, everything outside the set we’re talking about. These are ungainly, yes. And it’s a bit odd considering that most real numbers are irrational numbers. The rational numbers are a most ineffable cloud of dust the atmosphere of the real numbers.

But, mostly, we don’t need to talk about functions that have an irrational-number domain. We can do our work with a real-number domain instead. So we leave that set with a clumsy symbol. If there’s ever a gold rush of fruitful mathematics to be done with functions on irrational domains then we’ll put in some better notation. Until then, there are better jobs for our letters to do.

The Set Tour, Part 5: C^n


The next piece in this set tour is a hybrid. It mixes properties of the last two sets. And I’ll own up now that while it’s a set that gets used a lot, it’s one that gets used a lot in just some corners of mathematics. It’s got a bit of that “Internet fame”. In particular circles it’s well-known; venture outside those circles even a little, and it’s not. But it leads us into other, useful places.

Cn

C here is the set of complex-valued numbers. We may have feared them once, but now they’re friends, or at least something we can work peacefully with. n here is some counting number, just as it is with Rn. n could be one or two or forty or a hundred billion. It’ll be whatever fits the problem we’re doing, if we need to pin down its value at all.

The reference to Rn, another friend, probably tipped you off to the rest. The items in Cn are n-tuples, ordered sets of some number n of numbers. Each of those numbers is itself a complex-valued number, something from C. Cn gets typeset in bold, and often with that extra vertical stroke on the left side of the C arc. It’s handwritten that way, too.

As with Rn we can add together things in Cn. Suppose that we are in C2 so that I don’t have to type too much. Suppose the first number is (2 + i, -3 – 3*i) and the second number is (6 – 2*i, 2 + 9*i). There could be fractions or irrational numbers in the real and imaginary components, but I don’t want to type that much. The work is the same. Anyway, the sum will be another number in Cn. The first term in that sum will be the sum of the first term in the first number, 2 + i, and the first term in the second number, 6 – 2*i. That in turn will be the sum of the real and of the imaginary components, so, 2 + 6 + i – 2*i, or 8 – i all told. The second term of the sum will be the second term of the first number, -3 – 3*i, and the second term of the second number, 2 + 9*i, which will be -3 – 3*i + 2 + 9*i or, all told, -1 + 6*i. The sum is the n-tuple (8 – i, -1 + 6*i).

And also as with Rn there really isn’t multiplying of one term of Cn by another. Generally, we can’t do this in any useful way. We can multiply something in Cn by a scalar, a single real — or, why not, complex-valued — number, though.

So let’s start out with (8 – i, -1 + 6*i), a number in C2. And then pick a scalar, say, 2 + 2*i. It doesn’t have to be complex-valued, but, why not? The product of this scalar and this term will be another number in C2. Its first term will the scalar, 2 + 2*i, multiplied by the first term in it, 8 – i. That’s (2 + 2*i) * (8 – i), or 2*8 – 2*i + 16*i – 2*i*i, or 2*8 – 2*i + 16*i + 2, or 18 + 14*i. And then its second term will be the scalar 2 + 2*i multiplied by the second term, -1 + 6*i. That’s (2 + 2*i)*(-1 + 6*i), or 2*(-1) + 2*6*i -2*i + 2*6*i*i. And that’s -2 + 12*i – 2*i -12, or -14 + 10*i. So the product is (18 + 14*i, -14 + 10*i).

So as with Rn, Cn creates a “vector space”. These spaces are useful in complex analysis. They’re also useful in the study of affine geometry, a corner of geometry that I’m sad to admit falls outside what I studied. I have tried reading up on them on my own, and I run aground each time. I understand the basic principles but never quite grasp why they are interesting. That’s my own failing, of course, and I’d be glad for a pointer that explained in ways I understood why they’re so neat.

I do understand some of what’s neat about them: affine geometry tells us what we can know about shapes without using the concept of “distance”. When you discover that we can know anything about shapes without the idea of “distance” your imagination should be fired. Mine is, too. I just haven’t followed from that to feel comfortable with the terminology and symbols of the field.

You could, if you like, think of Cn as being a specially-delineated version of R2*n. This is just as you can see a complex number as an ordered pair of real numbers. But sometimes information is usefully thought of as a single, complex-valued number. And there is a value in introducing the idea of ordered sets of things that are not real numbers. We will see the concept again.


Also, the heck did I write an 800-word essay about the family of sets of complex-valued n-tuples and have Hemingway Editor judge it to be at the “Grade 3” reading level? I rarely get down to “Grade 6” when I do a Reading the Comics post explaining how Andertoons did a snarky-word-problem-answers panel. That’s got to be a temporary glitch.

The Set Tour, Part 4: Complex Numbers


C

The square root of negative one. Everybody knows it doesn’t exist; there’s no real number you can multiply by itself and get negative one out. But then sometime in algebra, deep in a section about polynomials, suddenly we come out and declare there is such a thing. It’s an “imaginary number” that we call “i”. It’s hard to blame students for feeling betrayed by this. To make it worse, we throw real and imaginary numbers together and call the result “complex numbers”. It’s as if we’re out to tease them for feeling confused.

It’s an important set of things, though. It turns up as the domain, or the range, of functions so often that one of the major fields of analysis is called, “Complex Analysis”. If the course listing allows for more words, it’s called “Analysis of Functions of a Complex Variable” or something like that. Despite the connotations of the word “complex”, though, the field is a delight. It’s considerably easier to understand than Real Analysis, the study of functions of mere real numbers. When there is a theorem that has a version in Real Analysis and a version in Complex Analysis, the Complex Analysis side is usually easier to prove and easier to understand. It’s uncanny.

The set of all complex numbers is denoted C, in parallel to the set of real numbers, R. To make it clear that we mean this set, and not some piddling little common set that might happen to share the name C, add a vertical stroke to the left of the letter. This is just as we add a vertical stroke to R to emphasize we mean the Real Numbers. We should approach the set with respect, removing our hats, thinking seriously about great things. It would look silly to add a second curve to C though, so we just add a straight vertical stroke on the left side of the letter C. This makes it look a bit like it’s an Old English typeface (the kind you call Gothic until you learn that means “sans serif”) pared down to its minimum.

Why do we teach people there’s no such thing as a square root of minus one, and then one day, teach them there is? Part of it is that whether there is a square root depends on your context. If you are interested only in the real numbers, there’s nothing that, squared, gives you minus one. This is exactly the way that it’s not possible to equally divide five objects between two people if you aren’t allowed to cut the objects in half. But if you are willing to allow half-objects to be things, then you can do what was previously forbidden. What you can do depends on what the rules you set out are.

And there’s surely some echo of the historical discovery of imaginary and complex numbers at work here. They were noticed when working out the roots of third- and fourth-degree polynomials. These can be done by way of formulas that nobody ever remembers because there are so many better things to remember. These formulas would sometimes require one to calculate a square root of a negative number, a thing that obviously didn’t exist. Except that if you pretended it did, you could get out correct answers, just as if these were ordinary numbers. You can see why this may be dubbed an “imaginary” number. The name hints at the suspicion with which it’s viewed. It’s much as “negative” numbers look like some trap to people who’re just getting comfortable with fractions.

It goes against the stereotype of mathematicians to suppose they’d accept working with something they don’t understand because the results are all right, afterwards. But, actually, mathematicians are willing to accept getting answers by any crazy method. If you have a plausible answer, you can test whether it’s right, and if all you really need this minute is the right answer, good.

But we do like having methods; they’re more useful than mere answers. And we can imagine this set called the complex numbers. They contain … well, all the possible roots, the solutions, of all polynomials. (The polynomials might have coefficients — the numbers in front of the variable — of integers, or rational numbers, or irrational numbers. If we already accept the idea of complex numbers, the coefficients can be complex numbers too.)

It’s exceedingly common to think of the complex numbers by starting off with a new number called “i”. This is a number about which we know nothing except that i times i equals minus one. Then we tend to think of complex numbers as “a real number plus i times another real number”. The first real number gets called “the real component”, and is usually denoted as either “a” or “x”. The second real number gets called “the imaginary component”, and is usually denoted as either “b” or “y”. Then the complex number is written “a + i*b” or “x + i*y”. Sometimes it’s written “a + b*i” or “x + y*i”; that’s a mere matter of house style. Don’t let it throw you.

Writing a complex number this way has advantages. Particularly, it makes it easy to see how one would add together (or subtract) complex numbers: “a + b*i + x + y*i” almost suggests that the sum should be “(a + x) + (b + y)*i”. What we know from ordinary arithmetic gives us guidance. And if we’re comfortable with binomials, then we know how to multiply complex numbers. Start with “(a + b*i) * (x + y*i)” and follow the distributive law. We get, first, “a*x + a*y*i + b*i*x + b*y*i*i”. But “i*i” equals minus one, so this is the same as “a*x + a*y*i + b*i*x – b*y”. Move the real components together, and move the imaginary components together, and we have “(a*x – b*y) + (a*y + b*x)*i”.

That’s the most common way of writing out complex numbers. It’s so common that Eric W Weisstein’s Mathworld encyclopedia even says that’s what complex numbers are. But it isn’t the only way to construct, or look at, complex numbers. A common alternate way to look at complex numbers is to match a complex number to a point on the plane, or if you prefer, a point in the set R2.

It’s surprisingly natural to think of the real component as how far to the right or left of an origin your complex number is, and to think of the imaginary component as how far above or below the origin it is. Much complex-number work makes sense if you think of complex numbers as points in space, or directions in space. The language of vectors trips us up only a little bit here. We speak of a complex number as corresponding to a point on the “complex plane”, just as we might speak of a real number as a point on the “(real) number line”.

But there are other descriptions yet. We can represent complex numbers as a pair of numbers with a scheme that looks like polar coordinates. Pick a point on the complex plane. We can say where that is by two points of information. The first is the amplitude, or magnitude: how far the point is from the origin. The second is the phase, or angle: draw the line segment connecting the origin and your point. What angle does that make with the positive horizontal axis?

This representation is called the “phasor” representation. It’s tolerably popular in physics and I hear tell of engineers liking it. We represent numbers then not as “x + i*y” but instead as “r * e”, with r the magnitude and θ the angle. “e” is the base of the natural logarithm, which you get very comfortable with if you do much mathematics or physics. And “i” is just what we’ve been talking about here. This is a pretty natural way to write about complex numbers that represent stuff that oscillates, such as alternating current or the probability function in quantum mechanics. A lot of stuff oscillates, if you study it through the right lens. So numbers that look like this keep creeping in, and into unexpected places. It’s quite easy to multiply numbers in phasor form — just multiply the magnitude parts, and add the angle parts — although addition and subtraction become a pain.

Mathematicians generally use the letter “z” to represent a complex-valued number whose identity is not known. As best I can tell, this is because we do think so much of a complex number as the sum “x + y*i”. So if we used familiar old “x” for an unknown number, it would carry the connotations of “the real component of our complex-valued number” and mislead the unwary mathematician. The connection is so common that a mathematician might carelessly switch between “z” and the real and imaginary components “x” and “y” without specifying that “z” is another way of writing “x + y*i”. A good copy editor or an alert student should catch this.

Complex numbers work very much like real numbers do. They add and multiply in natural-looking ways, and you can do subtraction and division just as well. You can take exponentials, and can define all the common arithmetic functions — sines and cosines, square roots and logarithms, integrals and differentials — on them just as well as you can with real numbers. And you can embed the real numbers within the complex numbers: if you have a real number x, you can match that perfectly with the complex number “x + 0*i”.

But that doesn’t mean complex numbers are exactly like the real numbers. For example, it’s possible to order the real numbers. You can say that the number “a” is less than the number “b”, and have that mean something. That’s not possible to do with complex numbers. You can’t say that “a + b*i” is less than, or greater than, “x + y*i” in a logically consistent way. You can say the magnitude of one complex-valued number is greater than the magnitude of another. But the magnitudes are real numbers. For all that complex numbers give us there are things they’re not good for.

The Set Tour, Part 3: R^n


After talking about the real numbers last time, I had two obvious sets to use as follow up. Of course I’d overthink the choice of which to make my next common domain-and-range set.

Rn

Rn is pronounced “are enn”, just as you might do if you didn’t know enough mathematics to think the superscript meant something important. It does mean something important; it’s just that there’s not a graceful way to say what offhand. This is the set of n-tuples of real numbers. That is, anything you pick out of Rn is an ordered set of things all of which are themselves real numbers. The “n” here is the name for some whole number whose value isn’t going to change during the length of this problem.

So when we speak of Rn we are really speaking of a family of sets, all of them similar in some important ways. The things in R2 look like pairs of real numbers: (3, 4), or (4π, -2e), or (2038, 0.010010001), pairs like that. The things in R3 are triplets of real numbers: (3, 4, 5), or (4π, -2e, 1 + 1/π). The things in R4 are quartets of real numbers: (3, 4, 5, 12) or (4π, -2e, 1 + 1/π, -6) or so. The things in R10 are probably clear enough to not need listing.

It’s possible to add together two things in Rn. At least if they come from the same Rn; you can’t add a pair of numbers to a quartet of numbers, not if you’re being honest. The addition rule is just what you’d come up with if you didn’t know enough mathematics to be devious, though: add the first number of the first thing to the first number of the second thing, and that’s the first number of the sum. Add the second number of the first thing to the second number of the second thing, and that’s the second number of the sum. Add the third number of the first thing to the third number of the second thing, and that’s the third number of the sum. Keep on like this until you run out of numbers in each thing. It’s possible you already have.

You can’t multiply together two things in Rn, though, unless your n is 1. (There may be some conceptual difference between R1 and plain old R. But I don’t recall seeing a mathematician being interested in the difference except when she’s studying the philosophy of mathematics.) The obvious multiplication scheme — multiply matching numbers, like you do with addition — produces something that doesn’t work enough like multiplication to be interesting. It’s possible for some n’s to work out schemes that act like multiplication enough to be interesting, but for the most part we don’t need them.

What we will do, though, is multiply something in Rn by a single real number. That real number is called a “scalar”. You do the multiplication, again, like you’d do if you were too new to mathematics to be clever. Multiply the first number in your thing by the scalar, and that’s the first number in your product. Multiply the second number in your thing by the scalar, and that’s the second number in your product. Multiply the third number in your thing by the scalar, and that’s the third number in your product. Carry on like this until you run out of numbers, and then stop. Usually good advice.

That you can add together two things from Rn, and you can multiply anything in Rn by a scalar, makes this a “vector space”. (There are some more requirements, but they amount to addition and multiplication working like you’d expect.) The term means about what you think; a “space” is a … well … something that acts mathematically like ordinary everyday space works. A “vector space” is a space where the things inside it are vectors. Vectors are a combination of a direction and a distance in that direction. They’re very well-represented as n-tuples. They get represented as n-tuples so often it’s easy to forget that’s just a convenient way to write them down.

This vector space property of Rn makes it a really useful set. R2 corresponds naturally to “the points on a flat surface”. R3 corresponds naturally to an idea of “all the points in normal everyday space where something could be”. Or, if you like, it can represent “the speed and direction something is travelling in”. Or the direction and amount of its acceleration, for that matter.

Because of these mathematicians will often call Rn the “n-dimensional Euclidean space”. The n is about how many components there are in an element of the set. The “space” tells us it’s a space. “Euclidean” tells us that it looks and works like, well, Euclidean geometry. We can talk about the distance between points and use the ideas we had from plane or solid geometry. We can talk about angles and areas and volumes similarly. We can do this so much we might say “n-dimensional space” as if there weren’t anything but Euclidean spaces out there.

And this is useful for more than describing where something happens to be. A great number of physics problems find it convenient to study the position and the velocity of a number of particles which interact. If we have N particles, then, and we’re in a three-dimensional space, and we’re keeping track of positions and velocities for each of them, then we can describe where everything is and how everything is moving as one element in the space R6N. We can describe movement in time as a function that has a domain of R6N and a range of R6N, and see the progression of time as tracing out a path in that space.

We can’t draw that, obviously, and I’d look skeptically at people who say they can visualize it. What we usually draw is a little enclosed space that’s either a rectangle or a blob, and draw out lines — “trajectories” — inside that. The different spots along the trajectory correspond to all the positions and velocities of all the particles in the system at different times.

Though that’s a fantastic use, it’s not the only one. It’s not required, for example, that a function have the same Rn as both domain and range. It can have different sets. If we want to be clear that the domain and range can be of different sizes, it’s common to call one Rn and the other Rm if we aren’t interested in pinning down just which spaces they are.

But, for example, a perfectly legitimate function would have a domain of R3 and a range of R1, the reals. There’s even an obvious, common one: return the size, the magnitude, of whatever the vector in the domain is. Or we might take as domain R4, and the range R2, following the rule “match an element in the domain to an element in the range that has the same first and third components”. That kind of function is called a “projection”, as it gives what might look like the shadow of the original thing in a smaller space.

If we wanted to go the other way, from R2 to R4 as an example, we could. Here set the rule “match an element in the domain to an element in the range which has the same first and second components, and has ‘3’ and ‘4’ as the third and fourth components”. That’s an “embedding”, giving us the idea that we can put a Euclidean space with fewer dimensions into a space with more. The idea comes naturally to anyone who’s seen a cartoon where a character leaps off the screen and interacts with the real world.

The Set Tour, Stage 2: The Real Star


For the second of my little tour of sets that get commonly used as domains and ranges I want to name the most common of them all.

R

This is the real numbers. In text that’s written with a bold R. Written by hand, and often in text, that’s written with a capital R that has a double stroke for the main vertical line. That’s an easy-to-write way to distinguish it from a plain old civilian R. The double-vertical-stroke convention is used for many of the most common sets of numbers. It will get used for letters like I and J (the integers), or N (the counting numbers). A vertical stroke will even get added to symbols that technically don’t have any vertical strokes, like Q (the rational numbers). There it’s just put inside the loop, on the left side, far enough from the edge that the reader can notice the vertical stroke is there.

R is a big one. It’s not just a big set. It’s also a popular one. It may as well be the default domain and range. If someone fails to tell you what either set is, you can suppose she meant R and be only rarely wrong. The real numbers are familiar and popular and it feels like we know what they are. It’s a bit tricky to define them exactly, though, and you’ll notice that I’m not doing that. You know what I mean, though. It’s whole numbers, and rational numbers, and irrational numbers like the square root of pi, and for that matter pi, and a whole bunch of other boring numbers nobody looks at. Let’s leave it at that.

All the intervals I talked about last time are subsets of R. If we really wanted to, we could turn a function with domain an interval like [0, 1] into a function with a domain of R. That’s a kind of “embedding”. Let me call the function with domain [0, 1] by the name “f”. I’ll then define g, on the domain R, by the rule “whatever f(x) is, if x is from 0 to 1; and some other, harmless value, if x isn’t”. Probably the harmless value is zero. Sometimes we need to change the domain a function’s defined on, and this is a way to do it.

If we only want to talk about the positive real numbers we can denote that by putting a plus sign in superscript: R+. If we only want the negative numbers we put in a minus sign: R. Do either of these include zero? My heart tells me neither should, but I wouldn’t be surprised if in practice either did, because zero is often useful to have around. To be careful we might explicitly include zero, using the notations of set theory. Then we might write \textbf{R}^+ \cup \left\{0\right\} .

Sometimes the rule for a function doesn’t make sense for some values. For example, if a function has the rule f: x \mapsto 1 / (x - 1) then you can’t work out a value for f(1). That would require dividing by zero and we dare not do that. A careful mathematician would say the domain of that function f is all the real numbers R except for the number 1. This exclusion gets written as “R \ {1}”. The backslash means “except the numbers in the following set”. It might be a single number, such as in this example. It might be a lot of numbers. The function g: x \mapsto \log\left(1 - x\right) is meaningless for any x that’s equal to or greater than 1. We could write its domain then as “R \ { x: x ≥ 1 }”.

That’s if we’re being careful. If we get a little careless, or if we’re writing casually, or if the set of non-permitted points is complicated we might omit that. Mathematical writing includes an assumption of good faith. The author is supposed to be trying to say something interesting and true. The reader is expected to be skeptical but not quarrelsome. Spotting a flaw in the argument because the domain doesn’t explicitly rule out some points it shouldn’t have is tedious. Finding that the interesting thing only holds true for values that are implicitly outside the domain is serious.

The set of real numbers is a group; it has an operation that works like addition. We call it addition. For that matter, it’s a ring. It has an operation that works like multiplication. We call it multiplication. And it’s even more than a ring. Everything in R except for the additive identity — 0, the number you can add to anything without changing what the thing is — has a multiplicative inverse. That is, any number except zero has some number you can multiply it by to get 1. This property makes it a “field”, to people who study (abstract) algebra. This “field” hasn’t got anything to do with gravitational or electrical or baseball or magnetic fields. But the overlap in names does serve to sometimes confuse people.

But having this multiplicative inverse means that we can do something that operates like division. Divide one thing by a second by taking the first thing and multiplying it by the second thing’s multiplicative inverse. We call this division-like operation “division”.

It’s not coincidence that the algebraic “addition” and “multiplication” and “division” operations are the ones we call addition and multiplication and division. What makes abstract algebra abstract is that it’s the study of things that work kind of like the real numbers do. The operations we can do on the real numbers inspire us to look for other sets that can let us do similar things.

The Set Tour, Stage 1: Intervals


I keep writing about functions. I’m not exactly sure how. I keep meaning to get to other things but find interesting stuff to say about domains and ranges and the like. These domains and ranges have to be sets. There are some sets that come up all the time in domains and ranges. I thought I’d share some of the common ones. The first family of sets is known as “intervals”.

[ 0, 1 ]

This means all the real numbers from 0 to 1. Written with straight brackets like that means to include the matching point there — that is, 0 is in the domain, and so is 1. We don’t always want to include the ending points in a domain; if we want to omit them, we write parentheses instead. So (0, 1) would mean all the real numbers bigger than zero and smaller than one, but neither zero nor one. We can also include one but not both endpoints: [0, 1) is fine. It offends copy editors, by having its open bracket and its closed parenthesis be unmatched, but its meaning is clear enough. It’s the real numbers from zero to one, with zero allowed but one ruled out. We can include or omit either or both endpoints, and we have to keep that straight. But for most of our work it doesn’t matter what we choose, as long as we stay consistent. It changes proofs a bit, but in routine ways.

Zero to one is a popular interval. Negative 1 to 1 is another popular interval. They’re nice round numbers. And the intervals between -π and π, or between 0 and 2π, are also popular. Those match nicely with trigonometric functions such as sine and tangent. You can make an interval that runs from any number to any other number — [ a, b ] if you don’t want to pin down just what numbers you mean. But [ 0, 1 ] and [ -1, 1 ] are popular choices among mathematicians. If you can prove something interesting about a function with a domain that’s either of these intervals, you can then normally prove it’s true on whatever a function with a domain that’s whatever interval you want. (I can’t think of an exception offhand, but mathematics is vast and my statement sweeping. There may be trouble.)

Suppose we start out with a function named f that has its domain the interval [ -16, 48 ]. I’m not saying anything about its range or its rule because they don’t matter. You can make them anything you like, if you need them to be anything. (This is exactly the way we might, in high school algebra, talk about the number named `x’ without ever caring whether we find out what number that actually is.) But from this start we can talk about a related function named g, which has as its domain [ -1, 1 ]. A rule for g is g: x \mapsto f\left(32\cdot x + 16\right) ; that is, g(0) is whatever number you get from f(16), for example. So if we can prove something’s true about g on this standard domain, we can almost certainly show the same thing is true about f on the other domain. This is considered a kind of mapping, or as a composition of functions. It’s a composition because you can see it as taking your original number called x, seeing what one function does with it — in this case, multiplying it by 32 and adding 16 — and then seeing what a second function — the one named f — does with that.

It’s easy to go the other way around. If we know what g is on the domain [ -1, 1 ] then we can define a related function f on the domain [ -16, 48 ]. We can say f: x \mapsto g\left(\frac{1}{32}\cdot\left(x - 16\right)\right). And again, if we can show something’s true on this standard domain, we can almost certainly show the same thing’s true on the other domain.

I gave examples, mappings, that are simple. They’re linear. They don’t have to be. We just have to match everything in one interval to everything in another. For example, we can match the domain (1, ∞) — all the numbers bigger than 1 — to the domain (0, 1). Let’s again call f the function with domain (1, ∞). Then we can say g is the function with domain (0, 1) and defined by the rule g: x \mapsto f\left(\frac{1}{x}\right) . That’s a nonlinear mapping.

Linear mappings are easier to deal with than nonlinear mappings. Usually, mathematically, if something is divided into “linear” and “nonlinear” the linear version is easier. Sometimes a nonlinear mapping is the best one to use to match a function on some convenient domain to a function on some other one. The hard part is often a matter of showing that something true for one function on a common domain like (0, 1) will also be true for the other domain, (a, b).

However, showing that the truth holds can often be done without knowing much about your specific function. You maybe need to know what kind of function it is, that it’s continuous or bounded or something like that. But the actual specific rule? Not so important. You can prove that the truth holds ahead of time. Or find an analysis textbook or paper or something where someone else has proven that. So while a domain might be any interval, often in practice you don’t need to work with more than a couple nice familiar ones.