## The End 2016 Mathematics A To Z: Image

It’s another free-choice entry. I’ve got something that I can use to make my Friday easier.

## Image.

So remember a while back I talked about what functions are? I described them the way modern mathematicians like. A function’s got three components to it. One is a set of things called the domain. Another is a set of things called the range. And there’s some rule linking things in the domain to things in the range. In shorthand we’ll write something like “f(x) = y”, where we know that x is in the domain and y is in the range. In a slightly more advanced mathematics class we’ll write $f: x \rightarrow y$. That maybe looks a little more computer-y. But I bet you can read that already: “f matches x to y”. Or maybe “f maps x to y”.

We have a couple ways to think about what ‘y’ is here. One is to say that ‘y’ is the image of ‘x’, under ‘f’. The language evokes camera trickery, or at least the way a trick lens might make us see something different. Pretend that the domain is something you could gaze at. If the domain is, say, some part of the real line, or a two-dimensional plane, or the like that’s not too hard to do. Then we can think of the rule part of ‘f’ as some distorting filter. When we look to where ‘x’ would be, we see the thing in the range we know as ‘y’.

At this point you probably imagine this is a pointless word to have. And that it’s backed up by a useless analogy. So it is. As far as I’ve gone this addresses a problem we don’t need to solve. If we want “the thing f matches x to” we can just say “f(x)”. Well, we write “f(x)”. We say “f of x”. Maybe “f at x”, or “f evaluated at x” if we want to emphasize ‘f’ more than ‘x’ or ‘f(x)’.

Where it gets useful is that we start looking at subsets. Bunches of points, not just one. Call ‘D’ some interesting-looking subset of the domain. What would it mean if we wrote the expression ‘f(D)’? Could we make that meaningful?

We do mean something by it. We mean what you might imagine by it. If you haven’t thought about what ‘f(D)’ might mean, take a moment — a short moment — and guess what it might. Don’t overthink it and you’ll have it right. I’ll put the answer just after this little bit so you can ponder.

So. ‘f(D)’ is a set. We make that set by taking, in turn, every single thing that’s in ‘D’. And find everything in the range that’s matched by ‘f’ to those things in ‘D’. Collect them all together. This set, ‘f(D)’, is “the image of D under f”.

We use images a lot when we’re studying how functions work. A function that maps a simple lump into a simple lump of about the same size is one thing. A function that maps a simple lump into a cloud of disparate particles is a very different thing. A function that describes how physical systems evolve will preserve the volume and some other properties of these lumps of space. But it might stretch out and twist around that space, which is how we discovered chaos.

Properly speaking, the range of a function ‘f’ is just the image of the whole domain under that ‘f’. But we’re not usually that careful about defining ranges. We’ll say something like ‘the domain and range are the sets of real numbers’ even though we only need the positive real numbers in the range. Well, it’s not like we’re paying for unnecessary range. Let me call the whole domain ‘X’, because I went and used ‘D’ earlier. Then the range, let me call that ‘Y’, would be ‘Y = f(X)’.

Images will turn up again. They’re a handy way to let us get at some useful ideas.

## A Leap Day 2016 Mathematics A To Z: Surjective Map

Gaurish today gives me one more request for the Leap Day Mathematics A To Z. And it lets me step away from abstract algebra again, into the world of analysis and what makes functions work. It also hovers around some of my past talk about functions.

## Surjective Map.

This request echoes one of the first terms from my Summer 2015 Mathematics A To Z. Then I’d spent some time on a bijection, or a bijective map. A surjective map is a less complicated concept. But if you understood bijective maps, you picked up surjective maps along the way.

By “map”, in this context, mathematicians don’t mean those diagrams that tell you where things are and how you might get there. Of course we don’t. By a “map” we mean that we have some rule that matches things in one set to things in another. If this sounds to you like what I’ve claimed a function is then you have a good ear. A mapping and a function are pretty much different names for one another. If there’s a difference in connotation I suppose it’s that a “mapping” makes a weaker suggestion that we’re necessarily talking about numbers.

(In some areas of mathematics, a mapping means a function with some extra properties, often some kind of continuity. Don’t worry about that. Someone will tell you when you’re doing mathematics deep enough to need this care. Mind, that person will tell you by way of a snarky follow-up comment picking on some minor point. It’s nothing personal. They just want you to appreciate that they’re very smart.)

So a function, or a mapping, has three parts. One is a set called the domain. One is a set called the range. And then there’s a rule matching things in the domain to things in the range. With functions we’re so used to the domain and range being the real numbers that we often forget to mention those parts. We go on thinking “the function” is just “the rule”. But the function is all three of these pieces.

A function has to match everything in the domain to something in the range. That’s by definition. There’s no unused scraps in the domain. If it looks like there is, that’s because were being sloppy in defining the domain. Or let’s be charitable. We assumed the reader understands the domain is only the set of things that make sense. And things make sense by being matched to something in the range.

Ah, but now, the range. The range could have unused bits in it. There’s nothing that inherently limits the range to “things matched by the rule to some thing in the domain”.

By now, then, you’ve probably spotted there have to be two kinds of functions. There’s one in which the whole range is used, and there’s ones in which it’s not. Good eye. This is exactly so.

If a function only uses part of the range, if it leaves out anything, even if it’s just a single value out of infinitely many, then the function is called an “into” mapping. If you like, it takes the domain and stuffs it into the range without filling the range.

Ah, but if a function uses every scrap of the range, with nothing left out, then we have an “onto” mapping. The whole of the domain gets sent onto the whole of the range. And this is also known as a “surjective” mapping. We get the term “surjective” from Nicolas Bourbaki. Bourbaki is/was the renowned 20th century mathematics art-collective group which did so much to place rigor and intuition-free bases into mathematics.

The term pairs up with the “injective” mapping. In this, the elements in the range match up with one and only one thing in the domain. So if you know the function’s rule, then if you know a thing in the range, you also know the one and only thing in the domain matched to that. If you don’t feel very French, you might call this sort of function one-to-one. That might be a better name for saying why this kind of function is interesting.

Not every function is injective. But then not every function is surjective either. But if a function is both injective and surjective — if it’s both one-to-one and onto — then we have a bijection. It’s a mapping that can represent the way a system changes and that we know how to undo. That’s pretty comforting stuff.

If we use a mapping to describe how a process changes a system, then knowing it’s a surjective map tells us something about the process. It tells us the process makes the system settle into a subset of all the possible states. That doesn’t mean the thing is stable — that little jolts get worn down. And it doesn’t mean that the thing is settling to a fixed state. But it is a piece of information suggesting that’s possible. This may not seem like a strong conclusion. But considering how little we know about the function it’s impressive to be able to say that much.

## The Set Tour, Part 10: Lots of Spheres

The next exhibit on the Set Tour here builds on a couple of the previous ones. First is the set Sn, that is, the surface of a hypersphere in n+1 dimensions. Second is Bn, the ball — the interior — of a hypersphere in n dimensions. Yeah, it bugs me too that Sn isn’t the surface of Bn. But it’d be too much work to change things now. The third has lurked implicitly since all the way back to Rn, a set of n real numbers for which the ordering of the numbers matters. (That is, that the set of numbers 2, 3 probably means something different than the set 3, 2.) And fourth is a bit of writing we picked up with matrices. The selection is also dubiously relevant to my own thesis from back in the day.

## Sn x m and Bn x m

Here ‘n’ and ‘m’ are whole numbers, and I’m not saying which ones because I don’t need to tie myself down. Just as with Rn and with matrices this is a whole family of sets. Each different pair of n and m gives us a different set Sn x m or Bn x m, but they’ll all look quite similar.

The multiplication symbol here is a kind of multiplication, just as it was in matrices. That kind is called a “direct product”. What we mean by Sn x m is that we have a collection of items. We have the number m of them. Each one of those items is in Sn. That’s the surface of the hypersphere in n+1 dimensions. And we want to keep track of the order of things; we can’t swap items around and suppose they mean the same thing.

So suppose I write S2 x 7. This is an ordered collection of seven items, every one of which is on the surface of a three-dimensional sphere. That is, it’s the location of seven spots on the surface of the Earth. S2 x 8 offers similar prospects for talking about the location of eight spots.

With that written out, you should have a guess what Bn x m means. Your guess is correct. It’s a collection of m things, each of them within the interior of the n-dimensional ball.

Now the dubious relevance to my thesis. My problem was modeling a specific layer of planetary atmospheres. The model used for this was to pretend the atmosphere was made up of some large number of vortices, of whirlpools. Just like you see in the water when you slide your hand through the water and watch the little whirlpools behind you. The winds could be worked out as the sum of the winds produced by all these little vortices.

In the model, each of these vortices was confined to a single distance from the center of the planet. That’s close enough to true for planetary atmospheres. A layer in the atmosphere is not thick at all, compared to the planet. So every one of these vortices could be represented as a point in S2, the surface of a three-dimensional sphere. There would be some large number of these points. Most of my work used a nice round 256 points. So my model of a planetary atmosphere represented the system as a point in the domain S2 x 256. I was particularly interested in the energy of this set of 256 vortices. That was a function which had, as its domain, S2 x 256, and as range, the real numbers R.

But the connection to my actual work is dubious. I was doing numerical work, for the most part. I don’t think my advisor or I ever wrote S2 x 256 or anything like that when working out what I ought to do, much less what I actually did. Had I done a more analytic thesis I’d surely have needed to name this set. But I didn’t. It was lurking there behind my work nevertheless.

The energy of this system of vortices looked a lot like the potential energy for a bunch of planets attracting each other gravitationally, or like point charges repelling each other electrically. We work it out by looking at each pair of vortices. Work out the potential energy of those two vortices being that strong and that far apart. We call that a pairwise interaction. Then add up all the pairwise interactions. That’s it. [1] The pairwise interaction is stronger as each vortex is stronger; it gets weaker as the vortices get farther apart.

In gravity or electricity problems the strength falls off as the reciprocal of the distance between points. In vortices, the strength falls off as minus one times the logarithm of the distance between points. That’s a difference, and it meant that a lot of analytical results known for electric charges didn’t apply to my problem exactly. That was all right. I didn’t need many. But it does mean that I was fibbing up above, when I said I was working with S2 x 256. Pause a moment. Do you see what the fib was?

I’ll put what would otherwise be a footnote here so folks have a harder time reading right through to the answer.

[1] Physics majors may be saying something like: “wait, I see how this would be the potential energy of these 256 vortices, but where’s the kinetic energy?” The answer is, there is none. It’s all potential energy. The dynamics of point vortices are weird. I didn’t have enough grounding in mechanics when I went into them.

That’s all to the footnote.

Here’s where the fib comes in. If I’m really picking sets of vortices from all of the set S2 x 256, then, can two of them be in the exact same place? Sure they can. Why couldn’t they? For precedent, consider R3. In the three-dimensional vectors I can have the first and third numbers “overlap” and have the same value: (1, 2, 1) is a perfectly good vector. Why would that be different for an ordered set of points on the surface of the sphere? Why can’t vortex 1 and vortex 3 happen to have the same value in S2?

The problem is if two vortices were in the exact same position then the energy would be infinitely large. That’s not unique to vortices. It would be true for masses and gravity, or electric charges, if they were brought perfectly on top of each other. Infinitely large energies are a problem. We really don’t want to deal with them.

We could deal with this by pretending it doesn’t happen. Imagine if you dropped 256 poker chips across the whole surface of the Earth. Would you expect any two to be on top of each other? Would you expect two to be exactly and perfectly on top of each other, neither one even slightly overhanging the other? That’s so unlikely you could safely ignore it, for the same reason you could ignore the chance you’ll toss a coin and have it come up tails 56 times in a row.

And if you were interested in modeling the vortices moving it would be incredibly unlikely to have one vortex collide with another. They’d circle around each other, very fast, almost certainly. So ignoring the problem is defensible in this case.

Or we could be proper and responsible and say, “no overlaps” and “no collisions”. We would define some set that represents “all the possible overlaps and arrangements that give us a collision”. Then we’d say we’re looking at S2 x 256 except for those. I don’t think there’s a standard convention for “all the possible overlaps and collisions”, but Ω is a reasonable choice. Then our domain would be S2 x 256 \ Ω. The backslash means “except for the stuff after this”. This might seem unsatisfying. We don’t explicitly say what combinations we’re excluding. But go ahead and try listing all the combinations that would produce trouble. Try something simple, like S2 x 4. This is why we hide all the complicated stuff under a couple ordinary sentences.

It’s not hard to describe “no overlaps” mathematically. (You would say something like “vortex number j and vortex number k are not at the same position”, with maybe a rider of “unless j and k are the same number”. Or you’d put it in symbols that mean the same thing.) “No collisions” is harder. For gravity or electric charge problems we can describe at least some of them. And I realize now I’m not sure if there is an easy way to describe vortices that collide. I have difficulty imagining how they might, since vortices that are close to one another are pushing each other sideways quite intently. I don’t think that I can say they can’t, though. Not without more thought.

## The Set Tour, Stage 1: Intervals

I keep writing about functions. I’m not exactly sure how. I keep meaning to get to other things but find interesting stuff to say about domains and ranges and the like. These domains and ranges have to be sets. There are some sets that come up all the time in domains and ranges. I thought I’d share some of the common ones. The first family of sets is known as “intervals”.

## [ 0, 1 ]

This means all the real numbers from 0 to 1. Written with straight brackets like that means to include the matching point there — that is, 0 is in the domain, and so is 1. We don’t always want to include the ending points in a domain; if we want to omit them, we write parentheses instead. So (0, 1) would mean all the real numbers bigger than zero and smaller than one, but neither zero nor one. We can also include one but not both endpoints: [0, 1) is fine. It offends copy editors, by having its open bracket and its closed parenthesis be unmatched, but its meaning is clear enough. It’s the real numbers from zero to one, with zero allowed but one ruled out. We can include or omit either or both endpoints, and we have to keep that straight. But for most of our work it doesn’t matter what we choose, as long as we stay consistent. It changes proofs a bit, but in routine ways.

Zero to one is a popular interval. Negative 1 to 1 is another popular interval. They’re nice round numbers. And the intervals between -π and π, or between 0 and 2π, are also popular. Those match nicely with trigonometric functions such as sine and tangent. You can make an interval that runs from any number to any other number — [ a, b ] if you don’t want to pin down just what numbers you mean. But [ 0, 1 ] and [ -1, 1 ] are popular choices among mathematicians. If you can prove something interesting about a function with a domain that’s either of these intervals, you can then normally prove it’s true on whatever a function with a domain that’s whatever interval you want. (I can’t think of an exception offhand, but mathematics is vast and my statement sweeping. There may be trouble.)

Suppose we start out with a function named f that has its domain the interval [ -16, 48 ]. I’m not saying anything about its range or its rule because they don’t matter. You can make them anything you like, if you need them to be anything. (This is exactly the way we might, in high school algebra, talk about the number named `x’ without ever caring whether we find out what number that actually is.) But from this start we can talk about a related function named g, which has as its domain [ -1, 1 ]. A rule for g is $g: x \mapsto f\left(32\cdot x + 16\right)$; that is, g(0) is whatever number you get from f(16), for example. So if we can prove something’s true about g on this standard domain, we can almost certainly show the same thing is true about f on the other domain. This is considered a kind of mapping, or as a composition of functions. It’s a composition because you can see it as taking your original number called x, seeing what one function does with it — in this case, multiplying it by 32 and adding 16 — and then seeing what a second function — the one named f — does with that.

It’s easy to go the other way around. If we know what g is on the domain [ -1, 1 ] then we can define a related function f on the domain [ -16, 48 ]. We can say $f: x \mapsto g\left(\frac{1}{32}\cdot\left(x - 16\right)\right)$. And again, if we can show something’s true on this standard domain, we can almost certainly show the same thing’s true on the other domain.

I gave examples, mappings, that are simple. They’re linear. They don’t have to be. We just have to match everything in one interval to everything in another. For example, we can match the domain (1, ∞) — all the numbers bigger than 1 — to the domain (0, 1). Let’s again call f the function with domain (1, ∞). Then we can say g is the function with domain (0, 1) and defined by the rule $g: x \mapsto f\left(\frac{1}{x}\right)$. That’s a nonlinear mapping.

Linear mappings are easier to deal with than nonlinear mappings. Usually, mathematically, if something is divided into “linear” and “nonlinear” the linear version is easier. Sometimes a nonlinear mapping is the best one to use to match a function on some convenient domain to a function on some other one. The hard part is often a matter of showing that something true for one function on a common domain like (0, 1) will also be true for the other domain, (a, b).

However, showing that the truth holds can often be done without knowing much about your specific function. You maybe need to know what kind of function it is, that it’s continuous or bounded or something like that. But the actual specific rule? Not so important. You can prove that the truth holds ahead of time. Or find an analysis textbook or paper or something where someone else has proven that. So while a domain might be any interval, often in practice you don’t need to work with more than a couple nice familiar ones.