My 2019 Mathematics A To Z: Zeno’s Paradoxes

Today’s A To Z term was nominated by Dina Yagodich, who runs a YouTube channel with a host of mathematics topics. Zeno’s Paradoxes exist in the intersection of mathematics and philosophy. Mathematics majors like to declare that they’re all easy. The Ancient Greeks didn’t understand infinite series or infinitesimals like we do. Now they’re no challenge at all. This reflects a belief that philosophers must be silly people who haven’t noticed that one can, say, exit a room.

This is your classic STEM-attitude of missing the point. We may suppose that Zeno of Elea occasionally exited rooms himself. That is a supposition, though. Zeno, like most philosophers who lived before Socrates, we know from other philosophers making fun of him a century after he died. Or at least trying to explain what they thought he was on about. Modern philosophers are expected to present others’ arguments as well and as strongly as possible. This even — especially — when describing an argument they want to say is the stupidest thing they ever heard. Or, to use the lingo, when they wish to refute it. Ancient philosophers had no such compulsion. They did not mind presenting someone else’s argument sketchily, if they supposed everyone already knew it. Or even badly, if they wanted to make the other philosopher sound ridiculous. Between that and the sparse nature of the record, we have to guess a bit about what Zeno precisely said and what he meant. This is all right. We have some idea of things that might reasonably have bothered Zeno.

And they have bothered philosophers for thousands of years. They are about change. The ones I mean to discuss here are particularly about motion. And there are things we do not understand about change. This essay will not answer what we don’t understand. But it will, I hope, show something about why that’s still an interesting thing to ponder.

When we capture a moment by photographing it we add lies to what we see. We impose a frame on its contents, discarding what is off-frame. We rip an instant out of its context. And that before considering how we stage photographs, making people smile and stop tilting their heads. We forgive many of these lies. The things excluded from or the moments around the one photographed might not alter what the photograph represents. Making everyone smile can convey the emotional average of the event in a way that no individual moment represents. Arranging people to stand in frame can convey the participation in the way a candid photograph would not.

But there remains the lie that a photograph is “a moment”. It is no such thing. We notice this when the photograph is blurred. It records all the light passing through the lens while the shutter is open. A photograph records an eighth of a second. A thirtieth of a second. A thousandth of a second. But still, some time. There is always the ghost of motion in a picture. If we do not see it, it is because our photograph’s resolution is too coarse. If we could photograph something with infinite fidelity we would see, even in still life, the wobbling of the molecules that make up a thing.

Which implies something fascinating to me. Think of a reel of film. Here I mean old-school pre-digital film, the thing that’s a great strip of pictures, a new one shown 24 times per second. Each frame of film is a photograph, recording some split-second of time. How much time is actually in a film, then? How long, cumulatively, was a camera shutter open during a two-hour film? I use pre-digital, strip-of-film movies for convenience. Digital films offer the same questions, but with different technical points. And I do not want the writing burden of describing both analog and digital film technologies. So I will stick to the long sequence of analog photographs model.

Let me imagine a movie. One of an ordinary everyday event; an actuality, to use the terminology of 1898. A person overtaking a walking tortoise. Look at the strip of film. There are many frames which show the person behind the tortoise. There are many frames showing the person ahead of the tortoise. When are the person and the tortoise at the same spot?

We have to put in some definitions. Fine; do that. Say we mean when the leading edge of the person’s nose overtakes the leading edge of the tortoise’s, as viewed from our camera. Or, since there must be blur, when the center of the blur of the person’s nose overtakes the center of the blur of the tortoise’s nose.

Do we have the frame when that moment happened? I’m sure we have frames from the moments before, and frames from the moments after. But the exact moment? Are you positive? If we zoomed in, would it actually show the person is a millimeter behind the tortoise? That the person is a hundredth of a millimeter ahead? A thousandth of a hair’s width behind? Suppose that our camera is very good. It can take frames representing as small a time as we need. Does it ever capture that precise moment? To the point that we know, no, it’s not the case that the tortoise is one-trillionth the width of a hydrogen atom ahead of the person?

If we can’t show the frame where this overtaking happened, then how do we know it happened? To put it in terms a STEM major will respect, how can we credit a thing we have not observed with happening? … Yes, we can suppose it happened if we suppose continuity in space and time. Then it follows from the intermediate value theorem. But then we are begging the question. We impose the assumption that there is a moment of overtaking. This does not prove that the moment exists.

Fine, then. What if time is not continuous? If there is a smallest moment of time? … If there is, then, we can imagine a frame of film that photographs only that one moment. So let’s look at its footage.

One thing stands out. There’s finally no blur in the picture. There can’t be; there’s no time during which to move. We might not catch the moment that the person overtakes the tortoise. It could “happen” in-between moments. But at least we have a moment to observe at leisure.

So … what is the difference between a picture of the person overtaking the tortoise, and a picture of the person and the tortoise standing still? A movie of the two walking should be different from a movie of the two pretending to be department store mannequins. What, in this frame, is the difference? If there is no observable difference, how does the universe tell whether, next instant, these two should have moved or not?

A mathematical physicist may toss in an answer. Our photograph is only of positions. We should also track momentum. Momentum carries within it the information of how position changes over time. We can’t photograph momentum, not without getting blurs. But analytically? If we interpret a photograph as “really” tracking the positions of a bunch of particles? To the mathematical physicist, momentum is as good a variable as position, and it’s as measurable. We can imagine a hyperspace photograph that gives us an image of positions and momentums. So, STEM types show up the philosophers finally, right?

Hold on. Let’s allow that somehow we get changes in position from the momentum of something. Hold off worrying about how momentum gets into position. Where does a change in momentum come from? In the mathematical physics problems we can do, the change in momentum has a value that depends on position. In the mathematical physics problems we have to deal with, the change in momentum has a value that depends on position and momentum. But that value? Put it in words. That value is the change in momentum. It has the same relationship to acceleration that momentum has to velocity. For want of a real term, I’ll call it acceleration. We need more variables. An even more hyperspatial film camera.

… And does acceleration change? Where does that change come from? That is going to demand another variable, the change-in-acceleration. (The “jerk”, according to people who want to tell you that “jerk” is a commonly used term for the change-in-acceleration, and no one else.) And the change-in-change-in-acceleration. Change-in-change-in-change-in-acceleration. We have to invoke an infinite regression of new variables. We got here because we wanted to suppose it wasn’t possible to divide a span of time infinitely many times. This seems like a lot to build into the universe to distinguish a person walking past a tortoise from a person standing near a tortoise. And then we still must admit not knowing how one variable propagates into another. That a person is wide is not usually enough explanation of how they are growing taller.

Numerical integration can model this kind of system with time divided into discrete chunks. It teaches us some ways that this can make logical sense. It also shows us that our projections will (generally) be wrong. At least unless we do things like have an infinite number of steps of time factor into each projection of the next timestep. Or use the forecast of future timesteps to correct the current one. Maybe use both. These are … not impossible. But being “ … not impossible” is not to say satisfying. (We allow numerical integration to be wrong by quantifying just how wrong it is. We call this an “error”, and have techniques that we can use to keep the error within some tolerated margin.)

So where has the movement happened? The original scene had movement to it. The movie seems to represent that movement. But that movement doesn’t seem to be in any frame of the movie. Where did it come from?

We can have properties that appear in a mass which don’t appear in any component piece. No molecule of a substance has a color, but a big enough mass does. No atom of iron is ferromagnetic, but a chunk might be. No grain of sand is a heap, but enough of them are. The Ancient Greeks knew this; we call it the Sorites paradox, after Eubulides of Miletus. (“Sorites” means “heap”, as in heap of sand. But if you had to bluff through a conversation about ancient Greek philosophers you could probably get away with making up a quote you credit to Sorites.) Could movement be, in the term mathematical physicists use, an intensive property? But intensive properties are obvious to the outside observer of a thing. We are not outside observers to the universe. It’s not clear what it would mean for there to be an outside observer to the universe. Even if there were, what space and time are they observing in? And aren’t their space and their time and their observations vulnerable to the same questions? We’re in danger of insisting on an infinite regression of “universes” just so a person can walk past a tortoise in ours.

We can say where movement comes from when we watch a movie. It is a trick of perception. Our eyes take some time to understand a new image. Our brains insist on forming a continuous whole story even out of disjoint ideas. Our memory fools us into remembering a continuous line of action. That a movie moves is entirely an illusion.

You see the implication here. Surely Zeno was not trying to lead us to understand all motion, in the real world, as an illusion? … Zeno seems to have been trying to support the work of Parmenides of Elea. Parmenides is another pre-Socratic philosopher. So we have about four words that we’re fairly sure he authored, and we’re not positive what order to put them in. Parmenides was arguing about the nature of reality, and what it means for a thing to come into or pass out of existence. He seems to have been arguing something like that there was a true reality that’s necessary and timeless and changeless. And there’s an apparent reality, the thing our senses observe. And in our sensing, we add lies which make things like change seem to happen. (Do not use this to get through your PhD defense in philosophy. I’m not sure I’d use it to get through your Intro to Ancient Greek Philosophy quiz.) That what we perceive as movement is not what is “really” going on is, at least, imaginable. So it is worth asking questions about what we mean for something to move. What difference there is between our intuitive understanding of movement and what logic says should happen.

(I know someone wishes to throw down the word Quantum. Quantum mechanics is a powerful tool for describing how many things behave. It implies limits on what we can simultaneously know about the position and the time of a thing. But there is a difference between “what time is” and “what we can know about a thing’s coordinates in time”. Quantum mechanics speaks more to the latter. There are also people who would like to say Relativity. Relativity, special and general, implies we should look at space and time as a unified set. But this does not change our questions about continuity of time or space, or where to find movement in both.)

And this is why we are likely never to finish pondering Zeno’s Paradoxes. In this essay I’ve only discussed two of them: Achilles and the Tortoise, and The Arrow. There are two other particularly famous ones: the Dichotomy, and the Stadium. The Dichotomy is the one about how to get somewhere, you have to get halfway there. But to get halfway there, you have to get a quarter of the way there. And an eighth of the way there, and so on. The Stadium is the hardest of the four great paradoxes to explain. This is in part because the earliest writings we have about it don’t make clear what Zeno was trying to get at. I can think of something which seems consistent with what’s described, and contrary-to-intuition enough to be interesting. I’m satisfied to ponder that one. But other people may have different ideas of what the paradox should be.

There are a handful of other paradoxes which don’t get so much love, although one of them is another version of the Sorites Paradox. Some of them the Stanford Encyclopedia of Philosophy dubs “paradoxes of plurality”. These ask how many things there could be. It’s hard to judge just what he was getting at with this. We know that one argument had three parts, and only two of them survive. Trying to fill in that gap is a challenge. We want to fill in the argument we would make, projecting from our modern idea of this plurality. It’s not Zeno’s idea, though, and we can’t know how close our projection is.

I don’t have the space to make a thematically coherent essay describing these all, though. The set of paradoxes have demanded thought, even just to come up with a reason to think they don’t demand thought, for thousands of years. We will, perhaps, have to keep trying again to fully understand what it is we don’t understand.

And with that — I find it hard to believe — I am done with the alphabet! All of the Fall 2019 A-to-Z essays should appear at this link. Additionally, the A-to-Z sequences of this and past years should be at this link. Tomorrow and Saturday I hope to bring up some mentions of specific past A-to-Z essays. Next week I hope to share my typical thoughts about what this experience has taught me, and some other writing about this writing.

Thank you, all who’ve been reading, and who’ve offered topics, comments on the material, or questions about things I was hoping readers wouldn’t notice I was shorting. I’ll probably do this again next year, after I’ve had some chance to rest.

Why I’ll Say 1/x Is A Continuous Function And Why I’ll Say It Isn’t

So let me finally follow up last month’s question. That was whether the function “$\frac{1}{x}$” is continuous. My earlier post lays out what a mathematician means by a “continuous function”. The short version is, we have a good definition for a function being continuous at a point in the domain. If it’s continuous at every point in the domain, it’s a continuous function.

The definition of continuous-at-a-point has some technical stuff that I’m going to skip this essay. The important part is that the stuff ordinary people would call “continuous” mathematicians agree with. Like, if you draw a curve representing the function without having to lift your pen off the paper? That function’s continuous. At least the stretch you drew was.

So is the function “$\frac{1}{x}$” continuous? What if I said absolutely it is, because ‘x’ is a number that happens to be … oh, let’s say it’s 3. And $\frac{1}{3}$ is a constant function; of course that’s continuous. Your sensible response is to ask if I want a punch in the nose. No, I do not.

One of the great breakthroughs of algebra was that we could use letters to represent any number we want, whether or not we know what number it is. So why can’t I get away with this? And the answer is that we live in a society, please. There are rules. At least, there’s conventions. They’re good things. They save us time setting up problems. They help us see things the current problem has with other problems. They help us communicate to people who haven’t been with us through all our past work. As always, these rules are made for our convenience, and we can waive them for good reason. But then you have to say what those reasons are.

What someone expects, if you write ‘x’ without explanation it’s a variable and usually an independent one. Its value might be any of a set of things, and often, we don’t explicitly know what it is. Letters at the start of the alphabet usually stand for coefficients, some fixed number with a value we don’t want to bother specifying. In making this division — ‘a’, ‘b’, ‘c’ for coefficients, ‘x’, ‘y’, ‘z’ for variables — we are following Réné Descartes, who explained his choice of convention quite well. And there are other letters with connotations. We tend to use ‘t’ as a variable if it seems like we’re looking at something which depends on time. If something seems to depend on a radius, ‘r’ goes into service. We use letters like ‘f’ and ‘g’ and ‘h’ for functions. For indexes, ‘i’ and ‘j’ and ‘k’ get called up. For total counts of things, or for powers, ‘n’ and ‘m’, often capitalized, appear. The result is that any mathematician, looking at the expression

$\sum_{j = i}^{n} a_i f(x_j)$

would have a fair idea what kinds of things she was looking at.

So when someone writes “the function $\frac{1}{x}$” they mean “the function which matches ‘x’, in the domain, with $\frac{1}{x}$, in the range”. We write this as “$f(x) = \frac{1}{x}$”. Or, if we become mathematics majors, and we’re in the right courses, we write “$f:x \rightarrow \frac{1}{x}$”. It’s a format that seems like it’s overcomplicating things. But it’s good at emphasizing the idea that a function can be a map, matching a set in the domain to a set in the range.

This is a tiny point. Why discuss it at any length?

It’s because the question “is $\frac{1}{x}$ a continuous function” isn’t well-formed. There’s important parts not specified. We can make it well-formed by specifying these parts. This is adding assumptions about what we mean. What assumptions we make affect what the answer is.

A function needs three components. One component is a set that’s the domain. One component is a set that’s the range. And one component is a rule that pairs up things in the domain with things in the range. But there are some domains and some ranges that we use all the time. We use them so often we end up not mentioning them. We have a common shorthand for functions which is to just list the rule.

So what are the domain and range?

Barring special circumstances, we usually take the domain that offers the most charitable reading of the rule. What’s the biggest set on which the rule makes sense? The domain is that. The range we find once we have the domain and rule. It’s the set that the rule maps the domain onto.

So, for example, if we have the function “f(x) = x2”? That makes sense if ‘x’ is any real number. if there’s no reason to think otherwise, we suppose the domain is the set of all real numbers. We’d write that as the set R. Whatever ‘x’ is, though, ‘x2‘ is either zero or a positive number. So the range is the real numbers greater than or equal to zero. Or the nonnegative real numbers, if you prefer.

And even that reasonably clear guideline hides conventions. Like, who says this should be the real numbers? Can’t you take the square of a complex-valued number? And yes, you absolutely can. Some people even encourage it. So why not use the set C instead?

Convention, again. If we don’t expect to need complex-valued numbers, we don’t tend to use them. I suspect it’s a desire not to invite trouble. The use of ‘x’ as the independent variable is another bit of convention. An ‘x’ can be anything, yes. But if it’s a number, it’s more likely a real-valued number. Same with ‘y’. If we want a complex-valued independent variable we usually label that ‘z’. If we need a second, ‘w’ comes in. Writing “x2” alone suggests real-valued numbers.

And this might head off another question. How do we know that ‘x’ is the only variable? How do we know we don’t need an ordered pair, ‘(x, y)’? This would be from the set called R2, pairs of real-valued numbers. It uses only the first coordinate of the pair, but that’s allowed. How do we know that’s not going on? And we don’t know that from the “x2” part. The “f(x) = ” part gives us that hint. If we thought the problem needed two independent variables, it would usually list them somewhere. Writing “f(x, y) = x2” begs for the domain R2, even if we don’t know what good the ‘y’ does yet. In mapping notation, if we wrote “$f:(x, y) \rightarrow x^2$” we’d be calling for R2. If ‘x’ and ‘z’ both appear, that’s usually a hint that the problem needs coordinates ‘x’, ‘y’, and ‘z’, so that we’d want R3 at least.

So that’s the maybe frustrating heuristic here. The inferred domain is the smallest biggest set that the rule makes sense on. The real numbers, but not ordered pairs of real numbers, and not complex-valued numbers. Something like that.

What does this mean for the function “$f(x) = \frac{1}{x}$”? Well, the variable is ‘x’, so we should think real numbers rather than complex-valued ones. There no ‘y’ or ‘z’ or anything, so we don’t need ordered sets. The domain is something in the real numbers, then. And the formula “$\frac{1}{x}$” means something for any real number ‘x’ … well, with the one exception. We try not to divide by zero. It raises questions we’d rather not have brought up.

So from this we infer a domain of “all the real numbers except 0”. And this in turn implies a range of “all the real numbers except 0”.

Is “$f(x) = \frac{1}{x}$” continuous on every point in the domain? That is, whenever ‘x’ is any real number besides zero? And, well, it is. A proper proof would be even more heaps of paragraphs, so I’ll skip it. Informally, you know if you drew a curve representing this function there’s only one point where you would ever lift your pen. And that point is 0 … which is not in this domain. So the function is continuous at every point in the domain. So the function’s continuous. Done.

And, I admit, not quite comfortably done. I feel like there’s some slight-of-hand anyway. You draw “$\frac{1}{x}$” and you absolutely do lift your pen, after all.

So, I fibbed a little above. When I said the range was “the set that the rule maps the domain onto”. I mean, that’s what it properly is. But finding that is often too much work. You have to find where the function would be its smallest, which is often hard, or at least tedious. You have to find where it’s largest, which is just as tedious. You have to find if there’s anything between the smallest and largest values that it skips. You have to find all these gaps. That’s boring. And what’s the harm done if we declare the range is bigger than that set? If, for example, we say the range of’ x2‘ is all the real numbers, even though we know it’s really only the non-negative numbers?

None at all. Not unless we’re taking an exam about finding the smallest range that lets a function make sense. So in practice we’ll throw in all the negative numbers into that range, even if nothing matches them. I admit this makes me feel wasteful, but that’s my weird issue. It’s not like we use the numbers up. We’ll just overshoot on the range and that’s fine.

You see the trap this has set up. If it doesn’t cost us anything to throw in unneeded stuff in the range, and it makes the problem easier to write about, can we do that with the domain?

Well. Uhm. No. Not if we’re doing this right. The range can have unneeded stuff in it. The domain can’t. It seems unfair, but if we don’t set hold to that rule, we make trouble for ourselves. By ourselves I mean mathematicians who study the theory of functions. That’s kind of like ourselves, right? So there’s no declaring that “$\frac{1}{x}$” is a function on “all” the real numbers and trusting nobody to ask what happens when ‘x’ is zero.

But we don’t need for a function’s rule to a be a single thing. Or a simple thing. It can have different rules for different parts of the domain. It’s fine to declare, for example, that f(x) is equal to “$\frac{1}{x}$” for every real number where that makes sense, and that it’s equal to 0 everywhere else. Or that it’s 1 everywhere else. That it’s negative a billion and a third everywhere else. Whatever number you like. As long as it’s something in the range.

So I’ll declare that my idea of this function is an ‘f(x)’ that’s equal to “$\frac{1}{x}$” if ‘x’ is not zero, and that’s equal to 2 if ‘x’ is zero. I admit if I weren’t writing for an audience I’d make ‘f(x)’ equal to 0 there. That feels nicely symmetric. But everybody picks 0 when they’re filling in this function. I didn’t get where I am by making the same choices as everybody else, I tell myself, while being far less successful than everybody else.

And now my ‘f(x)’ is definitely not continuous. The domain’s all the real numbers, yes. But at the point where ‘x’ is 0? There’s no drawing that without raising your pen from the paper. I trust you’re convinced. Your analysis professor will claim she’s not convinced, if you write that on your exam. But if you and she were just talking about functions, she’d agree. Since there’s one point in the domain where the function’s not continuous, the function is not continuous.

So there we have it. “$\frac{1}{x}$”, taken in one reasonable way, is a continuous function. “$\frac{1}{x}$”, taken in another reasonable way, is not a continuous function. What you think reasonable is what sets your answer.

Is 1/x a Continuous Function?

So this is a question I got by way of a friend. It’s got me thinking because there is an obviously right answer. And there’s an answer that you get to if you think about it longer. And then longer still and realize there are several answers you could give. So I wanted to put it out to my audience. Figuring out your answer and why you stand on that is the interesting bit.

The question is as asked in the subject line: is $\frac{1}{x}$ a continuous function?

Mathematics majors, or related people like physics majors, already understand the question. Other people will want to know what the question means. This includes people who took a class calculus class, who remember three awful weeks where they had to write ε and δ a lot. The era passed, even if they did not. And people who never took a mathematics class, but like their odds at solving a reasoning problem, can get up to speed on this fast.

The colloquial idea of a “continuous function” is, well. Imagine drawing a curve that represents the function. Can you draw the whole thing without lifting your pencil off the page? That is, no gaps, no jumps? Then it’s continuous. That’s roughly the idea we want to capture by talking about a “continuous function”. It needs some logical rigor to pass as mathematics, though. So here we go.

A function is continuous if, and only if, it’s continuous at every point in the function’s domain. That I start out with that may inspire a particular feeling. That feeling is, “our Game Master grinned ear-to-ear and took out four more dice and a booklet when we said we were sure”.

But our best definition of continuity builds on functions at particular points. Which is fair. We can imagine a function that’s continuous in some places but that’s not continuous somewhere else. The ground can be very level and smooth right up to the cliff. And we have a nice, easy enough, idea of what it is to be continuous at a point.

I’ll get there in a moment. My life will be much easier if I can give you some more vocabulary. They’re all roughly what you might imagine the words meant if I didn’t tell you they were mathematics words.

The first is ‘map’. A function ‘maps’ something in its domain to something in its range. Like if ‘a’ is a point in the domain, ‘f’ maps that point to ‘f(a)’, in its range. Like, if your function is ‘f(x) = x2‘, then f maps 2 to 4. It maps 3 to 9. It maps -2 to 4 again, and that’s all right. There’s no reason you can’t map several things to one thing.

The next is ‘image’. Take something in the domain. It might be a single point. It might be a couple of points. It might be an interval. It might be several intervals. It’s a set, as big or as empty as you like. The `image’ of that set is all the points in the range that any point in the original set gets mapped to. So, again play with f(x) = x2. The image of the interval from 0 to 2 is the interval from 0 to 4. The image of the interval from 3 to 4 is the interval from 9 to 16. The image of the interval from -3 to 1 is the interval from 0 to 9.

That’s as much vocabulary as I need. Thank you for putting up with that. Now I can say what it means to be continuous at a point.

Is a function continuous at a point? Let me call that point ‘a’? It is continuous at ‘a’ we can do this. Take absolutely any open set in the range that contains ‘f(a)’. I’m going to call that open set ‘R’. Is there an open set, that I’ll call ‘D’, inside the domain, that contains ‘a’, and with an image that’s inside ‘R’? ‘D’ doesn’t have to be big. It can be ridiculously tiny; it just has to be an open set. If there always is a D like this, no matter how big or how small ‘R’ is, then ‘f’ is continuous at ‘a’. If there is not — if there’s even just the one exception — then ‘f’ is not continuous at ‘a’.

I realize that’s going back and forth a lot. It’s as good as we can hope for, though. It does really well at capturing things that seem like they should be continuous. And it never rules as not-continuous something that people agree should be continuous. It does label “continuous” some things that seem like they shouldn’t be. We accept this because not labelling continuous stuff as non-continuous is worse.

And all this talk about open sets and images gets a bit abstract. It’s written to cover all kinds of functions on all kinds of things. It’s hard to master, but, if you get it, you’ve got a lot of things. It works for functions on all kinds of domains and ranges. And it doesn’t need very much. You need to have an idea of what an ‘open set’ is, on the domain and range, and that’s all. This is what gives it universality.

But it does mean there’s the challenge figuring out how to start doing anything. If we promise that we’re talking about a function with domain and range of real numbers we can simplify things. This is where that ε and δ talk comes from. But here’s how we can define “continuous at a point” for a function in the special case that its domain and range are both real numbers.

Take any positive ε. Is there is some positive δ, so that, whenever ‘x’ is a number less than δ away from ‘a’, we know that f(x) is less than ε away from f(a)? If there always is, no matter how large or small ε is, then f is continuous at a. If there ever is not, even for a single exceptional ε, then f is not continuous at a.

That definition is tailored for real-valued functions. But that’s enough if you want to answer the original question. Which, you might remember, is, “is 1/x a continuous function”?

That I ask the question, for a function simple and familiar enough a lot of people don’t even need to draw it, may give away what I think the answer is. But what’s interesting is, of course, why the answer. So I’ll leave that for an essay next week.

This attractive little tweet came across my feed yesterday:

This function — I guess it’s the “popcorn” function — is a challenge to our ideas about what a “continuous” function is. I’ve mentioned “continuous” functions before and said something like they’re functions you could draw without lifting your pen from the paper. That’s the colloquial, and the intuitive, idea of what they mean. And that’s all right for ordinary uses.

But the best definition mathematicians have thought of for a “continuous function” has some quirks. And here’s one of them. Define a function named ‘f’. Its domain is the real numbers. Its range is the real numbers. And the rule matching things in the domain to things in the range is, as pictured:

• If ‘x’ is zero then $f(x) = 1$
• If ‘x’ is an irrational number then $f(x) = 0$
• If ‘x’ is a rational number, then it’s equal in lowest terms to the whole number ‘p’ divided by the positive whole number ‘q’. And for this ‘x’, then $f(x) = \frac{1}{q}$

And as the tweet from Fermat’s Library says, this is a function that’s continuous on all the irrational numbers. It’s not continuous on any rational numbers. This seems like a prank. But it’s a common approach to finding intuition-testing ideas about continuity. Setting different rules for rational and irrational numbers works well for making these strange functions. And thinking of rational numbers as their representation in lowest terms is also common. (Writing it as ‘p divided by q’ suggests that ‘p’ and ‘q’ are going to be prime, but, no! Think of $\frac{3}{8}$ or of $\frac{4}{9}$.) If you stare at the plot you can maybe convince yourself that “continuous on the irrational numbers” makes sense here. That heavy line of dots at the bottom looks like it’s approaching a continuous blur, at least.

It can get weirder. It’s possible to create a function that’s continuous at only a single point of all the real numbers. This is why Real Analysis is such a good subject to crash hard against. But we accept weird conclusions like this because the alternative is to give up as “continuous” functions that we just know have to be continuous. Mathematical definitions are things we make for our use.

The End 2016 Mathematics A To Z: Weierstrass Function

I’ve teased this one before.

Weierstrass Function.

So you know how the Earth is a sphere, but from our normal vantage point right up close to its surface it looks flat? That happens with functions too. Here I mean the normal kinds of functions we deal with, ones with domains that are the real numbers or a Euclidean space. And ranges that are real numbers. The functions you can draw on a sheet of paper with some wiggly bits. Let the function wiggle as much as you want. Pick a part of it and zoom in close. That zoomed-in part will look straight. If it doesn’t look straight, zoom in closer.

We rely on this. Functions that are straight, or at least straight enough, are easy to work with. We can do calculus on them. We can do analysis on them. Functions with plots that look like straight lines are easy to work with. Often the best approach to working with the function you’re interested in is to approximate it with an easy-to-work-with function. I bet it’ll be a polynomial. That serves us well. Polynomials are these continuous functions. They’re differentiable. They’re smooth.

That thing about the Earth looking flat, though? That’s a lie. I’ve never been to any of the really great cuts in the Earth’s surface, but I have been to some decent gorges. I went to grad school in the Hudson River Valley. I’ve driven I-80 over Pennsylvania’s scariest bridges. There’s points where the surface of the Earth just drops a great distance between your one footstep and your last.

Functions do that too. We can have points where a function isn’t differentiable, where it’s impossible to define the direction it’s headed. We can have points where a function isn’t continuous, where it jumps from one region of values to another region. Everyone knows this. We can’t dismiss those as abberations not worthy of the name “function”; too many of them are too useful. Typically we handle this by admitting there’s points that aren’t continuous and we chop the function up. We make it into a couple of functions, each stretching from discontinuity to discontinuity. Between them we have continuous region and we can go about our business as before.

Then came the 19th century when things got crazy. This particular craziness we credit to Karl Weierstrass. Weierstrass’s name is all over 19th century analysis. He had that talent for probing the limits of our intuition about basic mathematical ideas. We have a calculus that is logically rigorous because he found great counterexamples to what we had assumed without proving.

The Weierstrass function challenges this idea that any function is going to eventually level out. Or that we can even smooth a function out into basically straight, predictable chunks in-between sudden changes of direction. The function is continuous everywhere; you can draw it perfectly without lifting your pen from paper. But it always looks like a zig-zag pattern, jumping around like it was always randomly deciding whether to go up or down next. Zoom in on any patch and it still jumps around, zig-zagging up and down. There’s never an interval where it’s always moving up, or always moving down, or even just staying constant.

Despite being continuous it’s not differentiable. I’ve described that casually as it being impossible to predict where the function is going. That’s an abuse of words, yes. The function is defined. Its value at a point isn’t any more random than the value of “x2” is for any particular x. The unpredictability I’m talking about here is a side effect of ignorance. Imagine I showed you a plot of “x2” with a part of it concealed and asked you to fill in the gap. You’d probably do pretty well estimating it. The Weierstrass function, though? No; your guess would be lousy. My guess would be lousy too.

That’s a weird thing to have happen. A century and a half later it’s still weird. It gets weirder. The Weierstrass function isn’t differentiable generally. But there are exceptions. There are little dots of differentiability, where the rate at which the function changes is known. Not intervals, though. Single points. This is crazy. Derivatives are about how a function changes. We work out what they should even mean by thinking of a function’s value on strips of the domain. Those strips are small, but they’re still, you know, strips. But on almost all of that strip the derivative isn’t defined. It’s only at isolated points, a set with measure zero, that this derivative even exists. It evokes the medieval Mysteries, of how we are supposed to try, even though we know we shall fail, to understand how God can have contradictory properties.

It’s not quite that Mysterious here. Properties like this challenge our intuition, if we’ve gotten any. Once we’ve laid out good definitions for ideas like “derivative” and “continuous” and “limit” and “function” we can work out whether results like this make sense. And they — well, they follow. We can avoid weird conclusions like this, but at the cost of messing up our definitions for what a “function” and other things are. Making those useless. For the mathematical world to make sense, we have to change our idea of what quite makes sense.

That’s all right. When we look close we realize the Earth around us is never flat. Even reasonably flat areas have slight rises and falls. The ends of properties are marked with curbs or ditches, and bordered by streets that rise to a center. Look closely even at the dirt and we notice that as level as it gets there are still rocks and scratches in the ground, clumps of dirt an infinitesimal bit higher here and lower there. The flatness of the Earth around us is a useful tool, but we miss a lot by pretending it’s everything. The Weierstrass function is one of the ways a student mathematician learns that while smooth, predictable functions are essential, there is much more out there.

The End 2016 Mathematics A To Z: Smooth

Mathematicians affect a pose of objectivity. We justify this by working on things whose truth we can know, and which must be true whenever we accept certain rules of deduction and certain definitions and axioms. This seems fair. But we choose to pay attention to things that interest us for particular reasons. We study things we like. My A To Z glossary term for today is about one of those things we like.

Smooth.

Functions. Not everything mathematicians do is functions. But functions turn up a lot. We need to set some rules. “A function” is so generic a thing we can’t handle it much. Narrow it down. Pick functions with domains that are numbers. Range too. By numbers I mean real numbers, maybe complex numbers. That gives us something.

There’s functions that are hard to work with. This is almost all of them, so we don’t touch them unless we absolutely must. But they’re functions that aren’t continuous. That means what you imagine. The value of the function at some point is wholly unrelated to its value at some nearby point. It’s hard to work with anything that’s unpredictable like that. Functions as well as people.

We like functions that are continuous. They’re predictable. We can make approximations. We can estimate the function’s value at some point using its value at some more convenient point. It’s easy to see why that’s useful for numerical mathematics, for calculations to approximate stuff. The dazzling thing is it’s useful analytically. We step into the Platonic-ideal world of pure mathematics. We have tools that let us work as if we had infinitely many digits of precision, for infinitely many numbers at once. And yet we use estimates and approximations and errors. We use them in ways to give us perfect knowledge; we get there by estimates.

Continuous functions are nice. Well, they’re nicer to us than functions that aren’t continuous. But there are even nicer functions. Functions nicer to us. A continuous function, for example, can have corners; it can change direction suddenly and without warning. A differentiable function is more predictable. It can’t have corners like that. Knowing the function well at one point gives us more information about what it’s like nearby.

The derivative of a function doesn’t have to be continuous. Grumble. It’s nice when it is, though. It makes the function easier to work with. It’s really nice for us when the derivative itself has a derivative. Nothing guarantees that the derivative of a derivative is continuous. But maybe it is. Maybe the derivative of the derivative has a derivative. That’s a function we can do a lot with.

A function is “smooth” if it has as many derivatives as we need for whatever it is we’re doing. And if those derivatives are continuous. If this seems loose that’s because it is. A proof for whatever we’re interested in might need only the original function and its first derivative. It might need the original function and its first, second, third, and fourth derivatives. It might need hundreds of derivatives. If we look through the details of the proof we might find exactly how many derivatives we need and how many of them need to be continuous. But that’s tedious. We save ourselves considerable time by saying the function is “smooth”, as in, “smooth enough for what we need”.

If we do want to specify how many continuous derivatives a function has we call it a “Ck function”. The C here means continuous. The ‘k’ means there are the number ‘k’ continuous derivatives of it. This is completely different from a “Ck function”, which would be one that’s a k-dimensional vector. Whether the “C” is boldface or not is important. A function might have infinitely many continuous derivatives. That we call a “C function”. That’s got wonderful properties, especially if the domain and range are complex-valued numbers. We couldn’t do Complex Analysis without it. Complex Analysis is the course students take after wondering how they’ll ever survive Real Analysis. It’s much easier than Real Analysis. Mathematics can be strange.

The End 2016 Mathematics A To Z: Local

Today’s is another of those words that means nearly what you would guess. There are still seven letters left, by the way, which haven’t had any requested terms. If you’d like something described please try asking.

Local.

Stops at every station, rather than just the main ones.

OK, I’ll take it seriously.

So a couple years ago I visited Niagara Falls, and I stepped into the river, just above the really big drop.

I didn’t have any plans to go over the falls, and didn’t, but I liked the thrill of claiming I had. I’m not crazy, though; I picked a spot I knew was safe to step in. It’s only in the retelling I went into the Niagara River just above the falls.

Because yes, there is surely danger in certain spots of the Niagara River. But there are also spots that are perfectly safe. And not isolated spots either. I wouldn’t have been less safe if I’d stepped into the river a few feet closer to the edge. Nor if I’d stepped in a few feet farther away. Where I stepped in was locally safe.

Over in mathematics we do a lot of work on stuff that’s true or false depending on what some parameters are. We can look at bunches of those parameters, and they often look something like normal everyday space. There’s some values that are close to what we started from. There’s others that are far from that.

So, a “neighborhood” of some point is that point and some set of points containing it. It needs to be an “open” set, which means it doesn’t contain its boundary. So, like, everything less than one minute’s walk away, but not the stuff that’s precisely one minute’s walk away. (If we include boundaries we break stuff that we don’t want broken is why.) And certainly not the stuff more than one minute’s walk away. A neighborhood could have any shape. It’s easy to think of it as a little disc around the point you want. That’s usually the easiest to describe in a proof, because it’s “everything a distance less than (something) away”. (That “something” is either ‘δ’ or ‘ε’. Both Greek letters are called in to mean “a tiny distance”. They have different connotations about what the tiny distance is in.) It’s easiest to draw as little amoeba-like blob around a point, and contained inside a bigger amoeba-like blob.

Anyway, something is true “locally” to a point if it’s true in that neighborhood. That means true for everything in that neighborhood. Which is what you’d expect. “Local” means just that. It’s the stuff that’s close to where we started out.

Often we would like to know something “globally”, which means … er … everywhere. Universally so. But it’s usually easier to prove a thing locally. I suppose having a point where we know something is so makes it easier to prove things about what’s nearby. Distant stuff, who knows?

“Local” serves as an adjective for many things. We think of a “local maximum”, for example, or “local minimum”. This is where whatever we’re studying has a value bigger (or smaller) than anywhere else nearby has. Or we speak of a function being “locally continuous”, meaning that we know it’s continuous near this point and we make no promises away from it. It might be “locally differentiable”, meaning we can take derivatives of it close to some interesting point. We say nothing about what happens far from it.

Unless we do. We can talk about something being “local to infinity”. Your first reaction to that should probably be to slap the table and declare that’s it, we’re done. But we can make it sensible, at least to other mathematicians. We do it by starting with a neighborhood that contains the origin, zero, that point in the middle of everything. So, what’s the inverse of that? It’s everything that’s far enough away from the origin. (Don’t include the boundary, we don’t need those headaches.) So why not call that the “neighborhood of infinity”? Other than that it’s a weird set of words to put together? And if something is true in that “neighborhood of infinity”, what is that thing other than true “local to infinity”?

I don’t blame you for being skeptical.

Theorem Thursday: A First Fixed Point Theorem

I’m going to let the Mean Value Theorem slide a while. I feel more like a Fixed Point Theorem today. As with the Mean Value Theorem there’s several of these. Here I’ll start with an easy one.

The Fixed Point Theorem.

Back when the world and I were young I would play with electronic calculators. They encouraged play. They made it so easy to enter a number and hit an operation, and then hit that operation again, and again and again. Patterns appeared. Start with, say, ‘2’ and hit the ‘squared’ button, the smaller ‘2’ raised up from the key’s baseline. You got 4. And again: 16. And again: 256. And again and again and you got ever-huger numbers. This happened whenever you started from a number bigger than 1. Start from something smaller than 1, however tiny, and it dwindled down to zero, whatever you tried. Start at ‘1’ and it just stays there. The results were similar if you started with negative numbers. The first squaring put you in positive numbers and everything carried on as before.

This sort of thing happened a lot. Keep hitting the mysterious ‘exp’ and the numbers would keep growing forever. Keep hitting ‘sqrt’; if you started above 1, the numbers dwindled to 1. Start below and the numbers rise to 1. Or you started at zero, but who’s boring enough to do that? ‘log’ would start with positive numbers and keep dropping until it turned into a negative number. The next step was the calculator’s protest we were unleashing madness on the world.

But you didn’t always get zero, one, infinity, or madness, from repeatedly hitting the calculator button. Sometimes, some functions, you’d get an interesting number. If you picked any old number and hit cosine over and over the digits would eventually settle down to around 0.739085. Or -0.739085. Cosine’s great. Tangent … tangent is weird. Tangent does all sorts of bizarre stuff. But at least cosine is there, giving us this interesting number.

(Something you might wonder: this is the cosine of an angle measured in radians, which is how mathematicians naturally think of angles. Normal people measure angles in degrees, and that will have a different fixed point. We write both the cosine-in-radians and the cosine-in-degrees using the shorthand ‘cos’. We get away with this because people who are confused by this are too embarrassed to call us out on it. If we’re thoughtful we write, say, ‘cos x’ for radians and ‘cos x°’ for degrees. This makes the difference obvious. It doesn’t really, but at least we gave some hint to the reader.)

This all is an example of a fixed point theorem. Fixed point theorems turn up in a lot of fields. They were most impressed upon me in dynamical systems, studying how a complex system changes in time. A fixed point, for these problems, is an equilibrium. It’s where things aren’t changed by a process. You can see where that’s interesting.

In this series I haven’t stated theorems exactly much, and I haven’t given them real proofs. But this is an easy one to state and to prove. Start off with a function, which I’ll name ‘f’, because yes that is exactly how much effort goes in to naming functions. It has as a domain the interval [a, b] for some real numbers ‘a’ and ‘b’. And it has as rang the same interval, [a, b]. It might use the whole range; it might use only a subset of it. And we have to require that f is continuous.

Then there has to be at least one fixed point. There must be at last one number ‘c’, somewhere in the interval [a, b], for which f(c) equals c. There may be more than one; we don’t say anything about how many there are. And it can happen that c is equal to a. Or that c equals b. We don’t know that it is or that it isn’t. We just know there’s at least one ‘c’ that makes f(c) equal c.

You get that in my various examples. If the function f has the rule that any given x is matched to x2, then we do get two fixed points: f(0) = 02 = 0, and, f(1) = 12 = 1. Or if f has the rule that any given x is matched to the square root of x, then again we have: $f(0) = \sqrt{0} = 0$ and $f(1) = \sqrt{1} = 1$. Same old boring fixed points. The cosine is a little more interesting. For that we have $f(0.739085...) = \cos\left(0.739085...\right) = 0.739085...$.

How to prove it? The easiest way I know is to summon the Intermediate Value Theorem. Since I wrote a couple hundred words about that a few weeks ago I can assume you to understand it perfectly and have no question about how it makes this problem easy. I don’t even need to go on, do I?

… Yeah, fair enough. Well, here’s how to do it. We’ll take the original function f and create, based on it, a new function. We’ll dig deep in the alphabet and name that ‘g’. It has the same domain as f, [a, b]. Its range is … oh, well, something in the real numbers. Don’t care. The wonder comes from the rule we use.

The rule for ‘g’ is this: match the given number ‘x’ with the number ‘f(x) – x’. That is, g(a) equals whatever f(a) would be, minus a. g(b) equals whatever f(b) would be, minus b. We’re allowed to define a function in terms of some other function, as long as the symbols are meaningful. But we aren’t doing anything wrong like dividing by zero or taking the logarithm of a negative number or asking for f where it isn’t defined.

You might protest that we don’t know what the rule for f is. We’re told there is one, and that it’s a continuous function, but nothing more. So how can I say I’ve defined g in terms of a function I don’t know?

In the first place, I already know everything about f that I need to. I know it’s a continuous function defined on the interval [a, b]. I won’t use any more than that about it. And that’s great. A theorem that doesn’t require knowing much about a function is one that applies to more functions. It’s like the difference between being able to say something true of all living things in North America, and being able to say something true of all persons born in Redbank, New Jersey, on the 18th of February, 1944, who are presently between 68 and 70 inches tall and working on their rock operas. Both things may be true, but one of those things you probably use more.

In the second place, suppose I gave you a specific rule for f. Let me say, oh, f matches x with the arccosecant of x. Are you feeling any more enlightened now? Didn’t think so.

Back to g. Here’s some things we can say for sure about it. g is a function defined on the interval [a, b]. That’s how we set it up. Next point: g is a continuous function on the interval [a, b]. Remember, g is just the function f, which was continuous, minus x, which is also continuous. The difference of two continuous functions is still going to be continuous. (This is obvious, although it may take some considered thinking to realize why it is obvious.)

Now some interesting stuff. What is g(a)? Well, it’s whatever number f(a) is minus a. I can’t tell you what number that is. But I can tell you this: it’s not negative. Remember that f(a) has to be some number in the interval [a, b]. That is, it’s got to be no smaller than a. So the smallest f(a) can be is equal to a, in which case f(a) minus a is zero. And f(a) might be larger than a, in which case f(a) minus a is positive. So g(a) is either zero or a positive number.

(If you’ve just realized where I’m going and gasped in delight, well done. If you haven’t, don’t worry. You will. You’re just out of practice.)

What about g(b)? Since I don’t know what f(b) is, I can’t tell you what specific number it is. But I can tell you it’s not a positive number. The reasoning is just like above: f(b) is some number on the interval [a, b]. So the biggest number f(b) can equal is b. And in that case f(b) minus b is zero. If f(b) is any smaller than b, then f(b) minus b is negative. So g(b) is either zero or a negative number.

(Smiling at this? Good job. If you aren’t, again, not to worry. This sort of argument is not the kind of thing you do in Boring Algebra. It takes time and practice to think this way.)

And now the Intermediate Value Theorem works. g(a) is a positive number. g(b) is a negative number. g is continuous from a to b. Therefore, there must be some number ‘c’, between a and b, for which g(c) equals zero. And remember what g(c) means: f(c) – c equals 0. Therefore f(c) has to equal c. There has to be a fixed point.

And some tidying up. Like I said, g(a) might be positive. It might also be zero. But if g(a) is zero, then f(a) – a = 0. So a would be a fixed point. And similarly if g(b) is zero, then f(b) – b = 0. So then b would be a fixed point. The important thing is there must be at least some fixed point.

Now that calculator play starts taking on purposeful shape. Squaring a number could find a fixed point only if you started with a number from -1 to 1. The square of a number outside this range, such as ‘2’, would be bigger than you started with, and the Fixed Point Theorem doesn’t apply. Similarly with exponentials. But square roots? The square root of any number from 0 to a positive number ‘b’ is a number between 0 and ‘b’, at least as long as b was bigger than 1. So there was a fixed point, at 1. The cosine of a real number is some number between -1 and 1, and the cosines of all the numbers between -1 and 1 are themselves between -1 and 1. The Fixed Point Theorem applies. Tangent isn’t a continuous function. And the calculator play never settles on anything.

As with the Intermediate Value Theorem, this is an existence proof. It guarantees there is a fixed point. It doesn’t tell us how to find one. Calculator play does, though. Start from any old number that looks promising and work out f for that number. Then take that and put it back into f. And again. And again. This is known as “fixed point iteration”. It won’t give you the exact answer.

Not usually, anyway. In some freak cases it will. But what it will give, provided some extra conditions are satisfied, is a sequence of values that get closer and closer to the fixed point. When you’re close enough, then you stop calculating. How do you know you’re close enough? If you know something about the original f you can work out some logically rigorous estimates. Or you just keep calculating until all the decimal points you want stop changing between iterations. That’s not logically sound, but it’s easy to program.

That won’t always work. It’ll only work if the function f is differentiable on the interval (a, b). That is, it can’t have corners. And there have to be limits on how fast the function changes on the interval (a, b). If the function changes too fast, iteration can’t be guaranteed to work. But often if we’re interested in a function at all then these conditions will be true, or we can think of a related function that for which they are true.

And even if it works it won’t always work well. It can take an enormous pile of calculations to get near the fixed point. But this is why we have computers, and why we can leave them to work overnight.

And yet such a simple idea works. It appears in ancient times, in a formula for finding the square root of an arbitrary positive number ‘N’. (Find the fixed point for $f(x) = \frac{1}{2}\left(\frac{N}{x} + x\right)$). It creeps into problems that don’t look like fixed points. Calculus students learn of something called the Newton-Raphson Iteration. It finds roots, points where a function f(x) equals zero. Mathematics majors learn of numerical methods to solve ordinary differential equations. The most stable of these are again fixed-point iteration schemes, albeit in disguise.

Theorem Thursday: The Intermediate Value Theorem

I am still taking requests for this Theorem Thursdays sequence. I intend to post each Thursday in June and July an essay talking about some theorem and what it means and why it’s important. I have gotten a couple of requests in, but I’m happy to take more; please just give me a little lead time. But I want to start with one that delights me.

The Intermediate Value Theorem

I own a Scion tC. It’s a pleasant car, about 2400 percent more sporty than I am in real life. I got it because it met my most important criteria: it wasn’t expensive and it had a sun roof. That it looks stylish is an unsought bonus.

But being a car, and a black one at that, it has a common problem. Leave it parked a while, then get inside. In the winter, it gets so cold that snow can fall inside it. In the summer, it gets so hot that the interior, never mind the passengers, risks melting. While pondering this slight inconvenience I wondered, isn’t there any outside temperature that leaves my car comfortable?

Of course there is. We know this before thinking about it. The sun heats the car, yes. When the outside temperature is low enough, there’s enough heat flowing out that the car gets cold. When the outside temperature’s high enough, not enough heat flows out. The car stays warm. There must be some middle temperature where just enough heat flows out that the interior doesn’t get particularly warm or cold. Not just one middle temperature, come to that. There is a range of temperatures that are comfortable to sit in. But that just means there’s a range of outside temperatures for which the car’s interior stays comfortable. We know this range as late April, early May, here. Most years, anyway.

The reasoning that lets us know there is a comfort-producing outside temperature we can see as a use of the Intermediate Value Theorem. It addresses a function f with domain [a, b], and range of the real numbers. The domain is closed; that is, the numbers we call ‘a’ and ‘b’ are both in the set. And f has to be a continuous function. If you want to draw it, you can do so without having to lift pen from paper. (WARNING: Do not attempt to pass your Real Analysis course with that definition. But that’s what the proper definition means.)

So look at the numbers f(a) and f(b). Pick some number between them, and I’ll call that number ‘g’. There must be at least one number ‘c’, that’s between ‘a’ and ‘b’, and for which f(c) equals g.

Bernard Bolzano, an early-19th century mathematician/logician/theologist/priest, gets the credit for first proving this theorem. Bolzano’s version was a little different. It supposes that f(a) and f(b) are of opposite sign. That is, f(a) is a positive and f(b) a negative number. Or f(a) is negative and f(b) is positive. And Bolzano’s theorem says there must be some number ‘c’ for which f(c) is zero.

You can prove this by drawing any wiggly curve at all and then a horizontal line in the middle of it. Well, that doesn’t prove it to mathematician’s satisfaction. But it will prove the matter in the sense that you’ll be convinced. It’ll also convince anyone you try explaining this to.

You might wonder why anyone needed this proved at all. It’s a bit like proving that as you pour water into the sink there’ll come a time the last dish gets covered with water. So it is. The need for a proof came about from the ongoing attempt to make mathematics rigorous. We have an intuitive idea of what it means for functions to be continuous; see my above comment about lifting pens from paper. Can that be put in terms that don’t depend on physical intuition? … Yes, it can. And we can divorce the Intermediate Value Theorem from our physical intuitions. We can know something that’s true even if we never see a car or a sink.

This theorem might leave you feeling a little hollow inside. Proving that there is some ‘c’ for which f(c) equals g, or even equals zero, doesn’t seem to tell us much about how to find it. It doesn’t even tell us that there’s only one ‘c’, rather than two or three or a hundred million candidates that meet our criteria. Fair enough. The Intermediate Value Theorem is more about proving the existence of solutions, rather than how to find them.

But knowing there is a solution can help us find them. The Intermediate Value Theorem as we know it grew out of finding roots for polynomials. One numerical method, easy to set up for any problem, is the bisection method. If you know that somewhere between ‘a’ and ‘b’ the function goes from positive to negative, then find the midpoint, ‘c’. The function is equal to zero either between ‘a’ and ‘c’, or between ‘c’ and ‘b’. Pick the side that it’s on, and bisect that. Pick the half of that which the zero must be in. Bisect that half. And repeat until you get close enough to the answer for your needs. (The same reasoning applies to a lot of problems in which you divide the search range in two each time until the answer appears.)

We can get some pretty heady results from the Intermediate Value Theorem, too, even if we don’t know where any of them are. An example you’ll see everywhere is that there must be spots on the opposite sides of the globe with the exact same temperature. Or humidity, or daily rainfall, or any other quantity like that. I had thought everyone was ripping that example off from Richard Courant and Herbert Robbins’s masterpiece What Is Mathematics?. But I can’t find this particular example in there. I wonder what we are all ripping it off from.

So here’s a neat example that is ripped off from them. Draw two blobs on the plane. Is there a straight line that bisects both of them at once? Bisecting here means there’s exactly as much of one blob on one side of the line as on the other. There certainly is. The trick is there are any number of lines that will bisect one blob, and then look at what that does to the other.

A similar ripped-off result you can do with a single blob of any shape you like. Draw any line that bisects it. There are a lot of candidates. Can you draw a line perpendicular to that so that the blob gets quartered, divided into four spots of equal area? Yes. Try it.

But surely the best use of the Intermediate Value Theorem is in the problem of wobbly tables. If the table has four legs, all the same length, and the problem is the floor isn’t level it’s all right. There is some way to adjust the table so it won’t wobble. (Well, the ground can’t be angled more than a bit over 35 degrees, but that’s all right. If the ground has a 35 degree angle you aren’t setting a table on it. You’re rolling down it.) Finally a mathematical proof can save us from despair!

Except that the proof doesn’t work if the table legs are uneven which, alas, they often are. But we can’t get everything.

Courant and Robbins put forth one more example that’s fantastic, although it doesn’t quite work. But it’s a train problem unlike those you’ve seen before. Let me give it to you as they set it out:

Suppose a train travels from station A to station B along a straight section of track. The journey need not be of uniform speed or acceleration. The train may act in any manner, speeding up, slowing down, coming to a halt, or even backing up for a while, before reaching B. But the exact motion of the train is supposed to be known in advance; that is, the function s = f(t) is given, where s is the distance of the train from station A, and t is the time, measured from the instant of departure.

On the floor of one of the cars a rod is pivoted so that it may move without friction either forward or backward until it touches the floor. If it does touch the floor, we assume that it remains on the floor henceforth; this wil be the case if the rod does not bounce.

Is it possible to place the rod in such a position that, if it is released at the instant when the train starts and allowed to move solely under the influence of gravity and the motion of the train, it will not fall to the floor during the entire journey from A to B?

They argue it is possible, and use the Intermediate Value Theorem to show it. They admit the range of angles it’s safe to start the rod from may be too small to be useful.

But they’re not quite right. Ian Stewart, in the revision of What Is Mathematics?, includes an appendix about this. Stewart credits Tim Poston with pointing out, in 1976, the flaw. It’s possible to imagine a path which causes the rod, from one angle, to just graze tipping over, let’s say forward, and then get yanked back and fall over flat backwards. This would leave no room for any starting angles that avoid falling over entirely.

It’s a subtle flaw. You might expect so. Nobody mentioned it between the book’s original publication in 1941, after which everyone liking mathematics read it, and 1976. And it is one that touches on the complications of spaces. This little Intermediate Value Theorem problem draws us close to chaos theory. It’s one of those ideas that weaves through all mathematics.

Things To Be Thankful For

A couple buildings around town have blackboard paint and a writing prompt on the walls. Here’s one my love and I wandered across the other day while going to Fabiano’s Chocolate for the obvious reason. (The reason was to see their novelty three-foot-tall, 75-pound solid chocolate bunny. Also to buy less huge piles of candy.)

I recognized that mathematics majors had been past. Well, anyone with an interest in popular mathematics might have written they’re grateful for “G. Cantor”. His work’s escaped into the popular imagination, at least a bit. “C. Weirstrauβ”, though, that’s a mathematics major at work.

Karl Weierstrass, the way his name’s rendered in the English-language mathematics books I know, was one of the people who made analysis what it is today. Analysis is, at heart, the study of why calculus works. He attacked the foundations of calculus, which by modern standards weren’t quite rigorous. And he did brilliantly, giving us the modern standards of rigor. He’s terrified generations of mathematics majors by defining what it is for a function to be continuous. Roughly, it means we can draw the graph of a function without having to lift a pencil. He put it in a non-rough manner. He also developed the precise modern idea for what a limit is. Roughly, a limit is exactly what you might think it means; but to be precise takes genius.

Among Weierstrass’s students was Georg Cantor. His is a more familiar name. He proved that just because a set has infinitely many elements in it doesn’t mean that it can’t be quite small compared to other infinitely large sets. His Diagonal Argument shows there must be, in a sense, more real numbers than there are counting numbers. And a child can understand it. Cantor also pioneered the modern idea of set theory. For a while this looked like it might be the best way to understand why arithmetic works like it does. (My understanding is it’s now thought category theory more fundamental. But I don’t know category theory well enough to have an informed opinion.)

The person grateful to Michigan State University basketball I assume wrote that before last Sunday, when the school wrecked so many NCAA tournament brackets.

The Set Tour, Part 13: Continuity

I hope we’re all comfortable with the idea of looking at sets of functions. If not we can maybe get comfortable soon. What’s important about functions is that we can add them together, and we can multiply them by real numbers. They work in important ways like regular old numbers would. They also work the way vectors do. So all we have to do is be comfortable with vectors. Then we have the background to talk about functions this way. And so, my first example of an oft-used set of functions:

C[a, b]

People like continuity. It’s comfortable. It’s reassuring, even. Most situations, most days, most things are pretty much like they were before, and that’s how we want it. Oh, we cast some hosannas towards the people who disrupt the steady progression of stuff. But we’re lying. Think of the worst days of your life. They were the ones that were very much not like the day before. If the day is discontinuous enough, then afterwards, people ask one another what they were doing when the discontinuous thing happened.

(OK, there are some good days which are very much not like the day before. But imagine someone who seems informed assures you that tomorrow will completely change your world. Do you feel anticipation or dread?)

Mathematical continuity isn’t so fraught with social implications. What we mean by a continuous function is — well, skip the precise definition. Calculus I students see it, stare at it, and run away. It comes back to the mathematics majors in Intro to Real Analysis. Then it comes back again in Real Analysis. Mathematics majors get to accepting it sometime around Real Analysis II, because the alternative is Functional Analysis. The definition’s in truth not so bad. But it’s fussy and if you get any parts wrong silly consequences follow.

If you’re not a mathematics major, or if you’re a mathematics major not taking a test in Real Analysis, you can get away with this. We’re talking here, and we’re going to keep talking, about functions with real numbers as the domain and real numbers as the range. Later, we can go to complex-valued numbers, or even vectors of numbers. The arguments get a bit longer but don’t change much, so if you learn this you’ve got most of the way to learning everything.

A continuous function is one whose graph you can draw without having to lift your pen. We like continuous functions, mathematically, because they are so much easier to work with. Why are they easy? Well, because if you know the value of your function at one point, you know approximately what it is at nearby points. There’s predictability to the function’s values. You can see why this would make it easier to do calculations. But it makes analysis easy too. We want to do a lot of proofs which involve arithmetic with the values functions have. It gets so much easier that we can say the function’s actual value is something like the value it has at some point we happen to know.

So if we want to work with functions, we usually want to work with continuous functions. They behave more predictably, and more like we hope they will.

The set C[a, b] is the set of all continuous real-valued whose domain is the set of real numbers from a to b. For example, pick a function that’s in C[-1, 1]. Let me call it f. Then f is a real-valued function. And its domain is the real numbers from -1 to 1. In the absence of other information about what its range is, we assume it to be the real numbers R. We can have any real numbers as the boundaries; C[-1000, π] is legitimate if eccentric.

There are some ranges that are particularly popular. All the real numbers is one. That might get written C(R) for shorthand. C[0, 1], the range from 0 to 1, is popular and easy to work with. C[-1, 1] is almost as good and has the advantage of giving us negative numbers. C[-π, π] is also liked because it meshes well with the trigonometric functions. You remember those: sines and cosines and tangent functions, plus some unpopular ones we try to not talk about. We don’t often talk about other ranges. We can change, say, C[0, 1] into C[0, 10] exactly the way you’d imagine. Re-scaling numbers, and even shifting them up or down some, requires so little work we don’t bother doing it.

C[-1, 1] is a different set of functions from, say, C[0, 1]. There are many functions in one set that have the same rule as a function in another set. But the functions in C[-1, 1] have a different domain from the functions in C[0, 1]. So they can’t be the same functions. The rule might be meaningful outside the domain. If the rule is “f:x -> 3*x”, well, that makes sense whatever x should be. But a function is the rule, the domain, and the range together. If any of the parts changes, we have a different function.

The way I’ve written the symbols, with straight brackets [a, b], means that both the numbers a and b are in the domain of these functions. If I want to omit the boundaries — have every number greater than a but not a itself, and have every number less than b but not b itself — then we change to parentheses. That would be C(-1, 1). If I want to include one boundary but not the other, use a straight bracket for the boundary to include, and a parenthesis for the boundary to omit. C[-1, 1) says functions in that set have a domain that includes -1 but does not include -1. It also drives my text editor crazy having unmatched parentheses and brackets like that. We must suffer for our mathematical arts.

Jump discontinuity.

Analysis is one of the major subjects in mathematics. That’s the study of functions. These usually have numbers as the domain and the range. The domain and range might be the real numbers, or complex numbers, or they might be sets of real or complex numbers. But they’re all numbers. If you asked for an example of one of these functions you’d get something that looked more or less like a function out of high school.

Continuity is one of the things mathematicians look for in functions. To a mathematician continuity means almost what you’d imagine from the everyday definition of the term. You could draw a sketch of a continuous function without having to lift your pen off the paper. (Typically. If you want to, you can define functions that meet the proper mathematical definition of “continuous” but that you really can’t draw. Mathematicians use these functions to keep one another humble.)

Continuous functions tend to be nice ones to work with. Continuity usually makes it easier to prove a function has whatever other properties you’d like. Mathematicians will even talk about continuous functions as being nice and well-behaved and even normal, as though the functions being easier to work with bestowed on them some moral virtue. However, not every function is continuous. Properly speaking, most functions aren’t continuous. This is the same way that most numbers aren’t whole numbers.

There are different ways that a function can be discontinuous. One of the easiest to understand and to work with is called a jump discontinuity. If you draw a plot representing a function with a jump discontinuity, it looks rather like the plot of a nice, well-behaved, continuous function except that at the discontinuity it jumps. From one side of the discontinuity to the other the function suddenly hops upward, or drops downward.

If a function only has jump discontinuities we aren’t badly off. We can write a function with jump discontinuities as the sum of a continuous function and a function made up only of jumps. The continuous function will be easy to work with, since it’s continuous. The function made of jumps isn’t continuous, by definition, but it’s going to be “flat” — it’ll have the same value in-between any two jumps. That’s usually easy to work with, and while the details of these jump functions will be different they’ll all look about the same. They’ll have different heights and jump up or down at different points, but if you know how to understand a function that jumps from being equal to 0 to being equal to 1 when the input goes from just below to just above 2, then you know how to understand a function that jumps from being equal to 0 to being equal to 3 when the input goes from just below 2.5 to just above 2.5.

This won’t let us work with every function. Most functions are going to be discontinuous in ways that we can’t resolve with jump functions. But a lot of the functions we’re naturally interested in, because they model interesting problems, can be. And so we can divide tricky functions into sets of functions that are easier to deal with.

The Intermediacy That Was Overused

However I may sulk, Chiaroscuro did show off a use of the Intermediate Value Theorem that I wanted to talk about because normally the Intermediate Value Theorem occupies a little spot around Chapter 2, Section 6 of the Intro Calculus textbook and it gets a little attention just before the class moves on to this theorem about there being some point where the slope of the derivative equals the slope of a secant line which is very testable and leaves the entire class confused.

The theorem is pretty easy to state, and looks obviously true, which is a danger sign. One bit of mathematics folklore is that the only things one should never try to prove are the false and the obvious. But it’s not hard to prove, at least based on my dim memories of the last time I went through the proof. One incarnation of the theorem, one making it look quite obvious, starts off with a function that takes as its input a real number — since we need a label for it we’ll use the traditional variable name x — and returns as output a real number, possibly a different number. And we have to also suppose that the function is continuous, which means just about what you’d expect from the meaning of “continuous” in ordinary human language. It’s a bit tricky to describe exactly, in mathematical terms, and is where students get hopelessly lost either early in Chapter 2 or early in Chapter 3 of the Intro Calculus textbook. We’ll worry about that later if at all. For us it’s enough to imagine it means you can draw a curve representing the function without having to lift your pen from the paper.

Descartes’ Flies

There are a healthy number of legends about René Descartes. Some of them may be true. I know the one I like is the story that this superlative mathematician, philosopher, and theologian (fields not so sharply differentiated in his time as they are today; for that matter, fields still not perfectly sharply differentiated) was so insistent on sleeping late and sufficiently ingenious in forming arguments that while a student at the Jesuit Collè Royal Henry-Le-Grand he convinced his schoolmasters to let him sleep until 11 am. Supposedly he kept to this rather civilized rising hour until he last months of his life, when he needed to tutor Queen Christina of Sweden in the earliest hours of the winter morning.

I suppose this may be true; it’s certainly repeated often enough, and comes to mind often when I do have to wake to the alarm clock. I haven’t studied Descartes’ biography well enough to know whether to believe it, although as it makes for a charming and humanizing touch probably the whole idea is bunk and we’re fools to believe it. I’m comfortable being a little foolish. (I’ve read just the one book which might be described as even loosely biographic of Descartes — Russell Shorto’s Descartes’ Bones — and so, though I have no particular reason to doubt Shorto’s research and no question with his narrative style, suppose I am marginally worse-informed than if I were completely ignorant. It takes a cluster of books on a subject to know it.)

Place the name “Descartes” into the conversation and a few things pop immediately into mind. Those things are mostly “I think, therefore I am”, and some attempts to compose a joke about being “before the horse”. Running up sometime after that is something called “Cartesian coordinates”, which are about the most famous kind of coordinates and the easiest way to get into the problem of describing just where something is in two- or three-dimensional space.