The End 2016 Mathematics A To Z: Jordan Curve


I realize I used this thing in one of my Theorem Thursday posts but never quite said what it was. Let me fix that.

Jordan Curve

Get a rubber band. Well, maybe you can’t just now, even if you wanted to after I gave orders like that. Imagine a rubber band. I apologize to anyone so offended by my imperious tone that they’re refusing. It’s the convention for pop mathematics or science.

Anyway, take your rubber band. Drop it on a table. Fiddle with it so it hasn’t got any loops in it and it doesn’t twist over any. I want the whole of one edge of the band touching the table. You can imagine the table too. That is a Jordan Curve, at least as long as the rubber band hasn’t broken.

This may not look much like a circle. It might be close, but I bet it’s got some wriggles in its curves. Maybe it even curves so much the thing looks more like a kidney bean than a circle. Maybe it pinches so much that it looks like a figure eight, a couple of loops connected by a tiny bridge on the interior. Doesn’t matter. You can bring out the circle. Put your finger inside the rubber band’s loops and spiral your finger around. Do this gently and the rubber band won’t jump off the table. It’ll round out to as perfect a circle as the limitations of matter allow.

And for that matter, if we wanted, we could take a rubber band laid down as a perfect circle. Then nudge it here and push it there and wrinkle it up into as complicated a figure as you like. Either way is as possible.

A Jordan Curve is a closed curve, a curve that loops around back to itself. And it’s simple. That is, it doesn’t cross over itself at any point. However weird and loopy this figure is, as long as it doesn’t cross over itself, it’s got in a sense the same shape as a circle. We can imagine a function that matches every point on a true circle to a point on the Jordan Curve. A set of points in order on the original circle will match to points in the same order on the Jordan Curve. There’s nothing missing and there’s no jumps or ambiguous points. And no point on the Jordan Curve matches to two or more on the original circle. (This is why we don’t let the curve to cross over itself.)

When I wrote about the Jordan Curve Theorem it was about how to tell how a curve divides a plane into two pieces, an inside and an outside. You can have some pretty complicated-looking figures. I have an example on the Jordan Curve Theorem essay, but you can make your own by doodling. And we can look at it as a circle, as a rubber band, twisted all around.

This all dips into topology, the study of how shapes connect when we don’t care about distance. But there are simple wondrous things to find about them. For example. Draw a Jordan Curve, please. Any that you like. Now draw a triangle. Again, any that you like.

There is some trio of points in your Jordan Curve which connect to a triangle the same shape as the one you drew. It may be bigger than your triangle, or smaller. But it’ll look similar. The angles inside will all be the same as the ones you started with. This should help make doodling during a dull meeting even more exciting.

There may be four points on your Jordan Curve that make a square. I don’t know. Nobody knows for sure. There certainly are if your curve is convex, that is, if no line between any two points on the curve goes outside the curve. And it’s true even for curves that aren’t complex if they are smooth enough. But generally? For an arbitrary curve? We don’t know. It might be true. It might be impossible to find a square in some Jordan Curve. It might be the Jordan Curve you drew. Good luck looking.

Theorem Thursday: The Jordan Curve Theorem


There are many theorems that you have to get fairly far into mathematics to even hear of. Often they involve things that are so abstract and abstruse that it’s hard to parse just what we’re studying. This week’s entry is not one of them.

The Jordan Curve Theorem.

There are a couple of ways to write this. I’m going to fall back on the version that Richard Courant and Herbert Robbins put in the great book What Is Mathematics?. It’s a theorem in the field of topology, the study of how shapes interact. In particular it’s about simple, closed curves on a plane. A curve is just what you figure it should be. It’s closed if it … uh … closes, makes a complete loop. It’s simple if it doesn’t cross itself or have any disconnected bits. So, something you could draw without lifting pencil from paper and without crossing back over yourself. Have all that? Good. Here’s the theorem:

A simple closed curve in the plane divides that plane into exactly two domains, an inside and an outside.

It’s named for Camille Jordan, a French mathematician who lived from 1838 to 1922, and who’s renowned for work in group theory and topology. It’s a different Jordan from the one named in Gauss-Jordan Elimination, which is a matrix thing that’s important but tedious. It’s also a different Jordan from Jordan Algebras, which I remember hearing about somewhere.

The Jordan Curve Theorem is proved by reading its proposition and then saying, “Duh”. This is compelling, although it lacks rigor. It’s obvious if your curve is a circle, or a slightly squished circle, or a rectangle or something like that. It’s less obvious if your curve is a complicated labyrinth-type shape.

A labyrinth drawn in straight and slightly looped lines.
A generic complicated maze shape. Can you pick out which part is the inside and which the outside? Pretend you don’t notice that little peninsula thing in the upper right corner. I didn’t mean the line to overlap itself but I was using too thick a brush in ArtRage and didn’t notice before I’d exported the image.

It gets downright hard if the curve has a lot of corners. This is why a completely satisfying rigorous proof took decades to find. There are curves that are nowhere differentiable, that are nothing but corners, and those are hard to deal with. If you think there’s no such thing, then remember the Koch Snowflake. That’s that triangle sticking up from the middle of a straight line, that itself has triangles sticking up in the middle of its straight lines, and littler triangles still sticking up from the straight lines. Carry that on forever and you have a shape that’s continuous but always changing direction, and this is hard to deal with.

Still, you can have a good bit of fun drawing a complicated figure, then picking a point and trying to work out whether it’s inside or outside the curve. The challenging way to do that is to view your figure as a maze and look for a path leading outside. The easy way is to draw a new line. I recommend doing that in a different color.

In particular, draw a line from your target point to the outside. Some definitely outside point. You need the line to not be parallel to any of the curve’s line segments. And it’s easier if you don’t happen to intersect any vertices, but if you must, we’ll deal with that two paragraphs down.

A dot with a testing line that crosses the labyrinth curve six times, and therefore is outside the curve.
A red dot that turns out to be outside the labyrinth, based on the number of times the testing line, in blue, crosses the curve. I learned doing this that I should have drawn the dot and blue line first and then fit a curve around it so I wouldn’t have to work so hard to find one lousy point and line segment that didn’t have some problems.

So draw your testing line here from the point to something definitely outside. And count how many times your testing line crosses the original curve. If the testing line crosses the original curve an even number of times then the original point was outside the curve. If the testing line crosses the original an odd number of times then the original point was inside of the curve. Done.

If your testing line touches a vertex, well, then it gets fussy. It depends whether the two edges of the curve that go into that vertex stay on the same side as your testing line. If the original curve’s edges stay on the same side of your testing line, then don’t count that as a crossing. If the edges go on opposite sides of the testing line, then that does count as one crossing. With that in mind, carry on like you did before. An even number of crossings means your point was outside. An odd number of crossings means your point was inside.

The testing line touches a corner of the curve. The curve comes up to and goes away from the same side as the testing line.
This? Doesn’t count as the blue testing line crossing the black curve.

The testing line touches a corner of the curve. The curve crosses over, with legs on either side of the testing line at that point.
This? This counts as the blue testing line crossing the black curve.

So go ahead and do this a couple times with a few labyrinths and sample points. It’s fun and elevates your doodling to the heights of 19th-century mathematics. Also once you’ve done that a couple times you’ve proved the Jordan curve theorem.

Well, no, not quite. But you are most of the way to proving it for a special case. If the curve is a polygon, a shape made up of a finite number of line segments, then you’ve got almost all the proof done. You have to finish it off by choosing a ray, a direction, that isn’t parallel to any of the polygon’s line segments. (This is one reason this method only works for polygons, and fails for stuff like the Koch Snowflake. It also doesn’t work well with space-filling curves, which are things that exist. Yes, those are what they sound like: lines that squiggle around so much they fill up area. Some can fill volume. I swear. It’s fractal stuff.) Imagine all the lines that are parallel to that ray. There’s definitely some point along that line that’s outside the curve. You’ll need that for reference. Classify all the points on that line by whether there’s an even or an odd number of crossings between a starting point and your reference definitely-outside point. Keep doing that for all these many parallel lines.

And that’s it. The mess of points that have an odd number of intersections are the inside. The mess of points that have an even number of intersections are the outside.

You won’t be surprised to know there’s versions of the Jordan curve theorem for solid objects in three-dimensional space. And for hyperdimensional spaces too. You can always work out an inside and an outside, as long as space isn’t being all weird. But it might sound like it’s not much of a theorem. So you can work out an inside and an outside; so what?

But it’s one of those great utility theorems. It pops in to places, the perfect tool for a problem you were just starting to notice existed. If I can get my rhetoric organized I hope to show that off next week, when I figure to do the Five-Color Map Theorem.

Theorem Thursday: A First Fixed Point Theorem


I’m going to let the Mean Value Theorem slide a while. I feel more like a Fixed Point Theorem today. As with the Mean Value Theorem there’s several of these. Here I’ll start with an easy one.

The Fixed Point Theorem.

Back when the world and I were young I would play with electronic calculators. They encouraged play. They made it so easy to enter a number and hit an operation, and then hit that operation again, and again and again. Patterns appeared. Start with, say, ‘2’ and hit the ‘squared’ button, the smaller ‘2’ raised up from the key’s baseline. You got 4. And again: 16. And again: 256. And again and again and you got ever-huger numbers. This happened whenever you started from a number bigger than 1. Start from something smaller than 1, however tiny, and it dwindled down to zero, whatever you tried. Start at ‘1’ and it just stays there. The results were similar if you started with negative numbers. The first squaring put you in positive numbers and everything carried on as before.

This sort of thing happened a lot. Keep hitting the mysterious ‘exp’ and the numbers would keep growing forever. Keep hitting ‘sqrt’; if you started above 1, the numbers dwindled to 1. Start below and the numbers rise to 1. Or you started at zero, but who’s boring enough to do that? ‘log’ would start with positive numbers and keep dropping until it turned into a negative number. The next step was the calculator’s protest we were unleashing madness on the world.

But you didn’t always get zero, one, infinity, or madness, from repeatedly hitting the calculator button. Sometimes, some functions, you’d get an interesting number. If you picked any old number and hit cosine over and over the digits would eventually settle down to around 0.739085. Or -0.739085. Cosine’s great. Tangent … tangent is weird. Tangent does all sorts of bizarre stuff. But at least cosine is there, giving us this interesting number.

(Something you might wonder: this is the cosine of an angle measured in radians, which is how mathematicians naturally think of angles. Normal people measure angles in degrees, and that will have a different fixed point. We write both the cosine-in-radians and the cosine-in-degrees using the shorthand ‘cos’. We get away with this because people who are confused by this are too embarrassed to call us out on it. If we’re thoughtful we write, say, ‘cos x’ for radians and ‘cos x°’ for degrees. This makes the difference obvious. It doesn’t really, but at least we gave some hint to the reader.)

This all is an example of a fixed point theorem. Fixed point theorems turn up in a lot of fields. They were most impressed upon me in dynamical systems, studying how a complex system changes in time. A fixed point, for these problems, is an equilibrium. It’s where things aren’t changed by a process. You can see where that’s interesting.

In this series I haven’t stated theorems exactly much, and I haven’t given them real proofs. But this is an easy one to state and to prove. Start off with a function, which I’ll name ‘f’, because yes that is exactly how much effort goes in to naming functions. It has as a domain the interval [a, b] for some real numbers ‘a’ and ‘b’. And it has as rang the same interval, [a, b]. It might use the whole range; it might use only a subset of it. And we have to require that f is continuous.

Then there has to be at least one fixed point. There must be at last one number ‘c’, somewhere in the interval [a, b], for which f(c) equals c. There may be more than one; we don’t say anything about how many there are. And it can happen that c is equal to a. Or that c equals b. We don’t know that it is or that it isn’t. We just know there’s at least one ‘c’ that makes f(c) equal c.

You get that in my various examples. If the function f has the rule that any given x is matched to x2, then we do get two fixed points: f(0) = 02 = 0, and, f(1) = 12 = 1. Or if f has the rule that any given x is matched to the square root of x, then again we have: f(0) = \sqrt{0} = 0 and f(1) = \sqrt{1} = 1 . Same old boring fixed points. The cosine is a little more interesting. For that we have f(0.739085...) = \cos\left(0.739085...\right) = 0.739085... .

How to prove it? The easiest way I know is to summon the Intermediate Value Theorem. Since I wrote a couple hundred words about that a few weeks ago I can assume you to understand it perfectly and have no question about how it makes this problem easy. I don’t even need to go on, do I?

… Yeah, fair enough. Well, here’s how to do it. We’ll take the original function f and create, based on it, a new function. We’ll dig deep in the alphabet and name that ‘g’. It has the same domain as f, [a, b]. Its range is … oh, well, something in the real numbers. Don’t care. The wonder comes from the rule we use.

The rule for ‘g’ is this: match the given number ‘x’ with the number ‘f(x) – x’. That is, g(a) equals whatever f(a) would be, minus a. g(b) equals whatever f(b) would be, minus b. We’re allowed to define a function in terms of some other function, as long as the symbols are meaningful. But we aren’t doing anything wrong like dividing by zero or taking the logarithm of a negative number or asking for f where it isn’t defined.

You might protest that we don’t know what the rule for f is. We’re told there is one, and that it’s a continuous function, but nothing more. So how can I say I’ve defined g in terms of a function I don’t know?

In the first place, I already know everything about f that I need to. I know it’s a continuous function defined on the interval [a, b]. I won’t use any more than that about it. And that’s great. A theorem that doesn’t require knowing much about a function is one that applies to more functions. It’s like the difference between being able to say something true of all living things in North America, and being able to say something true of all persons born in Redbank, New Jersey, on the 18th of February, 1944, who are presently between 68 and 70 inches tall and working on their rock operas. Both things may be true, but one of those things you probably use more.

In the second place, suppose I gave you a specific rule for f. Let me say, oh, f matches x with the arccosecant of x. Are you feeling any more enlightened now? Didn’t think so.

Back to g. Here’s some things we can say for sure about it. g is a function defined on the interval [a, b]. That’s how we set it up. Next point: g is a continuous function on the interval [a, b]. Remember, g is just the function f, which was continuous, minus x, which is also continuous. The difference of two continuous functions is still going to be continuous. (This is obvious, although it may take some considered thinking to realize why it is obvious.)

Now some interesting stuff. What is g(a)? Well, it’s whatever number f(a) is minus a. I can’t tell you what number that is. But I can tell you this: it’s not negative. Remember that f(a) has to be some number in the interval [a, b]. That is, it’s got to be no smaller than a. So the smallest f(a) can be is equal to a, in which case f(a) minus a is zero. And f(a) might be larger than a, in which case f(a) minus a is positive. So g(a) is either zero or a positive number.

(If you’ve just realized where I’m going and gasped in delight, well done. If you haven’t, don’t worry. You will. You’re just out of practice.)

What about g(b)? Since I don’t know what f(b) is, I can’t tell you what specific number it is. But I can tell you it’s not a positive number. The reasoning is just like above: f(b) is some number on the interval [a, b]. So the biggest number f(b) can equal is b. And in that case f(b) minus b is zero. If f(b) is any smaller than b, then f(b) minus b is negative. So g(b) is either zero or a negative number.

(Smiling at this? Good job. If you aren’t, again, not to worry. This sort of argument is not the kind of thing you do in Boring Algebra. It takes time and practice to think this way.)

And now the Intermediate Value Theorem works. g(a) is a positive number. g(b) is a negative number. g is continuous from a to b. Therefore, there must be some number ‘c’, between a and b, for which g(c) equals zero. And remember what g(c) means: f(c) – c equals 0. Therefore f(c) has to equal c. There has to be a fixed point.

And some tidying up. Like I said, g(a) might be positive. It might also be zero. But if g(a) is zero, then f(a) – a = 0. So a would be a fixed point. And similarly if g(b) is zero, then f(b) – b = 0. So then b would be a fixed point. The important thing is there must be at least some fixed point.

Now that calculator play starts taking on purposeful shape. Squaring a number could find a fixed point only if you started with a number from -1 to 1. The square of a number outside this range, such as ‘2’, would be bigger than you started with, and the Fixed Point Theorem doesn’t apply. Similarly with exponentials. But square roots? The square root of any number from 0 to a positive number ‘b’ is a number between 0 and ‘b’, at least as long as b was bigger than 1. So there was a fixed point, at 1. The cosine of a real number is some number between -1 and 1, and the cosines of all the numbers between -1 and 1 are themselves between -1 and 1. The Fixed Point Theorem applies. Tangent isn’t a continuous function. And the calculator play never settles on anything.

As with the Intermediate Value Theorem, this is an existence proof. It guarantees there is a fixed point. It doesn’t tell us how to find one. Calculator play does, though. Start from any old number that looks promising and work out f for that number. Then take that and put it back into f. And again. And again. This is known as “fixed point iteration”. It won’t give you the exact answer.

Not usually, anyway. In some freak cases it will. But what it will give, provided some extra conditions are satisfied, is a sequence of values that get closer and closer to the fixed point. When you’re close enough, then you stop calculating. How do you know you’re close enough? If you know something about the original f you can work out some logically rigorous estimates. Or you just keep calculating until all the decimal points you want stop changing between iterations. That’s not logically sound, but it’s easy to program.

That won’t always work. It’ll only work if the function f is differentiable on the interval (a, b). That is, it can’t have corners. And there have to be limits on how fast the function changes on the interval (a, b). If the function changes too fast, iteration can’t be guaranteed to work. But often if we’re interested in a function at all then these conditions will be true, or we can think of a related function that for which they are true.

And even if it works it won’t always work well. It can take an enormous pile of calculations to get near the fixed point. But this is why we have computers, and why we can leave them to work overnight.

And yet such a simple idea works. It appears in ancient times, in a formula for finding the square root of an arbitrary positive number ‘N’. (Find the fixed point for f(x) = \frac{1}{2}\left(\frac{N}{x} + x\right) ). It creeps into problems that don’t look like fixed points. Calculus students learn of something called the Newton-Raphson Iteration. It finds roots, points where a function f(x) equals zero. Mathematics majors learn of numerical methods to solve ordinary differential equations. The most stable of these are again fixed-point iteration schemes, albeit in disguise.

They all share this almost playful backbone.

Theorem Thursday: What Is Cramer’s Rule?


KnotTheorist asked for this one during my appeal for theorems to discuss. And I’m taking an open interpretation of what a “theorem” is. I can do a rule.

Cramer’s Rule

I first learned of Cramer’s Rule in the way I expect most people do. It was an algebra course. I mean high school algebra. By high school algebra I mean you spend roughly eight hundred years learning ways to solve for x or to plot y versus x. Then take a pause for polar coordinates and matrices. Then you go back to finding both x and y.

Cramer’s Rule came up in the context of solving simultaneous equations. You have more than one variable. So x and y. Maybe z. Maybe even a w, before whoever set up the problem gives up and renames everything x1 and x2 and x62 and all that. You also have more than one equation. In fact, you have exactly as many equations as you have variables. Are there any sets of values those variables can have which make all those variable true simultaneously? Thus the imaginative name “simultaneous equations” or the search for “simultaneous solutions”.

If all the equations are linear then we can always say whether there’s simultaneous solutions. By “linear” we mean what we always mean in mathematics, which is, “something we can handle”. But more exactly it means the equations have x and y and whatever other variables only to the first power. No x-squared or square roots of y or tangents of z or anything. (The equations are also allowed to omit a variable. That is, if you have one equation with x, y, and z, and another with just x and z, and another with just y and z, that’s fine. We pretend the missing variable is there and just multiplied by zero, and proceed as before.) One way to find these solutions is with Cramer’s Rule.

Cramer’s Rule sets up some matrices based on the system of equations. If the system has two equations, it sets up three matrices. If the system has three equations, it sets up four matrices. If the system has twelve equations, it sets up thirteen matrices. You see the pattern here. And then you can take the determinant of each of these matrices. Dividing the determinant of one of these matrices by another one tells you what value of x makes all the equations true. Dividing the determinant of another matrix by the determinant of one of these matrices tells you which values of y makes all the equations true. And so on. The Rule tells you which determinants to use. It also says what it means if the determinant you want to divide by equals zero. It means there’s either no set of simultaneous solutions or there’s infinitely many solutions.

This gets dropped on us students in the vain effort to convince us knowing how to calculate determinants is worth it. It’s not that determinants aren’t worth knowing. It’s just that they don’t seem to tell us anything we care about. Not until we get into mappings and calculus and differential equations and other mathematics-major stuff. We never see it in high school.

And the hard part of determinants is that for all the cool stuff they tell us, they take forever to calculate. The determinant for a matrix with two rows and two columns isn’t bad. Three rows and three columns is getting bad. Four rows and four columns is awful. The determinant for a matrix with five rows and five columns you only ever calculate if you’ve made your teacher extremely cross with you.

So there’s the genius and the first problem with Cramer’s Rule. It takes a lot of calculating. Many any errors along the way with the calculation and your work is wrong. And worse, it won’t be wrong in an obvious way. You can find the error only by going over every single step and hoping to catch the spot where you, somehow, got 36 times -7 minus 21 times -8 wrong.

The second problem is nobody in high school algebra mentions why systems of linear equations should be interesting to solve. Oh, maybe they’ll explain how this is the work you do to figure out where two straight lines intersect. But that just shifts the “and we care because … ?” problem back one step. Later on we might come to understand the lines represent cases where something we’re interested in is true, or where it changes from true to false.

This sort of simultaneous-solution problem turns up naturally in optimization problems. These are problems where you try to find a maximum subject to some constraints. Or find a minimum. Maximums and minimums are the same thing when you think about them long enough. If all the constraints can be satisfied at once and you get a maximum (or minimum, whatever), great! If they can’t … Well, you can study how close it’s possible to get, and what happens if you loosen one or more constraint. That’s worth knowing about.

The third problem with Cramer’s Rule is that, as a method, it kind of sucks. We can be convinced that simultaneous linear equations are worth solving, or at least that we have to solve them to get out of High School Algebra. And we have computers. They can grind away and work out thirteen determinants of twelve-row-by-twelve-column matrices. They might even get an answer back before the end of the term. (The amount of work needed for a determinant grows scary fast as the matrix gets bigger.) But all that work might be meaningless.

The trouble is that Cramer’s Rule is numerically unstable. Before I even explain what that is you already sense it’s a bad thing. Think of all the good things in your life you’ve heard described as unstable. Fair enough. But here’s what we mean by numerically unstable.

Is 1/3 equal to 0.3333333? No, and we know that. But is it close enough? Sure, most of the time. Suppose we need a third of sixty million. 0.3333333 times 60,000,000 equals 19,999,998. That’s a little off of the correct 20,000,000. But I bet you wouldn’t even notice the difference if nobody pointed it out to you. Even if you did notice it you might write off the difference. “If we must, make up the difference out of petty cash”, you might declare, as if that were quite sensible in the context.

And that’s so because this multiplication is numerically stable. Make a small error in either term and you get a proportional error in the result. A small mistake will — well, maybe it won’t stay small, necessarily. But it’ll not grow too fast too quickly.

So now you know intuitively what an unstable calculation is. This is one in which a small error doesn’t necessarily stay proportionally small. It might grow huge, arbitrarily huge, and in few calculations. So your answer might be computed just fine, but actually be meaningless.

This isn’t because of a flaw in the computer per se. That is, it’s working as designed. It’s just that we might need, effectively, infinitely many digits of precision for the result to be correct. You see where there may be problems achieving that.

Cramer’s Rule isn’t guaranteed to be nonsense, and that’s a relief. But it is vulnerable to this. You can set up problems that look harmless but which the computer can’t do. And that’s surely the worst of all worlds, since we wouldn’t bother calculating them numerically if it weren’t too hard to do by hand.

(Let me direct the reader who’s unintimidated by mathematical jargon, and who likes seeing a good Wikipedia Editors quarrel, to the Cramer’s Rule Talk Page. Specifically to the section “Cramer’s Rule is useless.”)

I don’t want to get too down on Cramer’s Rule. It’s not like the numerical instability hurts every problem you might use it on. And you can, at the cost of some more work, detect whether a particular set of equations will have instabilities. That requires a lot of calculation but if we have the computer to do the work fine. Let it. And a computer can limit its numerical instabilities if it can do symbolic manipulations. That is, if it can use the idea of “one-third” rather than 0.3333333. The software package Mathematica, for example, does symbolic manipulations very well. You can shed many numerical-instability problems, although you gain the problem of paying for a copy of Mathematica.

If you just care about, or just need, one of the variables then what the heck. Cramer’s Rule lets you solve for just one or just some of the variables. That seems like a niche application to me, but it is there.

And the Rule re-emerges in pure analysis, where numerical instability doesn’t matter. When we look to differential equations, for example, we often find solutions are combinations of several independent component functions. Bases, in fact. Testing whether we have found independent bases can be done through a thing called the Wronskian. That’s a way that Cramer’s Rule appears in differential equations.

Wikipedia also asserts the use of Cramer’s Rule in differential geometry. I believe that’s a true statement, and that it will be reflected in many mechanics problems. In these we can use our knowledge that, say, energy and angular momentum of a system are constant values to tell us something of how positions and velocities depend on each other. But I admit I’m not well-read in differential geometry. That’s something which has indeed caused me pain in my scholarly life. I don’t know whether differential geometers thank Cramer’s Rule for this insight or whether they’re just glad to have got all that out of the way. (See the above Wikipedia Editors quarrel.)

I admit for all this talk about Cramer’s Rule I haven’t said what it is. Not in enough detail to pass your high school algebra class. That’s all right. It’s easy to find. MathWorld has the rule in pretty simple form. Mathworld does forget to define what it means by the vector d. (It’s the vector with components d1, d2, et cetera.) But that’s enough technical detail. If you need to calculate something using it, you can probably look closer at the problem and see if you can do it another way instead. Or you’re in high school algebra and just have to slog through it. It’s all right. Eventually you can put x and y aside and do geometry.

Theorem Thursday: The Intermediate Value Theorem


I am still taking requests for this Theorem Thursdays sequence. I intend to post each Thursday in June and July an essay talking about some theorem and what it means and why it’s important. I have gotten a couple of requests in, but I’m happy to take more; please just give me a little lead time. But I want to start with one that delights me.

The Intermediate Value Theorem

I own a Scion tC. It’s a pleasant car, about 2400 percent more sporty than I am in real life. I got it because it met my most important criteria: it wasn’t expensive and it had a sun roof. That it looks stylish is an unsought bonus.

But being a car, and a black one at that, it has a common problem. Leave it parked a while, then get inside. In the winter, it gets so cold that snow can fall inside it. In the summer, it gets so hot that the interior, never mind the passengers, risks melting. While pondering this slight inconvenience I wondered, isn’t there any outside temperature that leaves my car comfortable?

Scion tC covered in snow and ice from a late winter storm.
My Scion tC, here, not too warm.

Of course there is. We know this before thinking about it. The sun heats the car, yes. When the outside temperature is low enough, there’s enough heat flowing out that the car gets cold. When the outside temperature’s high enough, not enough heat flows out. The car stays warm. There must be some middle temperature where just enough heat flows out that the interior doesn’t get particularly warm or cold. Not just one middle temperature, come to that. There is a range of temperatures that are comfortable to sit in. But that just means there’s a range of outside temperatures for which the car’s interior stays comfortable. We know this range as late April, early May, here. Most years, anyway.

The reasoning that lets us know there is a comfort-producing outside temperature we can see as a use of the Intermediate Value Theorem. It addresses a function f with domain [a, b], and range of the real numbers. The domain is closed; that is, the numbers we call ‘a’ and ‘b’ are both in the set. And f has to be a continuous function. If you want to draw it, you can do so without having to lift pen from paper. (WARNING: Do not attempt to pass your Real Analysis course with that definition. But that’s what the proper definition means.)

So look at the numbers f(a) and f(b). Pick some number between them, and I’ll call that number ‘g’. There must be at least one number ‘c’, that’s between ‘a’ and ‘b’, and for which f(c) equals g.

Bernard Bolzano, an early-19th century mathematician/logician/theologist/priest, gets the credit for first proving this theorem. Bolzano’s version was a little different. It supposes that f(a) and f(b) are of opposite sign. That is, f(a) is a positive and f(b) a negative number. Or f(a) is negative and f(b) is positive. And Bolzano’s theorem says there must be some number ‘c’ for which f(c) is zero.

You can prove this by drawing any wiggly curve at all and then a horizontal line in the middle of it. Well, that doesn’t prove it to mathematician’s satisfaction. But it will prove the matter in the sense that you’ll be convinced. It’ll also convince anyone you try explaining this to.

A generic wiggly function, with vertical lines marking off the domain limits of a and b. Horizontal lines mark off f(a) and f(b), as well as a putative value g. The wiggly function indeed has at least one point for which its value is g.
Any old real-valued function, drawn in blue. The number ‘g’ is something between the number f(a) and f(b). And somewhere there’s at least one number, between a and b, for where the function’s equal to g.

You might wonder why anyone needed this proved at all. It’s a bit like proving that as you pour water into the sink there’ll come a time the last dish gets covered with water. So it is. The need for a proof came about from the ongoing attempt to make mathematics rigorous. We have an intuitive idea of what it means for functions to be continuous; see my above comment about lifting pens from paper. Can that be put in terms that don’t depend on physical intuition? … Yes, it can. And we can divorce the Intermediate Value Theorem from our physical intuitions. We can know something that’s true even if we never see a car or a sink.

This theorem might leave you feeling a little hollow inside. Proving that there is some ‘c’ for which f(c) equals g, or even equals zero, doesn’t seem to tell us much about how to find it. It doesn’t even tell us that there’s only one ‘c’, rather than two or three or a hundred million candidates that meet our criteria. Fair enough. The Intermediate Value Theorem is more about proving the existence of solutions, rather than how to find them.

But knowing there is a solution can help us find them. The Intermediate Value Theorem as we know it grew out of finding roots for polynomials. One numerical method, easy to set up for any problem, is the bisection method. If you know that somewhere between ‘a’ and ‘b’ the function goes from positive to negative, then find the midpoint, ‘c’. The function is equal to zero either between ‘a’ and ‘c’, or between ‘c’ and ‘b’. Pick the side that it’s on, and bisect that. Pick the half of that which the zero must be in. Bisect that half. And repeat until you get close enough to the answer for your needs. (The same reasoning applies to a lot of problems in which you divide the search range in two each time until the answer appears.)

We can get some pretty heady results from the Intermediate Value Theorem, too, even if we don’t know where any of them are. An example you’ll see everywhere is that there must be spots on the opposite sides of the globe with the exact same temperature. Or humidity, or daily rainfall, or any other quantity like that. I had thought everyone was ripping that example off from Richard Courant and Herbert Robbins’s masterpiece What Is Mathematics?. But I can’t find this particular example in there. I wonder what we are all ripping it off from.

Two blobby shapes, one of them larger and more complicated, the other looking kind of like the outline of a trefoil, both divided by a magenta line.
Does this magenta line bisect both the red and the greyish blobs simultaneously? … Probably not, unless I’ve been way lucky. But there is some line that does.

So here’s a neat example that is ripped off from them. Draw two blobs on the plane. Is there a straight line that bisects both of them at once? Bisecting here means there’s exactly as much of one blob on one side of the line as on the other. There certainly is. The trick is there are any number of lines that will bisect one blob, and then look at what that does to the other.

A similar ripped-off result you can do with a single blob of any shape you like. Draw any line that bisects it. There are a lot of candidates. Can you draw a line perpendicular to that so that the blob gets quartered, divided into four spots of equal area? Yes. Try it.

A generic blobby shape with two perpendicular magenta lines crossing over it.
Does this pair of magenta lines split this blue blob into four pieces of exactly the same area? … Probably not, unless I’ve been lucky. But there is some pair of perpendicular lines that will do it. Also, is it me or does that blob look kind of like a butterfly?

But surely the best use of the Intermediate Value Theorem is in the problem of wobbly tables. If the table has four legs, all the same length, and the problem is the floor isn’t level it’s all right. There is some way to adjust the table so it won’t wobble. (Well, the ground can’t be angled more than a bit over 35 degrees, but that’s all right. If the ground has a 35 degree angle you aren’t setting a table on it. You’re rolling down it.) Finally a mathematical proof can save us from despair!

Except that the proof doesn’t work if the table legs are uneven which, alas, they often are. But we can’t get everything.

Courant and Robbins put forth one more example that’s fantastic, although it doesn’t quite work. But it’s a train problem unlike those you’ve seen before. Let me give it to you as they set it out:

Suppose a train travels from station A to station B along a straight section of track. The journey need not be of uniform speed or acceleration. The train may act in any manner, speeding up, slowing down, coming to a halt, or even backing up for a while, before reaching B. But the exact motion of the train is supposed to be known in advance; that is, the function s = f(t) is given, where s is the distance of the train from station A, and t is the time, measured from the instant of departure.

On the floor of one of the cars a rod is pivoted so that it may move without friction either forward or backward until it touches the floor. If it does touch the floor, we assume that it remains on the floor henceforth; this wil be the case if the rod does not bounce.

Is it possible to place the rod in such a position that, if it is released at the instant when the train starts and allowed to move solely under the influence of gravity and the motion of the train, it will not fall to the floor during the entire journey from A to B?

They argue it is possible, and use the Intermediate Value Theorem to show it. They admit the range of angles it’s safe to start the rod from may be too small to be useful.

But they’re not quite right. Ian Stewart, in the revision of What Is Mathematics?, includes an appendix about this. Stewart credits Tim Poston with pointing out, in 1976, the flaw. It’s possible to imagine a path which causes the rod, from one angle, to just graze tipping over, let’s say forward, and then get yanked back and fall over flat backwards. This would leave no room for any starting angles that avoid falling over entirely.

It’s a subtle flaw. You might expect so. Nobody mentioned it between the book’s original publication in 1941, after which everyone liking mathematics read it, and 1976. And it is one that touches on the complications of spaces. This little Intermediate Value Theorem problem draws us close to chaos theory. It’s one of those ideas that weaves through all mathematics.

Any Requests, Theorem Thursdays Edition?


I don’t know just when I’ll have the energy for my next Mathematics A To Z. But I do want to do something. So for June and July I figure to run a Theorem Thursdays bit. Pitch me some theorems and I’ll do my best to explain what they’re about, or why they’re interesting, or how there might be some bit of mathematics-community folklore behind it. That would be the Contraction Mapping Theorem.

While I’m calling it Theorem Thursdays that’s just for the sake of marketing. It doesn’t literally need to have “theorem” in the thing’s name. The only condition I mean to put on it is that I won’t do Cantor’s Diagonal Argument — the proof that there’s more real numbers than there are integers — because it’s already been done so well, so often, by everyone. I don’t have anything to say that could add to its explanation.

Please, put your requests in comments here. I shall try to take the first nine that I see and feel like I can be competent to handle by the end of July. And I hope I’m not doing something soon to be disastrous. I may not know exactly what I’m doing, but then, if anyone ever did know exactly what they were doing they’d never do it.