My 2018 Mathematics A To Z: Limit


I got an irresistible topic for today’s essay. It’s courtesy Peter Mander, author of Carnot Cycle, “the classical blog about thermodynamics”. It’s bimonthly and it’s one worth waiting for. Some of the essays are historical; some are statistical-mechanics; many are mixtures of them. You could make a fair argument that thermodynamics is the most important field of physics. It’s certainly one that hasn’t gotten the popularization treatment it deserves, for its importance. Mander is doing something to correct that.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Limit.

It is hard to think of limits without thinking of motion. The language even professional mathematicians use suggests it. We speak of the limit of a function “as x goes to a”, or “as x goes to infinity”. Maybe “as x goes to zero”. But a function is a fixed thing, a relationship between stuff in a domain and stuff in a range. It can’t change any more than January, AD 1988 can change. And ‘x’ here is a dummy variable, part of the scaffolding to let us find what we want to know. I suppose ‘x’ can change, but if we ever see it, something’s gone very wrong. But we want to use it to learn something about a function for a point like ‘a’ or ‘infinity’ or ‘zero’.

The language of motion helps us learn, to a point. We can do little experiments: if f(x) = \frac{sin(x)}{x} , then, what should we expect it to be for x near zero? It’s irresistible to try out the calculator. Let x be 0.1. 0.01. 0.001. 0.0001. The numbers say this f(x) gets closer and closer to 1. That’s good, right? We know we can’t just put in an x of zero, because there’s some trouble that makes. But we can imagine creeping up on the zero we really wanted. We might spot some obvious prospects for mischief: what if x is negative? We should try -0.1, -0.01, -0.001 and so on. And maybe we won’t get exactly the right answer. But if all we care about is the first (say) three digits and we try out a bunch of x’s and the corresponding f(x)’s agree to those three digits, that’s good enough, right?

This is good for giving an idea of what to expect a limit to look like. It should be, well, what it really really really looks like a function should be. It takes some thinking to see where it might go wrong. It might go to different numbers based on which side you approach from. But that seems like something you can rationalize. Indeed, we do; we can speak of functions having different limits based on what direction you approach from. Sometimes that’s the best one can say about them.

But it can get worse. It’s possible to make functions that do crazy weird things. Some of these look like you’re just trying to be difficult. Like, set f(x) equal to 1 if x is rational and 0 if x is irrational. If you don’t expect that to be weird you’re not paying attention. Can’t blame someone for deciding that falls outside the realm of stuff you should be able to find limits for. And who would make, say, an f(x) that was 1 if x was 0.1 raised to some power, but 2 if x was 0.2 raised to some power, and 3 otherwise? Besides someone trying to prove a point?

Fine. But you can make a function that looks innocent and yet acts weird if the domain is two-dimensional. Or more. It makes sense to say that the functions I wrote in the above paragraph should be ruled out of consideration. But the limit of f(x, y) = \frac{x^3 y}{x^6 + y^2} at the origin? You get different results approaching in different directions. And the function doesn’t give obvious signs of imminent danger here.

We need a better idea. And we even have one. This took centuries of mathematical wrangling and arguments about what should and shouldn’t be allowed. This should inspire sympathy with Intro Calc students who don’t understand all this by the end of week three. But here’s what we have.

I need a supplementary idea first. That is the neighborhood. A point has a neighborhood if there’s some open set that contains it. We represent this by drawing a little blob around the point we care about. If we’re looking at the neighborhood of a real number, then this is a little interval, that’s all. When we actually get around to calculating, we make these neighborhoods little circles. Maybe balls. But when we’re doing proofs about how limits work, or how we use them to prove things, we make blobs. This “neighborhood” idea looks simple, but we need it, so here we go.

So start with a function, named ‘f’. It has a domain, which I’ll call ‘D’. And a range, which I want to call ‘R’, but I don’t think I need the shorthand. Now pick some point ‘a’. This is the point at which we want to evaluate the limit. This seems like it ought to be called the “limit point” and it’s not. I’m sorry. Mathematicians use “limit point” to talk about something else. And, unfortunately, it makes so much sense in that context that we aren’t going to change away from that.

‘a’ might be in the domain ‘D’. It might not. It might be on the border of ‘D’. All that’s important is that there be a neighborhood inside ‘D’ that contains ‘a’.

I don’t know what f(a) is. There might not even be an f(a), if a is on the boundary of the domain ‘D’. But I do know that everything inside the neighborhood of ‘a’, apart from ‘a’, is in the domain. So we can look at the values of f(x) for all the x’s in this neighborhood. This will create a set, in the range, that’s known as the image of the neighborhood. It might be a continuous chunk in the range. It might be a couple of chunks. It might be a single point. It might be some crazy-quilt set. Depends on ‘f’. And the neighborhood. No matter.

Now I need you to imagine the reverse. Pick a point in the range. And then draw a neighborhood around it. Then pick out what we call the pre-image of it. That’s all the points in the domain that get matched to values inside that neighborhood. Don’t worry about trying to do it; that’s for the homework practice. Would you agree with me that you can imagine it?

I hope so because I’m about to describe the part where Intro Calc students think hard about whether they need this class after all.

OK. Ready?

All right. Then I want something in the range. I’m going to call it ‘L’. And it’s special. It’s the limit of ‘f’ at ‘a’ if this following bit is true:

Think of every neighborhood you could pick of ‘L’. Can be big, can be small. Just has to be a neighborhood of ‘L’. Now think of the pre-image of that neighborhood. Is there always a neighborhood of ‘a’ inside that pre-image? It’s okay if it’s a tiny neighborhood. Just has to be an open neighborhood. It doesn’t have to contain ‘a’. You can allow a pinpoint hole there.

If you can always do this, however tiny the neighborhood of ‘L’ is, then the limit of ‘f’ at ‘a’ is ‘L’. If you can’t always do this — if there’s even a single exception — then there is no limit of ‘f’ at ‘a’.

I know. I felt like that the first couple times through the subject too. The definition feels backward. Worse, it feels like it begs the question. We suppose there’s an ‘L’ and then test these properties about it and then if it works we say we’re done? I know. It’s a pain when you start calculating this with specific formulas and all that, too. But supposing there is an answer and then learning properties about it, including whether it can exist? That’s a slick trick. We can use it.

Thing is, the pain is worth it. We can calculate with it and not have to out-think tricky functions. It works for domains with as many dimensions as you need. It works for limits that aren’t inside the domain. It works with domains and ranges that aren’t real numbers. It works for functions with weird and complicated domains. We can adapt it if we want to consider limits that are constrained in some way. It won’t be fooled by tricks like I put up above, the f(x) with different rules for the rational and irrational numbers.

So mathematicians shrug, and do enough problems that they get the hang of it, and use this definition. It’s worth it, once you get there.


This and other Fall 2018 Mathematics A-To-Z posts can be read at this link. And I’m still taking nominations for discussion topics, if you’d like to see mathematics terms explained. I know I would.

Advertisements

Reading the Comics, April 25, 2018: Coronet Blue Edition


You know what? Sometimes there just isn’t any kind of theme for the week’s strips. I can use an arbitrary name.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 21st of April, 2018 would have gone in last week if I weren’t preoccupied on Saturday. The joke is aimed at freshman calculus students and then intro Real Analysis students. The talk about things being “arbitrarily small” turns up a lot in these courses. Why? Well, in them we usually want to show that one thing equals another. But it’s hard to do that. What we can show is some estimate of how different the first thing can be from the second. And if you can show that that difference can be made small enough by calculating it correctly, great. You’ve shown the two things are equal.

Delta and epsilon turn up in these a lot. In the generic proof of this you say you want to show the difference between the thing you can calculate and the thing you want is smaller than epsilon. So you have the thing you can calculate parameterized by delta. Then your problem becomes showing that if delta is small enough, the difference between what you can do and what you want is smaller than epsilon. This is why it’s an appropriately-formed joke to show someone squeezed by a delta and an epsilon. These are the lower-case delta and epsilon, which is why it’s not a triangle on the left there.

Mad scientist cackling at a man being crushed between giant delta and epsilon figure: 'And now, good doctor, we will see how you fit between this delta and this epsilon!' Caption: Soon, soon the calculus teacher would become arbitrarily small.
Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 21st of April, 2018. I feel vaguely personally called out by the calculus teacher wearing cargo shorts, tall white socks, and sandals.

For example, suppose you want to know how long the perimeter of an ellipse is. But all you can calculate is the perimeter of a polygon. I would expect to make a proof of it look like this. Give me an epsilon that’s how much error you’ll tolerate between the polygon’s perimeter and the ellipse’s perimeter. I would then try to find, for epsilon, a corresponding delta. And that if the edges of a polygon are never farther than delta from a point on the ellipse, then the perimeter of the polygon and that of the ellipse are less than epsilon away from each other. And that’s Calculus and Real Analysis.

John Zakour and Scott Roberts’s Maria’s Day for the 22nd is the anthropomorphic numerals joke for this week. I’m curious whether the 1 had a serif that could be wrestled or whether the whole number had to be flopped over, as though it were a ruler or a fat noodle.

Maria at her desk challenges a giant number 4 to arm wrestling; she slams its 'arm' down easily. Other numerals flee as she yells out: 'Okay, anyone else wanna take me on? Huh? --- Yeah, didn't think so!' Reality: she's at her desk with a book and some paper and says, "Whew! This math homework was tough --- but I think I got it down.'
John Zakour and Scott Roberts’s Maria’s Day for the 22nd of April, 2018. I’m curious whether Zakour and Roberts deliberately put 2 and 3 to the left, with pain stars indicating they’ve been beaten already, while the bigger numbers are off to the side. Or was it just an arbitrary choice? The numbers are almost in order, left to right, except that the 7’s out of place. So maybe the ordering is just coincidence?

Anthony Blades’s Bewley for the 23rd offers advice for what to do if you’ve not got your homework. This strip’s already been run, and mentioned here. I might drop this from my reading if it turns out the strip is done and I’ve exhausted all the topics it inspires.

Bea: 'Aaaah! I forgot to do my maths homework!' Tonus: 'I did mine.' Bea: 'Can I copy yours?' Tonus: 'Of course you can. I didn't know the answers so I drew a picture of a scary dinosaur.' [ Silent penultimate panel. ] Bea: 'Better than nothing.' Tonus: 'Remember the big teeth. Big teeth make it scary.'
Anthony Blades’s Bewley for the 23rd of April, 2018. Whenever a comic strip with this setup begins I think of the time in geometry class when I realized I hadn’t done any homework and wondered if I could get something done in the time it took papers to be passed up. This in a class of twelve students. No, there was not, even though the teacher started from the other side of the classroom.

Dave Whamond’s Reality Check for the 23rd is designed for the doors of mathematics teachers everywhere. It does incidentally express one of those truths you barely notice: that statisticians and mathematicians don’t seem to be quite in the same field. They’ve got a lot of common interest, certainly. But they’re often separate departments in a college or university. When they do share a department it’s named the Department of Mathematics and Statistics, itself an acknowledgement that they’re not quite the same thing. (Also it seems to me it’s always Mathematics-and-Statistics. If there’s a Department of Statistics-and-Mathematics somewhere I don’t know of it and would be curious.) This has to reflect historical influence. Statistics, for all that it uses the language of mathematics and that logical rigor and ideas about proofs and all, comes from a very practical, applied, even bureaucratic source. It grew out of asking questions about the populations of nations and the reliable manufacture of products. Mathematics, even the mathematics that is about real-world problems, is different. A mathematician might specialize in the equations that describe fluid flows, for example. But it could plausibly be because they have interesting and strange analytical properties. It’d be only incidental that they might also say something enlightening about why the plumbing is stopped up.

[ Clown Statistician vs Clown mathematician. ] The Clown Statistician holds up a pie chart, ready to throw it. The mathematician holds up a pi symbol, ready to throw it. Corner squirrel's comment: 'There's always room for more pi.'
Dave Whamond’s Reality Check for the 23rd of April, 2018. I’m not sure I’ve laughed more at a dumb joke than I have at this in a long while.

Neal Rubin and Rod Whigham’s Gil Thorp for the 24th seems to be setting out the premise for the summer storyline. It’s sabermetrics. Or at least the idea that sports performance can be quantized, measured, and improved. The principle behind that is sound enough. The trick is figuring out what are the right things to measure, and what can be done to improve them. Also another trick is don’t be a high school student trying to lecture classmates about geometry. Seriously. They are not going to thank you. Even if you turn out to be right. I’m not sure how you would have much control of the angle your ball comes off the bat, but that’s probably my inexperience. I’ve learned a lot about how to control a pinball hitting the flipper. I’m not sure I could quantize any of it, but I admit I haven’t made a serious attempt to try either. Also, when you start doing baseball statistics you run a roughly 45% chance of falling into a deep well of calculation and acronyms of up to twelve letters from which you never emerge. Be careful. (This is a new comic strip tag.)

[ With rain delaying (baseball) practice, Kevin Pelwecki expounds on his new favorite subject --- ] Kevin: 'Launch angle! You want the ball coming off the bat at 25 degrees.' Teammate: 'Anyone else notice we're taking math lessons --- from a guy who barely passed geometry?'
Neal Rubin and Rod Whigham’s Gil Thorp for the 24th of April, 2018. Are … both word balloons coming from the same guy? In the last panel there. I understand one guy starting and another closing a thought but that’s usually something you do with an established in-joke that anyone can feed and anyone else can finish. A spontaneous insult like this seems like it only needs the one person, but the word balloon tails are weird if they’re both from the same guy.

Randy Glasbergen’s Glasbergen Cartoons rerun for the 25th feels a little like a slight against me. Well, no matter. Use the things that get you in the mood you need to do well. (Not a new comic strip tag because I’m filing it under ‘Randy Glasbergen’ which I guess I used before?)

Kid with guitar: 'I start every song by counting 1-2-3-4 because it reminds me of math. Math depresses me and that helps me sing the blues.'
Randy Glasbergen’s Glasbergen Cartoons rerun for the 25th of April, 2018. OK, but what’s his guitar plugged in to?

Someone Else’s Homework: Some More Thoughts


I wanted to get back to my friend’s homework problem. And a question my friend had about the question. It’s a question I figure is good for another essay.

But I also had second thoughts about the answer I gave. Not that it’s wrong, but that it could be better. Also that I’m not doing as well in spelling “range” as I had always assumed I would. This is what happens when I don’t run an essay through Hemmingway App to check whether my sentences are too convoluted. I also catch smaller word glitches.

Let me re-state the problem: Suppose you have a function f, with domain of the integers Z and rage of the integers Z. And also you know that f has the property that for any two integers ‘a’ and ‘b’, f(a + b) equals f(a) + f(b). And finally, suppose that for some odd number ‘c’, you know that f(c) is even. The challenge: prove that f is even for all the integers.

Like I say, the answer I gave on Tuesday is right. That’s fine. I just thought of a better answer. This often happens. There are very few interesting mathematical truths that only have a single proof. The ones that have only a single proof are on the cutting edge, new mathematics in a context we don’t understand well enough yet. (Yes, I am overlooking the obvious exception of ______ .) But a question so well-chewed-over that it’s fit for undergraduate homework? There’s probably dozens of ways to attack that problem.

And yes, you might only see one proof of something. Sometimes there’s an approach that works so well it’s silly to consider alternatives. Or the problem isn’t big enough to need several different proofs. There’s something to regret in that. Re-thinking an argument can make it better. As instructors we might recommend rewriting an assignment before turning it in. But I’m not sure that encourages re-thinking the assignment. It’s too easy to just copy-edit and catch obvious mistakes. Which is valuable, yes. But it’s good for communication, not for the mathematics itself.

So here’s my revised argument. It’s much cleaner, as I realized it while showering Wednesday morning.

Give me an integer. Let’s call it m. Well, m has to be either an even or an odd number. I’m supposing nothing about whether it’s positive or negative, by the way. This means what I show will work whether m is greater than, less than, or equal to zero.

Suppose that m is an even number. Then m has to equal 2*k for some integer k. (And yeah, k might be positive, might be negative, might be zero. Don’t know. Don’t care.) That is, m has to equal k + k. So f(m) = f(k) + f(k). That’s one of the two things we know about the function f. And f(k) + f(k) is is 2 * f(k). And f(k) is an integer: the integers are the function’s rage range). So 2 * f(k) is an even integer. So if m is an even number then f(m) has to be even.

All right. Suppose that m isn’t an even integer. Then it’s got to be an odd integer. So this means m has to be equal to c plus some even number, which I’m going ahead and calling 2*k. Remember c? We were given information about f for that element c in the domain. And again, k might be positive. Might be negative. Might be zero. Don’t know, and don’t need to know. So since m = c + 2*k, we know that f(m) = f(c) + f(2*k). And the other thing we know about f is that f(c) is even. f(2*k) is also even. f(c), which is even, plus f(2*k), which is even, has to be even. So if m is an odd number, then f(m) has to be even.

And so, as long as m is an integer, f(m) is even.

You see why I like that argument better. It’s shorter. It breaks things up into fewer cases. None of those cases have to worry about whether m is positive or negative or zero. Each of the cases is short, and moves straight to its goal. This is the proof I’d be happy submitting. Today, anyway. No telling what tomorrow will make me think.

Someone Else’s Homework: A Solution


I have a friend who’s been taking mathematical logic. While talking over the past week’s work they mentioned a problem that had stumped them. But they’d figured it out — at least the critical part — about a half-hour after turning it in. And I had fun going over it. Since the assignment’s already turned in and I don’t even know which class it was, I’d like to share it with you.

So here’s the problem. Suppose you have a function f, with domain of the integers Z and rage of the integers Z. And also you know that f has the property that for any two integers ‘a’ and ‘b’, f(a + b) equals f(a) + f(b). And finally, suppose that for some odd number ‘c’, you know that f(c) is even. The challenge: prove that f is even for all the integers.

If you want to take a moment to think about that, please do.

A Californian rabbit (white body, grey ears and nose and paws) eating a pile of vegetables. In the background is the sunlit outside in the window, with a small rabbit statue silhouetted behind the rabbit's back.
So you can ponder without spoilers here’s a picture of the rabbit we’re fostering for the month, who’s having lunch. The silhouette behind her back is of a little statue decoration and not some outsider trying to lure our foster rabbit to freedom outside, so far as we know. (Don’t set domesticated rabbits outside. It won’t go well for them. And domesticated rabbits aren’t native to North America, I mention for the majority of my readers who are.)

So here’s my thinking about this.

First thing I want to do is show that f(1) is an even number. How? Well, if ‘c’ is an odd number, then ‘c’ has to equal ‘2*k + 1’ for some integer ‘k’. So f(c) = f(2*k + 1). And therefore f(c) = f(2*k) + f(1). And, since 2*k is equal to k + k, then f(2*k) has to equal f(k) + f(k). Therefore f(c) = 2*f(k) + f(1). Whatever f(k) is, 2*f(k) has to be an even number. And we’re given f(c) is even. Therefore f(1) has to be even.

Now I can prove that if ‘k’ is any positive integer, then f(k) has to be even. Why? Because ‘k’ is equal to 1 + 1 + 1 + … + 1. And so f(k) has to equal f(1) + f(1) + f(1) + … + f(1). That is, it’s k * f(1). And if f(1) is even then so is k * f(1). So that covers the positive integers.

How about zero? Can I show that f(0) is even? Oh, sure, easy. Start with ‘c’. ‘c’ equals ‘c + 0’. So f(c) = f(c) + f(0). The only way that’s going to be true is if f(0) is equal to zero, which is an even number.

By the way, here’s an alternate way of arguing this: 0 = 0 + 0. So f(0) = f(0) + f(0). And therefore f(0) = 2 * f(0) and that’s an even number. Incidentally also zero. Submit the proof you like.

What’s not covered yet? Negative integers. It’s hard not to figure, well, we know f(1) is even, we know f(a + b) if f(a) + f(b). Shouldn’t, like, f(-2) just be -2 * f(1)? Oh, it so should. I don’t feel like we have that already proven, though. So let me nail that down. I’m going to use what we know about f(k) for positive ‘k’, and the fact that f(0) is 0.

So give me any negative integer; I’m going call it ‘-k’. Its additive inverse is ‘k’, which is a positive number. -k + k = 0. And so f(-k + k) = f(-k) + f(k) = f(0). So, f(-k) + f(k) = 0, and f(-k) = -f(k). If f(k) is even — and it is — then f(-k) is also even.

So there we go: whether ‘k’ is a positive, zero, or negative integer, f(k) is even. All the integers are either positive, zero, or negative. So f is even for any integer.

I’ve got some more thoughts about this problem.

What I’ve Been Reading, Mid-March 2018


So here’s some of the stuff I’ve noticed while being on the Internet and sometimes noticing interesting mathematical stuff.

Here from the end of January is a bit of oddball news. A story problem for 11-year-olds in one district of China set up a problem that couldn’t be solved. Not exactly, anyway. The question — “if a ship had 26 sheep and 10 goats onboard, how old is the ship’s captain?” — squares nicely with that Gil comic strip I discussed the other day. After seeing 26 (something) and 10 (something else) it’s easy to think of what answers might be wanted: 36 (total animals) or 16 (how many more sheep there are than goats) or maybe 104 (how many hooves there are, if they all have the standard four hooves). That the question doesn’t ask anything that the given numbers matter for barely registers unless you read the question again. I like the principle of reminding people not to calculate until you know what you want to do and why that. And it’s possible to give partial answers: the BBC News report linked above includes a mention from one commenter that allowed a reasonable lower bound to be set on the ship’s captain’s age.

In something for my mathematics majors, here’s A Regiment of Monstrous Functions as assembled by Rob J Low. This is about functions with a domain and a range that are both real numbers. There’s many kinds of these functions. They match nicely to the kinds of curves you can draw on a sheet of paper. So take a sheet of paper and draw a curve. You’ve probably drawn a continuous curve, one that can be drawn without lifting your pencil off the paper. Good chance you drew a differentiable one, one without corners. But most functions aren’t continuous. And aren’t differentiable. Of those few exceptions that are, many of them are continuous or differentiable only in weird cases. Low reviews some of the many kinds of functions out there. Functions discontinuous at a point. Functions continuous only on one point, and why that’s not a crazy thing to say. Functions continuous on irrational numbers but discontinuous on rational numbers. This is where mathematics majors taking real analysis feel overwhelmed. And then there’s stranger stuff out there.

Here’s a neat one. It’s about finding recognizable, particular, interesting pictures in long enough prime numbers. The secret to it is described in the linked paper. The key is that the eye is very forgiving of slightly imperfect images. This fact should reassure people learning to draw, but will not. And there’s a lot of prime numbers out there. If an exactly-correct image doesn’t happen to be a prime number that’s all right. There’s a number close enough to it that will be. That latter point is something that anyone interested in number theory “knows”, in that we know some stuff about the biggest possible gaps between prime numbers. But that fact isn’t the same as seeing it.

And finally there’s something for mathematics majors. Differential equations are big and important. They appear whenever you want to describe something that changes based on its current state. And this is so much stuff. Finding solutions to differential equations is a whole major field of mathematics. The linked PDF is a slideshow of notes about one way to crack these problems: find symmetries. The only trouble is it’s a PDF of a Powerpoint presentation, one of those where each of the items gets added on in sequence. So each slide appears like eight times, each time with one extra line on it. It’s still good, interesting stuff.

Reading the Comics, December 2, 2017: Showing Intelligence Edition


November closed out with another of those weeks not quite busy enough to justify splitting into two. I blame Friday and Saturday. Nothing mathematically-themed was happening them. Suppose some days are just like that.

Johnny Hart’s Back To BC for the 26th is an example of using mathematical truths as profound statements. I’m not sure that I’d agree with just stating the Pythagorean Theorem as profound, though. It seems like a profound statement has to have some additional surprising, revelatory elements to it. Like, knowing the Pythagorean theorem is true means we can prove there’s exactly one line parallel to a given line and passing through some point. Who’d see that coming? I don’t blame Hart for not trying to fit all that into one panel, though. Too slow a joke. The strip originally ran the 4th of September, 1960.

Tom Toles’s Randolph Itch, 2 am rerun for the 26th is a cute little arithmetic-in-real-life panel. I suppose arithmetic-in-real-life. Well, I’m amused and stick around for the footer joke. The strip originally ran the 24th of February, 2002.

Zach Weinersmith’s Saturday Morning Breakfast Cereal makes its first appearance for the week on the 26th. It’s an anthropomorphic-numerals joke and some wordplay. Interesting trivia about the whole numbers that never actually impresses people: a whole number is either a perfect square, like 1 or 4 or 9 or 16 are, or else its square root is irrational. There’s no whole number with a square root that’s, like, 7.745 or something. Maybe I just discuss it with people who’re too old. It seems like the sort of thing to reveal to a budding mathematician when she’s eight.

Saturday Morning Breakfast Cereal makes another appearance the 29th. The joke’s about using the Greek ε, which has a long heritage of use for “a small, positive number”. We use this all the time in analysis. A lot of proofs in analysis are done by using ε in a sort of trick. We want to show something is this value, but it’s too hard to do. Fine. Pick any ε, a positive number of unknown size. So then we’ll find something we can calculate, and show that the difference between the thing we want and the thing we can do is smaller than ε. And that the value of the thing we can calculate is that. Therefore, the difference between what we want and what we can do is smaller than any positive number. And so the difference between them must be zero, and voila! We’ve proved what we wanted to prove. I have always assumed that we use ε for this for the association with “error”, ideally “a tiny error”. If we need another tiny quantity we usually go to δ, probably because it’s close to ε and ‘d’ is still a letter close to ‘e’. (The next letter after ε is ζ, which carries other connotations with it and is harder to write than δ is.) Anyway, Weinersmith is just doing a ha-ha, your penis is small joke.

Samson’s Dark Side of the Horse for the 28th is a counting-sheep joke. It maybe doesn’t belong here but I really, really like the art of the final panel and I want people to see it.

Arnoldine: 'If you're so SMART, what's the SQUARE ROOT of a million?!' Arnold, after a full panel's thought: 'FIVE!' Arnoldine: 'OK! What's the square root of TWO MILLION?!'
Bud Grace’s Piranha Club for the 29th of November, 2017. So do always remember the old advice for attorneys and people doing investigative commissions: never ask a question you don’t already know the answer to.

Bud Grace’s Piranha Club for the 29th is, as with Back to BC, an attempt at showing intelligence through mathematics. There are some flaws in the system. Fun fact: since one million is a perfect square, Arnold could have answered within a single panel. (Also fun fact: I am completely unqualified to judge whether something is a “fun” fact.)

Jason Chatfield’s Ginger Meggs for the 29th is Ginger subverting the teacher’s questions, like so many teacher-and-student jokes will do.

Dan Thompson’s Brevity for the 30th is the anthropomorphic geometric figures joke for the week.

There seems to be no Mark Anderson’s Andertoons for this week. There’ve been some great ones (like on the 26th or the 28th and the 29th) but they’re not at all mathematical. I apologize for the inconvenience and am launching an investigation into this problem.

The Summer 2017 Mathematics A To Z: Open Set


Today’s glossary entry is another request from Elke Stangl, author of the Elkemental Force blog. I’m hoping this also turns out to be a well-received entry. Half of that is up to you, the kind reader. At least I hope you’re a reader. It’s already gone wrong, as it was supposed to be Friday’s entry. I discovered I hadn’t actually scheduled it while I was too far from my laptop to do anything about that mistake. This spoils the nice Monday-Wednesday-Friday routine of these glossary entries that dates back to the first one I ever posted and just means I have to quit forever and not show my face ever again. Sorry, Ulam Spiral. Someone else will have to think of you.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Open Set.

Mathematics likes to present itself as being universal truths. And it is. At least if we allow that the rules of logic by which mathematics works are universal. Suppose them to be true and the rest follows. But we start out with intuition, with things we observe in the real world. We’re happy when we can remove the stuff that’s clearly based on idiosyncratic experience. We find something that’s got to be universal.

Sets are pretty abstract things, as mathematicians use the term. They get to be hard to talk about; we run out of simpler words that we can use. A set is … a bunch of things. The things are … stuff that could be in a set, or else that we’d rule out of a set. We can end up better understanding things by drawing a picture. We draw the universe, which is a rectangular block, sometimes with dashed lines as the edges. The set is some blotch drawn on the inside of it. Some shade it in to emphasize which stuff we want in the set. If we need to pick out a couple things in the universe we drop in dots or numerals. If we’re rigorous about the drawing we could create a Venn Diagram.

When we do this, we’re giving up on the pure mathematical abstraction of the set. We’re replacing it with a territory on a map. Several territories, if we have several sets. The territories can overlap or be completely separate. We’re subtly letting our sense of geography, our sense of the spaces in which we move, infiltrate our understanding of sets. That’s all right. It can give us useful ideas. Later on, we’ll try to separate out the ideas that are too bound to geography.

A set is open if whenever you’re in it, you can’t be on its boundary. We never quite have this in the real world, with territories. The border between, say, New Jersey and New York becomes this infinitesimally slender thing, as wide in space as midnight is in time. But we can, with some effort, imagine the state. Imagine being as tiny in every direction as the border between two states. Then we can imagine the difference between being on the border and being away from it.

And not being on the border matters. If we are not on the border we can imagine the problem of getting to the border. Pick any direction; we can move some distance while staying inside the set. It might be a lot of distance, it might be a tiny bit. But we stay inside however we might move. If we are on the border, then there’s some direction in which any movement, however small, drops us out of the set. That’s a difference in kind between a set that’s open and a set that isn’t.

I say “a set that’s open and a set that isn’t”. There are such things as closed sets. A set doesn’t have to be either open or closed. It can be neither, a set that includes some of its borders but not other parts of it. It can even be both open and closed simultaneously. The whole universe, for example, is both an open and a closed set. The empty set, with nothing in it, is both open and closed. (This looks like a semantic trick. OK, if you’re in the empty set you’re not on its boundary. But you can’t be in the empty set. So what’s going on? … The usual. It makes other work easier if we call the empty set ‘open’. And the extra work we’d have to do to rule out the empty set doesn’t seem to get us anything interesting. So we accept what might be a trick.) The definitions of ‘open’ and ‘closed’ don’t exclude one another.

I’m not sure how this confusing state of affairs developed. My hunch is that the words ‘open’ and ‘closed’ evolved independent of each other. Why do I think this? An open set has its openness from, well, not containing its boundaries; from the inside there’s always a little more to it. A closed set has its closedness from sequences. That is, you can consider a string of points inside a set. Are these points leading somewhere? Is that point inside your set? If a string of points always leads to somewhere, and that somewhere is inside the set, then you have closure. You have a closed set. I’m not sure that the terms were derived with that much thought. But it does explain, at least in terms a mathematician might respect, why a set that isn’t open isn’t necessarily closed.

Back to open sets. What does it mean to not be on the boundary of the set? How do we know if we’re on it? We can define sets by all sorts of complicated rules: complex-valued numbers of size less than five, say. Rational numbers whose denominator (in lowest form) is no more than ten. Points in space from which a satellite dropped would crash into the moon rather than into the Earth or Sun. If we have an idea of distance we could measure how far it is from a point to the nearest part of the boundary. Do we need distance, though?

No, it turns out. We can get the idea of open sets without using distance. Introduce a neighborhood of a point. A neighborhood of a point is an open set that contains that point. It doesn’t have to be small, but that’s the connotation. And we get to thinking of little N-balls, circle or sphere-like constructs centered on the target point. It doesn’t have to be N-balls. But we think of them so much that we might as well say it’s necessary. If every point in a set has a neighborhood around it that’s also inside the set, then the set’s open.

You’re going to accuse me of begging the question. Fair enough. I was using open sets to define open sets. This use is all right for an intuitive idea of what makes a set open, but it’s not rigorous. We can give in and say we have to have distance. Then we have N-balls and we can build open sets out of balls that don’t contain the edges. Or we can try to drive distance out of our idea of open sets.

We can do it this way. Start off by saying the whole universe is an open set. Also that the union of any number of open sets is also an open set. And that the intersection of any finite number of open sets is also an open set. Does this sound weak? So it sounds weak. It’s enough. We get the open sets we were thinking of all along from this.

This works for the sets that look like territories on a map. It also works for sets for which we have some idea of distance, however strange it is to our everyday distances. It even works if we don’t have any idea of distance. This lets us talk about topological spaces, and study what geometry looks like if we can’t tell how far apart two points are. We can, for example, at least tell that two points are different. Can we find a neighborhood of one that doesn’t contain the other? Then we know they’re some distance apart, even without knowing what distance is.

That we reached so abstract an idea of what an open set is without losing the idea’s usefulness suggests we’re doing well. So we are. It also shows why Nicholas Bourbaki, the famous nonexistent mathematician, thought set theory and its related ideas were the core of mathematics. Today category theory is a more popular candidate for the core of mathematics. But set theory is still close to the core, and much of analysis is about what we can know from the fact of sets being open. Open sets let us explain a lot.

My Mathematics Reading For The 13th of June


I’m working on the next Why Stuff Can Orbit post, this one to feature a special little surprise. In the meanwhile here’s some of the things I’ve read recently and liked.

The Theorem of the Day is just what the name offers. They’re fit onto single slides, so there’s not much text to read. I’ll grant some of them might be hard reading at once, though, if you’re not familiar with the lingo. Anyway, this particular theorem, the Lindemann-Weierstrass Theorem, is one of the famous ones. Also one of the best-named ones. Karl Weierstrass is one of those names you find all over analysis. Over the latter half of the 19th century he attacked the logical problems that had bugged calculus for the previous three centuries and beat them all. I’m lying, but not by much. Ferdinand von Lindemann’s name turns up less often, but he’s known in mathematics circles for proving that π is transcendental (and so, ultimately, that the circle can’t be squared by compass and straightedge). And he was David Hilbert’s thesis advisor.

The Lindemann-Weierstrass Theorem is one of those little utility theorems that’s neat on its own, yes, but is good for proving other stuff. This theorem says that if a given number is algebraic (ask about that some A To Z series) then e raised to that number has to be transcendental, and vice-versa. (The exception: e raised to 0 is equal to 1.) The page also mentions one of those fun things you run across when you have a scientific calculator and can repeat an operation on whatever the result of the last operation was.

I’ve mentioned Maths By A Girl before, but, it’s worth checking in again. This is a piece about Apéry’s Constant, which is one of those numbers mathematicians have heard of, and that we don’t know whether is transcendental or not. It’s hard proving numbers are transcendental. If you go out trying to build a transcendental number it’s easy, but otherwise, you have to hope you know your number is the exponential of an algebraic number.

I forget which Twitter feed brought this to my attention, but here’s a couple geometric theorems demonstrated and explained some by Dave Richeson. There’s something wonderful in a theorem that’s mostly a picture. It feels so supremely mathematical to me.

And last, Katherine Bourzac writing for Nature.com reports the creation of a two-dimensional magnet. This delights me since one of the classic problems in statistical mechanics is a thing called the Ising model. It’s a basic model for the mathematics of how magnets would work. The one-dimensional version is simple enough that you can give it to undergrads and have them work through the whole problem. The two-dimensional version is a lot harder to solve and I’m not sure I ever saw it laid out even in grad school. (Mind, I went to grad school for mathematics, not physics, and the subject is a lot more physics.) The four- and higher-dimensional model can be solved by a clever approach called mean field theory. The three-dimensional model .. I don’t think has any exact solution, which seems odd given how that’s the version you’d think was most useful.

That there’s a real two-dimensional magnet (well, a one-molecule-thick magnet) doesn’t really affect the model of two-dimensional magnets. The model is interesting enough for its mathematics, which teaches us about all kinds of phase transitions. And it’s close enough to the way certain aspects of real-world magnets behave to enlighten our understanding. The topic couldn’t avoid drawing my eye, is all.

What Second Derivatives Are And What They Can Do For You


Previous supplemental reading for Why Stuff Can Orbit:


This is another supplemental piece because it’s too much to include in the next bit of Why Stuff Can Orbit. I need some more stuff about how a mathematical physicist would look at something.

This is also a story about approximations. A lot of mathematics is really about approximations. I don’t mean numerical computing. We all know that when we compute we’re making approximations. We use 0.333333 instead of one-third and we use 3.141592 instead of π. But a lot of precise mathematics, what we call analysis, is also about approximations. We do this by a logical structure that works something like this: take something we want to prove. Now for every positive number ε we can find something — a point, a function, a curve — that’s no more than ε away from the thing we’re really interested in, and which is easier to work with. Then we prove whatever we want to with the easier-to-work-with thing. And since ε can be as tiny a positive number as we want, we can suppose ε is a tinier difference than we can hope to measure. And so the difference between the thing we’re interested in and the thing we’ve proved something interesting about is zero. (This is the part that feels like we’re pulling a scam. We’re not, but this is where it’s worth stopping and thinking about what we mean by “a difference between two things”. When you feel confident this isn’t a scam, continue.) So we proved whatever we proved about the thing we’re interested in. Take an analysis course and you will see this all the time.

When we get into mathematical physics we do a lot of approximating functions with polynomials. Why polynomials? Yes, because everything is polynomials. But also because polynomials make so much mathematical physics easy. Polynomials are easy to calculate, if you need numbers. Polynomials are easy to integrate and differentiate, if you need analysis. Here that’s the calculus that tells you about patterns of behavior. If you want to approximate a continuous function you can always do it with a polynomial. The polynomial might have to be infinitely long to approximate the entire function. That’s all right. You can chop it off after finitely many terms. This finite polynomial is still a good approximation. It’s just good for a smaller region than the infinitely long polynomial would have been.

Necessary qualifiers: pages 65 through 82 of any book on real analysis.

So. Let me get to functions. I’m going to use a function named ‘f’ because I’m not wasting my energy coming up with good names. (When we get back to the main Why Stuff Can Orbit sequence this is going to be ‘U’ for potential energy or ‘E’ for energy.) It’s got a domain that’s the real numbers, and a range that’s the real numbers. To express this in symbols I can write f: \Re \rightarrow \Re . If I have some number called ‘x’ that’s in the domain then I can tell you what number in the domain is matched by the function ‘f’ to ‘x’: it’s the number ‘f(x)’. You were expecting maybe 3.5? I don’t know that about ‘f’, not yet anyway. The one thing I do know about ‘f’, because I insist on it as a condition for appearing, is that it’s continuous. It hasn’t got any jumps, any gaps, any regions where it’s not defined. You could draw a curve representing it with a single, if wriggly, stroke of the pen.

I mean to build an approximation to the function ‘f’. It’s going to be a polynomial expansion, a set of things to multiply and add together that’s easy to find. To make this polynomial expansion this I need to choose some point to build the approximation around. Mathematicians call this the “point of expansion” because we froze up in panic when someone asked what we were going to name it, okay? But how are we going to make an approximation to a function if we don’t have some particular point we’re approximating around?

(One answer we find in grad school when we pick up some stuff from linear algebra we hadn’t been thinking about. We’ll skip it for now.)

I need a name for the point of expansion. I’ll use ‘a’. Many mathematicians do. Another popular name for it is ‘x0‘. Or if you’re using some other variable name for stuff in the domain then whatever that variable is with subscript zero.

So my first approximation to the original function ‘f’ is … oh, shoot, I should have some new name for this. All right. I’m going to use ‘F0‘ as the name. This is because it’s one of a set of approximations, each of them a little better than the old. ‘F1‘ will be better than ‘F0‘, but ‘F2‘ will be even better, and ‘F2038‘ will be way better yet. I’ll also say something about what I mean by “better”, although you’ve got some sense of that already.

I start off by calling the first approximation ‘F0‘ by the way because you’re going to think it’s too stupid to dignify with a number as big as ‘1’. Well, I have other reasons, but they’ll be easier to see in a bit. ‘F0‘, like all its sibling ‘Fn‘ functions, has a domain of the real numbers and a range of the real numbers. The rule defining how to go from a number ‘x’ in the domain to some real number in the range?

F^0(x) = f(a)

That is, this first approximation is simply whatever the original function’s value is at the point of expansion. Notice that’s an ‘x’ on the left side of the equals sign and an ‘a’ on the right. This seems to challenge the idea of what an “approximation” even is. But it’s legit. Supposing something to be constant is often a decent working assumption. If you failed to check what the weather for today will be like, supposing that it’ll be about like yesterday will usually serve you well enough. If you aren’t sure where your pet is, you look first wherever you last saw the animal. (Or, yes, where your pet most loves to be. A particular spot, though.)

We can make this rigorous. A mathematician thinks this is rigorous: you pick any margin of error you like. Then I can find a region near enough to the point of expansion. The value for ‘f’ for every point inside that region is ‘f(a)’ plus or minus your margin of error. It might be a small region, yes. Doesn’t matter. It exists, no matter how tiny your margin of error was.

But yeah, that expansion still seems too cheap to work. My next approximation, ‘F1‘, will be a little better. I mean that we can expect it will be closer than ‘F0‘ was to the original ‘f’. Or it’ll be as close for a bigger region around the point of expansion ‘a’. What it’ll represent is a line. Yeah, ‘F0‘ was a line too. But ‘F0‘ is a horizontal line. ‘F1‘ might be a line at some completely other angle. If that works better. The second approximation will look like this:

F^1(x) = f(a) + m\cdot\left(x - a\right)

Here ‘m’ serves its traditional yet poorly-explained role as the slope of a line. What the slope of that line should be we learn from the derivative of the original ‘f’. The derivative of a function is itself a new function, with the same domain and the same range. There’s a couple ways to denote this. Each way has its strengths and weaknesses about clarifying what we’re doing versus how much we’re writing down. And trying to write down almost anything can inspire confusion in analysis later on. There’s a part of analysis when you have to shift from thinking of particular problems to how problems work then.

So I will define a new function, spoken of as f-prime, this way:

f'(x) = \frac{df}{dx}\left(x\right)

If you look closely you realize there’s two different meanings of ‘x’ here. One is the ‘x’ that appears in parentheses. It’s the value in the domain of f and of f’ where we want to evaluate the function. The other ‘x’ is the one in the lower side of the derivative, in that \frac{df}{dx} . That’s my sloppiness, but it’s not uniquely mine. Mathematicians keep this straight by using the symbols \frac{df}{dx} so much they don’t even see the ‘x’ down there anymore so have no idea there’s anything to find confusing. Students keep this straight by guessing helplessly about what their instructors want and clinging to anything that doesn’t get marked down. Sorry. But what this means is to “take the derivative of the function ‘f’ with respect to its variable, and then, evaluate what that expression is for the value of ‘x’ that’s in parentheses on the left-hand side”. We can do some things that avoid the confusion in symbols there. They all require adding some more variables and some more notation in, and it looks like overkill for a measly definition like this.

Anyway. We really just want the deriviate evaluated at one point, the point of expansion. That is:

m = f'(a) = \frac{df}{dx}\left(a\right)

which by the way avoids that overloaded meaning of ‘x’ there. Put this together and we have what we call the tangent line approximation to the original ‘f’ at the point of expansion:

F^1(x) = f(a) + f'(a)\cdot\left(x - a\right)

This is also called the tangent line, because it’s a line that’s tangent to the original function. A plot of ‘F1‘ and the original function ‘f’ are guaranteed to touch one another only at the point of expansion. They might happen to touch again, but that’s luck. The tangent line will be close to the original function near the point of expansion. It might happen to be close again later on, but that’s luck, not design. Most stuff you might want to do with the original function you can do with the tangent line, but the tangent line will be easier to work with. It exactly matches the original function at the point of expansion, and its first derivative exactly matches the original function’s first derivative at the point of expansion.

We can do better. We can find a parabola, a second-order polynomial that approximates the original function. This will be a function ‘F2(x)’ that looks something like:

F^2(x) = f(a) + f'(a)\cdot\left(x - a\right) + \frac12 m_2 \left(x - a\right)^2

What we’re doing is adding a parabola to the approximation. This is that curve that looks kind of like a loosely-drawn U. The ‘m2‘ there measures how spread out the U is. It’s not quite the slope, but it’s kind of like that, which is why I’m using the letter ‘m’ for it. Its value we get from the second derivative of the original ‘f’:

m_2 = f''(a) = \frac{d^2f}{dx^2}\left(a\right)

We find the second derivative of a function ‘f’ by evaluating the first derivative, and then, taking the derivative of that. We can denote it with two ‘ marks after the ‘f’ as long as we aren’t stuck wrapping the function name in ‘ marks to set it out. And so we can describe the function this way:

F^2(x) = f(a) + f'(a)\cdot\left(x - a\right) + \frac12 f''(a) \left(x - a\right)^2

This will be a better approximation to the original function near the point of expansion. Or it’ll make larger the region where the approximation is good.

If the first derivative of a function at a point is zero that means the tangent line is horizontal. In physics stuff this is an equilibrium. The second derivative can tell us whether the equilibrium is stable or not. If the second derivative at the equilibrium is positive it’s a stable equilibrium. The function looks like a bowl open at the top. If the second derivative at the equilibrium is negative then it’s an unstable equilibrium.

We can make better approximations yet, by using even more derivatives of the original function ‘f’ at the point of expansion:

F^3(x) = f(a) + f'(a)\cdot\left(x - a\right) + \frac12 f''(a) \left(x - a\right)^2 + \frac{1}{3\cdot 2} f'''(a) \left(x - a\right)^3

There’s better approximations yet. You can probably guess what the next, fourth-degree, polynomial would be. Or you can after I tell you the fraction in front of the new term will be \frac{1}{4\cdot 3\cdot 2} . The only big difference is that after about the third derivative we give up on adding ‘ marks after the function name ‘f’. It’s just too many little dots. We start writing, like, ‘f(iv)‘ instead. Or if the Roman numerals are too much then ‘f(2038)‘ instead. Or if we don’t want to pin things down to a specific value ‘f(j)‘ with the understanding that ‘j’ is some whole number.

We don’t need all of them. In physics problems we get equilibriums from the first derivative. We get stability from the second derivative. And we get springs in the second derivative too. And that’s what I hope to pick up on in the next installment of the main series.

Words About A Wordless Induction Proof


This pair of tweets came across my feed. And who doesn’t like a good visual proof of a mathematical fact? I hope you enjoy.

So here’s the proposition.

This is the sort of identity we normally try proving by induction. Induction is a great scheme for proving identities like this. It works by finding some index on the formula. Then show that if the formula is true for one value of the index, then it’s true for the next-higher value of the index. Finally, find some value of the index for which it’s easy to check that the formula’s true. And that proves it’s true for all the values of that index above that base.

In this case the index is ‘n’. It’s really easy to prove the base case, since 13 is equal to 12 what with ‘1’ being the number everybody likes to raise to powers. Going from proving that if it’s true in one case — 1^3 + 2^3 + 3^3 + \cdots + n^3 — then it’s true for the next — 1^3 + 2^3 + 3^3 + \cdots + n^3 + (n + 1)^3 — is work. But you can get it done.

And then there’s this, done visually:

It took me a bit to read fully until I was confident in what it was showing. But it is all there.

As often happens with these wordless proofs you can ask whether it is properly speaking a proof. A proof is an argument and to be complete it has to contain every step needed to deduce the conclusion from the premises, following one of the rules of inference each step. Thing is basically no proof is complete that way, because it takes forever. We elide stuff that seems obvious, confident that if we had to we could fill in the intermediate steps. A wordless proof like trusts that if we try to describe what is in the picture then we are constructing the argument.

That’s surely enough of my words.

The End 2016 Mathematics A To Z Roundup


As is my tradition for the end of these roundups (see Summer 2015 and then Leap Day 2016) I want to just put up a page listing the whole set of articles. It’s a chance for people who missed a piece to easily see what they missed. And it lets me recover that little bit extra from the experience. Run over the past two months were:

The End 2016 Mathematics A To Z: Xi Function


I have today another request from gaurish, who’s also been good enough to give me requests for ‘Y’ and ‘Z’. I apologize for coming to this a day late. But it was Christmas and many things demanded my attention.

Xi Function.

We start with complex-valued numbers. People discovered them because they were useful tools to solve polynomials. They turned out to be more than useful fictions, if numbers are anything more than useful fictions. We can add and subtract them easily. Multiply and divide them less easily. We can even raise them to powers, or raise numbers to them.

If you become a mathematics major then somewhere in Intro to Complex Analysis you’re introduced to an exotic, infinitely large sum. It’s spoken of reverently as the Riemann Zeta Function, and it connects to something named the Riemann Hypothesis. Then you remember that you’ve heard of this, because if you’re willing to become a mathematics major you’ve read mathematics popularizations. And you know the Riemann Hypothesis is an unsolved problem. It proposes something that might be true or might be false. Either way has astounding implications for the way numbers fit together.

Riemann here is Bernard Riemann, who’s turned up often in these A To Z sequences. We saw him in spheres and in sums, leading to integrals. We’ll see him again. Riemann just covered so much of 19th century mathematics; we can’t talk about calculus without him. Zeta, Xi, and later on, Gamma are the famous Greek letters. Mathematicians fall back on them because the Roman alphabet just hasn’t got enough letters for our needs. I’m writing them out as English words instead because if you aren’t familiar with them they look like an indistinct set of squiggles. Even if you are familiar, sometimes. I got confused in researching this some because I did slip between a lowercase-xi and a lowercase-zeta in my mind. All I can plead is it’s been a hard week.

Riemann’s Zeta function is famous. It’s easy to approach. You can write it as a sum. An infinite sum, but still, those are easy to understand. Pick a complex-valued number. I’ll call it ‘s’ because that’s the standard. Next take each of the counting numbers: 1, 2, 3, and so on. Raise each of them to the power ‘s’. And take the reciprocal, one divided by those numbers. Add all that together. You’ll get something. Might be real. Might be complex-valued. Might be zero. We know many values of ‘s’ what would give us a zero. The Riemann Hypothesis is about characterizing all the possible values of ‘s’ that give us a zero. We know some of them, so boring we call them trivial: -2, -4, -6, -8, and so on. (This looks crazy. There’s another way of writing the Riemann Zeta function which makes it obvious instead.) The Riemann Hypothesis is about whether all the proper, that is, non-boring values of ‘s’ that give us a zero are 1/2 plus some imaginary number.

It’s a rare thing mathematicians have only one way of writing. If something’s been known and studied for a long time there are usually variations. We find different ways to write the problem. Or we find different problems which, if solved, would solve the original problem. The Riemann Xi function is an example of this.

I’m going to spare you the formula for it. That’s in self-defense. I haven’t found an expression of the Xi function that isn’t a mess. The normal ways to write it themselves call on the Zeta function, as well as the Gamma function. The Gamma function looks like factorials, for the counting numbers. It does its own thing for other complex-valued numbers.

That said, I’m not sure what the advantages are in looking at the Xi function. The one that people talk about is its symmetry. Its value at a particular complex-valued number ‘s’ is the same as its value at the number ‘1 – s’. This may not seem like much. But it gives us this way of rewriting the Riemann Hypothesis. Imagine all the complex-valued numbers with the same imaginary part. That is, all the numbers that we could write as, say, ‘x + 4i’, where ‘x’ is some real number. If the size of the value of Xi, evaluated at ‘x + 4i’, always increases as ‘x’ starts out equal to 1/2 and increases, then the Riemann hypothesis is true. (This has to be true not just for ‘x + 4i’, but for all possible imaginary numbers. So, ‘x + 5i’, and ‘x + 6i’, and even ‘x + 4.1 i’ and so on. But it’s easier to start with a single example.)

Or another way to write it. Suppose the size of the value of Xi, evaluated at ‘x + 4i’ (or whatever), always gets smaller as ‘x’ starts out at a negative infinitely large number and keeps increasing all the way to 1/2. If that’s true, and true for every imaginary number, including ‘x – i’, then the Riemann hypothesis is true.

And it turns out if the Riemann hypothesis is true we can prove the two cases above. We’d write the theorem about this in our papers with the start ‘The Following Are Equivalent’. In our notes we’d write ‘TFAE’, which is just as good. Then we’d take which ever of them seemed easiest to prove and find out it isn’t that easy after all. But if we do get through we declare ourselves fortunate, sit back feeling triumphant, and consider going out somewhere to celebrate. But we haven’t got any of these alternatives solved yet. None of the equivalent ways to write it has helped so far.

We know some some things. For example, we know there are infinitely many roots for the Xi function with a real part that’s 1/2. This is what we’d need for the Riemann hypothesis to be true. But we don’t know that all of them are.

The Xi function isn’t entirely about what it can tell us for the Zeta function. The Xi function has its own exotic and wonderful properties. In a 2009 paper on arxiv.org, for example, Drs Yang-Hui He, Vishnu Jejjala, and Djordje Minic describe how if the zeroes of the Xi function are all exactly where we expect them to be then we learn something about a particular kind of string theory. I admit not knowing just what to say about a genus-one free energy of the topological string past what I have read in this paper. In another paper they write of how the zeroes of the Xi function correspond to the description of the behavior for a quantum-mechanical operator that I just can’t find a way to describe clearly in under three thousand words.

But mathematicians often speak of the strangeness that mathematical constructs can match reality so well. And here is surely a powerful one. We learned of the Riemann Hypothesis originally by studying how many prime numbers there are compared to the counting numbers. If it’s true, then the physics of the universe may be set up one particular way. Is that not astounding?

The End 2016 Mathematics A To Z: Weierstrass Function


I’ve teased this one before.

Weierstrass Function.

So you know how the Earth is a sphere, but from our normal vantage point right up close to its surface it looks flat? That happens with functions too. Here I mean the normal kinds of functions we deal with, ones with domains that are the real numbers or a Euclidean space. And ranges that are real numbers. The functions you can draw on a sheet of paper with some wiggly bits. Let the function wiggle as much as you want. Pick a part of it and zoom in close. That zoomed-in part will look straight. If it doesn’t look straight, zoom in closer.

We rely on this. Functions that are straight, or at least straight enough, are easy to work with. We can do calculus on them. We can do analysis on them. Functions with plots that look like straight lines are easy to work with. Often the best approach to working with the function you’re interested in is to approximate it with an easy-to-work-with function. I bet it’ll be a polynomial. That serves us well. Polynomials are these continuous functions. They’re differentiable. They’re smooth.

That thing about the Earth looking flat, though? That’s a lie. I’ve never been to any of the really great cuts in the Earth’s surface, but I have been to some decent gorges. I went to grad school in the Hudson River Valley. I’ve driven I-80 over Pennsylvania’s scariest bridges. There’s points where the surface of the Earth just drops a great distance between your one footstep and your last.

Functions do that too. We can have points where a function isn’t differentiable, where it’s impossible to define the direction it’s headed. We can have points where a function isn’t continuous, where it jumps from one region of values to another region. Everyone knows this. We can’t dismiss those as abberations not worthy of the name “function”; too many of them are too useful. Typically we handle this by admitting there’s points that aren’t continuous and we chop the function up. We make it into a couple of functions, each stretching from discontinuity to discontinuity. Between them we have continuous region and we can go about our business as before.

Then came the 19th century when things got crazy. This particular craziness we credit to Karl Weierstrass. Weierstrass’s name is all over 19th century analysis. He had that talent for probing the limits of our intuition about basic mathematical ideas. We have a calculus that is logically rigorous because he found great counterexamples to what we had assumed without proving.

The Weierstrass function challenges this idea that any function is going to eventually level out. Or that we can even smooth a function out into basically straight, predictable chunks in-between sudden changes of direction. The function is continuous everywhere; you can draw it perfectly without lifting your pen from paper. But it always looks like a zig-zag pattern, jumping around like it was always randomly deciding whether to go up or down next. Zoom in on any patch and it still jumps around, zig-zagging up and down. There’s never an interval where it’s always moving up, or always moving down, or even just staying constant.

Despite being continuous it’s not differentiable. I’ve described that casually as it being impossible to predict where the function is going. That’s an abuse of words, yes. The function is defined. Its value at a point isn’t any more random than the value of “x2” is for any particular x. The unpredictability I’m talking about here is a side effect of ignorance. Imagine I showed you a plot of “x2” with a part of it concealed and asked you to fill in the gap. You’d probably do pretty well estimating it. The Weierstrass function, though? No; your guess would be lousy. My guess would be lousy too.

That’s a weird thing to have happen. A century and a half later it’s still weird. It gets weirder. The Weierstrass function isn’t differentiable generally. But there are exceptions. There are little dots of differentiability, where the rate at which the function changes is known. Not intervals, though. Single points. This is crazy. Derivatives are about how a function changes. We work out what they should even mean by thinking of a function’s value on strips of the domain. Those strips are small, but they’re still, you know, strips. But on almost all of that strip the derivative isn’t defined. It’s only at isolated points, a set with measure zero, that this derivative even exists. It evokes the medieval Mysteries, of how we are supposed to try, even though we know we shall fail, to understand how God can have contradictory properties.

It’s not quite that Mysterious here. Properties like this challenge our intuition, if we’ve gotten any. Once we’ve laid out good definitions for ideas like “derivative” and “continuous” and “limit” and “function” we can work out whether results like this make sense. And they — well, they follow. We can avoid weird conclusions like this, but at the cost of messing up our definitions for what a “function” and other things are. Making those useless. For the mathematical world to make sense, we have to change our idea of what quite makes sense.

That’s all right. When we look close we realize the Earth around us is never flat. Even reasonably flat areas have slight rises and falls. The ends of properties are marked with curbs or ditches, and bordered by streets that rise to a center. Look closely even at the dirt and we notice that as level as it gets there are still rocks and scratches in the ground, clumps of dirt an infinitesimal bit higher here and lower there. The flatness of the Earth around us is a useful tool, but we miss a lot by pretending it’s everything. The Weierstrass function is one of the ways a student mathematician learns that while smooth, predictable functions are essential, there is much more out there.

The End 2016 Mathematics A To Z: Smooth


Mathematicians affect a pose of objectivity. We justify this by working on things whose truth we can know, and which must be true whenever we accept certain rules of deduction and certain definitions and axioms. This seems fair. But we choose to pay attention to things that interest us for particular reasons. We study things we like. My A To Z glossary term for today is about one of those things we like.

Smooth.

Functions. Not everything mathematicians do is functions. But functions turn up a lot. We need to set some rules. “A function” is so generic a thing we can’t handle it much. Narrow it down. Pick functions with domains that are numbers. Range too. By numbers I mean real numbers, maybe complex numbers. That gives us something.

There’s functions that are hard to work with. This is almost all of them, so we don’t touch them unless we absolutely must. But they’re functions that aren’t continuous. That means what you imagine. The value of the function at some point is wholly unrelated to its value at some nearby point. It’s hard to work with anything that’s unpredictable like that. Functions as well as people.

We like functions that are continuous. They’re predictable. We can make approximations. We can estimate the function’s value at some point using its value at some more convenient point. It’s easy to see why that’s useful for numerical mathematics, for calculations to approximate stuff. The dazzling thing is it’s useful analytically. We step into the Platonic-ideal world of pure mathematics. We have tools that let us work as if we had infinitely many digits of precision, for infinitely many numbers at once. And yet we use estimates and approximations and errors. We use them in ways to give us perfect knowledge; we get there by estimates.

Continuous functions are nice. Well, they’re nicer to us than functions that aren’t continuous. But there are even nicer functions. Functions nicer to us. A continuous function, for example, can have corners; it can change direction suddenly and without warning. A differentiable function is more predictable. It can’t have corners like that. Knowing the function well at one point gives us more information about what it’s like nearby.

The derivative of a function doesn’t have to be continuous. Grumble. It’s nice when it is, though. It makes the function easier to work with. It’s really nice for us when the derivative itself has a derivative. Nothing guarantees that the derivative of a derivative is continuous. But maybe it is. Maybe the derivative of the derivative has a derivative. That’s a function we can do a lot with.

A function is “smooth” if it has as many derivatives as we need for whatever it is we’re doing. And if those derivatives are continuous. If this seems loose that’s because it is. A proof for whatever we’re interested in might need only the original function and its first derivative. It might need the original function and its first, second, third, and fourth derivatives. It might need hundreds of derivatives. If we look through the details of the proof we might find exactly how many derivatives we need and how many of them need to be continuous. But that’s tedious. We save ourselves considerable time by saying the function is “smooth”, as in, “smooth enough for what we need”.

If we do want to specify how many continuous derivatives a function has we call it a “Ck function”. The C here means continuous. The ‘k’ means there are the number ‘k’ continuous derivatives of it. This is completely different from a “Ck function”, which would be one that’s a k-dimensional vector. Whether the “C” is boldface or not is important. A function might have infinitely many continuous derivatives. That we call a “C function”. That’s got wonderful properties, especially if the domain and range are complex-valued numbers. We couldn’t do Complex Analysis without it. Complex Analysis is the course students take after wondering how they’ll ever survive Real Analysis. It’s much easier than Real Analysis. Mathematics can be strange.

The End 2016 Mathematics A To Z: Kernel


I told you that Image thing would reappear. Meanwhile I learned something about myself in writing this.

Kernel.

I want to talk about functions again. I’ve been keeping like a proper mathematician to a nice general idea of what a function is. The sort where a function’s this rule matching stuff in a set called the domain with stuff in a set called the range. And I’ve tried not to commit myself to saying anything about what that domain and range are. They could be numbers. They could be other functions. They could be the set of DVDs you own but haven’t watched in more than two years. They could be collections socks. Haven’t said.

But we know what functions anyone cares about. They’re stuff that have domains and ranges that are numbers. Preferably real numbers. Complex-valued numbers if we must. If we look at more exotic sets they’re ones that stick close to being numbers: vectors made up of an ordered set of numbers. Matrices of numbers. Functions that are themselves about numbers. Maybe we’ll get to something exotic like a rotation, but then what is a rotation but spinning something a certain number of degrees? There are a bunch of unavoidably common domains and ranges.

Fine, then. I’ll stick to functions with ranges that look enough like regular old numbers. By “enough” I mean they have a zero. That is, something that works like zero does. You know, add it to something else and that something else isn’t changed. That’s all I need.

A natural thing to wonder about a function — hold on. “Natural” is the wrong word. Something we learn to wonder about in functions, in pre-algebra class where they’re all polynomials, is where the zeroes are. They’re generally not at zero. Why would we say “zeroes” to mean “zero”? That could let non-mathematicians think they knew what we were on about. By the “zeroes” we mean the things in the domain that get matched to the zero in the range. It might be zero; no reason it couldn’t, until we know what the function’s rule is. Just we can’t count on that.

A polynomial we know has … well, it might have zero zeroes. Might have no zeroes. It might have one, or two, or so on. If it’s an n-th degree polynomial it can have up to n zeroes. And if it’s not a polynomial? Well, then it could have any conceivable number of zeroes and nobody is going to give you a nice little formula to say where they all are. It’s not that we’re being mean. It’s just that there isn’t a nice little formula that works for all possibilities. There aren’t even nice little formulas that work for all polynomials. You have to find zeroes by thinking about the problem. Sorry.

But! Suppose you have a collection of all the zeroes for your function. That’s all the points in the domain that match with zero in the range. Then we have a new name for the thing you have. And that’s the kernel of your function. It’s the biggest subset in the domain with an image that’s just the zero in the range.

So we have a name for the zeroes that isn’t just “the zeroes”. What does this get us?

If we don’t know anything about the kind of function we have, not much. If the function belongs to some common kinds of functions, though, it tells us stuff.

For example. Suppose the function has domain and range that are vectors. And that the function is linear, which is to say, easy to deal with. Let me call the function ‘f’. And let me pick out two things in the domain. I’ll call them ‘x’ and ‘y’ because I’m writing this after Thanksgiving dinner and can’t work up a cleverer name for anything. If f is linear then f(x + y) is the same thing as f(x) + f(y). And now something magic happens. If x and y are both in the kernel, then x + y has to be in the kernel too. Think about it. Meanwhile, if x is in the kernel but y isn’t, then f(x + y) is f(y). Again think about it.

What we can see is that the domain fractures into two directions. One of them, the direction of the kernel, is invisible to the function. You can move however much you like in that direction and f can’t see it. The other direction, perpendicular (“orthogonal”, we say in the trade) to the kernel, is visible. Everything that might change changes in that direction.

This idea threads through vector spaces, and we study a lot of things that turn out to look like vector spaces. It keeps surprising us by letting us solve problems, or find the best-possible approximate solutions. This kernel gives us room to match some fiddly conditions without breaking the real solution. The size of the null space alone can tell us whether some problems are solvable, or whether they’ll have infinitely large sets of solutions.

In this vector-space construct the kernel often takes on another name, the “null space”. This means the same thing. But it reminds us that superhero comics writers miss out on many excellent pieces of terminology by not taking advanced courses in mathematics.

Kernels also appear in group theory, whenever we get into rings. We’re always working with rings. They’re nearly as unavoidable as vector spaces.

You know how you can divide the whole numbers into odd and even? And you can do some neat tricks with that for some problems? You can do that with every ring, using the kernel as a dividing point. This gives us information about how the ring is shaped, and what other structures might look like the ring. This often lets us turn proofs that might be hard into a collection of proofs on individual cases that are, at least, doable. Tricks about odd and even numbers become, in trained hands, subtle proofs of surprising results.

We see vector spaces and rings all over the place in mathematics. Some of that’s selection bias. Vector spaces capture a lot of what’s important about geometry. Rings capture a lot of what’s important about arithmetic. We have understandings of geometry and arithmetic that transcend even our species. Raccoons understand space. Crows understand number. When we look to do mathematics we look for patterns we understand, and these are major patterns we understand. And there are kernels that matter to each of them.

Some mathematical ideas inspire metaphors to me. Kernels are one. Kernels feel to me like the process of holding a polarized lens up to a crystal. This lets one see how the crystal is put together. I realize writing this down that my metaphor is unclear: is the kernel the lens or the structure seen in the crystal? I suppose the function has to be the lens, with the kernel the crystallization planes made clear under it. It’s curious I had enjoyed this feeling about kernels and functions for so long without making it precise. Feelings about mathematical structures can be like that.

The End 2016 Mathematics A To Z: The Fredholm Alternative


Some things are created with magnificent names. My essay today is about one of them. It’s one of my favorite terms and I get a strange little delight whenever it needs to be mentioned in a proof. It’s also the title I shall use for my 1970s Paranoid-Conspiracy Thriller.

The Fredholm Alternative.

So the Fredholm Alternative is about whether this supercomputer with the ability to monitor every commercial transaction in the country falls into the hands of the Parallax Corporation or whether — ahm. Sorry. Wrong one. OK.

The Fredholm Alternative comes from the world of functional analysis. In functional analysis we study sets of functions with tools from elsewhere in mathematics. Some you’d be surprised aren’t already in there. There’s adding functions together, multiplying them, the stuff of arithmetic. Some might be a bit surprising, like the stuff we draw from linear algebra. That’s ideas like functions having length, or being at angles to each other. Or that length and those angles changing when we take a function of those functions. This may sound baffling. But a mathematics student who’s got into functional analysis usually has a happy surprise waiting. She discovers the subject is easy. At least, it relies on a lot of stuff she’s learned already, applied to stuff that’s less difficult to work with than, like, numbers.

(This may be a personal bias. I found functional analysis a thoroughgoing delight, even though I didn’t specialize in it. But I got the impression from other grad students that functional analysis was well-liked. Maybe we just got the right instructor for it.)

I’ve mentioned in passing “operators”. These are functions that have a domain that’s a set of functions and a range that’s another set of functions. Suppose you come up to me with some function, let’s say f(x) = x^2 . I give you back some other function — say, F(x) = \frac{1}{3}x^3 - 4 . Then I’m acting as an operator.

Why should I do such a thing? Many operators correspond to doing interesting stuff. Taking derivatives of functions, for example. Or undoing the work of taking a derivative. Describing how changing a condition changes what sorts of outcomes a process has. We do a lot of stuff with these. Trust me.

Let me use the name `T’ for some operator. I’m not going to say anything about what it does. The letter’s arbitrary. We like to use capital letters for operators because it makes the operators look extra important. And we don’t want to use `O’ because that just looks like zero and we don’t need that confusion.

Anyway. We need two functions. One of them will be called ‘f’ because we always call functions ‘f’. The other we’ll call ‘v’. In setting up the Fredholm Alternative we have this important thing: we know what ‘f’ is. We don’t know what ‘v’ is. We’re finding out something about what ‘v’ might be. The operator doing whatever it does to a function we write down as if it were multiplication, that is, like ‘Tv’. We get this notation from linear algebra. There we multiple matrices by vectors. Matrix-times-vector multiplication works like operator-on-a-function stuff. So much so that if we didn’t use the same notation young mathematics grad students would rise in rebellion. “This is absurd,” they would say, in unison. “The connotations of these processes are too alike not to use the same notation!” And the department chair would admit they have a point. So we write ‘Tv’.

If you skipped out on mathematics after high school you might guess we’d write ‘T(v)’ and that would make sense too. And, actually, we do sometimes. But by the time we’re doing a lot of functional analysis we don’t need the parentheses so much. They don’t clarify anything we’re confused about, and they require all the work of parenthesis-making. But I do see it sometimes, mostly in older books. This makes me think mathematicians started out with ‘T(v)’ and then wrote less as people got used to what they were doing.

I admit we might not literally know what ‘f’ is. I mean we know what ‘f’ is in the same way that, for a quadratic equation, “ax2 + bx + c = 0”, we “know” what ‘a’, ‘b’, and ‘c’ are. Similarly we don’t know what ‘v’ is in the same way we don’t know what ‘x’ there is. The Fredholm Alternative tells us exactly one of these two things has to be true:

For operators that meet some requirements I don’t feel like getting into, either:

  1. There’s one and only one ‘v’ which makes the equation Tv  = f true.
  2. Or else Tv = 0 for some ‘v’ that isn’t just zero everywhere.

That is, either there’s exactly one solution, or else there’s no solving this particular equation. We can rule out there being two solutions (the way quadratic equations often have), or ten solutions (the way some annoying problems will), or infinitely many solutions (oh, it happens).

It turns up often in boundary value problems. Often before we try solving one we spend some time working out whether there is a solution. You can imagine why it’s worth spending a little time working that out before committing to a big equation-solving project. But it comes up elsewhere. Very often we have problems that, at their core, are “does this operator match anything at all in the domain to a particular function in the range?” When we try to answer we stumble across Fredholm’s Alternative over and over.

Fredholm here was Ivar Fredholm, a Swedish mathematician of the late 19th and early 20th centuries. He worked for Uppsala University, and for the Swedish Social Insurance Agency, and as an actuary for the Skandia insurance company. Wikipedia tells me that his mathematical work was used to calculate buyback prices. I have no idea how.

Reading the Comics, October 1, 2016: Jumble Is Back Edition


Comic Strip Master Command sent another normal-style week for mathematics references. There’s not much that lets me get really chatty or gossippy about mathematics lore. That’s all right. The important thing is: we’ve got Jumble back.

Greg Cravens’s The Buckets for the 25th features a bit of parental nonsense-telling. The rather annoying noise inside a car’s cabin when there’s one window open is the sort of thing fluid mechanics ought to be able to study. I see references claiming this noise to be a Helmholz Resonance. This is a kind of oscillation in the air that comes from wind blowing across the lone hole in a solid object. Wikipedia says it’s even the same phenomenon producing an ocean-roar in a seashell held up to the ear. It’s named for Hermann von Helmholtz, who described it while studying sound and vortices. Helmholz is also renowned for making a clear statement of the conservation of energy — an idea many were working towards, mind — and in thermodynamics and electromagnetism and for that matter how the eye works. Also how fast nerves transmit signals. All that said, I’m not sure that all the unpleasant sound heard and pressure felt from a single opened car window is Helmholz Resonance. Real stuff is complicated and the full story is always more complicated than that. I wouldn’t go farther than saying that Helmholz Resonance is one thing to look at.

Michael Cavna’s Warped for the 25th uses two mathematics-cliché equations as “amazingly successful formulas”. One can quibble with whether Einstein should be counted under mathematics. Pythagoras, at least for the famous theorem named for him, nobody would argue. John Grisham, I don’t know, the joke seems dated to me but we are talking about the comics.

Tony Carrillos’ F Minus for the 28th uses arithmetic as as something no reasonable person can claim is incorrect. I haven’t read the comments, but I am slightly curious whether someone says something snarky about Common Core mathematics — or even the New Math for crying out loud — before or after someone finds a base other than ten that makes the symbols correct.

Cory Thomas’s college-set soap-opera strip Watch Your Head for the 28th name-drops Introduction to Functional Analysis. It won’t surprise you it’s a class nobody would take on impulse. It’s an upper-level undergraduate or a grad-student course, something only mathematics majors would find interesting. But it is very interesting. It’s the reward students have for making it through Real Analysis, the spirit-crushing course about why calculus works. Functional Analysis is about what we can do with functions. We can make them work like numbers. We can define addition and multiplication, we can measure their size, we can create sequences of them. We can treat functions almost as if they were numbers. And while we’re working on things more abstract and more exotic than the ordinary numbers Real Analysis depends on, somehow, Functional Analysis is easier than Real Analysis. It’s a wonder.

Mark Anderson’s Andertoons for the 29th features a student getting worried about the order of arithmetic operations. I appreciate how kids get worried about the feelings of things like that. Although, truly, subtraction doesn’t go “last”; addition and subtraction have the same priority. They share the bottom of the pile, though. Multiplication and division similarly share a priority, above addition-and-subtraction. Many guides to the order of operations say to do addition-and-subtraction in order left to right, but that’s not so. Setting a left-to-right order is okay for deciding where to start. But you could do a string of additions or subtractions in any order and get the same answer, unless the expression is inconsistent.

Four people sitting at a table divided up as a pie chart. The one sitting behind the overwhelming majority of the chunk says, 'C'mon guys ... I feel like I'm doing all the work here.'
Daniel Beyer’s Long Story Short for the 30th of September, 2016. I think Randolph Itch, 2am did this joke too but then had everyone retire to the bar chart.

Daniel Beyer’s Long Story Short for the 30th is a pie chart joke. There’s not a lot of mathematics to it, but I’m amused.

Justin Boyd’s Invisible Bread for the 30th has maybe my favorite dumb joke of the week. It’s just a kite that’s proven its knowledge of mathematics. I’m a little surprised the kite didn’t call out a funnier number, by which I mean 37, but perhaps … no, that doesn’t work, actually. Of course the kite would be comfortable with higher mathematics.

LIPOS O-O-O; PURTE OO---; VONPER -OO---; YETMSS --O-OO. Her students were beginning to understand addition and subtraction OOOO OO OOOO.
David L Hoyt and Jeff Knurek’s Jumble for the 1st of October, 2016. I don’t know that there even is a permanent link for this that would be any good.

And as promised, David L Hoyt and Jeff Knurek’s Jumble for the 1st of October mentions mathematics. That’s enough for me to include here.

Theorem Thursday: Liouville’s Approximation Theorem And How To Make Your Own Transcendental Number


As I get into the second month of Theorem Thursdays I have, I think, the whole roster of weeks sketched out. Today, I want to dive into some real analysis, and the study of numbers. It’s the sort of thing you normally get only if you’re willing to be a mathematics major. I’ll try to be readable by people who aren’t. If you carry through to the end and follow directions you’ll have your very own mathematical construct, too, so enjoy.

Liouville’s Approximation Theorem

It all comes back to polynomials. Of course it does. Polynomials aren’t literally everything in mathematics. They just come close. Among the things we can do with polynomials is divide up the real numbers into different sets. The tool we use is polynomials with integer coefficients. Integers are the positive and the negative whole numbers, stuff like ‘4’ and ‘5’ and ‘-12’ and ‘0’.

A polynomial is the sum of a bunch of products of coefficients multiplied by a variable raised to a power. We can use anything for the variable’s name. So we use ‘x’. Sometimes ‘t’. If we want complex-valued polynomials we use ‘z’. Some people trying to make a point will use ‘y’ or ‘s’ but they’re just showing off. Coefficients are just numbers. If we know the numbers, great. If we don’t know the numbers, or we want to write something that doesn’t commit us to any particular numbers, we use letters from the start of the alphabet. So we use ‘a’, maybe ‘b’ if we must. If we need a lot of numbers, we use subscripts: a0, a1, a2, and so on, up to some an for some big whole number n. To talk about one of these without committing ourselves to a specific example we use a subscript of i or j or k: aj, ak. It’s possible that aj and ak equal each other, but they don’t have to, unless j and k are the same whole number. They might also be zero, but they don’t have to be. They can be any numbers. Or, for this essay, they can be any integers. So we’d write a generic polynomial f(x) as:

f(x) = a_0 + a_1 x + a_2 x^2 + a_3 x^3 + \cdots + a_{n - 1}x^{n - 1} + a_n x^n

(Some people put the coefficients in the other order, that is, a_n + a_{n - 1}x + a_{n - 2}x^2 and so on. That’s not wrong. The name we give a number doesn’t matter. But it makes it harder to remember what coefficient matches up with, say, x14.)

A zero, or root, is a value for the variable (‘x’, or ‘t’, or what have you) which makes the polynomial equal to zero. It’s possible that ‘0’ is a zero, but don’t count on it. A polynomial of degree n — meaning the highest power to which x is raised is n — can have up to n different real-valued roots. All we’re going to care about is one.

Rational numbers are what we get by dividing one whole number by another. They’re numbers like 1/2 and 5/3 and 6. They’re numbers like -2.5 and 1.0625 and negative a billion. Almost none of the real numbers are rational numbers; they’re exceptional freaks. But they are all the numbers we actually compute with, once we start working out digits. Thus we remember that to live is to live paradoxically.

And every rational number is a root of a first-degree polynomial. That is, there’s some polynomial f(x) = a_0 + a_1 x that’s made zero for your polynomial. It’s easy to tell you what it is, too. Pick your rational number. You can write that as the integer p divided by the integer q. Now look at the polynomial f(x) = p – q x. Astounded yet?

That trick will work for any rational number. It won’t work for any irrational number. There’s no first-degree polynomial with integer coefficients that has the square root of two as a root. There are polynomials that do, though. There’s f(x) = 2 – x2. You can find the square root of two as the zero of a second-degree polynomial. You can’t find it as the zero of any lower-degree polynomials. So we say that this is an algebraic number of the second degree.

This goes on higher. Look at the cube root of 2. That’s another irrational number, so no first-degree polynomials have it as a root. And there’s no second-degree polynomials that have it as a root, not if we stick to integer coefficients. Ah, but f(x) = 2 – x3? That’s got it. So the cube root of two is an algebraic number of degree three.

We can go on like this, although I admit examples for higher-order algebraic numbers start getting hard to justify. Most of the numbers people have heard of are either rational or are order-two algebraic numbers. I can tell you truly that the eighth root of two is an eighth-degree algebraic number. But I bet you don’t feel enlightened. At best you feel like I’m setting up for something. The number r(5), the smallest radius a disc can have so that five of them will completely cover a disc of radius 1, is eighth-degree and that’s interesting. But you never imagined the number before and don’t have any idea how big that is, other than “I guess that has to be smaller than 1”. (It’s just a touch less than 0.61.) I sound like I’m wasting your time, although you might start doing little puzzles trying to make smaller coins cover larger ones. Do have fun.

Liouville’s Approximation Theorem is about approximating algebraic numbers with rational ones. Almost everything we ever do is with rational numbers. That’s all right because we can make the difference between the number we want, even if it’s r(5), and the numbers we can compute with, rational numbers, as tiny as we need. We trust that the errors we make from this approximation will stay small. And then we discover chaos science. Nothing is perfect.

For example, suppose we need to estimate π. Everyone knows we can approximate this with the rational number 22/7. That’s about 3.142857, which is all right but nothing great. Some people know we can approximate it as 333/106. (I didn’t until I started writing this paragraph and did some research.) That’s about 3.141509, which is better. Then there’s 355/113, which is not as famous as 22/7 but is a celebrity compared to 333/106. That’s about 3.141529. Then we get into some numbers only mathematics hipsters know: 103993/33102 and 104348/33215 and so on. Fine.

The Liouville Approximation Theorem is about sequences that converge on an irrational number. So we have our first approximation x1, that’s the integer p1 divided by the integer q1. So, 22 and 7. Then there’s the next approximation x2, that’s the integer p2 divided by the integer q2. So, 333 and 106. Then there’s the next approximation yet, x3, that’s the integer p3 divided by the integer q3. As we look at more and more approximations, xj‘s, we get closer and closer to the actual irrational number we want, in this case π. Also, the denominators, the qj‘s, keep getting bigger.

The theorem speaks of having an algebraic number, call it x, of some degree n greater than 1. Then we have this limit on how good an approximation can be. The difference between the number x that we want, and our best approximation p / q, has to be larger than the number (1/q)n + 1. The approximation might be higher than x. It might be lower than x. But it will be off by at least the n-plus-first power of 1/q.

Polynomials let us separate the real numbers into infinitely many tiers of numbers. They also let us say how well the most accessible tier of numbers, rational numbers, can approximate these more exotic things.

One of the things we learn by looking at numbers through this polynomial screen is that there are transcendental numbers. These are numbers that can’t be the root of any polynomial with integer coefficients. π is one of them. e is another. Nearly all numbers are transcendental. But the proof that any particular number is one is hard. Joseph Liouville showed that transcendental numbers must exist by using continued fractions. But this approximation theorem tells us how to make our own transcendental numbers. This won’t be any number you or anyone else has ever heard of, unless you pick a special case. But it will be yours.

You will need:

  1. a1, an integer from 1 to 9, such as ‘1’, ‘9’, or ‘5’.
  2. a2, another integer from 1 to 9. It may be the same as a1 if you like, but it doesn’t have to be.
  3. a3, yet another integer from 1 to 9. It may be the same as a1 or a2 or, if it so happens, both.
  4. a4, one more integer from 1 to 9 and you know what? Let’s summarize things a bit.
  5. A whopping great big gob of integers aj, every one of them from 1 to 9, for every possible integer ‘j’ so technically this is infinitely many of them.
  6. Comfort with the notation n!, which is the factorial of n. For whole numbers that’s the product of every whole number from 1 to n, so, 2! is 1 times 2, or 2. 3! is 1 times 2 times 3, or 6. 4! is 1 times 2 times 3 times 4, or 24. And so on.
  7. Not to be thrown by me writing -n!. By that I mean work out n! and then multiply that by -1. So -2! is -2. -3! is -6. -4! is -24. And so on.

Now, assemble them into your very own transcendental number z, by this formula:

z = a_1 \cdot 10^{-1} + a_2 \cdot 10^{-2!} + a_3 \cdot 10^{-3!} + a_4 \cdot 10^{-4!} + a_5 \cdot 10^{-5!} + a_6 \cdot 10^{-6!} \cdots

If you’ve done it right, this will look something like:

z = 0.a_{1}a_{2}000a_{3}00000000000000000a_{4}0000000 \cdots

Ah, but, how do you know this is transcendental? We can prove it is. The proof is by contradiction, which is how a lot of great proofs are done. We show nonsense follows if the thing isn’t true, so the thing must be true. (There are mathematicians that don’t care for proof-by-contradiction. They insist on proof by charging straight ahead and showing a thing is true directly. That’s a matter of taste. I think every mathematician feels that way sometimes, to some extent or on some issues. The proof-by-contradiction is easier, at least in this case.)

Suppose that your z here is not transcendental. Then it’s got to be an algebraic number of degree n, for some finite number n. That’s what it means not to be transcendental. I don’t know what n is; I don’t care. There is some n and that’s enough.

Now, let’s let zm be a rational number approximating z. We find this approximation by taking the first m! digits after the decimal point. So, z1 would be just the number 0.a1. z2 is the number 0.a1a2. z3 is the number 0.a1a2000a3. I don’t know what m you like, but that’s all right. We’ll pick a nice big m.

So what’s the difference between z and zm? Well, it can’t be larger than 10 times 10-(m + 1)!. This is for the same reason that π minus 3.14 can’t be any bigger than 0.01.

Now suppose we have the best possible rational approximation, p/q, of your number z. Its first m! digits are going to be p / 10m!. This will be zm And by the Liouville Approximation Theorem, then, the difference between z and zm has to be at least as big as (1/10m!)(n + 1).

So we know the difference between z and zm has to be larger than one number. And it has to be smaller than another. Let me write those out.

\frac{1}{10^{m! (n + 1)}} < |z - z_m | < \frac{10}{10^{(m + 1)!}}

We don’t need the z – zm anymore. That thing on the rightmost side we can write what I’ll swear is a little easier to use. What we have left is:

\frac{1}{10^{m! (n + 1)}} < \frac{1}{10^{(m + 1)! - 1}}

And this will be true whenever the number m! (n + 1) is greater than (m + 1)! – 1 for big enough numbers m.

But there’s the thing. This isn’t true whenever m is greater than n. So the difference between your alleged transcendental number and its best-possible rational approximation has to be simultaneously bigger than a number and smaller than that same number without being equal to it. Supposing your number is anything but transcendental produces nonsense. Therefore, congratulations! You have a transcendental number.

If you chose all 1’s for your aj‘s, then you have what is sometimes called the Liouville Constant. If you didn’t, you may have a transcendental number nobody’s ever noticed before. You can name it after someone if you like. That’s as meaningful as naming a star for someone and cheaper. But you can style it as weaving someone’s name into the universal truth of mathematics. Enjoy!

I’m glad to finally give you a mathematics essay that lets you make something you can keep.

Theorem Thursday: A First Fixed Point Theorem


I’m going to let the Mean Value Theorem slide a while. I feel more like a Fixed Point Theorem today. As with the Mean Value Theorem there’s several of these. Here I’ll start with an easy one.

The Fixed Point Theorem.

Back when the world and I were young I would play with electronic calculators. They encouraged play. They made it so easy to enter a number and hit an operation, and then hit that operation again, and again and again. Patterns appeared. Start with, say, ‘2’ and hit the ‘squared’ button, the smaller ‘2’ raised up from the key’s baseline. You got 4. And again: 16. And again: 256. And again and again and you got ever-huger numbers. This happened whenever you started from a number bigger than 1. Start from something smaller than 1, however tiny, and it dwindled down to zero, whatever you tried. Start at ‘1’ and it just stays there. The results were similar if you started with negative numbers. The first squaring put you in positive numbers and everything carried on as before.

This sort of thing happened a lot. Keep hitting the mysterious ‘exp’ and the numbers would keep growing forever. Keep hitting ‘sqrt’; if you started above 1, the numbers dwindled to 1. Start below and the numbers rise to 1. Or you started at zero, but who’s boring enough to do that? ‘log’ would start with positive numbers and keep dropping until it turned into a negative number. The next step was the calculator’s protest we were unleashing madness on the world.

But you didn’t always get zero, one, infinity, or madness, from repeatedly hitting the calculator button. Sometimes, some functions, you’d get an interesting number. If you picked any old number and hit cosine over and over the digits would eventually settle down to around 0.739085. Or -0.739085. Cosine’s great. Tangent … tangent is weird. Tangent does all sorts of bizarre stuff. But at least cosine is there, giving us this interesting number.

(Something you might wonder: this is the cosine of an angle measured in radians, which is how mathematicians naturally think of angles. Normal people measure angles in degrees, and that will have a different fixed point. We write both the cosine-in-radians and the cosine-in-degrees using the shorthand ‘cos’. We get away with this because people who are confused by this are too embarrassed to call us out on it. If we’re thoughtful we write, say, ‘cos x’ for radians and ‘cos x°’ for degrees. This makes the difference obvious. It doesn’t really, but at least we gave some hint to the reader.)

This all is an example of a fixed point theorem. Fixed point theorems turn up in a lot of fields. They were most impressed upon me in dynamical systems, studying how a complex system changes in time. A fixed point, for these problems, is an equilibrium. It’s where things aren’t changed by a process. You can see where that’s interesting.

In this series I haven’t stated theorems exactly much, and I haven’t given them real proofs. But this is an easy one to state and to prove. Start off with a function, which I’ll name ‘f’, because yes that is exactly how much effort goes in to naming functions. It has as a domain the interval [a, b] for some real numbers ‘a’ and ‘b’. And it has as rang the same interval, [a, b]. It might use the whole range; it might use only a subset of it. And we have to require that f is continuous.

Then there has to be at least one fixed point. There must be at last one number ‘c’, somewhere in the interval [a, b], for which f(c) equals c. There may be more than one; we don’t say anything about how many there are. And it can happen that c is equal to a. Or that c equals b. We don’t know that it is or that it isn’t. We just know there’s at least one ‘c’ that makes f(c) equal c.

You get that in my various examples. If the function f has the rule that any given x is matched to x2, then we do get two fixed points: f(0) = 02 = 0, and, f(1) = 12 = 1. Or if f has the rule that any given x is matched to the square root of x, then again we have: f(0) = \sqrt{0} = 0 and f(1) = \sqrt{1} = 1 . Same old boring fixed points. The cosine is a little more interesting. For that we have f(0.739085...) = \cos\left(0.739085...\right) = 0.739085... .

How to prove it? The easiest way I know is to summon the Intermediate Value Theorem. Since I wrote a couple hundred words about that a few weeks ago I can assume you to understand it perfectly and have no question about how it makes this problem easy. I don’t even need to go on, do I?

… Yeah, fair enough. Well, here’s how to do it. We’ll take the original function f and create, based on it, a new function. We’ll dig deep in the alphabet and name that ‘g’. It has the same domain as f, [a, b]. Its range is … oh, well, something in the real numbers. Don’t care. The wonder comes from the rule we use.

The rule for ‘g’ is this: match the given number ‘x’ with the number ‘f(x) – x’. That is, g(a) equals whatever f(a) would be, minus a. g(b) equals whatever f(b) would be, minus b. We’re allowed to define a function in terms of some other function, as long as the symbols are meaningful. But we aren’t doing anything wrong like dividing by zero or taking the logarithm of a negative number or asking for f where it isn’t defined.

You might protest that we don’t know what the rule for f is. We’re told there is one, and that it’s a continuous function, but nothing more. So how can I say I’ve defined g in terms of a function I don’t know?

In the first place, I already know everything about f that I need to. I know it’s a continuous function defined on the interval [a, b]. I won’t use any more than that about it. And that’s great. A theorem that doesn’t require knowing much about a function is one that applies to more functions. It’s like the difference between being able to say something true of all living things in North America, and being able to say something true of all persons born in Redbank, New Jersey, on the 18th of February, 1944, who are presently between 68 and 70 inches tall and working on their rock operas. Both things may be true, but one of those things you probably use more.

In the second place, suppose I gave you a specific rule for f. Let me say, oh, f matches x with the arccosecant of x. Are you feeling any more enlightened now? Didn’t think so.

Back to g. Here’s some things we can say for sure about it. g is a function defined on the interval [a, b]. That’s how we set it up. Next point: g is a continuous function on the interval [a, b]. Remember, g is just the function f, which was continuous, minus x, which is also continuous. The difference of two continuous functions is still going to be continuous. (This is obvious, although it may take some considered thinking to realize why it is obvious.)

Now some interesting stuff. What is g(a)? Well, it’s whatever number f(a) is minus a. I can’t tell you what number that is. But I can tell you this: it’s not negative. Remember that f(a) has to be some number in the interval [a, b]. That is, it’s got to be no smaller than a. So the smallest f(a) can be is equal to a, in which case f(a) minus a is zero. And f(a) might be larger than a, in which case f(a) minus a is positive. So g(a) is either zero or a positive number.

(If you’ve just realized where I’m going and gasped in delight, well done. If you haven’t, don’t worry. You will. You’re just out of practice.)

What about g(b)? Since I don’t know what f(b) is, I can’t tell you what specific number it is. But I can tell you it’s not a positive number. The reasoning is just like above: f(b) is some number on the interval [a, b]. So the biggest number f(b) can equal is b. And in that case f(b) minus b is zero. If f(b) is any smaller than b, then f(b) minus b is negative. So g(b) is either zero or a negative number.

(Smiling at this? Good job. If you aren’t, again, not to worry. This sort of argument is not the kind of thing you do in Boring Algebra. It takes time and practice to think this way.)

And now the Intermediate Value Theorem works. g(a) is a positive number. g(b) is a negative number. g is continuous from a to b. Therefore, there must be some number ‘c’, between a and b, for which g(c) equals zero. And remember what g(c) means: f(c) – c equals 0. Therefore f(c) has to equal c. There has to be a fixed point.

And some tidying up. Like I said, g(a) might be positive. It might also be zero. But if g(a) is zero, then f(a) – a = 0. So a would be a fixed point. And similarly if g(b) is zero, then f(b) – b = 0. So then b would be a fixed point. The important thing is there must be at least some fixed point.

Now that calculator play starts taking on purposeful shape. Squaring a number could find a fixed point only if you started with a number from -1 to 1. The square of a number outside this range, such as ‘2’, would be bigger than you started with, and the Fixed Point Theorem doesn’t apply. Similarly with exponentials. But square roots? The square root of any number from 0 to a positive number ‘b’ is a number between 0 and ‘b’, at least as long as b was bigger than 1. So there was a fixed point, at 1. The cosine of a real number is some number between -1 and 1, and the cosines of all the numbers between -1 and 1 are themselves between -1 and 1. The Fixed Point Theorem applies. Tangent isn’t a continuous function. And the calculator play never settles on anything.

As with the Intermediate Value Theorem, this is an existence proof. It guarantees there is a fixed point. It doesn’t tell us how to find one. Calculator play does, though. Start from any old number that looks promising and work out f for that number. Then take that and put it back into f. And again. And again. This is known as “fixed point iteration”. It won’t give you the exact answer.

Not usually, anyway. In some freak cases it will. But what it will give, provided some extra conditions are satisfied, is a sequence of values that get closer and closer to the fixed point. When you’re close enough, then you stop calculating. How do you know you’re close enough? If you know something about the original f you can work out some logically rigorous estimates. Or you just keep calculating until all the decimal points you want stop changing between iterations. That’s not logically sound, but it’s easy to program.

That won’t always work. It’ll only work if the function f is differentiable on the interval (a, b). That is, it can’t have corners. And there have to be limits on how fast the function changes on the interval (a, b). If the function changes too fast, iteration can’t be guaranteed to work. But often if we’re interested in a function at all then these conditions will be true, or we can think of a related function that for which they are true.

And even if it works it won’t always work well. It can take an enormous pile of calculations to get near the fixed point. But this is why we have computers, and why we can leave them to work overnight.

And yet such a simple idea works. It appears in ancient times, in a formula for finding the square root of an arbitrary positive number ‘N’. (Find the fixed point for f(x) = \frac{1}{2}\left(\frac{N}{x} + x\right) ). It creeps into problems that don’t look like fixed points. Calculus students learn of something called the Newton-Raphson Iteration. It finds roots, points where a function f(x) equals zero. Mathematics majors learn of numerical methods to solve ordinary differential equations. The most stable of these are again fixed-point iteration schemes, albeit in disguise.

They all share this almost playful backbone.

Theorem Thursday: One Mean Value Theorem Of Many


For this week I have something I want to follow up on. We’ll see if I make it that far.

The Mean Value Theorem.

My subject line disagrees with the header just above here. I want to talk about the Mean Value Theorem. It’s one of those things that turns up in freshman calculus and then again in Analysis. It’s introduced as “the” Mean Value Theorem. But like many things in calculus it comes in several forms. So I figure to talk about one of them here, and another form in a while, when I’ve had time to make up drawings.

Calculus can split effortlessly into two kinds of things. One is differential calculus. This is the study of continuity and smoothness. It studies how a quantity changes if someting affecting it changes. It tells us how to optimize things. It tells us how to approximate complicated functions with simpler ones. Usually polynomials. It leads us to differential equations, problems in which the rate at which something changes depends on what value the thing has.

The other kind is integral calculus. This is the study of shapes and areas. It studies how infinitely many things, all infinitely small, add together. It tells us what the net change in things are. It tells us how to go from information about every point in a volume to information about the whole volume.

They aren’t really separate. Each kind informs the other, and gives us tools to use in studying the other. And they are almost mirrors of one another. Differentials and integrals are not quite inverses, but they come quite close. And as a result most of the important stuff you learn in differential calculus has an echo in integral calculus. The Mean Value Theorem is among them.

The Mean Value Theorem is a rule about functions. In this case it’s functions with a domain that’s an interval of the real numbers. I’ll use ‘a’ as the name for the smallest number in the domain and ‘b’ as the largest number. People talking about the Mean Value Theorem often do. The range is also the real numbers, although it doesn’t matter which ones.

I’ll call the function ‘f’ in accord with a longrunning tradition of not working too hard to name functions. What does matter is that ‘f’ is continuous on the interval [a, b]. I’ve described what ‘continuous’ means before. It means that here too.

And we need one more thing. The function f has to be differentiable on the interval (a, b). You maybe noticed that before I wrote [a, b], and here I just wrote (a, b). There’s a difference here. We need the function to be continuous on the “closed” interval [a, b]. That is, it’s got to be continuous for ‘a’, for ‘b’, and for every point in-between.

But we only need the function to be differentiable on the “open” interval (a, b). That is, it’s got to be continuous for all the points in-between ‘a’ and ‘b’. If it happens to be differentiable for ‘a’, or for ‘b’, or for both, that’s great. But we won’t turn away a function f for not being differentiable at those points. Only the interior. That sort of distinction between stuff true on the interior and stuff true on the boundaries is common. This is why mathematicians have words for “including the boundaries” (“closed”) and “never minding the boundaries” (“open”).

As to what “differentiable” is … A function is differentiable at a point if you can take its derivative at that point. I’m sure that clears everything up. There are many ways to describe what differentiability is. One that’s not too bad is to imagine zooming way in on the curve representing a function. If you start with a big old wobbly function it waves all around. But pick a point. Zoom in on that. Does the function stay all wobbly, or does it get more steady, more straight? Keep zooming in. Does it get even straighter still? If you zoomed in over and over again on the curve at some point, would it look almost exactly like a straight line?

If it does, then the function is differentiable at that point. It has a derivative there. The derivative’s value is whatever the slope of that line is. The slope is that thing you remember from taking Boring Algebra in high school. That rise-over-run thing. But this derivative is a great thing to know. You could approximate the original function with a straight line, with slope equal to that derivative. Close to that point, you’ll make a small enough error nobody has to worry about it.

That there will be this straight line approximation isn’t true for every function. Here’s an example. Picture a line that goes up and then takes a 90-degree turn to go back down again. Look at the corner. However close you zoom in on the corner, there’s going to be a corner. It’s never going to look like a straight line; there’s a 90-degree angle there. It can be a smaller angle if you like, but any sort of corner breaks this differentiability. This is a point where the function isn’t differentiable.

There are functions that are nothing but corners. They can be differentiable nowhere, or only at a tiny set of points that can be ignored. (A set of measure zero, as the dialect would put it.) Mathematicians discovered this over the course of the 19th century. They got into some good arguments about how that can even make sense. It can get worse. Also found in the 19th century were functions that are continuous only at a single point. This smashes just about everyone’s intuition. But we can’t find a definition of continuity that’s as useful as the one we use now and avoids that problem. So we accept that it implies some pathological conclusions and carry on as best we can.

Now I get to the Mean Value Theorem in its differential calculus pelage. It starts with the endpoints, ‘a’ and ‘b’, and the values of the function at those points, ‘f(a)’ and ‘f(b)’. And from here it’s easiest to figure what’s going on if you imagine the plot of a generic function f. I recommend drawing one. Just make sure you draw it without lifting the pen from paper, and without including any corners anywhere. Something wiggly.

Draw the line that connects the ends of the wiggly graph. Formally, we’re adding the line segment that connects the points with coordinates (a, f(a)) and (b, f(b)). That’s coordinate pairs, not intervals. That’s clear in the minds of the mathematicians who don’t see why not to use parentheses over and over like this. (We are short on good grouping symbols like parentheses and brackets and braces.)

Per the Mean Value Theorem, there is at least one point whose derivative is the same as the slope of that line segment. If you were to slide the line up or down, without changing its orientation, you’d find something wonderful. Most of the time this line intersects the curve, crossing from above to below or vice-versa. But there’ll be at least one point where the shifted line is “tangent”, where it just touches the original curve. Close to that touching point, the “tangent point”, the shifted line and the curve blend together and can’t be easily told apart. As long as the function is differentiable on the open interval (a, b), and continuous on the closed interval [a, b], this will be true. You might convince yourself of it by drawing a couple of curves and taking a straightedge to the results.

This is an existence theorem. Like the Intermediate Value Theorem, it doesn’t tell us which point, or points, make the thing we’re interested in true. It just promises us that there is some point that does it. So it gets used in other proofs. It lets us mix information about intervals and information about points.

It’s tempting to try using it numerically. It looks as if it justifies a common differential-calculus trick. Suppose we want to know the value of the derivative at a point. We could pick a little interval around that point and find the endpoints. And then find the slope of the line segment connecting the endpoints. And won’t that be close enough to the derivative at the point we care about?

Well. Um. No, we really can’t be sure about that. We don’t have any idea what interval might make the derivative of the point we care about equal to this line-segment slope. The Mean Value Theorem won’t tell us. It won’t even tell us if there exists an interval that would let that trick work. We can’t invoke the Mean Value Theorem to let us get away with that.

Often, though, we can get away with it. Differentiable functions do have to follow some rules. Among them is that if you do pick a small enough interval then approximations that look like this will work all right. If the function flutters around a lot, we need a smaller interval. But a lot of the functions we’re interested in don’t flutter around that much. So we can get away with it. And there’s some grounds to trust in getting away with it. The Mean Value Theorem isn’t any part of the grounds. It just looks so much like it ought to be.

I hope on a later Thursday to look at an integral-calculus form of the Mean Value Theorem.