## From my Fourth A-to-Z: Zeta Functions

I did not remember how long a buildup there was to my Summer 2017 writings about the Zeta function. But it’s something that takes a lot of setup. I don’t go into why the Riemann Hypothesis is interesting. I might have been saving that for a later A-to-Z. Or I might have trusted that since every pop mathematics blog has a good essay about the Riemann Hypothesis already there wasn’t much I could add.

I realize on re-reading that one might take me to have said that the final exam for my Intro to Complex Analysis course was always in the back of my textbook. I’d meant that after the final, I tucked it into my book and left it there. Probably nobody was confused by this.

Today Gaurish, of For the love of Mathematics, gives me the last subject for my Summer 2017 A To Z sequence. And also my greatest challenge: the Zeta function. The subject comes to all pop mathematics blogs. It comes to all mathematics blogs. It’s not difficult to say something about a particular zeta function. But to say something at all original? Let’s watch.

# Zeta Function.

The spring semester of my sophomore year I had Intro to Complex Analysis. Monday Wednesday 7:30; a rare evening class, one of the few times I’d eat dinner and then go to a lecture hall. There I discovered something strange and wonderful. Complex Analysis is a far easier topic than Real Analysis. Both are courses about why calculus works. But why calculus for complex-valued numbers works is a much easier problem than why calculus for real-valued numbers works. It’s dazzling. Part of this is that Complex Analysis, yes, builds on Real Analysis. So Complex can take for granted some things that Real has to prove. I didn’t mind. Given the way I crashed through Intro to Real Analysis I was glad for a subject that was, relatively, a breeze.

As we worked through Complex Variables and Applications so many things, so very many things, got to be easy. The basic unit of complex analysis, at least as we young majors learned it, was in contour integrals. These are integrals whose value depends on the values of a function on a closed loop. The loop is in the complex plane. The complex plane is, well, your ordinary plane. But we say the x-coordinate and the y-coordinate are parts of the same complex-valued number. The x-coordinate is the real-valued part. The y-coordinate is the imaginary-valued part. And we call that summation ‘z’. In complex-valued functions ‘z’ serves the role that ‘x’ does in normal mathematics.

So a closed loop is exactly what you think. Take a rubber band and twist it up and drop it on the table. That’s a closed loop. Suppose you want to integrate a function, ‘f(z)’. If you can always take its derivative on this loop and on the interior of that loop, then its contour integral is … zero. No matter what the function is. As long as it’s “analytic”, as the terminology has it. Yeah, we were all stunned into silence too. (Granted, mathematics classes are usually quiet, since it’s hard to get a good discussion going. Plus many of us were in post-dinner digestive lulls.)

Integrating regular old functions of real-valued numbers is this tedious process. There’s sooooo many rules and possibilities and special cases to consider. There’s sooooo many tricks that get you the integrals of some functions. And then here, with complex-valued integrals for analytic functions, you know the answer before you even look at the function.

As you might imagine, since this is only page 113 of a 341-page book there’s more to it. Most functions that anyone cares about aren’t analytic. At least they’re not analytic everywhere inside regions that might be interesting. There’s usually some points where an interesting function ‘f(z)’ is undefined. We call these “singularities”. Yes, like starships are always running into. Only we rarely get propelled into other universes or other times or turned into ghosts or stuff like that.

So much of the rest of the course turns into ways to avoid singularities. Sometimes you can spackle them over. This is when the function happens not to be defined somewhere, but you can see what it ought to be. Sometimes you have to do something more. This turns into a search for “removable” singularities. And this does something so brilliant it looks illicit. You modify your closed loop, so that it comes up very close, as close as possible, to the singularity, but studiously avoids it. Follow this game of I’m-not-touching-you right and you can turn your integral into two parts. One is the part that’s equal to zero. The other is the part that’s a constant times whatever the function is at the singularity you’re removing. And that ought to be easy to find the value for. (Being able to find a function’s value doesn’t mean you can find its derivative.)

Those tricks were hard to master. Not because they were hard. Because they were easy, in a context where we expected hard. But after that we got into how to move singularities. That is, how to do a change of variables that moved the singularities to where they’re more convenient for some reason. How could this be more convenient? Because of chapter five, “Series”. In regular old calculus we learn how to approximate well-behaved functions with polynomials. In complex-variable calculus, we learn the same thing all over again. They’re polynomials of complex-valued variables, but it’s the same sort of thing. And not just polynomials, but things that look like polynomials except they’re powers of $\frac{1}{z}$ instead. These open up new ways to approximate functions, and to remove singularities from functions.

And then we get into transformations. These are about turning a problem that’s hard into one that’s easy. Or at least different. They’re a change of variable, yes. But they also change what exactly the function is. This reshuffles the problem. Makes for a change in singularities. Could make ones that are easier to work with.

One of the useful, and so common, transforms is called the Laplace-Stieltjes Transform. (“Laplace” is said like you might guess. “Stieltjes” is said, or at least we were taught to say it, like “Stilton cheese” without the “ton”.) And it tends to create functions that look like a series, the sum of a bunch of terms. Infinitely many terms. Each of those terms looks like a number times another number raised to some constant times ‘z’. As the course came to its conclusion, we were all prepared to think about these infinite series. Where singularities might be. Which of them might be removable.

These functions, these results of the Laplace-Stieltjes Transform, we collectively call ‘zeta functions’. There are infinitely many of them. Some of them are relatively tame. Some of them are exotic. One of them is world-famous. Professor Walsh — I don’t mean to name-drop, but I discovered the syllabus for the course tucked in the back of my textbook and I’m delighted to rediscover it — talked about it.

That world-famous one is, of course, the Riemann Zeta function. Yes, that same Riemann who keeps turning up, over and over again. It looks simple enough. Almost tame. Take the counting numbers, 1, 2, 3, and so on. Take your ‘z’. Raise each of the counting numbers to that ‘z’. Take the reciprocals of all those numbers. Add them up. What do you get?

A mass of fascinating results, for one. Functions you wouldn’t expect are concealed in there. There’s strips where the real part is zero. There’s strips where the imaginary part is zero. There’s points where both the real and imaginary parts are zero. We know infinitely many of them. If ‘z’ is -2, for example, the sum is zero. Also if ‘z’ is -4. -6. -8. And so on. These are easy to show, and so are dubbed ‘trivial’ zeroes. To say some are ‘trivial’ is to say that there are others that are not trivial. Where are they?

Professor Walsh explained. We know of many of them. The nontrivial zeroes we know of all share something in common. They have a real part that’s equal to 1/2. There’s a zero that’s at about the number $\frac{1}{2} - \imath 14.13$. Also at $\frac{1}{2} + \imath 14.13$. There’s one at about $\frac{1}{2} - \imath 21.02$. Also about $\frac{1}{2} + \imath 21.02$. (There’s a symmetry, you maybe guessed.) Every nontrivial zero we’ve found has a real component that’s got the same real-valued part. But we don’t know that they all do. Nobody does. It is the Riemann Hypothesis, the great unsolved problem of mathematics. Much more important than that Fermat’s Last Theorem, which back then was still merely a conjecture.

What a prospect! What a promise! What a way to set us up for the final exam in a couple of weeks.

I had an inspiration, a kind of scheme of showing that a nontrivial zero couldn’t be within a given circular contour. Make the size of this circle grow. Move its center farther away from the z-coordinate $\frac{1}{2} + \imath 0$ to match. Show there’s still no nontrivial zeroes inside. And therefore, logically, since I would have shown nontrivial zeroes couldn’t be anywhere but on this special line, and we know nontrivial zeroes exist … I leapt enthusiastically into this project. A little less enthusiastically the next day. Less so the day after. And on. After maybe a week I went a day without working on it. But came back, now and then, prodding at my brilliant would-be proof.

The Riemann Zeta function was not on the final exam, which I’ve discovered was also tucked into the back of my textbook. It asked more things like finding all the singular points and classifying what kinds of singularities they were for functions like $e^{-\frac{1}{z}}$ instead. If the syllabus is accurate, we got as far as page 218. And I’m surprised to see the professor put his e-mail address on the syllabus. It was merely “bwalsh@math”, but understand, the Internet was a smaller place back then.

I finished the course with an A-, but without answering any of the great unsolved problems of mathematics.

## The Summer 2017 Mathematics A To Z: N-Sphere/N-Ball

Today’s glossary entry is a request from Elke Stangl, author of the Elkemental Force blog, which among other things has made me realize how much there is interesting to say about heat pumps. Well, you never know what’s interesting before you give it serious thought.

# Integration.

Stand on the edge of a plot of land. Walk along its boundary. As you walk the edge pay attention. Note how far you walk before changing direction, even in the slightest. When you return to where you started consult your notes. Contained within them is the area you circumnavigated.

If that doesn’t startle you perhaps you haven’t thought about how odd that is. You don’t ever touch the interior of the region. You never do anything like see how many standard-size tiles would fit inside. You walk a path that is as close to one-dimensional as your feet allow. And encoded in there somewhere is an area. Stare at that incongruity and you realize why integrals baffle the student so. They have a deep strangeness embedded in them.

We who do mathematics have always liked integration. They grow, in the western tradition, out of geometry. Given a shape, what is a square that has the same area? There are shapes it’s easy to find the area for, given only straightedge and compass: a rectangle? Easy. A triangle? Just as straightforward. A polygon? If you know triangles then you know polygons. A lune, the crescent-moon shape formed by taking a circular cut out of a circle? We can do that. (If the cut is the right size.) A circle? … All right, we can’t do that, but we spent two thousand years trying before we found that out for sure. And we can do some excellent approximations.

That bit of finding-a-square-with-the-same-area was called “quadrature”. The name survives, mostly in the phrase “numerical quadrature”. We use that to mean that we computed an integral’s approximate value, instead of finding a formula that would get it exactly. The otherwise obvious choice of “numerical integration” we use already. It describes computing the solution of a differential equation. We’re not trying to be difficult about this. Solving a differential equation is a kind of integration, and we need to do that a lot. We could recast a solving-a-differential-equation problem as a find-the-area problem, and vice-versa. But that’s bother, if we don’t need to, and so we talk about numerical quadrature and numerical integration.

Integrals are built on two infinities. This is part of why it took so long to work out their logic. One is the infinity of number; we find an integral’s value, in principle, by adding together infinitely many things. The other is an infinity of smallness. The things we add together are infinitesimally small. That we need to take things, each smaller than any number yet somehow not zero, and in such quantity that they add up to something, seems paradoxical. Their geometric origins had to be merged into that of arithmetic, of algebra, and it is not easy. Bishop George Berkeley made a steady name for himself in calculus textbooks by pointing this out. We have worked out several logically consistent schemes for evaluating integrals. They work, mostly, by showing that we can make the error caused by approximating the integral smaller than any margin we like. This is a standard trick, or at least it is, now that we know it.

That “in principle” above is important. We don’t actually work out an integral by finding the sum of infinitely many, infinitely tiny, things. It’s too hard. I remember in grad school the analysis professor working out by the proper definitions the integral of 1. This is as easy an integral as you can do without just integrating zero. He escaped with his life, but it was a close scrape. He offered the integral of x as a way to test our endurance, without actually doing it. I’ve never made it through that.

But we do integrals anyway. We have tools on our side. We can show, for example, that if a function obeys some common rules then we can use simpler formulas. Ones that don’t demand so many symbols in such tight formation. Ones that we can use in high school. Also, ones we can adapt to numerical computing, so that we can let machines give us answers which are near enough right. We get to choose how near is “near enough”. But then the machines decide how long we’ll have to wait to get that answer.

The greatest tool we have on our side is the Fundamental Theorem of Calculus. Even the name promises it’s the greatest tool we might have. This rule tells us how to connect integrating a function to differentiating another function. If we can find a function whose derivative is the thing we want to integrate, then we have a formula for the integral. It’s that function we found. What a fantastic result.

The trouble is it’s so hard to find functions whose derivatives are the thing we wanted to integrate. There are a lot of functions we can find, mind you. If we want to integrate a polynomial it’s easy. Sine and cosine and even tangent? Yeah. Logarithms? A little tedious but all right. A constant number raised to the power x? Also tedious but doable. A constant number raised to the power x2? Hold on there, that’s madness. No, we can’t do that.

There is a weird grab-bag of functions we can find these integrals for. They’re mostly ones we can find some integration trick for. An integration trick is some way to turn the integral we’re interested in into a couple of integrals we can do and then mix back together. A lot of a Freshman Calculus course is a heap of tricks we’ve learned. They have names like “u-substitution” and “integration by parts” and “trigonometric substitution”. Some of them are really exotic, such as turning a single integral into a double integral because that leads us to something we can do. And there’s something called “differentiation under the integral sign” that I don’t know of anyone actually using. People know of it because Richard Feynman, in his fun memoir What Do You Care What Other People Think: 250 Pages Of How Awesome I Was In Every Situation Ever, mentions how awesome it made him in so many situations. Mathematics, physics, and engineering nerds are required to read this at an impressionable age, so we fall in love with a technique no textbook ever mentions. Sorry.

I’ve written about all this as if we were interested just in areas. We’re not. We like calculating lengths and volumes and, if we dare venture into more dimensions, hypervolumes and the like. That’s all right. If we understand how to calculate areas, we have the tools we need. We can adapt them to as many or as few dimensions as we need. By weighting integrals we can do calculations that tell us about centers of mass and moments of inertial, about the most and least probable values of something, about all quantum mechanics.

As often happens, this powerful tool starts with something anyone might ponder: what size square has the same area as this other shape? And then think seriously about it.

## C

The square root of negative one. Everybody knows it doesn’t exist; there’s no real number you can multiply by itself and get negative one out. But then sometime in algebra, deep in a section about polynomials, suddenly we come out and declare there is such a thing. It’s an “imaginary number” that we call “i”. It’s hard to blame students for feeling betrayed by this. To make it worse, we throw real and imaginary numbers together and call the result “complex numbers”. It’s as if we’re out to tease them for feeling confused.

It’s an important set of things, though. It turns up as the domain, or the range, of functions so often that one of the major fields of analysis is called, “Complex Analysis”. If the course listing allows for more words, it’s called “Analysis of Functions of a Complex Variable” or something like that. Despite the connotations of the word “complex”, though, the field is a delight. It’s considerably easier to understand than Real Analysis, the study of functions of mere real numbers. When there is a theorem that has a version in Real Analysis and a version in Complex Analysis, the Complex Analysis side is usually easier to prove and easier to understand. It’s uncanny.

The set of all complex numbers is denoted C, in parallel to the set of real numbers, R. To make it clear that we mean this set, and not some piddling little common set that might happen to share the name C, add a vertical stroke to the left of the letter. This is just as we add a vertical stroke to R to emphasize we mean the Real Numbers. We should approach the set with respect, removing our hats, thinking seriously about great things. It would look silly to add a second curve to C though, so we just add a straight vertical stroke on the left side of the letter C. This makes it look a bit like it’s an Old English typeface (the kind you call Gothic until you learn that means “sans serif”) pared down to its minimum.

Why do we teach people there’s no such thing as a square root of minus one, and then one day, teach them there is? Part of it is that whether there is a square root depends on your context. If you are interested only in the real numbers, there’s nothing that, squared, gives you minus one. This is exactly the way that it’s not possible to equally divide five objects between two people if you aren’t allowed to cut the objects in half. But if you are willing to allow half-objects to be things, then you can do what was previously forbidden. What you can do depends on what the rules you set out are.

And there’s surely some echo of the historical discovery of imaginary and complex numbers at work here. They were noticed when working out the roots of third- and fourth-degree polynomials. These can be done by way of formulas that nobody ever remembers because there are so many better things to remember. These formulas would sometimes require one to calculate a square root of a negative number, a thing that obviously didn’t exist. Except that if you pretended it did, you could get out correct answers, just as if these were ordinary numbers. You can see why this may be dubbed an “imaginary” number. The name hints at the suspicion with which it’s viewed. It’s much as “negative” numbers look like some trap to people who’re just getting comfortable with fractions.

It goes against the stereotype of mathematicians to suppose they’d accept working with something they don’t understand because the results are all right, afterwards. But, actually, mathematicians are willing to accept getting answers by any crazy method. If you have a plausible answer, you can test whether it’s right, and if all you really need this minute is the right answer, good.

But we do like having methods; they’re more useful than mere answers. And we can imagine this set called the complex numbers. They contain … well, all the possible roots, the solutions, of all polynomials. (The polynomials might have coefficients — the numbers in front of the variable — of integers, or rational numbers, or irrational numbers. If we already accept the idea of complex numbers, the coefficients can be complex numbers too.)

It’s exceedingly common to think of the complex numbers by starting off with a new number called “i”. This is a number about which we know nothing except that i times i equals minus one. Then we tend to think of complex numbers as “a real number plus i times another real number”. The first real number gets called “the real component”, and is usually denoted as either “a” or “x”. The second real number gets called “the imaginary component”, and is usually denoted as either “b” or “y”. Then the complex number is written “a + i*b” or “x + i*y”. Sometimes it’s written “a + b*i” or “x + y*i”; that’s a mere matter of house style. Don’t let it throw you.

Writing a complex number this way has advantages. Particularly, it makes it easy to see how one would add together (or subtract) complex numbers: “a + b*i + x + y*i” almost suggests that the sum should be “(a + x) + (b + y)*i”. What we know from ordinary arithmetic gives us guidance. And if we’re comfortable with binomials, then we know how to multiply complex numbers. Start with “(a + b*i) * (x + y*i)” and follow the distributive law. We get, first, “a*x + a*y*i + b*i*x + b*y*i*i”. But “i*i” equals minus one, so this is the same as “a*x + a*y*i + b*i*x – b*y”. Move the real components together, and move the imaginary components together, and we have “(a*x – b*y) + (a*y + b*x)*i”.

That’s the most common way of writing out complex numbers. It’s so common that Eric W Weisstein’s Mathworld encyclopedia even says that’s what complex numbers are. But it isn’t the only way to construct, or look at, complex numbers. A common alternate way to look at complex numbers is to match a complex number to a point on the plane, or if you prefer, a point in the set R2.

It’s surprisingly natural to think of the real component as how far to the right or left of an origin your complex number is, and to think of the imaginary component as how far above or below the origin it is. Much complex-number work makes sense if you think of complex numbers as points in space, or directions in space. The language of vectors trips us up only a little bit here. We speak of a complex number as corresponding to a point on the “complex plane”, just as we might speak of a real number as a point on the “(real) number line”.

But there are other descriptions yet. We can represent complex numbers as a pair of numbers with a scheme that looks like polar coordinates. Pick a point on the complex plane. We can say where that is by two points of information. The first is the amplitude, or magnitude: how far the point is from the origin. The second is the phase, or angle: draw the line segment connecting the origin and your point. What angle does that make with the positive horizontal axis?

This representation is called the “phasor” representation. It’s tolerably popular in physics and I hear tell of engineers liking it. We represent numbers then not as “x + i*y” but instead as “r * e”, with r the magnitude and θ the angle. “e” is the base of the natural logarithm, which you get very comfortable with if you do much mathematics or physics. And “i” is just what we’ve been talking about here. This is a pretty natural way to write about complex numbers that represent stuff that oscillates, such as alternating current or the probability function in quantum mechanics. A lot of stuff oscillates, if you study it through the right lens. So numbers that look like this keep creeping in, and into unexpected places. It’s quite easy to multiply numbers in phasor form — just multiply the magnitude parts, and add the angle parts — although addition and subtraction become a pain.

Mathematicians generally use the letter “z” to represent a complex-valued number whose identity is not known. As best I can tell, this is because we do think so much of a complex number as the sum “x + y*i”. So if we used familiar old “x” for an unknown number, it would carry the connotations of “the real component of our complex-valued number” and mislead the unwary mathematician. The connection is so common that a mathematician might carelessly switch between “z” and the real and imaginary components “x” and “y” without specifying that “z” is another way of writing “x + y*i”. A good copy editor or an alert student should catch this.

Complex numbers work very much like real numbers do. They add and multiply in natural-looking ways, and you can do subtraction and division just as well. You can take exponentials, and can define all the common arithmetic functions — sines and cosines, square roots and logarithms, integrals and differentials — on them just as well as you can with real numbers. And you can embed the real numbers within the complex numbers: if you have a real number x, you can match that perfectly with the complex number “x + 0*i”.

But that doesn’t mean complex numbers are exactly like the real numbers. For example, it’s possible to order the real numbers. You can say that the number “a” is less than the number “b”, and have that mean something. That’s not possible to do with complex numbers. You can’t say that “a + b*i” is less than, or greater than, “x + y*i” in a logically consistent way. You can say the magnitude of one complex-valued number is greater than the magnitude of another. But the magnitudes are real numbers. For all that complex numbers give us there are things they’re not good for.

## Do You Have To Understand This?

At least around here school is starting up again and that’s got me thinking about learning mathematics. Particularly, it’s got me on the question: what should you do if you get stuck?

You will get stuck. Much of mathematics is learning a series series of arguments. They won’t all make sense, at least not at first. The arguments are almost certainly correct. If you’re reading something from a textbook, especially a textbook with a name like “Introductory” and that’s got into its seventh edition, the arguments can be counted on. (On the cutting edge of new mathematical discovery arguments might yet be uncertain.) But just because the arguments are right doesn’t mean you’ll see why they’re right, or even how they work at all.

So is it all right, if you’re stuck on a point, to just accept that this is something you don’t get, and move on, maybe coming back later?

Some will say no. Charles Dodgson — Lewis Carroll — took a rather hard line on this, insisting that one must study the argument until it makes sense. There are good reasons for this attitude. One is that while mathematics is made up of lots of arguments, it’s also made up of lots of very similar arguments. If you don’t understand the proof for (say) Green’s Theorem, it’s rather likely you won’t understand Stokes’s Theorem. And that’s coming in a couple of pages. Nor will you get a number of other theorems built on similar setups and using similar arguments. If you want to progress you have to get this.

Another strong argument is that much of mathematics is cumulative. Green’s Theorem is used as a building block to many other theorems. If you haven’t got an understanding of why that theorem works, then you probably also don’t have a clear idea why its follow-up theorems work. Before long the entire chapter is an indistinct mass of the not-quite-understood.

I’m less hard-line about this. I’m sure that shocks everyone who has never heard me express an opinion on anything, ever. But I have to judge the way I learn stuff to be the best possible way to learn stuff. And that includes, after a certain while of beating my head against the wall, moving on and coming back around later.

Why do I think that’s justified? Well, for one, because I’m not in school anymore. What mathematics I learn is because I find it beautiful or fun, and if I’m making myself miserable then I’m missing the point. This is a good attitude when all mathematics is recreational. It’s not so applicable when the exam is Monday, 9:50 am.

But sometimes it’s easier to understand something when you have experience using it. A simple statement of Green’s Theorem can make it sound too intimidating to be useful. When you see it in use, the “why” and “how” can be clearer. The motivation for the theorem can be compelling. The slightly grim joke we shared as majors was that we never really understood a course until we took its successor. This had dire implications for understanding what we would take senior year.

What about the cumulative nature of mathematical knowledge? That’s so and it’s not disputable. But it seems to me possible to accept “this statement is true, even if I’m not quite sure why” on the way to something that requires it. We always have to depend on things that are true that we can’t quite justify. I don’t even mean the axioms or the assumptions going into a theorem. I’m not sure how to characterize the kind of thing I mean.

I can give examples, though. When I was learning simple harmonic motion, the study of pendulums, I was hung up on a particular point. In describing how the pendulum swings, there’s a point where we substitute the sine of the angle of the pendulum for the measure of the angle of the pendulum. If the angle is small enough these numbers are just about the same. But … why? What justifies going from the exact sine of the angle to the approximation of the angle? Why then and not somewhere else? How do you know to do it there and not somewhere else?

I couldn’t get satisfying answers as a student. If I had refused to move on until I understood the process? Well, I might have earlier had an understanding that these sorts of approximations defy rigor. They’re about judgement, when to approximate and when to not. And they come from experience. You learn that approximating this will give you a solvable interesting problem. But approximating that leaves you too simple a problem to be worth studying. But I would have been quite delayed in understanding simple harmonic motion, which is at least as important. Maybe more important if you’re studying physics problems. There have to be priorities.

Is that right, though? I did get to what I thought was more important at the time. But the making of approximations is important, and I didn’t really learn it then. I’d accepted that we would do this and move on, and I did fill in that gap later. But it is so easy to never get back to the gap.

There’s hope if you’re studying something well-developed. By “well-developed” I mean something like “there are several good textbooks someone teaching this might choose from”. If a subject gets several good textbooks it usually has several independent proofs of anything interesting. If you’re stuck on one point, you usually can find it explained by a different chain of reasoning.

Sometimes even just a different author will help. I survived Introduction to Real Analysis (the study of why calculus works) by accepting that I just didn’t speak the textbook’s language. I borrowed an intro to real analysis textbook that was written in French. I don’t speak or really read French, though I had a couple years of it in middle and high school. But the straightforward grammar of mathematical French, and the common vocabulary, meant I was able to work through at least the harder things to understand. Of course, the difference might have been that I had to slowly consider every sentence to turn it from French text to English reading.

Probably there can’t be a universally right answer. We learn by different methods, for different goals, at different times. Whether it’s all right to skip the difficult part and come back later will depend. But I’d like to know what other people think, and more, what they do.

## Avoiding Monsters and Non-Monsters

R J Lipton has an engaging post which starts from something that rather horrified the mathematics community when it was discovered: it’s a function which is continuous everywhere, but it’s not differentiable anywhere, no matter where you look. Continuity and differentiability are important concepts in mathematics, and have very precise definitions — motivated, in part, by things like the difficult function Lipton discusses here — but they can be put into ordinary language tolerably well.

If you think of a continuous function as being one whose graph you could draw without having to lift the pencil from the paper you’re not doing badly. Similarly a function is differentiable at a point if, from that point, you know what way the curve is going. This function, found by Karl Weierstrass, is one example of the breed.

Lipton points out the somewhat unsettling point that it’s much more common for functions to be like this than they are to be neat and well-behaved functions like $y = 4x - 3$ or even $y = e^{-\frac{1}{2}x^2}$, in much the same way a real number is much more likely to be irrational than it is to be rational, and Lipton goes on to give an example in an area of mathematics I’m not familiar with of the “pathological” case being the vastly more common one. Fortunately, it turns out, we can usually approximate the “pathological” or “monster” function with something easier to work with — a very fortunate thing or we could get done very few computations that reflected anything actually interesting — and that’s another thing we can credit Weirstrass with discovering. Gödel's Lost Letter and P=NP Karl Weierstrass is often credited with the creation of modern analysis. In his quest for rigor and precision, he also created a shock when he presented his “monster” to the Berlin Academy in 1872.

Today I want to talk about the existence of strange and wonderful math objects—other monsters.

Weierstrass’s monster is defined as

$latex displaystyle f(x) = sum_{k=1}^{infty} a^{k} cos(b^{k}pi x), &fg=000000$

where $latex {0<a<1}&fg=000000$, $latex {b}&fg=000000$ is any odd integer, and $latex {ab>1+3pi/2}&fg=000000$. This function is continuous everywhere, but is differentiable nowhere.

The shock was that most mathematicians at the time thought that a continous function would have to be differentiable at a significant number of points. Some even had tried to prove this. While Weierstrass was the first to publish this, it was apparently known to others as early as 1830 that such functions existed.

This is a picture of the function—note its recursive structure, which is…

View original post 1,142 more words

## What Is True Almost Everywhere?

I was reading a thermodynamics book (C Truesdell and S Bharatha’s The Concepts and Logic of Classical Thermodynamics as a Theory of Heat Engines, which is a fascinating read, for the field, and includes a number of entertaining, for the field, snipes at the stuff textbook writers put in because they’re just passing on stuff without rethinking it carefully), and ran across a couple proofs which mentioned equations that were true “almost everywhere”. That’s a construction it might be surprising to know even exists in mathematics, so, let me take a couple hundred words to talk about it.

The idea isn’t really exotic. You’ve seen a kind of version of it when you see an equation containing the note that there’s an exception, such as, $\frac{\left(x - 1\right)^2}{\left(x - 1\right)} = x \mbox{ for } x \neq 1$. If the exceptions are tedious to list — because there are many of them to write down, or because they’re wordy to describe (the thermodynamics book mentioned the exceptions were where a particular set of conditions on several differential equations happened simultaneously, if it ever happened) — and if they’re unlikely to come up, then, we might just write whatever it is we want to say and add an “almost everywhere”, or for shorthand, put an “ae” after the line. This “almost everywhere” will, except in freak cases, propagate through the rest of the proof, but I only see people writing that when they’re students working through the concept. In publications, the “almost everywhere” gets put in where the condition first stops being true everywhere-everywhere and becomes only almost-everywhere, and taken as read after that.

I introduced this with an equation, but it can apply to any relationship: something is greater than something else, something is less than or equal to something else, even something is not equal to something else. (After all, “ $x \neq -x$ is true almost everywhere, but there is that nagging exception.) A mathematical proof is normally about things which are true. Whether one thing is equal to another is often incidental to that.

What’s meant by “unlikely to come up” is actually rigorously defined, which is why we can get away with this. It’s otherwise a bit daft to think we can just talk about things that are true except where they aren’t and not even post warnings about where they’re not true. If we say something is true “almost everywhere” on the real number line, for example, that means that the set of exceptions has a total length of zero. So if the only exception is where x equals 1, sure enough, that’s a set with no length. Similarly if the exceptions are where x equals positive 1 or negative 1, that’s still a total length of zero. But if the set of exceptions were all values of x from 0 to 4, well, that’s a set of total length 4 and we can’t say “almost everywhere” for that.

This is all quite like saying that it can’t happen that if you flip a fair coin infinitely many times it will come up tails every single time. It won’t, even though properly speaking there’s no reason that it couldn’t. If something is true almost everywhere, then your chance of picking an exception out of all the possibilities is about like your chance of flipping that fair coin and getting tails infinitely many times over.

## Augustin-Louis Cauchy’s birthday

The Maths History feed on Twitter mentioned that the 21st of August was the birthday of Augustin-Louis Cauchy, who lived from 1789 to 1857. His is one of those names you get to know very well when you’re a mathematics major, since he published 789 papers in his life, and did very well at publishing important papers, ones that established concepts people would actually use.

He’s got an intriguing biography, as he lived (mostly) in France during the time of the Revolution, the Directorate, Napoleon, the Bourbon Restoration, the July Monarchy, the Revolutions of 1848, the Second Republic, and the Second Empire, and had a career which got inextricably tangled with the political upheavals of the era. I note that, according to the MacTutor biography linked to earlier this paragraph, he followed the deposed King Charles X to Prague in order to tutor his grandson, but might not have had the right temperament for it: at least once he got annoyed at the grandson’s confusion and screamed and yelled, with the Queen, Marie Thérèse, sometimes telling him, “too loud, not so loud”. But we’ve all had students that frustrate us.

Cauchy’s name appears on many theorems and principles and definitions of interesting things — I just checked Mathworld and his name returned 124 different items — though I’ll admit I’m stumped how to describe what the Cauchy-Frobenius Lemma is without scaring readers off. So let me talk about something simpler.

## Real Experiments with Grading Mathematics

[ On an unrelated note I see someone’s been going through and grading my essays. I thank you, whoever you are; I’ll take any stars I can get. And I’m also delighted to be near to my 9,500th page view; I’ll try to find something neat to do for either 9,999 or 10,000, whichever feels like the better number. ]

As a math major I staggered through a yearlong course in Real Analysis. My impression is this is the reaction most math majors have to it, as it’s the course in which you study why it is that Calculus works, so it’s everything that’s baffling about Calculus only moreso. I’d be interested to know what courses math majors consider their most crushingly difficult; I’d think only Abstract Algebra could rival Real Analysis for the position.

While I didn’t fail, I did have to re-take Real Analysis in graduate school, since you can’t go on to many other important courses without mastering it. Remarkably, courses that sound like they should be harder — Complex Analysis, Functional Analysis and their like — often feel easier. Possibly this is because the most important tricks to studying these fields are all introduced in Real Analysis so that by the fourth semester around the techniques are comfortably familiar. Or Functional Analysis really is easier than Real Analysis.

The second time around went quite well, possibly because a class really is easier the second time around (I don’t have the experience in re-taking classes to compare it to) or possibly because I clicked better with the professor, Dr Harry McLaughlin at Rensselaer Polytechnic Institute. Besides giving what I think might be the best homework assignment I ever received, he also used a grading scheme that I really responded to well, and that I’m sorry I haven’t been able to effectively employ when I’ve taught courses.

His concept — I believe he used it for all his classes, but certainly he put it to use in Real Analysis — came from as I remember it his being bored with the routine of grading weekly homeworks and monthly exams and a big final. Instead, students could put together a portfolio, showing their mastery of different parts of the course’s topics. The grade for the course was what he judged your mastery of the subject was, based on the breadth and depth of your portfolio work.

Any slightly different way of running class is a source of anxiety, and he did some steps to keep it from being too terrifying a departure. First is that you could turn in a portfolio for a review as you liked mid-course and he’d say what he felt was missing or inadequate or which needed reworking. I believe his official policy was that you could turn it in as often as you liked for review, though I wonder what he would do for the most grade-grabby students, the ones who wrestle obsessively for every half-point on every assignment, and who might turn in portfolio revisions on an hourly basis. Maybe he had a rule about doing at most one review a week per student or something like that.

The other is that he still gave out homework assignments and offered exams, and if you wanted you could have them graded as in a normal course, with the portfolio grade being what the traditional course grade would be. So if you were just too afraid to try this portfolio scheme you could just pretend the whole thing was one of those odd jokes professors will offer and not worry.

I really liked this system and was sorry I didn’t have the chance to take more courses from him. The course work felt easier, no doubt partly because there was no particular need to do homework at the last minute or cram for an exam, and if you just couldn’t get around to one assignment you didn’t need to fear a specific and immediate grade penalty. Or at least the penalty as you estimated it was something you could make up by thinking about the material and working on a similar breadth of work to the assignments and exams offered.

I regret that I haven’t had the courage to try this system on a course I was teaching, although I have tried a couple of non-traditional grading schemes. I’m always interested in hearing of more, though, in case I do get back into teaching and feel secure enough to try something odd.

## What Numbers Equal Zero?

I want to give some examples of showing numbers are equal by showing the difference between them is ε. It’s a fairly abstruse idea but when it works amazing things become possible.

The easy example, although one that produces strong resistance, is showing that the number 1 is equal to the number 0.9999…. But here I have to say what I mean by that second number. It’s obvious to me that I mean a number formed by putting a decimal point up, and then filling in a ‘9’ to every digit past the decimal, repeating forever and ever without end. That’s a description so easy to grasp it looks obvious. I can give a more precise, less intuitively obvious, description, though, which makes it easier to prove what I’m going to be claiming.

## Introducing a Very Small Number

Last time I talked mathematics I introduced the idea of using some little tolerated difference between quantities. This tolerated difference has an immediately obvious and useful real-world interpretation: if we measure two things and they differ by less than that amount, we’d say they’re equal, or close enough to equal for whatever it is we’re doing. And it has great use in the nice exact proofs of some sophisticated mathematical concepts, most of which I think I can get to without introducing equations, which will make everyone happy. Readers like reading things that don’t have equations (folklore has it that every equation, other than E = mc2, cuts book sales in half, although I don’t remember seeing that long-established folklore before Stephen Hawking claimed it in A Brief History Of Time, which sold a hundred million billion trillion copies). Writers like not putting in equations because web standards have evolved so that there’s not only no good ways of putting in equations, but there aren’t even ways that rate as only lousy. But we can make do.

The tolerated difference is usually written as ε, the Greek lower-case e, at least if we are working on calculus or analysis at least, and it’s typically taken to mean some small number. The use seems to go back to Augustin-Louis Cauchy, who lived from 1789 to 1857, who paired it with the symbol δ to talk about small quantities. He seems to have meant δ the Greek lowercase d, to be a small number representing a difference, and ε as a small number representing an error, and the symbols have been with us ever since.

Cauchy’s an interesting person, although it seems sometimes that every mathematician who lived in France anytime around the Revolution and the era of Napoleon was interesting. He was certainly prolific: the MacTutor biography credits him with 789 published papers, and they covered a wide swath of mathematics: solid geometry, polygonal numbers, waves, inelastic shocks, astronomy, differential equations, matrices, and a powerful tool called the Fourier transform. This is why mathematics majors spend about two years running across all sorts of new things named after Cauchy — the Cauchy-Schwarz inequality, Cauchy sequences, Cauchy convergence, Cauchy-Reimann equations, Cauchy-Kovalevskaya existence, Cauchy integrals, and more — until they almost get interested enough to look up something about who he was. For a while Cauchy was tutor to the grandson of France’s King Charles X, but apparently Cauchy had a tendency to get annoyed and start screaming at the uninterested prince. He has two lunar features (a crater and an escarpment) named for him, indicating, I suppose, that Charles X wasn’t asked for a reference.

## Little Enough Differences

It’s as far from my workplace to home as it is from my workplace to my sister-in-law’s home. That’s a fair coincidence, but nobody thinks it’s precisely true. I don’t think it’s exactly true myself, but let me try to make it a little interesting. I’d be surprised if it were the same number of miles from work to either home. I’d be shocked if it were the same number of miles down to the tenth of the mile. To be precisely the same distance, down to the n-th decimal point, would be just impossibly unlikely. But I’d still make the claim, and most people would accept it, and everyone knows what the claim is supposed to mean and why it’s true. What I mean, and what I imagine anyone hearing the claim takes me to mean, is that the difference between these two quantities, the distance from work to home and the distance from work to my sister-in-law’s home, is smaller than some tolerable margin for error.

That’s a good definition of equality between two things in the practical world. It applies mathematically as well. A good number of proofs, particularly the ones that go into proving calculus works, amount to showing that there is some number in which we are interested, and there is some number which we are actually able to calculate, and the difference between those two numbers is less than some tolerated difference. If we’re just looking for an approximate answer, that’s about where we stop. If we want to do prove something rigorously and exactly, then we use a slightly different trick.

Instead of proving that the difference is smaller than some tolerated error — say, that the distance to these two homes is the same plus or minus two miles, or that these two cups of soda have the same amount of drink plus or minus a half-ounce, or so — what we do is prove that we can pick some arbitrary small tolerated difference, and find that the number we want and the number we can calculate must be smaller than that tolerated difference. But that tolerated difference might be any positive number. We weren’t given it up front. If the difference is smaller than any positive number, then, we can, at least in imagination, make sure the difference is smaller than every positive number, however tiny. The conclusion, then, is that if the difference between what-we-want and what-we-have is smaller than every positive number, then the difference must be zero. The two quantities have to be equal.

That probably read fairly smoothly. It’s worth going over and thinking about closely because, at least in my experience, that’s one of the spots where calculus and analysis gets really confusing. It’s going to deserve some examples.