This was a week of few mathematically-themed comic strips. I don’t mind. If there was a recurring motif, it was about parents not doing mathematics well, or maybe at all. That’s not a very deep observation, though. Let’s look at what is here.
Liniers’s Macanudo for the 18th puts forth 2020 as “the year most kids realized their parents can’t do math”. Which may be so; if you haven’t had cause to do (say) long division in a while then remembering just how to do it is a chore. This trouble is not unique to mathematics, though. Several decades out of regular practice they likely also have trouble remembering what the 11th Amendment to the US Constitution is for, or what the rule is about using “lie” versus “lay”. Some regular practice would correct that, though. In most cases anyway; my experience suggests I cannot possibly learn the rule about “lie” versus “lay”. I’m also shaky on “set” as a verb.
Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 18th shows a mathematician talking, in the jargon of first and second derivatives, to support the claim there’ll never be a mathematician president. Yes, Weinersmith is aware that James Garfield, 20th President of the United States, is famous in trivia circles for having an original proof of the Pythagorean theorem. It would be a stretch to declare Garfield a mathematician, though, except in the way that anyone capable of reason can be a mathematician. Raymond Poincaré, President of France for most of the 1910s and prime minister before and after that, was not a mathematician. He was cousin to Henri Poincaré, who founded so much of our understanding of dynamical systems and of modern geometry. I do not offhand know what presidents (or prime ministers) of other countries have been like.
Weinersmith’s mathematician uses the jargon of the profession. Specifically that of calculus. It’s unlikely to communicate well with the population. The message is an ordinary one, though. The first derivative of something with respect to time means the rate at which things are changing. The first derivative of a thing, with respect to time being positive means that the quantity of the thing is growing. So, that first half means “things are getting more bad”.
The second derivative of a thing with respect to time, though … this is interesting. The second derivative is the same thing as the first derivative with respect to time of “the first derivative with respect to time”. It’s what the change is in the rate-of-change. If that second derivative is negative, then the first derivative will, in time, change from being positive to being negative. So the rate of increase of the original thing will, in time, go from a positive to a negative number. And so the quantity will eventually decline.
So the mathematician is making a this-is-the-end-of-the-beginning speech. The point at which the the second derivative of a quantity changes sign is known as the “inflection point”. Reaching that is often seen as the first important step in, for example, disease epidemics. It is usually the first good news, the promise that there will be a limit to the badness. It’s also sometimes mentioned in economic crises or sometimes demographic trends. “Inflection point” is likely as technical a term as one can expect the general public to tolerate, though. Even that may be pushing things.
Julie Larson’s The Dinette Set rerun for the 21st fusses around words. Along the way Burl mentions his having learned that two negatives can make a positive, in mathematics. Here it’s (most likely) the way that multiplying or dividing two negative numbers will produce a positive number.
I’m again falling behind the comic strips; I haven’t had the writing time I’d like, and that review of last month’s readership has to go somewhere. So let me try to dig my way back to current. The happy news is I get to do one of those single-day Reading the Comics posts, nearly.
Harley Schwadron’s 9 to 5 for the 7th strongly implies that the kid wearing a lemon juicer for his hat has nearly flunked arithmetic. At the least it’s mathematics symbols used to establish this is a school.
Jef Mallett’s Frazz for the 7th has kids thinking about numbers whose (English) names rhyme. And that there are surprisingly few of them, considering that at least the smaller whole numbers are some of the most commonly used words in the language. It would be interesting if there’s some deeper reason that they don’t happen to rhyme, but I would expect that it’s just, well, why should the names of 6 and 8 (say) have anything to do with each other?
There are, arguably, gaps in Evan and Kevyn’s reasoning, and on the 8th one of the other kids brings them up. Basically, is there any reason to say that thirteen and nineteen don’t rhyme? Or that twenty-one and forty-one don’t? Evan writes this off as pedantry. But I, admittedly inclined to be a pedant, think there’s a fair question here. How many numbers do we have names for? Is there something different between the name we have for 11 and the name we have for 1100? Or 2011?
There isn’t an objectively right or wrong answer; at most there are answers that are more or less logically consistent, or that are more or less convenient. Finding what those differences are can be interesting, and I think it bad faith to shut down the argument as “pedantry”.
Dave Whamond’s Reality Check for the 7th claims “birds aren’t partial to fractions” and shows a bird working out, partially with diagrams, the saying about birds in the hand and what they’re worth in the bush.
The narration box, phrasing the bird as not being “partial to fractions”, intrigues me. I don’t know if the choice is coincidental on Whamond’s part. But there is something called “partial fractions” that you get to learn painfully well in Calculus II. It’s used in integrating functions. It turns out that you often can turn a “rational function”, one whose rule is one polynomial divided by another, into the sum of simpler fractions. The point of that is making the fractions into things easier to integrate. The technique is clever, but it’s hard to learn. And, I must admit, I’m not sure I’ve ever used it to solve a problem of interest to me. But it’s very testable stuff.
Today’s A To Z term was suggested by Dina Yagodich, whose YouTube channel features many topics, including calculus and differential equations, statistics, discrete math, and Matlab. Matlab is especially valuable to know as a good quick calculation can answer many questions.
The Wallis named here is John Wallis, an English clergyman and mathematician and cryptographer. His most tweetable work is how we follow his lead in using the symbol ∞ to represent infinity. But he did much in calculus. And it’s a piece of that which brings us to today. He particularly noticed this:
This is an infinite product. It’s multiplication’s answer to the infinite series. It always amazes me when an infinite product works. There are dangers when you do anything with an infinite number of terms. Even the basics of arithmetic, like that you can change the order in which you calculate but still get the same result, break down. Series, in which you add together infinitely many things, are risky, but I’m comfortable with the rules to know when the sum can be trusted. Infinite products seem more mysterious. Then you learn an infinite product converges if and only if the series made from the logarithms of the terms in it also converges. Then infinite products seem less exciting.
There are many infinite products that give us π. Some work quite efficiently, giving us lots of digits for a few terms’ work. Wallis’s formula does not. We need about a thousand terms for it to get us a π of about 3.141. This is a bit much to calculate even today. In 1656, when he published it in Arithmetica Infinitorum, a book I have never read? Wallis was able to do mental arithmetic well. His biography at St Andrews says once when having trouble sleeping he calculated the square root of a 53-digit number in his head, and in the morning, remembered it, and was right. Still, this would be a lot of work. How could Wallis possibly do it? And what work could possibly convince anyone else that he was right?
As it common to striking discoveries it was a mixture of insight and luck and persistence and pattern recognition. He seems to have started with pondering the value of
Happily, he knew exactly what this was: . He knew this because of a bit of insight. We can interpret the integral here as asking for the area that’s enclosed, on a Cartesian coordinate system, by the positive x-axis, the positive y-axis, and the set of points which makes true the equation . This curve is the upper half of a circle with radius 1 and centered on the origin. The area enclosed by all this is one-fourth the area of a circle of radius 1. So that’s how he could know the value of the integral, without doing any symbol manipulation.
The question, in modern notation, would be whether he could do that integral. And, for this? He couldn’t. But, unable to do the problem he wanted, he tried doing the most similar problem he could and see what that proved. was beyond his power to integrate; but what if he swapped those exponents? Worked on instead? This would not — could not — give him what he was interested in. But it would give him something he could calculate. So can we:
And now here comes persistence. What if it’s not inside the parentheses there? If it’s x raised to some other unit fraction instead? What if the parentheses aren’t raised to the second power, but to some other whole number? Might that reveal something useful? Each of these integrals is calculable, and he calculated them. He worked out a table for many values of
for different sets of whole numbers p and q. He trusted that if he kept this up, he’d find some interesting pattern. And he does. The integral, for example, always turns out to be a unit fraction. And there’s a deeper pattern. Let me share results for different values of p and q; the integral is the reciprocal of the number inside the table. The topmost row is values of q; the leftmost column is values of p.
There is a deep pattern here, although I’m not sure Wallis noticed that one. Look along the diagonals, running from lower-left to upper-right. These are the coefficients of the binomial expansion. Yang Hui’s triangle, if you prefer. Pascal’s triangle, if you prefer that. Let me call the term in row p, column q of this table . Then
Great material, anyway. The trouble is that it doesn’t help Wallis with the original problem, which — in this notation — would have and . What he really wanted was the Binomial Theorem, but western mathematicians didn’t know it yet. Here a bit of luck comes in. He had noticed there’s a relationship between terms in one column and terms in another, particularly, that
So why shouldn’t that hold if p and q aren’t whole numbers? … We would today say why should they hold? But Wallis was working with a different idea of mathematical rigor. He made assumptions that it turned out in this case were correct. Of course, had he been wrong, we wouldn’t have heard of any of this and I would have an essay on some other topic.
With luck in Wallis’s favor we can go back to making a table. What would the row for look like? We’ll need both whole and half-integers. is easy; its reciprocal is 1. is also easy; that’s the insight Wallis had to start with. Its reciprocal is . What about the rest? Use the equation just up above, relating to ; then we can start to fill in:
Anything we can learn from this? … Well, sure. For one, as we go left to right, all these entries are increasing. So, like, the second column is less than the third which is less than the fourth. Here’s a triple inequality for you:
Multiply all that through by, on, . And then divide it all through by . What have we got?
I did some rearranging of terms, but, that’s the pattern. One-half π has to be between and four-thirds that.
Move over a little. Start from the row where . This starts us out with
Multiply everything by , and divide everything by and follow with some symbol manipulation. And here’s a tip which would have saved me some frustration working out my notes: . Also, 6 equals 2 times 3. Later on, you may want to remember that 8 equals 2 times 4. All this gets us eventually to
Move over to the next terms, starting from . This will get us eventually to
You see the pattern here. Whatever the value of , it’s squeezed between some number, on the left side of this triple inequality, and that same number times … uh … something like or or or . That last one is a number very close to 1. So the conclusion is that has to equal whatever that pattern is making for the number on the left there.
We can make this more rigorous. Like, we don’t have to just talk about squeezing the number we want between two nearly-equal values. We can rely on the use of the … Squeeze Theorem … to prove this is okay. And there’s much we have to straighten out. Particularly, we really don’t want to write out expressions like
Put that way, it looks like, well, we can divide each 3 in the denominator into a 6 in the numerator to get a 2, each 5 in the denominator to a 10 in the numerator to get a 2, and so on. We get a product that’s infinitely large, instead of anything to do with π. This is that problem where arithmetic on infinitely long strings of things becomes dangerous. To be rigorous, we need to write this product as the limit of a sequence, with finite numerator and denominator, and be careful about how to compose the numerators and denominators.
But this is all right. Wallis found a lovely result and in a way that’s common to much work in mathematics. It used a combination of insight and persistence, with pattern recognition and luck making a great difference. Often when we first find something the proof of it is rough, and we need considerable work to make it rigorous. The path that got Wallis to these products is one we still walk.
These are named for George Green, an English mathematician of the early 19th century. He’s one of those people who gave us our idea of mathematical physics. He’s credited with coining the term “potential”, as in potential energy, and in making people realize how studying this simplified problems. Mostly problems in electricity and magnetism, which were so very interesting back then. On the side also came work in multivariable calculus. His work most famous to mathematics and physics majors connects integrals over the surface of a shape with (different) integrals over the entire interior volume. In more specific problems, he did work on the behavior of water in canals.
There’s a patch of (high school) algebra where you solve systems of equations in a couple variables. Like, you have to do one system where you’re solving, say,
And then maybe later on you get a different problem, one that looks like:
If you solve both of them you notice you’re doing a lot of the same work. All the same hard work. It’s only the part on the right-hand side of the equals signs that are different. Even then, the series of steps you follow on the right-hand-side are the same. They have different numbers is all. What makes the problem distinct is the stuff on the left-hand-side. It’s the set of what coefficients times what variables add together. If you get enough about matrices and vectors you get in the habit of writing this set of equations as one matrix equation, as
Here holds all the unknown variables, your x and y and z and anything else that turns up. Your holds the right-hand side. Do enough of these problems and you notice something. You can describe how to find the solution for these equations before you even know what the right-hand-side is. You can do all the hard work of solving this set of equations for a generic set of right-hand-side constants. Fill them in when you need a particular answer.
I mentioned, while writing about Fourier series, how it turns out most of what you do to numbers you can also do to functions. This really proves itself in differential equations. Both partial and ordinary differential equations. A differential equation works with some not-yet-known function u(x). For what I’m discussing here it doesn’t matter whether ‘x’ is a single variable or a whole set of independent variables, like, x and y and z. I’ll use ‘x’ as shorthand for all that. The differential equation takes u(x) and maybe multiplies it by something, and adds to that some derivatives of u(x) multiplied by something. Those somethings can be constants. They can be other, known, functions with independent variable x. They can be functions that depend on u(x) also. But if they are, then this is a nonlinear differential equation and there’s no solving that.
So suppose we have a linear differential equation. Partial or ordinary, whatever you like. There’s terms that have u(x) or its derivatives in them. Move them all to the left-hand-side. Move everything else to the right-hand-side. This right-hand-side might be constant. It might depend on x. Doesn’t matter. This right-hand-side is some function which I’ll call f(x). This f(x) might be constant; that’s fine. That’s still a legitimate function.
Put this way, every differential equation looks like:
That stuff with u(x) and its derivatives we can call an operator. An operator’s a function which has a domain of functions and a range of functions. So we can give give that a name. ‘L’ is a good name here, because if it’s not the operator for a linear differential equation — a linear operator — then we’re done anyway. So whatever our differential equation was we can write it:
Writing it makes it look like we’re multiplying L by u(x). We’re not. We’re really not. This is more like if ‘L’ is the predicate of a sentence and ‘u(x)’ is the object. Read it like, to make up an example, ‘L’ means ‘three times the second derivative plus two x times’ and ‘u(x)’ as ‘u(x)’.
Still, looking at and then back up at tells you what I’m thinking. We can find some set of instructions to, for any , find the that makes true. So why can’t we find some set of instructions to, for any , find the that makes true?
This is where a Green’s function comes in. Or, like everybody says, “the” Green’s function. “The” here we use like we might talk about “the” roots of a polynomial. Every polynomial has different roots. So, too, does every differential equation have a different Green’s function. What the Green’s function is depends on the equation. It can also depend on what domain the differential equation applies to. It can also depend on some extra information called initial values or boundary values.
The Green’s function for a differential equation has twice as many independent variables as the differential equation has. This seems like we’re making a mess of things. It’s all right. These new variables are the falsework, the scaffolding. Once they’ve helped us get our work done they disappear. This kind of thing we call a “dummy variable”. If x is the actual independent variable, then pick something else — s is a good choice — for the dummy variable. It’s from the same domain as the original x, though. So the Green’s function is some . All right, but how do you find it?
To get this, you have to solve a particular special case of the differential equation. You have to solve:
This may look like we’re not getting anywhere. It may even look like we’re getting in more trouble. What is this , for example? Well, this is a particular and famous thing called the Dirac delta function. It’s called a function as a courtesy to our mathematical physics friends, who don’t care about whether it truly is a function. Dirac is Paul Dirac, from over in physics. The one whose biography is called The Strangest Man. His delta function is a strange function. Let me say that its independent variable is t. Then is zero, unless t is itself zero. If t is zero then is … something. What is that something? … Oh … something big. It’s … just … don’t look directly at it. What’s important is the integral of this function:
I write it this way because there’s delta functions for two-dimensional spaces, three-dimensional spaces, everything. If you integrate over a region that includes the origin, the integral of the delta function is 1. If you integrate over a region that doesn’t, the integral of the delta function is 0.
The delta function has a neat property sometimes called filtering. This is what happens if you integrate some function times the Dirac delta function. Then …
This may look dumb. That’s fine. This scheme is so good at getting rid of integrals where you don’t want them. Or at getting integrals in where it’d be convenient to have.
So, I have a mental model of what the Dirac delta function does. It might help you. Think of beating a drum. It can sound like many different things. It depends on how hard you hit it, how fast you hit it, what kind of stick you use, where exactly you hit it. I think of each differential equation as a different drumhead. The Green’s function is then the sound of a specific, uniform, reference hit at a reference position. This produces a sound. I can use that sound to extrapolate how every different sort of drumming would sound on this particular drumhead.
So solving this one differential equation, to find the Green’s function for a particular case, may be hard. Maybe not. Often it’s easier than some particular f(x) because the Dirac delta function is so weird that it becomes kinda easy-ish. But you do have to find one solution to this differential equation, somehow.
Once you do, though? Once you have this ? That is glorious. Because then, whatever your f is? The solution to is:
Here the integral is over whatever the domain of the differential equation is, and whatever the domain of f is. This last integral is where the dummy variable finally evaporates. All that remains is x, as we want.
A little bit of … arithmetic isn’t the right word. But symbol manipulation will convince you this is right, if you need convincing. (The trick is remembering that ‘x’ and ‘s’ are different variables. When you differentiate with respect to ‘x’, ‘s’ acts like a constant. When you integrate with respect to ‘s’, ‘x’ acts like a constant.)
What can make a Green’s function worth finding is that we do a lot of the same kinds of differential equations. We do a lot of diffusion problems. A lot of wave transmission problems. A lot of wave-transmission-with-losses problems. So there are many problems that can all use the same tools to solve.
Consider remote detection problems. This can include things like finding things underground. It also includes, like, medical sensors. We would like to know “what kind of thing produces a signal like this?” We can detect the signal easily enough. We can model how whatever it is between the thing and our sensors changes what we could detect. (This kind of thing we call an “inverse problem”, finding the thing that could produce what we know.) Green’s functions are one of the ways we can get at the source of what we can see.
Now, Green’s functions are a powerful and useful idea. They sprawl over a lot of mathematical applications. As they do, they pick up regional dialects. Things like deciding that , for example. None of these are significant differences. But before you go poking into someone else’s field and solving their problems, take a moment. Double-check that their symbols do mean precisely what you think they mean. It’ll save you some petty quarrels.
Also, I really don’t like how those systems of equations turned out up at the top of this essay. But I couldn’t work out how to do arrays of equations all lined up along the equals sign, or other mildly advanced LaTeX stuff like doing a function-definition-by-cases. If someone knows of the Real Official Proper List of what you can and can’t do with the LaTeX that comes from a standard free WordPress.com blog I’d appreciate a heads-up. Thank you.
The Extreme Value Theorem, which I chose to write about, is a fundamental bit of analysis. There is also a similarly-named but completely unrelated Extreme Value Theory. This exists in the world of statistics. That’s about outliers, and about how likely it is you’ll find an even more extreme outlier if you continue sampling. This is valuable in risk assessment: put another way, it’s the question of what neighborhoods you expect to flood based on how the river’s overflowed the last hundred years. Or be in a wildfire, or be hit by a major earthquake, or whatever. The more I think about it the more I realize that’s worth discussing too. Maybe in the new year, if I decide to do some A To Z extras.
And then there are theorems that seem the opposite. Ones that seem so obvious, and so obviously true, that they hardly seem like mathematics. If they’re not axioms, they might as well be. The extreme value theorem is one of these.
It’s a theorem about functions. Here, functions that have a domain and a range that are both real numbers. Even more specifically, about continuous functions. “Continuous” is a tricky idea to make precise, but we don’t have to do it. A century of mathematicians worked out meanings that correspond pretty well to what you’d imagine it should mean. It means you can draw a graph representing the function without lifting the pen. (Do not attempt to use this definition at your thesis defense. I’m skipping what a century’s worth of hard thinking about the subject.)
And it’s a theorem about “extreme” values. “Extreme” is a convenient word. It means “maximum or minimum”. We’re often interested in the greatest or least value of a function. Having a scheme to find the maximum is as good as having one to find a minimum. So there’s little point talking about them as separate things. But that forces us to use a bunch of syllables. Or to adopt a convention that “by maximum we always mean maximum or minimum”. We could say we mean that, but I’ll bet a good number of mathematicians, and 95% of mathematics students, would forget the “or minimum” within ten minutes. “Extreme”, then. It’s short and punchy and doesn’t commit us to a maximum or a minimum. It’s simply the most outstanding value we can find.
The Extreme Value Theorem doesn’t help us find them. It only proves to us there is an extreme to find. Particularly, it says that if a continuous function has a domain that’s a closed interval, then it has to have a maximum and a minimum. And it has to attain the maximum and the minimum at least once each. That is, something in the domain matches to the maximum. And something in the domain matches to the minimum. Could be multiple times, yes.
This might not seem like much of a theorem. Existence proofs rarely do. It’s a bias, I suppose. We like to think we’re out looking for solutions. So we suppose there’s a solution to find. Checking that there is an answer before we start looking? That seems excessive. Before heading to the airport we might check the flight wasn’t delayed. But we almost never check that there is still a Newark to fly to. I’m not sure, in working out problems, that we check it explicitly. We decide early on that we’re working with continuous functions and so we can try out the usual approaches. That we use the theorem becomes invisible.
And that’s sort of the history of this theorem. The Extreme Value Theorem, for example, is part of how we now prove Rolle’s Theorem. Rolle’s theorem is about functions continuous and differentiable on the interval from a to b. And functions that have the same value for a and for b. The conclusion is the function hass got a local maximum or minimum in-between these. It’s the theorem depicted in that xkcd comic you maybe didn’t check out a few paragraphs ago. Rolle’s Theorem is named for Michael Rolle, who proved the theorem (for polynomials) in 1691. The Indian mathematician Bhaskara II, in the 12th century, stated the theorem too. (I’m so ignorant of the Indian mathematical tradition that I don’t know whether Bhaskara II stated it for polynomials, or for functions in general, or how it was proved.)
The Extreme Value Theorem was proven around 1860. (There was an earlier proof, by Bernard Bolzano, whose name you’ll find all over talk about limits and functions and continuity and all. But that was unpublished until 1930. The proofs known about at the time were done by Karl Weierstrass. His is the other name you’ll find all over talk about limits and functions and continuity and all. Go on, now, guess who it was proved the Extreme Value Theorem. And guess what theorem, bearing the name of two important 19th-century mathematicians, is at the core of proving that. You need at most two chances!) That is, mathematicians were comfortable using the theorem before it had a clear identity.
Once you know that it’s there, though, the Extreme Value Theorem’s a great one. It’s useful. Rolle’s Theorem I just went through. There’s also the quite similar Mean Value Theorem. This one is about functions continuous and differentiable on an interval. It tells us there’s at least one point where the derivative is equal to the mean slope of the function on that interval. This is another theorem that’s a quick proof once you have the Extreme Value Theorem. Or we can get more esoteric. There’s a technique known as Lagrange Multipliers. It’s a way to find where on a constrained surface a function is at its maximum or minimum. It’s a clever technique, one that I needed time to accept as a thing that could possibly work. And why should it work? Go ahead, guess what the centerpiece of at least one method of proving it is.
Step back from calculus and into real analysis. That’s the study of why calculus works, and how real numbers work. The Extreme Value Theorem turns up again and again. Like, one technique for defining the integral itself is to approximate a function with a “stepwise” function. This is one that looks like a pixellated, rectangular approximation of the function. The definition depends on having a stepwise rectangular approximation that’s as close as you can get to a function while always staying less than it. And another stepwise rectangular approximation that’s as close as you can get while always staying greater than it.
And then other results. Often in real analysis we want to know about whether sets are closed and bounded. The Extreme Value Theorem has a neat corollary. Start with a continuous function with domain that’s a closed and bounded interval. Then, this theorem demonstrates, the range is also a closed and bounded interval. I know this sounds like a technical point. But it is the sort of technical point that makes life easier.
The Extreme Value Theorem even takes on meaning when we don’t look at real numbers. We can rewrite it in topological spaces. These are sets of points for which we have an idea of a “neighborhood” of points. We don’t demand that we know what distance is exactly, though. What had been a closed and bounded interval becomes a mathematical construct called a “compact set”. The idea of a continuous function changes into one about the image of an open set being another open set. And there is still something recognizably the Extreme Value Theorem. It tells us about things called the supremum and infimum, which are slightly different from the maximum and minimum. Just enough to confuse the student taking real analysis the first time through.
Topological spaces are an abstracted concept. Real numbers are topological spaces, yes. But many other things also are. Neighborhoods and compact sets and open sets are also abstracted concepts. And so this theorem has its same quiet utility in these many spaces. It’s just there quietly supporting more challenging work.
I liked that episode. I’ve got happy memories of the time when I first saw it. I thought the sketch in which Crow T Robot got so volume-obsessed was goofy and dumb in the fun-nerd way.
I accept Mr Kassinger’s challenge only I’m going to take it seriously.
How big is a thing?
There is a legend about Thomas Edison. He was unimpressed with a new hire. So he hazed the college-trained engineer who deeply knew calculus. He demanded the engineer tell him the volume within a light bulb. The engineer went to work, making measurements of the shape of the bulb’s outside. And then started the calculations. This involves a calculus technique called “volumes of rotation”. This can tell the volume within a rotationally symmetric shape. It’s tedious, especially if the outer edge isn’t some special nice shape. Edison, fed up, took the bulb, filled it with water, poured that out into a graduated cylinder and said that was the answer.
I’m skeptical of legends. I’m skeptical of stories about the foolish intellectual upstaged by the practical man-of-action. And I’m skeptical of Edison because, jeez, I’ve read biographies of the man. Even the fawning ones make him out to be yeesh.
But the legend’s Edison had a point. If the volume of a shape is not how much stuff fits inside the shape, what is it? And maybe some object has too complicated a shape to find its volume. Can we think of a way to produce something with the same volume, but that is easier? Sometimes we can. When we do this with straightedge and compass, the way the Ancient Greeks found so classy, we call this “quadrature”. It’s called quadrature from its application in two dimensions. It finds, for a shape, a square with the same area. For a three-dimensional object, we find a cube with the same volume. Cubes are easy to understand.
Straightedge and compass can’t do everything. Indeed, there’s so much they can’t do. Some of it is stuff you’d think it should be able to, like, find a cube with the same volume as a sphere. Integration gives us a mathematical tool for describing how much stuff is inside a shape. It’s even got a beautiful shorthand expression. Suppose that D is the shape. Then its volume V is:
Here “dV” is the “volume form”, a description of how the coordinates we describe a space in relate to the volume. The is jargon, meaning, “integrate over the whole volume”. The subscript “D” modifies that phrase by adding “of D” to it. Writing “D” is shorthand for “these are all the points inside this shape, in whatever coordinate system you use”. If we didn’t do that we’d have to say, on each sign, what points are inside the shape, coordinate by coordinate. At this level the equation doesn’t offer much help. It says the volume is the sum of infinitely many, infinitely tiny pieces of volume. True, but that doesn’t give much guidance about whether it’s more or less than two cups of water. We need to get more specific formulas, usually. We need to pick coordinates, for example, and say what coordinates are inside the shape. A lot of the resulting formulas can’t be integrated exactly. Like, an ellipsoid? Maybe you can integrate that. Don’t try without getting hazard pay.
We can approximate this integral. Pick a tiny shape whose volume is easy to know. Fill your shape with duplicates of it. Count the duplicates. Multiply that count by the volume of this tiny shape. Done. This is numerical integration, sometimes called “numerical quadrature”. If we’re being generous, we can say the legendary Edison did this, using water molecules as the tiny shape. And working so that he didn’t need to know the exact count or the volume of individual molecules. Good computational technique.
It’s hard not to feel we’re begging the question, though. We want the volume of something. So we need the volume of something else. Where does that volume come from?
Well, where does an inch come from? Or a centimeter? Whatever unit you use? You pick something to use as reference. Any old thing will do. Which is why you get fascinating stories about choosing what to use. And bitter arguments about which of several alternatives to use. And we express the length of something as some multiple of this reference length.
Volume works the same way. Pick a reference volume, something that can be one unit-of-volume. Other volumes are some multiple of that unit-of-volume. Possibly a fraction of that unit-of-volume.
Usually we use a reference volume that’s based on the reference length. Typically, we imagine a cube that’s one unit of length on each side. The volume of this cube with sides of length 1 unit-of-length is then 1 unit-of-volume. This seems all nice and orderly and it’s surely not because mathematicians have paid off by six-sided-dice manufacturers.
Does it have to be?
That we need some reference volume seems inevitable. We can’t very well say the area of something is ten times nothing-in-particular. Does that reference volume have to be a cube? Or even a rectangle or something else? It seems obvious that we need some reference shape that tiles, that can fill up space by itself … right?
What if we don’t?
I’m going to drop out of three dimensions a moment. Not because it changes the fundamentals, but because it makes something easier. Specifically, it makes it easier if you decide you want to get some construction paper, cut out shapes, and try this on your own. What this will tell us about area is just as true for volume. Area, for a two-dimensional sapce, and volume, for a three-dimensional, describe the same thing. If you’ll let me continue, then, I will.
So draw a figure on a clean sheet of paper. What’s its area? Now imagine you have a whole bunch of shapes with reference areas. A bunch that have an area of 1. That’s by definition. That’s our reference area. A bunch of smaller shapes with an area of one-half. By definition, too. A bunch of smaller shapes still with an area of one-third. Or one-fourth. Whatever. Shapes with areas you know because they’re marked on them.
Here’s one way to find the area. Drop your reference shapes, the ones with area 1, on your figure. How many do you need to completely cover the figure? It’s all right to cover more than the figure. It’s all right to have some of the reference shapes overlap. All you need is to cover the figure completely. … Well, you know how many pieces you needed for that. You can count them up. You can add up the areas of all these pieces needed to cover the figure. So the figure’s area can’t be any bigger than that sum.
Can’t be exact, though, right? Because you might get a different number if you covered the figure differently. If you used smaller pieces. If you arranged them better. This is true. But imagine all the possible reference shapes you had, and all the possible ways to arrange them. There’s some smallest area of those reference shapes that would cover your figure. Is there a more sensible idea for what the area of this figure would be?
And put this into three dimensions. If we start from some reference shapes of volume 1 and maybe 1/2 and 1/3 and whatever other useful fractions there are? Doesn’t this covering make sense as a way to describe the volume? Cubes or rectangles are easy to imagine. Tetrahedrons too. But why not any old thing? Why not, as the Mystery Science Theater 3000 episode had it, turkeys?
This is a nice, flexible, convenient way to define area. So now let’s see where it goes all bizarre. We know this thanks to Giuseppe Peano. He’s among the late-19th/early-20th century mathematicians who shaped modern mathematics. They did this by showing how much of our mathematics broke intuition. Peano was (here) exploring what we now call fractals. And noted a family of shapes that curl back on themselves, over and over. They’re beautiful.
And they fill area. Fill volume, if done in three dimensions. It seems impossible. If we use this covering scheme, and try to find the volume of a straight line, we get zero. Well, we find that any positive number is too big, and from that conclude that it has to be zero. Since a straight line has length, but not volume, this seems fine. But a Peano curve won’t go along with this. A Peano curve winds back on itself so much that there is some minimum volume to cover it.
This unsettles. But this idea of volume (or area) by covering works so well. To throw it away seems to hobble us. So it seems worth the trade. We allow ourselves to imagine a line so long and so curled up that it has a volume. Amazing.
And now I get to relax and unwind and enjoy a long weekend before coming to the letter ‘W’. That’ll be about some topic I figure I can whip out a nice tight 500 words about, and instead, produce some 1541-word monstrosity while I wonder why I’ve had no free time at all since August. Tuesday, give or take, it’ll be available at this link, as are the rest of these glossary posts. Thanks for reading.
While putting together the last comics from a week ago I realized there was a repeat among them. And a pretty recent repeat too. I’m supposing this is a one-off, but who can be sure? We’ll get there. I figure to cover last week’s mathematically-themed comics in posts on Wednesday and Thursday, subject to circumstances.
As fits the joke, the bit of calculus in this textbook paragraph is wrong. does not equal . This is even ignoring that we should expect, with an indefinite integral like this, a constant of integration. An indefinite integral like this is equal to a family of related functions. But it’s common shorthand to write out one representative function. But the indefinite integral of is not . You can confirm that by differentiating . The result is nothing like . Differentiating an indefinite integral should get the original function back. Here are the rules you need to do that for yourself.
As I make it out, a correct indefinite integral would be:
Plus that “constant of integration” the value of which we can’t tell just from the function we want to indefinitely-integrate. I admit I haven’t double-checked that I’m right in my work here. I trust someone will tell me if I’m not. I’m going to feel proud enough if I can get the LaTeX there to display.
Stephen Beals’s Adult Children for the 27th has run already. It turned up in late March of this year. Michael Spivak’s Calculus is a good choice for representative textbook. Calculus holds its terrors, too. Even someone who’s gotten through trigonometry can find the subject full of weird, apparently arbitrary rules. And formulas like those in the above paragraph.
Rob Harrell’s Big Top for the 27th is a strip about the difficulties of splitting a restaurant bill. And they’ve not even got to calculating the tip. (Maybe it’s just a strip about trying to push the group to splitting the bill a way that lets you off cheap. I haven’t had to face a group bill like this in several years. My skills with it are rusty.)
I got an irresistible topic for today’s essay. It’s courtesy Peter Mander, author of Carnot Cycle, “the classical blog about thermodynamics”. It’s bimonthly and it’s one worth waiting for. Some of the essays are historical; some are statistical-mechanics; many are mixtures of them. You could make a fair argument that thermodynamics is the most important field of physics. It’s certainly one that hasn’t gotten the popularization treatment it deserves, for its importance. Mander is doing something to correct that.
It is hard to think of limits without thinking of motion. The language even professional mathematicians use suggests it. We speak of the limit of a function “as x goes to a”, or “as x goes to infinity”. Maybe “as x goes to zero”. But a function is a fixed thing, a relationship between stuff in a domain and stuff in a range. It can’t change any more than January, AD 1988 can change. And ‘x’ here is a dummy variable, part of the scaffolding to let us find what we want to know. I suppose ‘x’ can change, but if we ever see it, something’s gone very wrong. But we want to use it to learn something about a function for a point like ‘a’ or ‘infinity’ or ‘zero’.
The language of motion helps us learn, to a point. We can do little experiments: if , then, what should we expect it to be for x near zero? It’s irresistible to try out the calculator. Let x be 0.1. 0.01. 0.001. 0.0001. The numbers say this f(x) gets closer and closer to 1. That’s good, right? We know we can’t just put in an x of zero, because there’s some trouble that makes. But we can imagine creeping up on the zero we really wanted. We might spot some obvious prospects for mischief: what if x is negative? We should try -0.1, -0.01, -0.001 and so on. And maybe we won’t get exactly the right answer. But if all we care about is the first (say) three digits and we try out a bunch of x’s and the corresponding f(x)’s agree to those three digits, that’s good enough, right?
This is good for giving an idea of what to expect a limit to look like. It should be, well, what it really really really looks like a function should be. It takes some thinking to see where it might go wrong. It might go to different numbers based on which side you approach from. But that seems like something you can rationalize. Indeed, we do; we can speak of functions having different limits based on what direction you approach from. Sometimes that’s the best one can say about them.
But it can get worse. It’s possible to make functions that do crazy weird things. Some of these look like you’re just trying to be difficult. Like, set f(x) equal to 1 if x is rational and 0 if x is irrational. If you don’t expect that to be weird you’re not paying attention. Can’t blame someone for deciding that falls outside the realm of stuff you should be able to find limits for. And who would make, say, an f(x) that was 1 if x was 0.1 raised to some power, but 2 if x was 0.2 raised to some power, and 3 otherwise? Besides someone trying to prove a point?
Fine. But you can make a function that looks innocent and yet acts weird if the domain is two-dimensional. Or more. It makes sense to say that the functions I wrote in the above paragraph should be ruled out of consideration. But the limit of at the origin? You get different results approaching in different directions. And the function doesn’t give obvious signs of imminent danger here.
We need a better idea. And we even have one. This took centuries of mathematical wrangling and arguments about what should and shouldn’t be allowed. This should inspire sympathy with Intro Calc students who don’t understand all this by the end of week three. But here’s what we have.
I need a supplementary idea first. That is the neighborhood. A point has a neighborhood if there’s some open set that contains it. We represent this by drawing a little blob around the point we care about. If we’re looking at the neighborhood of a real number, then this is a little interval, that’s all. When we actually get around to calculating, we make these neighborhoods little circles. Maybe balls. But when we’re doing proofs about how limits work, or how we use them to prove things, we make blobs. This “neighborhood” idea looks simple, but we need it, so here we go.
So start with a function, named ‘f’. It has a domain, which I’ll call ‘D’. And a range, which I want to call ‘R’, but I don’t think I need the shorthand. Now pick some point ‘a’. This is the point at which we want to evaluate the limit. This seems like it ought to be called the “limit point” and it’s not. I’m sorry. Mathematicians use “limit point” to talk about something else. And, unfortunately, it makes so much sense in that context that we aren’t going to change away from that.
‘a’ might be in the domain ‘D’. It might not. It might be on the border of ‘D’. All that’s important is that there be a neighborhood inside ‘D’ that contains ‘a’.
I don’t know what f(a) is. There might not even be an f(a), if a is on the boundary of the domain ‘D’. But I do know that everything inside the neighborhood of ‘a’, apart from ‘a’, is in the domain. So we can look at the values of f(x) for all the x’s in this neighborhood. This will create a set, in the range, that’s known as the image of the neighborhood. It might be a continuous chunk in the range. It might be a couple of chunks. It might be a single point. It might be some crazy-quilt set. Depends on ‘f’. And the neighborhood. No matter.
Now I need you to imagine the reverse. Pick a point in the range. And then draw a neighborhood around it. Then pick out what we call the pre-image of it. That’s all the points in the domain that get matched to values inside that neighborhood. Don’t worry about trying to do it; that’s for the homework practice. Would you agree with me that you can imagine it?
I hope so because I’m about to describe the part where Intro Calc students think hard about whether they need this class after all.
All right. Then I want something in the range. I’m going to call it ‘L’. And it’s special. It’s the limit of ‘f’ at ‘a’ if this following bit is true:
Think of every neighborhood you could pick of ‘L’. Can be big, can be small. Just has to be a neighborhood of ‘L’. Now think of the pre-image of that neighborhood. Is there always a neighborhood of ‘a’ inside that pre-image? It’s okay if it’s a tiny neighborhood. Just has to be an open neighborhood. It doesn’t have to contain ‘a’. You can allow a pinpoint hole there.
If you can always do this, however tiny the neighborhood of ‘L’ is, then the limit of ‘f’ at ‘a’ is ‘L’. If you can’t always do this — if there’s even a single exception — then there is no limit of ‘f’ at ‘a’.
I know. I felt like that the first couple times through the subject too. The definition feels backward. Worse, it feels like it begs the question. We suppose there’s an ‘L’ and then test these properties about it and then if it works we say we’re done? I know. It’s a pain when you start calculating this with specific formulas and all that, too. But supposing there is an answer and then learning properties about it, including whether it can exist? That’s a slick trick. We can use it.
Thing is, the pain is worth it. We can calculate with it and not have to out-think tricky functions. It works for domains with as many dimensions as you need. It works for limits that aren’t inside the domain. It works with domains and ranges that aren’t real numbers. It works for functions with weird and complicated domains. We can adapt it if we want to consider limits that are constrained in some way. It won’t be fooled by tricks like I put up above, the f(x) with different rules for the rational and irrational numbers.
So mathematicians shrug, and do enough problems that they get the hang of it, and use this definition. It’s worth it, once you get there.
I’m back to requests! Today’s comes from commenter Dina Yagodich. I don’t know whether Yagodich has a web site, YouTube channel, or other mathematics-discussion site, but am happy to pass along word if I hear of one.
Let me start by explaining integral calculus in two paragraphs. One of the things done in it is finding a `definite integral’. This is itself a function. The definite integral has as its domain the combination of a function, plus some boundaries, and its range is numbers. Real numbers, if nobody tells you otherwise. Complex-valued numbers, if someone says it’s complex-valued numbers. Yes, it could have some other range. But if someone wants you to do that they’re obliged to set warning flares around the problem and precede and follow it with flag-bearers. And you get at least double pay for the hazardous work. The function that gets definite-integrated has its own domain and range. The boundaries of the definite integral have to be within the domain of the integrated function.
For real-valued functions this definite integral has a great physical interpretation. A real-valued function means the domain and range are both real numbers. You see a lot of these. Call the function ‘f’, please. Call its independent variable ‘x’ and its dependent variable ‘y’. Using Euclidean coordinates, or as normal people call it “graph paper”, draw the points that make true the equation “y = f(x)”. Then draw in the x-axis, that is, the points where “y = 0”. The boundaries of the definite integral are going to be two values of ‘x’, a lower and an upper bound. Call that lower bound ‘a’ and the upper bound ‘b’. And heck, call that a “left boundary” and a “right boundary”, because … I mean, look at them. Draw the vertical line at “x = a” and the vertical line at “x = b”. If ‘f(x)’ is always a positive number, then there’s a shape bounded below by “y = 0”, on the left by “x = a”, on the right by “x = b”, and above by “y = f(x)”. And the definite integral is the area of that enclosed space. If ‘f(x)’ is sometimes zero, then there’s several segments, but their combined area is the definite integral. If ‘f(x)’ is sometimes below zero, then there’s several segments. The definite integral is the sum of the areas of parts above “y = 0” minus the area of the parts below “y = 0”.
(Why say “left boundary” instead of “lower boundary”? Taste, pretty much. But I look at the words “lower boundary” and think about the lower edge, that is, the line where “y = 0” here. And “upper boundary” makes sense as a way to describe the curve where “y = f(x)” as well as “x = b”. I’m confusing enough without making the simple stuff ambiguous.)
Don’t try to pass your thesis defense on this alone. But it’s what you need to understand ‘e’. Start out with the function ‘f’, which has domain of the positive real numbers and range of the positive real numbers. For every ‘x’ in the domain, ‘f(x)’ is the reciprocal, one divided by x. This is a shape you probably know well. It’s a hyperbola. Its asymptotes are the x-axis and the y-axis. It’s a nice gentle curve. Its plot passes through such famous points as (1, 1), (2, 1/2), (1/3, 3), and pairs like that. (10, 1/10) and (1/100, 100) too. ‘f(x)’ is always positive on this domain. Use as left boundary the line “x = 1”. And then — let’s think about different right boundaries.
If the right boundary is close to the left boundary, then this area is tiny. If it’s at, like, “x = 1.1” then the area can’t be more than 0.1. (It’s less than that. If you don’t see why that’s so, fit a rectangle of height 1 and width 0.1 around this curve and these boundaries. See?) But if the right boundary is farther out, this area is more. It’s getting bigger if the right boundary is “x = 2” or “x = 3”. It can get bigger yet. Give me any positive number you like. I can find a right boundary so the area inside this is bigger than your number.
Is there a right boundary where the area is exactly 1? … Well, it’s hard to see how there couldn’t be. If a quantity (“area between x = 1 and x = b”) changes from less than one to greater than one, it’s got to pass through 1, right? … Yes, it does, provided some technical points are true, and in this case they are. So that’s nice.
And there is. It’s a number (settle down, I see you quivering with excitement back there, waiting for me to unveil this) a slight bit more than 2.718. It’s a neat number. Carry it out a couple more digits and it turns out to be 2.718281828. So it looks like a great candidate to memorize. It’s not. It’s an irrational number. The digits go off without repeating or falling into obvious patterns after that. It’s a transcendental number, which has to do with polynomials. Nobody knows whether it’s a normal number, because remember, a normal number is just any real number that you never heard of. To be a normal number, every finite string of digits has to appear in the decimal expansion, just as often as every other string of digits of the same length. We can show by clever counting arguments that roughly every number is normal. Trick is it’s hard to show that any particular number is.
So let me do another definite integral. Set the left boundary to this “x = 2.718281828(etc)”. Set the right boundary a little more than that. The enclosed area is less than 1. Set the right boundary way off to the right. The enclosed area is more than 1. What right boundary makes the enclosed area ‘1’ again? … Well, that will be at about “x = 7.389”. That is, at the square of 2.718281828(etc).
Repeat this. Set the left boundary at “x = (2.718281828etc)2”. Where does the right boundary have to be so the enclosed area is 1? … Did you guess “x = (2.718281828etc)3”? Yeah, of course. You know my rhetorical tricks. What do you want to guess the area is between, oh, “x = (2.718281828etc)3” and “x = (2.718281828etc)5”? (Notice I put a ‘5’ in the superscript there.)
Now, relationships like this will happen with other functions, and with other left- and right-boundaries. But if you want it to work with a function whose rule is as simple as “f(x) = 1 / x”, and areas of 1, then you’re going to end up noticing this 2.718281828(etc). It stands out. It’s worthy of a name.
Which is why this 2.718281828(etc) is a number you’ve heard of. It’s named ‘e’. Leonhard Euler, whom you will remember as having written or proved the fundamental theorem for every area of mathematics ever, gave it that name. He used it first when writing for his own work. Then (in November 1731) in a letter to Christian Goldbach. Finally (in 1763) in his textbook Mechanica. Everyone went along with him because Euler knew how to write about stuff, and how to pick symbols that worked for stuff.
Once you know ‘e’ is there, you start to see it everywhere. In Western mathematics it seems to have been first noticed by Jacob (I) Bernoulli, who noticed it in toy compound interest problems. (Given this, I’d imagine it has to have been noticed by the people who did finance. But I am ignorant of the history of financial calculations. Writers of the kind of pop-mathematics history I read don’t notice them either.) Bernoulli and Pierre Raymond de Montmort noticed the reciprocal of ‘e’ turning up in what we’ve come to call the ‘hat check problem’. A large number of guests all check one hat each. The person checking hats has no idea who anybody is. What is the chance that nobody gets their correct hat back? … That chance is the reciprocal of ‘e’. The number’s about 0.368. In a connected but not identical problem, suppose something has one chance in some number ‘N’ of happening each attempt. And it’s given ‘N’ attempts given for it to happen. What’s the chance that it doesn’t happen? The bigger ‘N’ gets, the closer the chance it doesn’t happen gets to the reciprocal of ‘e’.
It comes up in peculiar ways. In high school or freshman calculus you see it defined as what you get if you take for ever-larger real numbers ‘x’. (This is the toy-compound-interest problem Bernoulli found.) But you can find the number other ways. You can calculate it — if you have the stamina — by working out the value of
There’s a simpler way to write that. There always is. Take all the nonnegative whole numbers — 0, 1, 2, 3, 4, and so on. Take their factorials. That’s 1, 1, 2, 6, 24, and so on. Take the reciprocals of all those. That’s … 1, 1, one-half, one-sixth, one-twenty-fourth, and so on. Add them all together. That’s ‘e’.
This ‘e’ turns up all the time. Any system whose rate of growth depends on its current value has an ‘e’ lurking in its description. That’s true if it declines, too, as long as the decline depends on its current value. It gets stranger. Cross ‘e’ with complex-valued numbers and you get, not just growth or decay, but oscillations. And many problems that are hard to solve to start with become doable, even simple, if you rewrite them as growths and decays and oscillations. Through ‘e’ problems too hard to do become problems of polynomials, or even simpler things.
Simple problems become that too. That property about the area underneath “f(x) = 1/x” between “x = 1” and “x = b” makes ‘e’ such a natural base for logarithms that we call it the base for natural logarithms. Logarithms let us replace multiplication with addition, and division with subtraction, easier work. They change exponentiation problems to multiplication, again easier. It’s a strange touch, a wondrous one.
There are some numbers interesting enough to attract books about them. π, obviously. 0. The base of imaginary numbers, , has a couple. I only know one pop-mathematics treatment of ‘e’, Eli Maor’s e: The Story Of A Number. I believe there’s room for more.
You know, the way anyone’s calculator will let you raise 2 to the 85th power. And then raise 3 to whatever number that is. Anyway. The digits of this will agree with the digits of ‘e’ for the first 18,457,734,525,360,901,453,873,570 decimal digits. One Richard Sabey found that, by what means I do not know, in 2004. The page linked there includes a bunch of other, no less amazing, approximations to numbers like ‘e’ and π and the Euler-Mascheroni Constant.
I haven’t got any good ideas for the title for this collection of mathematically-themed comic strips. But I was reading the Complete Peanuts for 1999-2000 and just ran across one where Rerun talked about consoling his basketball by bringing it to a nice warm gymnasium somewhere. So that’s where that pile of words came from.
Mark Anderson’s Andertoons for the 21st is the Mark Anderson’s Andertoons for this installment. It has Wavehead suggest a name for the subtraction of fractions. It’s not by itself an absurd idea. Many mathematical operations get specialized names, even though we see them as specific cases of some more general operation. This may reflect the accidents of history. We have different names for addition and subtraction, though we eventually come to see them as the same operation.
In calculus we get introduced to Maclaurin Series. These are polynomials that approximate more complicated functions. They’re the best possible approximations for a region around 0 in the domain. They’re special cases of the Taylor Series. Those are polynomials that approximate more complicated functions. But you get to pick where in the domain they should be the best approximation. Maclaurin series are nothing but a Taylor series; we keep the names separate anyway, for the reasons. And slightly baffling ones; James Gregory and Brook Taylor studied Taylor series before Colin Maclaurin did Maclaurin series. But at least Taylor worked on Taylor series, and Maclaurin on Macularin series. So for a wonder mathematicians named these things for appropriate people. (Ignoring that Indian mathematicians were poking around this territory centuries before the Europeans were. I don’t know whether English mathematicians of the 18th century could be expected to know of Indian work in the field, in fairness.)
In numerical calculus, we have a scheme for approximating integrals known as the trapezoid rule. It approximates the areas under curves by approximating a curve as a trapezoid. (Any questions?) But this is one of the Runge-Kutta methods. Nobody calls it that except to show they know neat stuff about Runge-Kutta methods. The special names serve to pick out particularly interesting or useful cases of a more generally used thing. Wavehead’s coinage probably won’t go anywhere, but it doesn’t hurt to ask.
Percy Crosby’s Skippy for the 22nd I admit I don’t quite understand. It mentions arithmetic anyway. I think it’s a joke about a textbook like this being good only if it’s got the questions and the answers. But it’s the rare Skippy that’s as baffling to me as most circa-1930 humor comics are.
Ham’s Life on Earth for the 23rd presents the blackboard full of symbols as an attempt to prove something challenging. In this case, to say something about the existence of God. It’s tempting to suppose that we could say something about the existence or nonexistence of God using nothing but logic. And there are mathematics fields that are very close to pure logic. But our scary friends in the philosophy department have been working on the ontological argument for a long while. They’ve found a lot of arguments that seem good, and that fall short for reasons that seem good. I’ll defer to their experience, and suppose that any mathematics-based proof to have the same problems.
Bill Amend’s FoxTrot Classics for the 23rd deploys a Maclaurin series. If you want to calculate the cosine of an angle, and you know the angle in radians, you can find the value by adding up the terms in an infinitely long series. So if θ is the angle, measured in radians, then its cosine will be:
60 degrees is in radians and you see from the comic how to turn this series into a thing to calculate. The series does, yes, go on forever. But since the terms alternate in sign — positive then negative then positive then negative — you have a break. Suppose all you want is the answer to within an error margin. Then you can stop adding up terms once you’ve gotten to a term that’s smaller than your error margin. So if you want the answer to within, say, 0.001, you can stop as soon as you find a term with absolute value less than 0.001.
For high school trig, though, this is all overkill. There’s five really interesting angles you’d be expected to know anything about. They’re 0, 30, 45, 60, and 90 degrees. And you need to know about reflections of those across the horizontal and vertical axes. Those give you, like, -30 degrees or 135 degrees. Those reflections don’t change the magnitude of the cosines or sines. They might change the plus-or-minus sign is all. And there’s only three pairs of numbers that turn up for these five interesting angles. There’s 0 and 1. There’s and . There’s and . Three things to memorize, plus a bit of orienteering, to know whether the cosine or the sine should be the larger size and whether they should positive or negative. And then you’ve got them all.
You might get asked for, like, the sine of 15 degrees. But that’s someone testing whether you know the angle-addition or angle-subtraction formulas. Or the half-angle and double-angle formulas. Nobody would expect you to know the cosine of 15 degrees. The cosine of 30 degrees, though? Sure. It’s .
Mike Thompson’s Grand Avenue for the 23rd is your basic confused-student joke. People often have trouble going from percentages to decimals to fractions and back again. Me, I have trouble in going from percentage chances to odds, as in, “two to one odds” or something like that. (Well, “one to one odds” I feel confident in, and “two to one” also. But, say, “seven to five odds” I can’t feel sure I understand, other than that the second choice is a perceived to be a bit more likely than the first.)
… You know, this would have parsed as the Maclaurin Series Edition, wouldn’t it? Well, if only I were able to throw away words I’ve already written and replace them with better words before publishing, huh?
I hate to disillusion anyone but I lack hard rules about what qualifies as a mathematically-themed comic strip. During a slow week, more marginal stuff makes it. This past week was going slow enough that I tagged Wednesday’s Quincy rerun, from March of 1979 for possible inclusion. And all it does is mention that Quincy’s got a mathematics test due. Fortunately for me the week picked up a little. It cheats me of an excuse to point out Ted Shearer’s art style to people, but that’s not really my blog’s business.
Also it may not surprise you but since I’ve decided I need to include GoComics images I’ve gotten more restrictive. Somehow the bit of work it takes to think of a caption and to describe the text and images of a comic strip feel like that much extra work.
Roy Schneider’s The Humble Stumble for the 13th of May is a logic/geometry puzzle. Is it relevant enough for here? Well, I spent some time working it out. And some time wondering about implicit instructions. Like, if the challenge is to have exactly four equally-sized boxes after two toothpicks are moved, can we have extra stuff? Can we put a toothpick where it’s just a stray edge, part of no particular shape? I can’t speak to how long you stay interested in this sort of puzzle. But you can have some good fun rules-lawyering it.
Jeff Harris’s Shortcuts for the 13th is a children’s informational feature about Aristotle. Aristotle is renowned for his mathematical accomplishments by many people who’ve got him mixed up with Archimedes. Aristotle it’s harder to say much about. He did write great texts that pop-science writers credit as giving us the great ideas about nature and physics and chemistry that the Enlightenment was able to correct in only about 175 years of trying. His mathematics is harder to summarize though. We can say certainly that he knew some mathematics. And that he encouraged thinking of subjects as built on logical deductions from axioms and definitions. So there is that influence.
Dan Thompson’s Brevity for the 15th is a pun, built on the bell curve. This is also known as the Gaussian distribution or the normal distribution. It turns up everywhere. If you plot how likely a particular value is to turn up, you get a shape that looks like a slightly melted bell. In principle the bell curve stretches out infinitely far. In practice, the curve turns into a horizontal line so close to zero you can’t see the difference once you’re not-too-far away from the peak.
Jason Chatfield’s Ginger Meggs for the 16th I assume takes place in a mathematics class. I’m assuming the question is adding together four two-digit numbers. But “what are 26, 24, 33, and 32” seems like it should be open to other interpretations. Perhaps Mr Canehard was asking for some class of numbers those all fit into. Integers, obviously. Counting numbers. Compound numbers rather than primes. I keep wanting to say there’s something deeper, like they’re all multiples of three (or something) but they aren’t. They haven’t got any factors other than 1 in common. I mention this because I’d love to figure out what interesting commonality those numbers have and which I’m overlooking.
Ed Stein’s Freshly Squeezed for the 17th is a story problem strip. Bit of a passive-aggressive one, in-universe. But I understand why it would be formed like that. The problem’s incomplete, as stated. There could be some fun in figuring out what extra bits of information one would need to give an answer. This is another new-tagged comic.
Henry Scarpelli and Craig Boldman’s Archie for the 19th name-drops calculus, credibly, as something high schoolers would be amazed to see one of their own do in their heads. There’s not anything on the blackboard that’s iconically calculus, it happens. Dilton’s writing out a polynomial, more or less, and that’s a fit subject for high school calculus. They’re good examples on which to learn differentiation and integration. They’re a little more complicated than straight lines, but not too weird or abstract. And they follow nice, easy-to-summarize rules. But they turn up in high school algebra too, and can fit into geometry easily. Or any subject, really, as remember, everything is polynomials.
Mark Anderson’s Andertoons for the 19th is Mark Anderson’s Andertoons for the week. Glad that it’s there. Let me explain why it is proper construction of a joke that a Fibonacci Division might be represented with a spiral. Fibonacci’s the name we give to Leonardo of Pisa, who lived in the first half of the 13th century. He’s most important for explaining to the western world why these Hindu-Arabic numerals were worth learning. But his pop-cultural presence owes to the Fibonacci Sequence, the sequence of numbers 1, 1, 2, 3, 5, 8, and so on. Each number’s the sum of the two before it. And this connects to the Golden Ratio, one of pop mathematics’ most popular humbugs. As the terms get bigger and bigger, the ratio between a term and the one before it gets really close to the Golden Ratio, a bit over 1.618.
So. Draw a quarter-circle that connects the opposite corners of a 1×1 square. Connect that to a quarter-circle that connects opposite corners of a 2×2 square. Connect that to a quarter-circle connecting opposite corners of a 3×3 square. And a 5×5 square, and an 8×8 square, and a 13×13 square, and a 21×21 square, and so on. Yes, there are ambiguities in the way I’ve described this. I’ve tried explaining how to do things just right. It makes a heap of boring words and I’m trying to reduce how many of those I write. But if you do it the way I want, guess what shape you have?
And that is why this is a correctly-formed joke about the Fibonacci Division.
This one I saw through John Allen Paulos’s twitter feed. He points out that it’s like the Collatz conjecture but is, in fact, proven. If you try this yourself don’t make the mistake of giving up too soon. You might figure, like start with 12. Sum the squares of its digits and you get 5, which is neither 1 nor anything in that 4-16-37-58-89-145-42-20 cycle. Not so! Square 5 and you get 25. Square those digits and add them and you get 29. Square those digits and add them and you get 40. And what comes next?
This is about a proof of Fermat’s Theorem of Sums of Two Squares. According to it, a prime number — let’s reach deep into the alphabet and call it p — can be written as the sum of two squares if and only if p is one more than a whole multiple of four. It’s a proof by using fixed point methods. This is a fun kind of proof, at least to my sense of fun. It’s an approach that’s got a clear physical interpretation. Imagine picking up a (thin) patch of bread dough, stretching it out some and maybe rotating it, and then dropping it back on the board. There’s at least one bit of dough that’s landed in the same spot it was before. Once you see this you will never be able to just roll out dough the same way. So here the proof involves setting up an operation on integers which has a fixed point, and that the fixed point makes the property true.
John D Cook, who runs a half-dozen or so mathematics-fact-of-the-day Twitter feeds, looks into calculating the volume of an egg. It involves calculus, as finding the volume of many interesting shapes does. I am surprised to learn the volume can be written out as a formula that depends on the shape of the egg. I would have bet that it couldn’t be expressed in “closed form”. This is a slightly flexible term. It’s meant to mean the thing can be written using only normal, familiar functions. However, we pretend that the inverse hyperbolic tangent is a “normal, familiar” function.
For example, there’s the surface area of an egg. This can be worked out too, again using calculus. It can’t be written even with the inverse hyperbolic cotangent, so good luck. You have to get into numerical integration if you want an answer humans can understand.
Comic Strip Master Command spent most of February making sure I could barely keep up. It didn’t slow down the final week of the month either. Some of the comics were those that I know are in eternal reruns. I don’t think I’m repeating things I’ve already discussed here, but it is so hard to be sure.
Bill Amend’s FoxTrot for the 24th of February has a mathematics problem with a joke answer. The approach to finding the area’s exactly right. It’s easy to find areas of simple shapes like rectangles and triangles and circles and half-circles. Cutting a complicated shape into known shapes, finding those areas, and adding them together works quite well, most of the time. And that’s intuitive enough. There are other approaches. If you can describe the outline of a shape well, you can use an integral along that outline to get the enclosed area. And that amazes me even now. One of the wonders of calculus is that you can swap information about a boundary for information about the interior, and vice-versa. It’s a bit much for even Jason Fox, though.
Jef Mallett’s Frazz for the 25th is a dispute between Mrs Olsen and Caulfield about whether it’s possible to give more than 100 percent. I come down, now as always, on the side that argues it depends what you figure 100 percent is of. If you mean “100% of the effort it’s humanly possible to expend” then yes, there’s no making more than 100% of an effort. But there is an amount of effort reasonable to expect for, say, an in-class quiz. It’s far below the effort one could possibly humanly give. And one could certainly give 105% of that effort, if desired. This happens in the real world, of course. Famously, in the right circles, the Space Shuttle Main Engines normally reached 104% of full throttle during liftoff. That’s because the original specifications for what full throttle would be turned out to be lower than was ultimately needed. And it was easier to plan around running the engines at greater-than-100%-throttle than it was to change all the earlier design documents.
Matt Janz’s Out of the Gene Pool rerun for the 25th tosses off a mention of “New Math”. It’s referenced as a subject that’s both very powerful but also impossible for Pop, as an adult, to understand. It’s an interesting denotation. Usually “New Math”, if it’s mentioned at all, is held up as a pointlessly complicated way of doing simple problems. This is, yes, the niche that “Common Core” has taken. But Janz’s strip might be old enough to predate people blaming everything on Common Core. And it might be character, that the father is old enough to have heard of New Math but not anything in the nearly half-century since. It’s an unusual mention in that “New” Math is credited as being good for things. (I’m aware this strip’s a rerun. I had thought I’d mentioned it in an earlier Reading the Comics post, but can’t find it. I am surprised.)
So, I must confess failure. Not about deciphering Józef Maria Hoëne-Wronski’s attempted definition of π. He’d tried this crazy method throwing a lot of infinities and roots of infinities and imaginary numbers together. I believe I translated it into the language of modern mathematics fairly. And my failure is not that I found the formula actually described the number -½π.
Oh, I had an error in there, yes. And I’d found where it was. It was all the way back in the essay which first converted Wronski’s formula into something respectable. It was a small error, first appearing in the last formula of that essay and never corrected from there. This reinforces my suspicion that when normal people see formulas they mostly look at them to confirm there is a formula there. With luck they carry on and read the sentences around them.
My failure is I wanted to write a bit about boring mistakes. The kinds which you make all the time while doing mathematics work, but which you don’t worry about. Dropped signs. Constants which aren’t divided out, or which get multiplied in incorrectly. Stuff like this which you only detect because you know, deep down, that you should have gotten to an attractive simple formula and you haven’t. Mistakes which are tiresome to make, but never make you wonder if you’re in the wrong job.
The trouble is I can’t think of how to make an essay of that. We don’t tend to rate little mistakes like the wrong sign or the wrong multiple or a boring unnecessary added constant as important. This is because they’re not. The interesting stuff in a mathematical formula is usually the stuff representing variations. Change is interesting. The direction of the change? Eh, nice to know. A swapped plus or minus sign alters your understanding of the direction of the change, but that’s all. Multiplying or dividing by a constant wrongly changes your understanding of the size of the change. But that doesn’t alter what the change looks like. Just the scale of the change. Adding or subtracting the wrong constant alters what you think the change is varying from, but not what the shape of the change is. Once more, not a big deal.
But you also know that instinctively, or at least you get it from seeing how it’s worth one or two points on an exam to write -sin where you mean +sin. Or how if you ask the instructor in class about that 2 where a ½ should be, she’ll say, “Oh, yeah, you’re right” and do a hurried bit of erasing before going on.
Thus my failure: I don’t know what to say about boring mistakes that has any insight.
For the record here’s where I got things wrong. I was creating a function, named ‘f’ and using as a variable ‘x’, to represent Wronski’s formula. I’d gotten to this point:
And then I observed how the stuff in curly braces there is “one of those magic tricks that mathematicians know because they see it all the time”. And I wanted to call in this formula, correctly:
So here’s where I went wrong. I took the way off in the front of that first formula and combined it with the stuff in braces to make 2 times a sine of some stuff. I apologize for this. I must have been writing stuff out faster than I was thinking about it. If I had thought, I would have gone through this intermediate step:
Because with that form in mind, it’s easy to take the stuff in curled braces and the in the denominator. From that we get, correctly, . And then the on the far left of that expression and the on the right multiply together to produce the number 8.
So the function ought to have been, all along:
Not very different, is it? Ah, but it makes a huge difference. Carry through with all the L’Hôpital’s Rule stuff described in previous essays. All the complicated formula work is the same. There’s a different number hanging off the front, waiting to multiply in. That’s all. And what you find, redoing all the work but using this corrected function, is that Wronski’s original mess —
Possibly the book I drew this from misquoted Wronski. It’s at least as good to have a formula for 2π as it is to have one for π. Or Wronski had a mistake in his original formula, and had a constant multiplied out front which he didn’t want. It happens to us all.
Józef Maria Hoëne-Wronski’s had an idea for a new, universal, culturally-independent definition of π. It was this formula that nobody went along with because they had looked at it:
I made some guesses about what he would want this to mean. And how we might put that in terms of modern, conventional mathematics. I describe those in the above links. In terms of limits of functions, I got this:
The trouble is that limit took more work than I wanted to do to evaluate. If you try evaluating that ‘f(x)’ at ∞, you get an expression that looks like zero times ∞. This begs for the use of L’Hôpital’s Rule, which tells you how to find the limit for something that looks like zero divided by zero, or like ∞ divided by ∞. Do a little rewriting — replacing that first ‘x’ with ‘ — and this ‘f(x)’ behaves like L’Hôpital’s Rule needs.
The trouble is, that’s a pain to evaluate. L’Hôpital’s Rule works on functions that look like one function divided by another function. It does this by calculating the derivative of the numerator function divided by the derivative of the denominator function. And I decided that was more work than I wanted to do.
Where trouble comes up is all those parts where turns up. The derivatives of functions with a lot of terms in them get more complicated than the original functions were. Is there a way to get rid of some or all of those?
And there is. Do a change of variables. Let me summon the variable ‘y’, whose value is exactly . And then I’ll define a new function, ‘g(y)’, whose value is whatever ‘f’ would be at . That is, and this is just a little bit of algebra:
The limit of ‘f(x)’ for ‘x’ at ∞ should be the same number as the limit of ‘g(y)’ for ‘y’ at … you’d really like it to be zero. If ‘x’ is incredibly huge, then has to be incredibly small. But we can’t just swap the limit of ‘x’ at ∞ for the limit of ‘y’ at 0. The limit of a function at a point reflects the value of the function at a neighborhood around that point. If the point’s 0, this includes positive and negative numbers. But looking for the limit at ∞ gets at only positive numbers. You see the difference?
… For this particular problem it doesn’t matter. But it might. Mathematicians handle this by taking a “one-sided limit”, or a “directional limit”. The normal limit at 0 of ‘g(y)’ is based on what ‘g(y)’ looks like in a neighborhood of 0, positive and negative numbers. In the one-sided limit, we just look at a neighborhood of 0 that’s all values greater than 0, or less than 0. In this case, I want the neighborhood that’s all values greater than 0. And we write that by adding a little + in superscript to the limit. For the other side, the neighborhood less than 0, we add a little – in superscript. So I want to evalute:
Limits and L’Hôpital’s Rule and stuff work for one-sided limits the way they do for regular limits. So there’s that mercy. The first attempt at this limit, seeing what ‘g(y)’ is if ‘y’ happens to be 0, gives . A zero divided by a zero is promising. That’s not defined, no, but it’s exactly the format that L’Hôpital’s Rule likes. The numerator is:
And the denominator is:
The first derivative of the denominator is blessedly easy: the derivative of y, with respect to y, is 1. The derivative of the numerator is a little harder. It demands the use of the Product Rule and the Chain Rule, just as last time. But these chains are easier.
The first derivative of the numerator is going to be:
Yeah, this is the simpler version of the thing I was trying to figure out last time. Because this is what’s left if I write the derivative of the numerator over the derivative of the denominator:
And now this is easy. Promise. There’s no expressions of ‘y’ divided by other expressions of ‘y’ or anything else tricky like that. There’s just a bunch of ordinary functions, all of them defined for when ‘y’ is zero. If this limit exists, it’s got to be equal to:
is 0. And the sine of 0 is 0. The cosine of 0 is 1. So all this gets to be a lot simpler, really fast.
And 20 is equal to 1. So the part to the left of the + sign there is all zero. What remains is:
And so, finally, we have it. Wronski’s formula, as best I make it out, is a function whose value is …
… So, what Wronski had been looking for, originally, was π. This is … oh, so very close to right. I mean, there’s π right there, it’s just multiplied by an unwanted . The question is, where’s the mistake? Was Wronski wrong to start with? Did I parse him wrongly? Is it possible that the book I copied Wronski’s formula from made a mistake?
Could be any of them. I’d particularly suspect I parsed him wrongly. I returned the library book I had got the original claim from, and I can’t find it again before this is set to publish. But I should check whether Wronski was thinking to find π, the ratio of the circumference to the diameter of a circle. Or might he have looked to find the ratio of the circumference to the radius of a circle? Either is an interesting number worth finding. We’ve settled on the circumference-over-diameter as valuable, likely for practical reasons. It’s much easier to measure the diameter than the radius of a thing. (Yes, I have read the Tau Manifesto. No, I am not impressed by it.) But if you know 2π, then you know π, or vice-versa.
The next question: yeah, but I turned up -½π. What am I talking about 2π for? And the answer there is, I’m not the first person to try working out Wronski’s stuff. You can try putting the expression, as best you parse it, into a tool like Mathematica and see what makes sense. Or you can read, for example, Quora commenters giving answers with way less exposition than I do. And I’m convinced: somewhere along the line I messed up. Not in an important way, but, essentially, doing something equivalent to divided by -2 when I should have multiplied by that.
I’ve spotted my mistake. I figure to come back around to explaining where it is and how I made it.
So now a bit more on Józef Maria Hoëne-Wronski’s attempted definition of π. I had got it rewritten to this form:
And I’d tried the first thing mathematicians do when trying to evaluate the limit of a function at a point. That is, take the value of that point and put it in whatever the formula is. If that formula evaluates to something meaningful, then that value is the limit. That attempt gave this:
Because the limit of ‘x’, for ‘x’ at ∞, is infinitely large. The limit of ‘‘ for ‘x’ at ∞ is 1. The limit of ‘ for ‘x’ at ∞ is 0. We can take limits that are 0, or limits that are some finite number, or limits that are infinitely large. But multiplying a zero times an infinity is dangerous. Could be anything.
Mathematicians have a tool. We know it as L’Hôpital’s Rule. It’s named for the French mathematician Guillaume de l’Hôpital, who discovered it in the works of his tutor, Johann Bernoulli. (They had a contract giving l’Hôpital publication rights. If Wikipedia’s right the preface of the book credited Bernoulli, although it doesn’t appear to be specifically for this. The full story is more complicated and ambiguous. The previous sentence may be said about most things.)
So here’s the first trick. Suppose you’re finding the limit of something that you can write as the quotient of one function divided by another. So, something that looks like this:
(Normally, this gets presented as ‘f(x)’ divided by ‘g(x)’. But I’m already using ‘f(x)’ for another function and I don’t want to muddle what that means.)
Suppose it turns out that at ‘a’, both ‘h(x)’ and ‘g(x)’ are zero, or both ‘h(x)’ and ‘g(x)’ are ∞. Zero divided by zero, or ∞ divided by ∞, looks like danger. It’s not necessarily so, though. If this limit exists, then we can find it by taking the first derivatives of ‘h’ and ‘g’, and evaluating:
That ‘ mark is a common shorthand for “the first derivative of this function, with respect to the only variable we have around here”.
This doesn’t look like it should help matters. Often it does, though. There’s an excellent chance that either ‘h'(x)’ or ‘g'(x)’ — or both — aren’t simultaneously zero, or ∞, at ‘a’. And once that’s so, we’ve got a meaningful limit. This doesn’t always work. Sometimes we have to use this l’Hôpital’s Rule trick a second time, or a third or so on. But it works so very often for the kinds of problems we like to do. Reaches the point that if it doesn’t work, we have to suspect we’re calculating the wrong thing.
But wait, you protest, reasonably. This is fine for problems where the limit looks like 0 divided by 0, or ∞ divided by ∞. What Wronski’s formula got me was 0 times 1 times ∞. And I won’t lie: I’m a little unsettled by having that 1 there. I feel like multiplying by 1 shouldn’t be a problem, but I have doubts.
That zero times ∞ thing, thought? That’s easy. Here’s the second trick. Let me put it this way: isn’t ‘x’ really the same thing as ?
I expect your answer is to slam your hand down on the table and glare at my writing with contempt. So be it. I told you it was a trick.
And it’s a perfectly good one. And it’s perfectly legitimate, too. is a meaningful number if ‘x’ is any finite number other than zero. So is . Mathematicians accept a definition of limit that doesn’t really depend on the value of your expression at a point. So that wouldn’t be meaningful for ‘x’ at zero doesn’t mean we can’t evaluate its limit for ‘x’ at zero. And just because we might not be sure that would mean for infinitely large ‘x’ doesn’t mean we can’t evaluate its limit for ‘x’ at ∞.
I see you, person who figures you’ve caught me. The first thing I tried was putting in the value of ‘x’ at the ∞, all ready to declare that this was the limit of ‘f(x)’. I know my caveats, though. Plugging in the value you want the limit at into the function whose limit you’re evaluating is a shortcut. If you get something meaningful, then that’s the same answer you would get finding the limit properly. Which is done by looking at the neighborhood around but not at that point. So that’s why this reciprocal-of-the-reciprocal trick works.
So back to my function, which looks like this:
Do I want to replace ‘x’ with , or do I want to replace with ? I was going to say something about how many times in my life I’ve been glad to take the reciprocal of the sine of an expression of x. But just writing the symbols out like that makes the case better than being witty would.
So here is a new, L’Hôpital’s Rule-friendly, version of my version of Wronski’s formula:
I put that -2 out in front because it’s not really important. The limit of a constant number times some function is the same as that constant number times the limit of that function. We can put that off to the side, work on other stuff, and hope that we remember to bring it back in later. I manage to remember it about four-fifths of the time.
So these are the numerator and denominator functions I was calling ‘h(x)’ and ‘g(x)’ before:
The limit of both of these at ∞ is 0, just as we might hope. So we take the first derivatives. That for ‘g(x)’ is easy. Anyone who’s reached week three in Intro Calculus can do it. This may only be because she’s gotten bored and leafed through the formulas on the inside front cover of the textbook. But she can do it. It’s:
When I last looked at Józef Maria Hoëne-Wronski’s attempted definition of π I had gotten it to this. Take the function:
And find its limit when ‘x’ is ∞. Formally, you want to do this by proving there’s some number, let’s say ‘L’. And ‘L’ has the property that you can pick any margin-of-error number ε that’s bigger than zero. And whatever that ε is, there’s some number ‘N’ so that whenever ‘x’ is bigger than ‘N’, ‘f(x)’ is larger than ‘L – ε’ and also smaller than ‘L + ε’. This can be a lot of mucking about with expressions to prove.
Fortunately we have shortcuts. There’s work we can do that gets us ‘L’, and we can rely on other proofs that show that this must be the limit of ‘f(x)’ at some value ‘a’. I use ‘a’ because that doesn’t commit me to talking about ∞ or any other particular value. The first approach is to just evaluate ‘f(a)’. If you get something meaningful, great! We’re done. That’s the limit of ‘f(x)’ at ‘a’. This approach is called “substitution” — you’re substituting ‘a’ for ‘x’ in the expression of ‘f(x)’ — and it’s great. Except that if your problem’s interesting then substitution won’t work. Still, maybe Wronski’s formula turns out to be lucky. Fit in ∞ where ‘x’ appears and we get:
So … all right. Not quite there yet. But we can get there. For example, has to be — well. It’s what you would expect if you were a kid and not worried about rigor: 0. We can make it rigorous if you like. (It goes like this: Pick any ε larger than 0. Then whenever ‘x’ is larger than then is less than ε. So the limit of at ∞ has to be 0.) So let’s run with this: replace all those expressions with 0. Then we’ve got:
The sine of 0 is 0. 20 is 1. So substitution tells us limit is -2 times ∞ times 1 times 0. That there’s an ∞ in there isn’t a problem. A limit can be infinitely large. Think of the limit of ‘x2‘ at ∞. An infinitely large thing times an infinitely large thing is fine. The limit of ‘x ex‘ at ∞ is infinitely large. A zero times a zero is fine; that’s zero again. But having an ∞ times a 0? That’s trouble. ∞ times something should be huge; anything times zero should be 0; which term wins?
So we have to fall back on alternate plans. Fortunately there’s a tool we have for limits when we’d otherwise have to face an infinitely large thing times a zero.
I hope to write about this next time. I apologize for not getting through it today but time wouldn’t let me.
I remain fascinated with Józef Maria Hoëne-Wronski’s attempted definition of π. It had started out like this:
And I’d translated that into something that modern mathematicians would accept without flinching. That is to evaluate the limit of a function that looks like this:
So. I don’t want to deal with that f(x) as it’s written. I can make it better. One thing that bothers me is seeing the complex number raised to a power. I’d like to work with something simpler than that. And I can’t see that number without also noticing that I’m subtracting from it raised to the same power. and are a “conjugate pair”. It’s usually nice to see those. It often hints at ways to make your expression simpler. That’s one of those patterns you pick up from doing a lot of problems as a mathematics major, and that then look like magic to the lay audience.
Here’s the first way I figure to make my life simpler. It’s in rewriting that and stuff so it’s simpler. It’ll be simpler by using exponentials. Shut up, it will too. I get there through Gauss, Descartes, and Euler.
At least I think it was Gauss who pointed out how you can match complex-valued numbers with points on the two-dimensional plane. On a sheet of graph paper, if you like. The number matches to the point with x-coordinate 1, y-coordinate 1. The number matches to the point with x-coordinate 1, y-coordinate -1. Yes, yes, this doesn’t sound like much of an insight Gauss had, but his work goes on. I’m leaving it off here because that’s all that I need for right now.
So these two numbers that offended me I can think of as points. They have Cartesian coordinates (1, 1) and (1, -1). But there’s never only one coordinate system for something. There may be only one that’s good for the problem you’re doing. I mean that makes the problem easier to study. But there are always infinitely many choices. For points on a flat surface like a piece of paper, and where the points don’t represent any particular physics problem, there’s two good choices. One is the Cartesian coordinates. In it you refer to points by an origin, an x-axis, and a y-axis. How far is the point from the origin in a direction parallel to the x-axis? (And in which direction? This gives us a positive or a negative number) How far is the point from the origin in a direction parallel to the y-axis? (And in which direction? Same positive or negative thing.)
The other good choice is polar coordinates. For that we need an origin and a positive x-axis. We refer to points by how far they are from the origin, heedless of direction. And then to get direction, what angle the line segment connecting the point with the origin makes with the positive x-axis. The first of these numbers, the distance, we normally label ‘r’ unless there’s compelling reason otherwise. The other we label ‘θ’. ‘r’ is always going to be a positive number or, possibly, zero. ‘θ’ might be any number, positive or negative. By convention, we measure angles so that positive numbers are counterclockwise from the x-axis. I don’t know why. I guess it seemed less weird for, say, the point with Cartesian coordinates (0, 1) to have a positive angle rather than a negative angle. That angle would be , because mathematicians like radians more than degrees. They make other work easier.
So. The point corresponds to the polar coordinates and . The point corresponds to the polar coordinates and . Yes, the θ coordinates being negative one times each other is common in conjugate pairs. Also, if you have doubts about my use of the word “the” before “polar coordinates”, well-spotted. If you’re not sure about that thing where ‘r’ is not negative, again, well-spotted. I intend to come back to that.
With the polar coordinates ‘r’ and ‘θ’ to describe a point I can go back to complex numbers. I can match the point to the complex number with the value given by , where ‘e’ is that old 2.71828something number. Superficially, this looks like a big dumb waste of time. I had some problem with imaginary numbers raised to powers, so now, I’m rewriting things with a number raised to imaginary powers. Here’s why it isn’t dumb.
It’s easy to raise a number written like this to a power. raised to the n-th power is going to be equal to . (Because and we’re going to go ahead and assume this stays true if ‘b’ is a complex-valued number. It does, but you’re right to ask how we know that.) And this turns into raising a real-valued number to a power, which we know how to do. And it involves dividing a number by that power, which is also easy.
And we can get back to something that looks like too. That is, something that’s a real number plus times some real number. This is through one of the many Euler’s Formulas. The one that’s relevant here is that for any real number ‘φ’. So, that’s true also for ‘θ’ times ‘n’. Or, looking to where everybody knows we’re going, also true for ‘θ’ divided by ‘x’.
OK, on to the people so anxious about all this. I talked about the angle made between the line segment that connects a point and the origin and the positive x-axis. “The” angle. “The”. If that wasn’t enough explanation of the problem, mention how your thinking’s done a 360 degree turn and you see it different now. In an empty room, if you happen to be in one. Your pedantic know-it-all friend is explaining it now. There’s an infinite number of angles that correspond to any given direction. They’re all separated by 360 degrees or, to a mathematician, 2π.
And more. What’s the difference between going out five units of distance in the direction of angle 0 and going out minus-five units of distance in the direction of angle -π? That is, between walking forward five paces while facing east and walking backward five paces while facing west? Yeah. So if we let ‘r’ be negative we’ve got twice as many infinitely many sets of coordinates for each point.
This complicates raising numbers to powers. θ times n might match with some point that’s very different from θ-plus-2-π times n. There might be a whole ring of powers. This seems … hard to work with, at least. But it’s, at heart, the same problem you get thinking about the square root of 4 and concluding it’s both plus 2 and minus 2. If you want “the” square root, you’d like it to be a single number. At least if you want to calculate anything from it. You have to pick out a preferred θ from the family of possible candidates.
For me, that’s whatever set of coordinates has ‘r’ that’s positive (or zero), and that has ‘θ’ between -π and π. Or between 0 and 2π. It could be any strip of numbers that’s 2π wide. Pick what makes sense for the problem you’re doing. It’s going to be the strip from -π to π. Perhaps the strip from 0 to 2π.
What this all amounts to is that I can turn this:
without changing its meaning any. Raising a number to the one-over-x power looks different from raising it to the n power. But the work isn’t different. The function I wrote out up there is the same as this function:
I can’t look at that number, , sitting there, multiplied by two things added together, and leave that. (OK, subtracted, but same thing.) I want to something something distributive law something and that gets us here:
Also, yeah, that square root of two raised to a power looks weird. I can turn that square root of two into “two to the one-half power”. That gets to this rewrite:
And then. Those parentheses. e raised to an imaginary number minus e raised to minus-one-times that same imaginary number. This is another one of those magic tricks that mathematicians know because they see it all the time. Part of what we know from Euler’s Formula, the one I waved at back when I was talking about coordinates, is this:
That’s good for any real-valued φ. For example, it’s good for the number . And that means we can rewrite that function into something that, finally, actually looks a little bit simpler. It looks like this:
And that’s the function whose limit I want to take at ∞. No, really.
I ran out of time to do my next bit on Wronski’s attempted definition of π. Next week, all goes well. But I have something to share anyway. William Lane Craig, of the The author of Boxing Pythagoras blog was intrigued by the starting point. And as a fan of studying how people understand infinity and infinitesimals (and how they don’t), this two-century-old example of mixing the numerous and the tiny set his course.
For example, can we speak of a number that’s larger than zero, but smaller than the reciprocal of any positive integer? It’s hard to imagine such a thing. But what if we can show that if we suppose such a number exists, then we can do this logically sound work with it? If you want to say that isn’t enough to show a number exists, then I have to ask how you know imaginary numbers or negative numbers exist.
Standard analysis, you probably guessed, doesn’t do that. It developed over the 19th century when the logical problems of these kinds of numbers seemed unsolvable. Mostly that’s done by limits, showing that a thing must be true whenever some quantity is small enough, or large enough. It seems safe to trust that the infinitesimally small is small enough, and the infinitely large is large enough. And it’s not like mathematicians back then were bad at their job. Mathematicians learned a lot of things about how infinitesimals and infinities work over the late 19th and early 20th century. It makes modern work possible.
Anyway, Boxing Pythagoras goes over what a non-standard analysis treatment of the formula suggests. I think it’s accessible even if you haven’t had much non-standard analysis in your background. At least it worked for me and I haven’t had much of the stuff. I think it’s also accessible if you’re good at following logical argument and won’t be thrown by Greek letters as variables. Most of the hard work is really arithmetic with funny letters. I recommend going and seeing if he did get to π.