Previous supplemental reading for Why Stuff Can Orbit:
This is another supplemental piece because it’s too much to include in the next bit of Why Stuff Can Orbit. I need some more stuff about how a mathematical physicist would look at something.
This is also a story about approximations. A lot of mathematics is really about approximations. I don’t mean numerical computing. We all know that when we compute we’re making approximations. We use 0.333333 instead of one-third and we use 3.141592 instead of π. But a lot of precise mathematics, what we call analysis, is also about approximations. We do this by a logical structure that works something like this: take something we want to prove. Now for every positive number ε we can find something — a point, a function, a curve — that’s no more than ε away from the thing we’re really interested in, and which is easier to work with. Then we prove whatever we want to with the easier-to-work-with thing. And since ε can be as tiny a positive number as we want, we can suppose ε is a tinier difference than we can hope to measure. And so the difference between the thing we’re interested in and the thing we’ve proved something interesting about is zero. (This is the part that feels like we’re pulling a scam. We’re not, but this is where it’s worth stopping and thinking about what we mean by “a difference between two things”. When you feel confident this isn’t a scam, continue.) So we proved whatever we proved about the thing we’re interested in. Take an analysis course and you will see this all the time.
When we get into mathematical physics we do a lot of approximating functions with polynomials. Why polynomials? Yes, because everything is polynomials. But also because polynomials make so much mathematical physics easy. Polynomials are easy to calculate, if you need numbers. Polynomials are easy to integrate and differentiate, if you need analysis. Here that’s the calculus that tells you about patterns of behavior. If you want to approximate a continuous function you can always do it with a polynomial. The polynomial might have to be infinitely long to approximate the entire function. That’s all right. You can chop it off after finitely many terms. This finite polynomial is still a good approximation. It’s just good for a smaller region than the infinitely long polynomial would have been.
Necessary qualifiers: pages 65 through 82 of any book on real analysis.
So. Let me get to functions. I’m going to use a function named ‘f’ because I’m not wasting my energy coming up with good names. (When we get back to the main Why Stuff Can Orbit sequence this is going to be ‘U’ for potential energy or ‘E’ for energy.) It’s got a domain that’s the real numbers, and a range that’s the real numbers. To express this in symbols I can write . If I have some number called ‘x’ that’s in the domain then I can tell you what number in the domain is matched by the function ‘f’ to ‘x’: it’s the number ‘f(x)’. You were expecting maybe 3.5? I don’t know that about ‘f’, not yet anyway. The one thing I do know about ‘f’, because I insist on it as a condition for appearing, is that it’s continuous. It hasn’t got any jumps, any gaps, any regions where it’s not defined. You could draw a curve representing it with a single, if wriggly, stroke of the pen.
I mean to build an approximation to the function ‘f’. It’s going to be a polynomial expansion, a set of things to multiply and add together that’s easy to find. To make this polynomial expansion this I need to choose some point to build the approximation around. Mathematicians call this the “point of expansion” because we froze up in panic when someone asked what we were going to name it, okay? But how are we going to make an approximation to a function if we don’t have some particular point we’re approximating around?
(One answer we find in grad school when we pick up some stuff from linear algebra we hadn’t been thinking about. We’ll skip it for now.)
I need a name for the point of expansion. I’ll use ‘a’. Many mathematicians do. Another popular name for it is ‘x0‘. Or if you’re using some other variable name for stuff in the domain then whatever that variable is with subscript zero.
So my first approximation to the original function ‘f’ is … oh, shoot, I should have some new name for this. All right. I’m going to use ‘F0‘ as the name. This is because it’s one of a set of approximations, each of them a little better than the old. ‘F1‘ will be better than ‘F0‘, but ‘F2‘ will be even better, and ‘F2038‘ will be way better yet. I’ll also say something about what I mean by “better”, although you’ve got some sense of that already.
I start off by calling the first approximation ‘F0‘ by the way because you’re going to think it’s too stupid to dignify with a number as big as ‘1’. Well, I have other reasons, but they’ll be easier to see in a bit. ‘F0‘, like all its sibling ‘Fn‘ functions, has a domain of the real numbers and a range of the real numbers. The rule defining how to go from a number ‘x’ in the domain to some real number in the range?
That is, this first approximation is simply whatever the original function’s value is at the point of expansion. Notice that’s an ‘x’ on the left side of the equals sign and an ‘a’ on the right. This seems to challenge the idea of what an “approximation” even is. But it’s legit. Supposing something to be constant is often a decent working assumption. If you failed to check what the weather for today will be like, supposing that it’ll be about like yesterday will usually serve you well enough. If you aren’t sure where your pet is, you look first wherever you last saw the animal. (Or, yes, where your pet most loves to be. A particular spot, though.)
We can make this rigorous. A mathematician thinks this is rigorous: you pick any margin of error you like. Then I can find a region near enough to the point of expansion. The value for ‘f’ for every point inside that region is ‘f(a)’ plus or minus your margin of error. It might be a small region, yes. Doesn’t matter. It exists, no matter how tiny your margin of error was.
But yeah, that expansion still seems too cheap to work. My next approximation, ‘F1‘, will be a little better. I mean that we can expect it will be closer than ‘F0‘ was to the original ‘f’. Or it’ll be as close for a bigger region around the point of expansion ‘a’. What it’ll represent is a line. Yeah, ‘F0‘ was a line too. But ‘F0‘ is a horizontal line. ‘F1‘ might be a line at some completely other angle. If that works better. The second approximation will look like this:
Here ‘m’ serves its traditional yet poorly-explained role as the slope of a line. What the slope of that line should be we learn from the derivative of the original ‘f’. The derivative of a function is itself a new function, with the same domain and the same range. There’s a couple ways to denote this. Each way has its strengths and weaknesses about clarifying what we’re doing versus how much we’re writing down. And trying to write down almost anything can inspire confusion in analysis later on. There’s a part of analysis when you have to shift from thinking of particular problems to how problems work then.
So I will define a new function, spoken of as f-prime, this way:
If you look closely you realize there’s two different meanings of ‘x’ here. One is the ‘x’ that appears in parentheses. It’s the value in the domain of f and of f’ where we want to evaluate the function. The other ‘x’ is the one in the lower side of the derivative, in that . That’s my sloppiness, but it’s not uniquely mine. Mathematicians keep this straight by using the symbols so much they don’t even see the ‘x’ down there anymore so have no idea there’s anything to find confusing. Students keep this straight by guessing helplessly about what their instructors want and clinging to anything that doesn’t get marked down. Sorry. But what this means is to “take the derivative of the function ‘f’ with respect to its variable, and then, evaluate what that expression is for the value of ‘x’ that’s in parentheses on the left-hand side”. We can do some things that avoid the confusion in symbols there. They all require adding some more variables and some more notation in, and it looks like overkill for a measly definition like this.
Anyway. We really just want the deriviate evaluated at one point, the point of expansion. That is:
which by the way avoids that overloaded meaning of ‘x’ there. Put this together and we have what we call the tangent line approximation to the original ‘f’ at the point of expansion:
This is also called the tangent line, because it’s a line that’s tangent to the original function. A plot of ‘F1‘ and the original function ‘f’ are guaranteed to touch one another only at the point of expansion. They might happen to touch again, but that’s luck. The tangent line will be close to the original function near the point of expansion. It might happen to be close again later on, but that’s luck, not design. Most stuff you might want to do with the original function you can do with the tangent line, but the tangent line will be easier to work with. It exactly matches the original function at the point of expansion, and its first derivative exactly matches the original function’s first derivative at the point of expansion.
We can do better. We can find a parabola, a second-order polynomial that approximates the original function. This will be a function ‘F2(x)’ that looks something like:
What we’re doing is adding a parabola to the approximation. This is that curve that looks kind of like a loosely-drawn U. The ‘m2‘ there measures how spread out the U is. It’s not quite the slope, but it’s kind of like that, which is why I’m using the letter ‘m’ for it. Its value we get from the second derivative of the original ‘f’:
We find the second derivative of a function ‘f’ by evaluating the first derivative, and then, taking the derivative of that. We can denote it with two ‘ marks after the ‘f’ as long as we aren’t stuck wrapping the function name in ‘ marks to set it out. And so we can describe the function this way:
This will be a better approximation to the original function near the point of expansion. Or it’ll make larger the region where the approximation is good.
If the first derivative of a function at a point is zero that means the tangent line is horizontal. In physics stuff this is an equilibrium. The second derivative can tell us whether the equilibrium is stable or not. If the second derivative at the equilibrium is positive it’s a stable equilibrium. The function looks like a bowl open at the top. If the second derivative at the equilibrium is negative then it’s an unstable equilibrium.
We can make better approximations yet, by using even more derivatives of the original function ‘f’ at the point of expansion:
There’s better approximations yet. You can probably guess what the next, fourth-degree, polynomial would be. Or you can after I tell you the fraction in front of the new term will be . The only big difference is that after about the third derivative we give up on adding ‘ marks after the function name ‘f’. It’s just too many little dots. We start writing, like, ‘f(iv)‘ instead. Or if the Roman numerals are too much then ‘f(2038)‘ instead. Or if we don’t want to pin things down to a specific value ‘f(j)‘ with the understanding that ‘j’ is some whole number.
We don’t need all of them. In physics problems we get equilibriums from the first derivative. We get stability from the second derivative. And we get springs in the second derivative too. And that’s what I hope to pick up on in the next installment of the main series.