My Little 2021 Mathematics A-to-Z: Ordinary Differential Equations


Mr Wu, my Singapore Maths Tuition friend, has offered many fine ideas for A-to-Z topics. This week’s is another of them, and I’m grateful for it.

Ordinary Differential Equations

As a rule, if you can do something with a number, you can do the same thing with a function. Not always, of course, but the exceptions are fewer than you might imagine. I’ll start with one of those things you can do to both.

A powerful thing we learn in (high school) algebra is that we can use a number without knowing what it is. We give it a name like ‘x’ or ‘y’ and describe what we find interesting about it. If we want to know what it is, we (usually) find some equation or set of equations and find what value of x could make that true. If we study enough (college) mathematics we learn its equivalent in functions. We give something a name like f or g or Ψ and describe what we know about it. And then try to find functions which make that true.

There are a couple common types of equation for these not-yet-known functions. The kind you expect to learn as a mathematics major involves differential equations. These are ones where your equation (or equations) involve derivatives of the not-yet-known f. A derivative describes the rate at which something changes. If we imagine the original f is a position, the derivative is velocity. Derivatives can have derivatives also; this second derivative would be the acceleration. And then second derivatives can have derivatives also, and so on, into infinity. When an equation involves a function and its derivatives we have a differential equation.

(The second common type is the integral equation, using a function and its integrals. And a third involves both derivatives and integrals. That’s known as an integro-differential equation, and isn’t life complicated enough? )

Differential equations themselves naturally divide into two kinds, ordinary and partial. They serve different roles. Usually an ordinary differential equation we can describe the change for from knowing only the current situation. (This may include velocities and accelerations and stuff. We could ask what the velocity at an instant means. But never mind that here.) Usually a partial differential equation bases the change where you are on the neighborhood of where your location. If you see holes you can pick in that, you’re right. The precise difference is about the independent variables. If the function f has more than one independent variable, it’s possible to take a partial derivative. This describes how f changes if one variable changes while the others stay fixed. If the function f has only the one independent variable, you can only take ordinary derivatives. So you get an ordinary differential equation.

But let’s speak casually here. If what you’re studying can be fully represented with a dashboard readout? Like, an ordered list of positions and velocities and stuff? You probably have an ordinary differential equation. If you need a picture with a three-dimensional surface or a color map to understand it? You probably have a partial differential equation.

One more metaphor. If you can imagine the thing you’re modeling as a marble rolling around on a hilly table? Odds are that’s an ordinary differential equation. And that representation covers a lot of interesting problems. Marbles on hills, obviously. But also rigid pendulums: we can treat the angle a pendulum makes and the rate at which those change as dimensions of space. The pendulum’s swinging then matches exactly a marble rolling around the right hilly table. Planets in space, too. We need more dimensions — three space dimensions and three velocity dimensions — for each planet. So, like, the Sun-Earth-and-Moon would be rolling around a hilly table with 18 dimensions. That’s all right. We don’t have to draw it. The mathematics works about the same. Just longer.

[ To be precise we need three momentum dimensions for each orbiting body. If they’re not changing mass appreciably, and not moving too near the speed of light, velocity is just momentum times a constant number, so we can use whichever is easier to visualize. ]

We mostly work with ordinary differential equations of either the first or the second order. First order means we have first derivatives in the equation, but never have to deal with more than the original function and its first derivative. Second order means we have second derivatives in the equation, but never have to deal with more than the original function or its first or second derivatives. You’ll never guess what a “third order” differential equation is unless you have experience in reading words. There are some reasons we stick to these low orders like first and second, though. One is that we know of good techniques for solving most first- and second-order ordinary differential equations. For higher-order differential equations we often use techniques that find a related normal old polynomial. Its solution helps with the thing we want. Or we break a high-order differential equation into a set of low-order ones. So yes, again, we search for answers where the light is good. But the good light covers many things we like to look at.

There’s simple harmonic motion, for example. It covers pendulums and springs and perturbations around stable equilibriums and all. This turns out to cover so many problems that, as a physics major, you get a little sick of simple harmonic motion. There’s the Airy function, which started out to describe the rainbow. It turns out to describe particles trapped in a triangular quantum well. The van der Pol equation, about systems where a small oscillation gets energy fed into it while a large oscillation gets energy drained. All kinds of exponential growth and decay problems. Very many functions where pairs of particles interact.

This doesn’t cover everything we would like to do. That’s all right. Ordinary differential equations lend themselves to numerical solutions. It requires considerable study and thought to do these numerical solutions well. But this doesn’t make the subject unapproachable. Few of us could animate the “Pink Elephants on Parade” scene from Dumbo. But could you draw a flip book of two stick figures tossing a ball back and forth? If you’ve had a good rest, a hearty breakfast, and have not listened to the news yet today, so you’re in a good mood?

The flip book ball is a decent example here, too. The animation will look good if the ball moves about the “right” amount between pages. A little faster when it’s first thrown, a bit slower as it reaches the top of its arc, a little faster as it falls back to the catcher. The ordinary differential equation tells us how fast our marble is rolling on this hilly table, and in what direction. So we can calculate how far the marble needs to move, and in what direction, to make the next page in the flip book.

Almost. The rate at which the marble should move will change, in the interval between one flip-book page and the next. The difference, the error, may not be much. But there is a difference between the exact and the numerical solution. Well, there is a difference between a circle and a regular polygon. We have many ways of minimizing and estimating and controlling the error. Doing that is what makes numerical mathematics the high-paid professional industry it is. Our game of catch we can verify by flipping through the book. The motion of four dozen planets and moons attracting one another is harder to be sure we calculate it right.

I said at the top that most anything one can do with numbers one can do with functions also. I would like to close the essay with some great parallel. Like, the way that trying to solve cubic equations made people realize complex numbers were good things to have. I don’t have a good example like that for ordinary differential equations, where the study expanded our ideas of what functions could be. Part of that is that complex numbers are more accessible than the stranger functions. Part of that is that complex numbers have a story behind them. The story features titanic figures like Gerolamo Cardano, Niccolò Tartaglia and Ludovico Ferrari. We see some awesome and weird personalities in 19th century mathematics. But their fights are generally harder to watch from the sidelines and cheer on. And part is that it’s easier to find pop historical treatments of the kinds of numbers. The historiography of what a “function” is is a specialist occupation.

But I can think of a possible case. A tool that’s sometimes used in solving ordinary differential equations is the “Dirac delta function”. Yes, that Paul Dirac. It’s a weird function, written as \delta(x) . It’s equal to zero everywhere, except where x is zero. When x is zero? It’s … we don’t talk about what it is. Instead we talk about what it can do. The integral of that Dirac delta function times some other function can equal that other function at a single point. It strains credibility to call this a function the way we speak of, like, sin(x) or \sqrt{x^2 + 4} being functions. Many will classify it as a distribution instead. But it is so useful, for a particular kind of problem, that it’s impossible to throw away.

So perhaps the parallels between numbers and functions extend that far. Ordinary differential equations can make us notice kinds of functions we would not have seen otherwise.


And with this — I can see the much-postponed end of the Little 2021 Mathematics A-to-Z! You can read all my entries for 2021 at this link, and if you’d like can find all my A-to-Z essays here. How will I finish off the shortest yet most challenging sequence I’ve done yet? Will it be yellow and equivalent to the Axiom of Choice? Answers should come, in a week, if all starts going well.

My All 2020 Mathematics A to Z: Extraneous Solutions


Iva Sallay, the kind author of the Find the Factors recreational mathematics puzzle, suggested this topic for the letter X. It’s a fun chance to look at some of the basics of (high school) algebra again.

Color cartoon illustration of a coati in a beret and neckerchief, holding up a director's megaphone and looking over the Hollywood hills. The megaphone has the symbols + x (division obelus) and = on it. The Hollywood sign is, instead, the letters MATHEMATICS. In the background are spotlights, with several of them crossing so as to make the letters A and Z; one leg of the spotlights has 'TO' in it, so the art reads out, subtly, 'Mathematics A to Z'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Extraneous Solutions.

When developing general relativity, Albert Einstein created a convention. He’s not unique in that. All mathematicians create conventions. They use shorthand for an idea that’s complicated or common. Relatively unique is that other people adopted his convention, because it expressed an idea compactly. This was in working with tensors, which look somewhat like matrixes and have a lot of indexes. In the equations of general relativity you need to take sums over many combinations of values of these indexes. What indexes there are are the same in most every problem. The possible values of the indexes is constant, problem to problem, too.

So Einstein saved himself writing, and his publishers from typesetting, a lot of redundant writing. This by writing out the conditions which implied “take the sums over these indexes on this range”. This is good for people doing general relativity, and certain kinds of geometry. It’s a problem only when an expression escapes its context. When it’s shown to a student or someone who doesn’t know this is a differential-geometry problem. Then the problem becomes confusing, and they can’t work on it.

This is not to fault the Einstein Summation Convention. It puts common necessary scaffolding out of the way and highlighting the interesting unique parts of a problem. Most conventions aim for that. We have the hazard, though, that we may not notice something breaking the convention.

And this is how we create extraneous solutions. And, as a bonus, to have missing solutions. We encounter them with the start of (high school) algebra, when we get used to manipulating equations. When we solve an equation what we always want is something clear, like

x = 2

But it never starts that way. It always starts with something like

x^3 - 8x^2 + 24x - 32 + 22\frac{1}{x} = \frac{6}{x}

or worse. We learn how to handle this. We know that we can do six things that do not alter the truth of an equation. We can regroup terms in the equation. We can add the same number to both sides of the equation. We can multiply both sides of the equation by some number besides zero. We can add zero to one side of the equation. We can multiply one side of the equation by 1. We can replace one quantity with another that has the same value. That doesn’t sound like a lot. It covers more than it seems. Multiplying by 1, for example, is the same as multiplying by \frac{x}{x} . If x isn’t zero, then we can multiply both sides of the equation by that x. And x can’t be zero, or else \frac{x}{x} would not be 1.

So with my example there, start off by multiplying the right side by 1, in the guise \frac{x}{x} . Then multiply both sides by that same non-zero x. At this point the right-hand side simplifies to being 6. Add a -6 to both sides. And then with a lot of shuffling around you work out that the equation is the same as

(x - 2)^4 = 0

And that can only be true when x equals 2.

It should be easy to catch spurious solutions creeping in. They must result from breaking a rule. The obvious problem is multiplying — or dividing — by zero. We expect those to be trouble. Wikipedia has a fine example:

\frac{1}{x - 2} = \frac{3}{x + 2} - \frac{6x}{(x - 2)(x + 2)}

The obvious step is to multiply this whole mess by (x - 2)(x + 2) , which turns our work into a linear equation. Very soon we find the solution must be x = -2 . Which would make at least two of the denominators in the original equation zero. We know not to want that.

The problems can be subtler, though. Consider:

x - 12 = \sqrt{x}

That’s not hard to solve. Multiply both sides by x - 12 . Although, before working out \sqrt{x}\cdot(x - 12) substitute that x - 12 with something equal to it. We know one thing is equal to it, \sqrt{x} . Then we have

(x - 12)^2 = x

It’s a quadratic equation. A little bit of work shows the roots are 9 and 16. One of those answers is correct and the other spurious. At no point did we divide anything, by zero or anything else.

So what is happening and what is the necessary rhetorical link to the Einstein Summation Convention?

There are many ways to look at equations. One that’s common is to look at them as functions. This is so common that we’ll elide between an equation and a function representation. This confuses the prealgebra student who wants to know why sometimes we look at

x^2 - 25x + 144 = 0

and sometimes we look at

f(x) = x^2 - 25x + 144

and sometimes at

f(x) = x^2 - 25x + 144 = 0

The advantage of looking at the function which shadows any equation is we have different tools for studying functions. Sometimes that makes solving the equation easier. In this form, we’re looking for what in the domain matches with something particular in the range.

And now we’ve reached the convention. When we write down something lke x^2 - 25x + 144 we’re implicitly defining a function. A function has three pieces. It has a set called the domain, from which we draw the independent variable. It has a set called the range. It has a rule matching elements in the domain to an element in the range. We’ve only given the rule. What are the domain and what’s the range for f(x) = x^2 - 25x + 144 ?

And here are the conventions. If we haven’t said otherwise, the domain and range are usually either the real numbers or the complex numbers. If we used x or y or t as the independent variable, we mean the real numbers. If we used z as the independent variable, and haven’t already put x and y in, we mean the complex numbers. Sometimes we call in s or w or another letter; never mind that. The range can be the whole set of real or complex numbers. It does us no harm to have too large a range.

The domain, though. We do insist that everything in the domain match to something in the range. And, like, \frac{1}{x - 2} ? That can’t mean anything if x equals 2.

So we take an implicit definition of the domain: it’s all the real numbers for which the function’s rule is meaningful. So, \frac{1}{x - 2} would have a domain “real numbers other than 2”. \frac{6x}{(x - 2)(x + 2)} would have a domain “real numbers other than 2 and -2”.

We create extraneous solutions — or we lose some — when our convention changes the domain. An extraneous solution is one that existed outside the original problem’s domain. A missing solution is one that existed in an excised part of the domain. To go from x^2 = 4x to x = 4 by dividing out x is to cut x = 0 out of the space of possible solutions.

A complaint you might raise. What is the domain for x - 12 = \sqrt{x} ? Rewrite that as a function. f(x) = x - 12 - \sqrt{x} would seem to have a domain “x greater than or equal to 0”. The extraneous solution is x = 9 , a number which rumor has it is greater than or equal to 0. What happened?

We have to take that equation-handling more slowly. We had started out with

x - 12 = \sqrt{x}

The domain has to be “x is greater than or equal to 0” here. All right. The next step was multiplying both sides by the same quantity, x - 12 . So:

(x - 12)(x - 12) = \sqrt{x}(x - 12)

The domain is still “x is greater than or equal to 0”. The next step, though, was a substitution. I wanted to replace the (x - 12) on the right with \sqrt{x} . We know, from the original equation, that those are equal. At least, they’re equal wherever the original equation x - 12 = \sqrt{x} is true. What happens when x = 9 , though?

9 - 12 = \sqrt{9}

We start to see the catch. 9 – 12 is -3. And while it’s true that -3 squared will be 9, it’s false that -3 is the square root of 9. The equation x - 12 = \sqrt{x} can only be true, for real numbers, if \sqrt{x} is nonnegative. We can make this rigorous with two supplementary functions. Let me call g(x) = x - 12 and h(x) = \sqrt{x} .

h(x) has an implicit domain of “x greater than or equal to 0”. What’s the domain of g(x) ? If g(x) = h(x) , like we said it does, then they have to agree for every x in either’s domain. So g(x) can’t have in its domain any x for which h(x) isn’t defined. So the domain of g(x) has to be “x for which x – 12 is greater than or equal to 0”. And that’s “x greater than or equal to 12”.

So the domain for the original equation is “x greater than or equal to 12”. When we keep that domain in mind, the extraneous nature of x = 9 is clear, and we avoid trouble.

Not all extraneous solutions come from algebraic manipulations. Sometimes there are constraints on the problem, rather than the numbers, that make a solution absurd. There is a betting strategy called the martingale. This amounts to doubling the bet every time one loses. This makes the first win balance out all the losses leading to it. This solution fails because the player has a finite wallet, and after a few losses any player hasn’t got the money to continue.

Or consider a case that may be legend. It concerns the Apollo Guidance Computer. It was designed to take the Lunar Module to a spot at zero altitude above the moon’s surface, with zero velocity. The story is that in early test runs, the computer would not avoid trajectories that dropped to a negative altitude along the way to the surface. One imagines the scene after the first Apollo subway trip. (I have not found a date when such a test run was done, or corrections to the code ordered. If someone knows, I’d appreciate learning specifics.)

The convention, that we trust the domain is “everything which makes sense”, is not to blame here. It’s normally a good convention. Explicitly noting the domain at every step is tedious and, most of the time, unenlightening. It belongs in the background. We also must check our possible solutions, and that they represent things that make sense. We can try to concentrate our thinking on the obvious interesting parts, but must spend some time on the rest also.


I am surprised to be so near the end of the 2020 A-to-Z, and to 2020, I hope. This and all the other glossary essays for the year should be at this link. All the essays from every A-to-Z series should be at this link. Thank you for reading.

The Summer 2017 Mathematics A To Z: Diophantine Equations


I have another request from Gaurish, of the For The Love Of Mathematics blog, today. It’s another change of pace.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Diophantine Equations

A Diophantine equation is a polynomial. Well, of course it is. It’s an equation, or a set of equations, setting one polynomial equal to another. Possibly equal to a constant. What makes this different from “any old equation” is the coefficients. These are the constant numbers that you multiply the variables, your x and y and x2 and z8 and so on, by. To make a Diophantine equation all these coefficients have to be integers. You know one well, because it’s that x^n + y^n = z^n thing that Fermat’s Last Theorem is all about. And you’ve probably seen ax + by = 1 . It turns up a lot because that’s a line, and we do a lot of stuff with lines.

Diophantine equations are interesting. There are a couple of cases that are easy to solve. I mean, at least that we can find solutions for. ax + by = 1 , for example, that’s easy to solve. x^n + y^n = z^n it turns out we can’t solve. Well, we can if n is equal to 1 or 2. Or if x or y or z are zero. These are obvious, that is, they’re quite boring. That one took about four hundred years to solve, and the solution was “there aren’t any solutions”. This may convince you of how interesting these problems are. What, from looking at it, tells you that ax + by = 1 is simple while x^n + y^n = z^n is (most of the time) impossible?

I don’t know. Nobody really does. There are many kinds of Diophantine equation, all different-looking polynomials. Some of them are special one-off cases, like x^n + y^n = z^n . For example, there’s x^4 + y^4 + z^4 = w^4 for some integers x, y, z, and w. Leonhard Euler conjectured this equation had only boring solutions. You’ll remember Euler. He wrote the foundational work for every field of mathematics. It turns out he was wrong. It has infinitely many interesting solutions. But the smallest one is 2,682,440^4 + 15,365,639^4 + 18,796,760^4 = 20,615,673^4 and that one took a computer search to find. We can forgive Euler not noticing it.

Some are groups of equations that have similar shapes. There’s the Fermat’s Last Theorem formula, for example, which is a different equation for every different integer n. Then there’s what we call Pell’s Equation. This one is x^2 - D y^2 = 1 (or equals -1), for some counting number D. It’s named for the English mathematician John Pell, who did not discover the equation (even in the Western European tradition; Indian mathematicians were familiar with it for a millennium), did not solve the equation, and did not do anything particularly noteworthy in advancing human understanding of the solution. Pell owes his fame in this regard to Leonhard Euler, who misunderstood Pell’s revising a translation of a book discussing a solution for Pell’s authoring a solution. I confess Euler isn’t looking very good on Diophantine equations.

But nobody looks very good on Diophantine equations. Make up a Diophantine equation of your own. Use whatever whole numbers, positive or negative, that you like for your equation. Use whatever powers of however many variables you like for your equation. So you get something that looks maybe like this:

7x^2 - 20y + 18y^2 - 38z = 9

Does it have any solutions? I don’t know. Nobody does. There isn’t a general all-around solution. You know how with a quadratic equation we have this formula where you recite some incantation about “b squared minus four a c” and get any roots that exist? Nothing like that exists for Diophantine equations in general. Specific ones, yes. But they’re all specialties, crafted to fit the equation that has just that shape.

So for each equation we have to ask: is there a solution? Is there any solution that isn’t obvious? Are there finitely many solutions? Are there infinitely many? Either way, can we find all the solutions? And we have to answer them anew. What answers these have? Whether answers are known to exist? Whether answers can exist? We have to discover anew for each kind of equation. Knowing answers for one kind doesn’t help us for any others, except as inspiration. If some trick worked before, maybe it will work this time.

There are a couple usually reliable tricks. Can the equation be rewritten in some way that it becomes the equation for a line? If it can we probably have a good handle on any solutions. Can we apply modulo arithmetic to the equation? If it is, we might be able to reduce the number of possible solutions that the equation has. In particular we might be able to reduce the number of possible solutions until we can just check every case. Can we use induction? That is, can we show there’s some parameter for the equations, and that knowing the solutions for one value of that parameter implies knowing solutions for larger values? And then find some small enough value we can test it out by hand? Or can we show that if there is a solution, then there must be a smaller solution, and smaller yet, until we can either find an answer or show there aren’t any? Sometimes. Not always. The field blends seamlessly into number theory. And number theory is all sorts of problems easy to pose and hard or impossible to solve.

We name these equation after Diophantus of Alexandria, a 3rd century Greek mathematician. His writings, what we have of them, discuss how to solve equations. Not general solutions, the way we might want to solve ax^2 + bx + c = 0 , but specific ones, like 1x^2 - 5x + 6 = 0 . His books are among those whose rediscovery shaped the rebirth of mathematics. Pierre de Fermat’s scribbled his famous note in the too-small margins of Diophantus’s Arithmetica. (Well, a popular translation.)

But the field predates Diophantus, at least if we look at specific problems. Of course it does. In mathematics, as in life, any search for a source ends in a vast, marshy ambiguity. The field stays vital. If we loosen ourselves to looking at inequalities — x - Dy^2 < A , let's say — then we start seeing optimization problems. What values of x and y will make this equation most nearly true? What values will come closest to satisfying this bunch of equations? The questions are about how to find the best possible fit to whatever our complicated sets of needs are. We can't always answer. We keep searching.

Theorem Thursday: What Is Cramer’s Rule?


KnotTheorist asked for this one during my appeal for theorems to discuss. And I’m taking an open interpretation of what a “theorem” is. I can do a rule.

Cramer’s Rule

I first learned of Cramer’s Rule in the way I expect most people do. It was an algebra course. I mean high school algebra. By high school algebra I mean you spend roughly eight hundred years learning ways to solve for x or to plot y versus x. Then take a pause for polar coordinates and matrices. Then you go back to finding both x and y.

Cramer’s Rule came up in the context of solving simultaneous equations. You have more than one variable. So x and y. Maybe z. Maybe even a w, before whoever set up the problem gives up and renames everything x1 and x2 and x62 and all that. You also have more than one equation. In fact, you have exactly as many equations as you have variables. Are there any sets of values those variables can have which make all those variable true simultaneously? Thus the imaginative name “simultaneous equations” or the search for “simultaneous solutions”.

If all the equations are linear then we can always say whether there’s simultaneous solutions. By “linear” we mean what we always mean in mathematics, which is, “something we can handle”. But more exactly it means the equations have x and y and whatever other variables only to the first power. No x-squared or square roots of y or tangents of z or anything. (The equations are also allowed to omit a variable. That is, if you have one equation with x, y, and z, and another with just x and z, and another with just y and z, that’s fine. We pretend the missing variable is there and just multiplied by zero, and proceed as before.) One way to find these solutions is with Cramer’s Rule.

Cramer’s Rule sets up some matrices based on the system of equations. If the system has two equations, it sets up three matrices. If the system has three equations, it sets up four matrices. If the system has twelve equations, it sets up thirteen matrices. You see the pattern here. And then you can take the determinant of each of these matrices. Dividing the determinant of one of these matrices by another one tells you what value of x makes all the equations true. Dividing the determinant of another matrix by the determinant of one of these matrices tells you which values of y makes all the equations true. And so on. The Rule tells you which determinants to use. It also says what it means if the determinant you want to divide by equals zero. It means there’s either no set of simultaneous solutions or there’s infinitely many solutions.

This gets dropped on us students in the vain effort to convince us knowing how to calculate determinants is worth it. It’s not that determinants aren’t worth knowing. It’s just that they don’t seem to tell us anything we care about. Not until we get into mappings and calculus and differential equations and other mathematics-major stuff. We never see it in high school.

And the hard part of determinants is that for all the cool stuff they tell us, they take forever to calculate. The determinant for a matrix with two rows and two columns isn’t bad. Three rows and three columns is getting bad. Four rows and four columns is awful. The determinant for a matrix with five rows and five columns you only ever calculate if you’ve made your teacher extremely cross with you.

So there’s the genius and the first problem with Cramer’s Rule. It takes a lot of calculating. Many any errors along the way with the calculation and your work is wrong. And worse, it won’t be wrong in an obvious way. You can find the error only by going over every single step and hoping to catch the spot where you, somehow, got 36 times -7 minus 21 times -8 wrong.

The second problem is nobody in high school algebra mentions why systems of linear equations should be interesting to solve. Oh, maybe they’ll explain how this is the work you do to figure out where two straight lines intersect. But that just shifts the “and we care because … ?” problem back one step. Later on we might come to understand the lines represent cases where something we’re interested in is true, or where it changes from true to false.

This sort of simultaneous-solution problem turns up naturally in optimization problems. These are problems where you try to find a maximum subject to some constraints. Or find a minimum. Maximums and minimums are the same thing when you think about them long enough. If all the constraints can be satisfied at once and you get a maximum (or minimum, whatever), great! If they can’t … Well, you can study how close it’s possible to get, and what happens if you loosen one or more constraint. That’s worth knowing about.

The third problem with Cramer’s Rule is that, as a method, it kind of sucks. We can be convinced that simultaneous linear equations are worth solving, or at least that we have to solve them to get out of High School Algebra. And we have computers. They can grind away and work out thirteen determinants of twelve-row-by-twelve-column matrices. They might even get an answer back before the end of the term. (The amount of work needed for a determinant grows scary fast as the matrix gets bigger.) But all that work might be meaningless.

The trouble is that Cramer’s Rule is numerically unstable. Before I even explain what that is you already sense it’s a bad thing. Think of all the good things in your life you’ve heard described as unstable. Fair enough. But here’s what we mean by numerically unstable.

Is 1/3 equal to 0.3333333? No, and we know that. But is it close enough? Sure, most of the time. Suppose we need a third of sixty million. 0.3333333 times 60,000,000 equals 19,999,998. That’s a little off of the correct 20,000,000. But I bet you wouldn’t even notice the difference if nobody pointed it out to you. Even if you did notice it you might write off the difference. “If we must, make up the difference out of petty cash”, you might declare, as if that were quite sensible in the context.

And that’s so because this multiplication is numerically stable. Make a small error in either term and you get a proportional error in the result. A small mistake will — well, maybe it won’t stay small, necessarily. But it’ll not grow too fast too quickly.

So now you know intuitively what an unstable calculation is. This is one in which a small error doesn’t necessarily stay proportionally small. It might grow huge, arbitrarily huge, and in few calculations. So your answer might be computed just fine, but actually be meaningless.

This isn’t because of a flaw in the computer per se. That is, it’s working as designed. It’s just that we might need, effectively, infinitely many digits of precision for the result to be correct. You see where there may be problems achieving that.

Cramer’s Rule isn’t guaranteed to be nonsense, and that’s a relief. But it is vulnerable to this. You can set up problems that look harmless but which the computer can’t do. And that’s surely the worst of all worlds, since we wouldn’t bother calculating them numerically if it weren’t too hard to do by hand.

(Let me direct the reader who’s unintimidated by mathematical jargon, and who likes seeing a good Wikipedia Editors quarrel, to the Cramer’s Rule Talk Page. Specifically to the section “Cramer’s Rule is useless.”)

I don’t want to get too down on Cramer’s Rule. It’s not like the numerical instability hurts every problem you might use it on. And you can, at the cost of some more work, detect whether a particular set of equations will have instabilities. That requires a lot of calculation but if we have the computer to do the work fine. Let it. And a computer can limit its numerical instabilities if it can do symbolic manipulations. That is, if it can use the idea of “one-third” rather than 0.3333333. The software package Mathematica, for example, does symbolic manipulations very well. You can shed many numerical-instability problems, although you gain the problem of paying for a copy of Mathematica.

If you just care about, or just need, one of the variables then what the heck. Cramer’s Rule lets you solve for just one or just some of the variables. That seems like a niche application to me, but it is there.

And the Rule re-emerges in pure analysis, where numerical instability doesn’t matter. When we look to differential equations, for example, we often find solutions are combinations of several independent component functions. Bases, in fact. Testing whether we have found independent bases can be done through a thing called the Wronskian. That’s a way that Cramer’s Rule appears in differential equations.

Wikipedia also asserts the use of Cramer’s Rule in differential geometry. I believe that’s a true statement, and that it will be reflected in many mechanics problems. In these we can use our knowledge that, say, energy and angular momentum of a system are constant values to tell us something of how positions and velocities depend on each other. But I admit I’m not well-read in differential geometry. That’s something which has indeed caused me pain in my scholarly life. I don’t know whether differential geometers thank Cramer’s Rule for this insight or whether they’re just glad to have got all that out of the way. (See the above Wikipedia Editors quarrel.)

I admit for all this talk about Cramer’s Rule I haven’t said what it is. Not in enough detail to pass your high school algebra class. That’s all right. It’s easy to find. MathWorld has the rule in pretty simple form. Mathworld does forget to define what it means by the vector d. (It’s the vector with components d1, d2, et cetera.) But that’s enough technical detail. If you need to calculate something using it, you can probably look closer at the problem and see if you can do it another way instead. Or you’re in high school algebra and just have to slog through it. It’s all right. Eventually you can put x and y aside and do geometry.

Who Discovered Boyle’s Law?


Stigler’s Law is a half-joking principle of mathematics and scientific history. It says that scientific discoveries are never named for the person who discovered them. It’s named for the statistician Stephen Stigler, who asserted that the principle was discovered by the sociologist Robert K Merton.

If you study much scientific history you start to wonder if anything is named correctly. There are reasons why. Often it’s very hard to say exactly what the discovery is, especially if it’s something fundamental. Often the earliest reports of something are unclear, at least to later eyes. People’s attention falls on a person who did very well describing or who effectively publicized the discovery. Sometimes a discovery is just in the air, and many people have important pieces of it nearly simultaneously. And sometimes history just seems perverse. Pell’s Equation, for example, is named for John Pell, who did not discover it, did not solve it, and did not particularly advance our understanding of it. We seem to name it Pell’s because Pell had translated a book which included a solution of the problem into English, and Leonhard Euler mistakenly thought Pell had solved it.

The Carnot Cycle blog for this month is about a fine example of naming confusion. In this case it’s about Boyle’s Law. That’s one of the rules describing how gases work. It says that, if a gas is held at a constant temperature, and the amount of gas doesn’t change, then the pressure of the gas times its volume stays constant. Squeeze the gas into a smaller volume and it exerts more pressure on the container it’s in. Stretch it into a larger volume and it presses more weakly on the container.

Obvious? Perhaps. But it is a thing that had to be discovered. There’s a story behind that. Peter Mander explains some of its tale.

A Summer 2015 Mathematics A To Z: xor


Xor.

Xor comes to us from logic. In this field we look at propositions, which can be be either true or false. Propositions serve the same rule here that variables like “x” and “y” serve in algebra. They have some value. We might know what the value is to start with. We might be hoping to deduce what the value is. We might not actually care what the value is, but need a placeholder for it while we do other work.

A variable, or a proposition, can carry some meaning. The variable “x” may represent “the longest straight board we can fit around this corner”. The proposition “A” may represent “The blue house is the one for sale”. (Logic has a couple of conventions. In one we use capital letters from the start of the alphabet for propositions. In other we use lowercase p’s and q’s and r’s and letters from that patch of the alphabet. This is a difference in dialect, not in content.) That’s convenient, since it can help us understand the meaning of a problem we’re working on, but it’s not essential. The process of solving an equation is the same whether or not the equation represents anything in the real world. So it is with logic.

We can combine propositions to make more interesting statements. If we know what whether the propositions are true or false we know whether the statements are true. If we know starting out only that the statements are true (or false) we might be able to work out whether the propositions are true or false.

Xor, the exclusive or, is one of the common combinations. Start with the propositions A and B, both of which may be true or may be false. A Xor B is a true statement when A is true while B is false, or when A is false while B is true. It’s false when A and B are simultaneously false. It’s also false when A and B are simultaneously true.

It’s the logic of whether a light bulb on a two-way switch is on. If one switch it on and the other off, the bulb is on. If both switches are on, or both switches off, the bulb is off. This is also the logic of what’s offered when the menu says you can have french fries or onion rings with your sandwich. You can get both, but it’ll cost an extra 95 cents.

Reading the Comics, April 27, 2015: Anthropomorphic Mathematics Edition


They’re not running at the frantic pace of April 21st, but there’s still been a fair clip of comic strips that mention some kind of mathematical topic. I imagine Comic Strip Master Command wants to be sure to use as many of these jokes up as possible before the (United States) summer vacation sets in.

Dan Thompson’s Brevity (April 23) is a straightforward pun strip. It also shows a correct understanding of how to draw a proper Venn Diagram. And after all why shouldn’t an anthropomorphized Venn Diagram star in movies too?

John Atkinson’sWrong Hands (April 23) gets into more comfortable territory with plain old numbers being anthropomorphized. The 1 is fair to call this a problem. What kind of problem depends on whether you read the x as a multiplication sign or as a variable x. If it’s a multiplication sign then I can’t think of any true statement that can be made from that bundle of symbols. If it’s the variable x then there are surprisingly many problems which could be made, particularly if you’re willing to count something like “x = 718” as a problem. I think that it works out to 24 problems but would accept contrary views. This one ended up being the most interesting to me once I started working out how many problems you could make with just those symbols. There’s a fun question for your combinatorics exam in that.

Continue reading “Reading the Comics, April 27, 2015: Anthropomorphic Mathematics Edition”

When 2 plus 2 Equals 5, plus Another Unsettling Equation


I just wanted to note for folks who don’t read The Straight Dope — the first two books of which were unimaginably important to the teenage me, hundreds of pages of neat stuff to know delivered in a powerful style, that overwhelmed even The People’s Almanac 2 if you can imagine — that the Straight Dope Science Advisory board tried to take on the question of Does 2 + 2 equal 5 for very large values of 2?

Straight Dope Staffer Dex takes the question a bit more literally than I have ever interpreted the joke to be. I’ve basically read it as just justifying a nonsense result with a nonsense explanation, fitting in the spectrum of comic answers somewhere between King Lear’s understanding of why there are seven stars in the Pleiades and classic 1940s style double-talk. But Dex uses the equation to point out how rounding and estimation, essential steps in translating between the real world and the mathematical representation of the world, can produce results which are correct at every step but wrong in the whole, which is worth considering.


Also, in a bit of reading I’m doing and which I might rip off^W^W use as inspiration for some posts around here the (British) author dropped in an equation meant to be unsettling and, yeah, this unsettles me. Let me know what you think:

3 \mbox{ feet } + 2 \mbox{ tons } = 36 \mbox{ inches } + 2440 \mbox{ pounds }

I should say it’s not like I’m going to have nightmares about that, but it feels off anyway.

Denominated Mischief


I’ve finally got around to reading one of my Christmas presents, Alfred S Posamentier and Ingmar Lehman’s Magnificent Mistakes in Mathematics, which is about ways that mathematical reasoning can be led astray. A lot, at least in the early pages, is about the ways a calculation can be fowled by a bit of carelessness, especially things like dividing by zero, which seems like such an obvious mistake that who could make it once they’ve passed Algebra II?

They got to a most neat little erroneous calculation, though, and I wanted to share it since the flaw is not immediately obvious although the absurdity of the conclusion drives you to look for it. We begin with a straightforward problem that I think of as Algebra I-grade, though I admit my memories of taking Algebra I are pretty vague these days, so maybe I missed the target grade level by a year or two.

\frac{3x - 30}{11 - x} = \frac{x + 2}{x - 7} - 4

Multiply that 4 on the right-hand side by 1 — in this case, by \frac{x - 7}{x - 7} — and combine that into the numerator:

\frac{3x - 30}{11 - x} = \frac{x + 2 - 4(x - 7)}{x - 7}

Expand that parentheses and simplify the numerator on the right-hand side:

\frac{3x - 30}{11 - x} = \frac{3x - 30}{7 - x}

Since the fractions are equal, and the numerators are equal, therefore their denominators must be equal. Thus, 11 - x = 7 - x and therefore, 11 = 7.

Did you spot where the card got palmed there?

What Is True Almost Everywhere?


I was reading a thermodynamics book (C Truesdell and S Bharatha’s The Concepts and Logic of Classical Thermodynamics as a Theory of Heat Engines, which is a fascinating read, for the field, and includes a number of entertaining, for the field, snipes at the stuff textbook writers put in because they’re just passing on stuff without rethinking it carefully), and ran across a couple proofs which mentioned equations that were true “almost everywhere”. That’s a construction it might be surprising to know even exists in mathematics, so, let me take a couple hundred words to talk about it.

The idea isn’t really exotic. You’ve seen a kind of version of it when you see an equation containing the note that there’s an exception, such as, \frac{\left(x - 1\right)^2}{\left(x - 1\right)} = x \mbox{ for } x \neq 1 . If the exceptions are tedious to list — because there are many of them to write down, or because they’re wordy to describe (the thermodynamics book mentioned the exceptions were where a particular set of conditions on several differential equations happened simultaneously, if it ever happened) — and if they’re unlikely to come up, then, we might just write whatever it is we want to say and add an “almost everywhere”, or for shorthand, put an “ae” after the line. This “almost everywhere” will, except in freak cases, propagate through the rest of the proof, but I only see people writing that when they’re students working through the concept. In publications, the “almost everywhere” gets put in where the condition first stops being true everywhere-everywhere and becomes only almost-everywhere, and taken as read after that.

I introduced this with an equation, but it can apply to any relationship: something is greater than something else, something is less than or equal to something else, even something is not equal to something else. (After all, “x \neq -x is true almost everywhere, but there is that nagging exception.) A mathematical proof is normally about things which are true. Whether one thing is equal to another is often incidental to that.

What’s meant by “unlikely to come up” is actually rigorously defined, which is why we can get away with this. It’s otherwise a bit daft to think we can just talk about things that are true except where they aren’t and not even post warnings about where they’re not true. If we say something is true “almost everywhere” on the real number line, for example, that means that the set of exceptions has a total length of zero. So if the only exception is where x equals 1, sure enough, that’s a set with no length. Similarly if the exceptions are where x equals positive 1 or negative 1, that’s still a total length of zero. But if the set of exceptions were all values of x from 0 to 4, well, that’s a set of total length 4 and we can’t say “almost everywhere” for that.

This is all quite like saying that it can’t happen that if you flip a fair coin infinitely many times it will come up tails every single time. It won’t, even though properly speaking there’s no reason that it couldn’t. If something is true almost everywhere, then your chance of picking an exception out of all the possibilities is about like your chance of flipping that fair coin and getting tails infinitely many times over.

Why A Line Doesn’t Have An Equation


[ To resume after some interruptions — it’s been quite a busy few weeks — the linear interpolations that I had been talking about, I will need equations describing a line. ]

To say something is the equation representing a line is to lie in the article. It’s little one, of the same order as pretending there’s just one answer to the question, “Who are you?” Who you are depends on context: you’re the person with this first-middle-last name combination. You’re the person with this first name. You’re the person with this nickname. You’re the third person in the phone queue for tech support. You’re the person with this taxpayer identification number. You’re the world’s fourth-leading expert on the Marvel “New Universe” line of comic books, and sorry for that. You’re the person who ordered two large-size fries at Five Guys Burgers And Fries and will soon learn you’ll never live long enough to eat them all. You’re the person who knows how to get the sink in the break room at work to stop dripping. These may all be correct, but depending on the context some of these answers are irrelevant, and maybe one or two of them is useful, or at least convenient. So it is with equations for a line: there are many possible equations. Some of them are just more useful, or even convenient.

Continue reading “Why A Line Doesn’t Have An Equation”

Hopefully, Saying Something True


I wanted to talk about drawing graphs that represent something, and to get there have to say what kinds of things I mean to represent. The quick and expected answer is that I mean to represent some kind of equation, such as “y = 3*x – 2” or “x2 + y2 = 4”, and that probably does come up the most often. We might also be interested in representing an inequality, something like “x2 – 2 y2 ≤ 1”. On occasion we’re interested just in the region where something is not true, saying something like “y ≠ 3 – x”. (I’ve used nice small counting numbers here not out of any interest in these numbers, or because larger ones or non-whole numbers or even irrational numbers don’t work, but because there is something pleasantly reassuring about seeing a “1” or a “2” in an equation. We strongly believe we know what we mean by “1”.)

Anyway, what we’ve written down is something describing a relationship which we are willing to suppose is true. We might not know what x or y are, and we might not care, but at least for the length of the problem we will suppose that the number represented by y must be equal to three times whatever number is represented by x and minus two. There might be only a single value of x we find interesting; there might be several; there might be infinitely many such values. There’ll be a corresponding number of y’s, at least, so long as the equation is true.

Sometimes we’ll turn the description in terms of an equation into a description in terms of a graph right away. Some of these descriptions are like as those of a line — the “y = 3*x – 2” equation — or a simple shape — “x2 + y2 = 4” is a circle — in that we can turn them into graphs right away without having to process them, at least not once we’re familiar and comfortable with the idea of graphing. Some of these descriptions are going to be in awkward forms. “x + 2 = – y2 / x + 2 y /x” is really just an awkward way to describe a circle (more or less), but that shape is hidden in the writing.

Continue reading “Hopefully, Saying Something True”

Before Drawing a Graph


I want to talk about drawing graphs, specifically, drawing curves on graphs. We know roughly what’s meant by that: it’s about wiggly shapes with a faint rectangular grid, usually in grey or maybe drawn in dotted lines, behind them. Sometimes the wiggly shapes will be in bright colors, to clarify a complicated figure or to justify printing the textbook in color. Those graphs.

I clarify because there is a type of math called graph theory in which, yes, you might draw graphs, but there what’s meant by a graph is just any sort of group of points, called vertices, connected by lines or curves. It makes great sense as a name, but it’s not what what someone who talks about drawing a graph means, up until graph theory gets into consideration. Those graphs are fun, particularly because they’re insensitive to exactly where the vertices are, so you get to exercise some artistic talent instead of figuring out whatever you were trying to prove in the problem.

The ordinary kind of graphs offer some wonderful advantages. The obvious one is that they’re pictures. People can very often understand a picture of something much faster than they can understand other sorts of descriptions. This probably doesn’t need any demonstration; if it does, try looking at a map of the boundaries of South Carolina versus reading a description of its boundaries. Some problems are much easier to work out if we can approach it as a geometric problem. (And I admit feeling a particular delight when I can prove a problem geometrically; it feels cleverer.)

Continue reading “Before Drawing a Graph”

In Case Of Sudden Failure Of Planet Earth


Have you ever figured out just exactly what you would do if the Earth were to suddenly disappear from the universe, leaving just you and whatever’s around to fall towards whatever the nearest heavenly bodies are? No, me neither. Asked to improvise one, I suppose I’d suffocate within minutes and then everything else becomes not so interesting to me, although possibly my heirs might be interested, if they’re somewhere.

My mother accidentally got me thinking about this fate. She’s taking a course about the science and world view of the Ancient Egyptians. (In talking about it she asked if I knew anything about the science of the Ancient Egyptians, and I tried to say I didn’t really know more than the average lay reader might, although when I got to mentioning that I knew of their awareness of the Sothic Cycle and where we get the name “Sothic Cycle” I realized that I can’t really call myself ignorant about the science of the Ancient Egyptians.) But the class had reached the point where astrological beliefs of the ancients were under discussion, and apparently some of the students insisted that astrology really and truly works, and my mother wanted me to figure out how strong the force of gravity of the Moon is, compared to the force of gravity of another person in the same room. This would allow her to go into class armed with numbers which have never dissuaded anyone from believing in astrology, but, it’s fun for someone.

I did double-check, though, that she meant the gravitational pull of the Moon, rather than its tidal pull. The shorthand reason for this is that arguments for astrology having some physical basis tend to run along the lines of, the Moon creates the tides (the Sun does too, but smaller ones), tides are made of water (rock moves, too, although much less), human bodies are mostly water (I don’t know what the fluid properties of cytoplasm are, but I’m almost curious enough to look them up), so there must be something tide-like in human bodies too (so there). The gravitational pull of the Moon, meanwhile, doesn’t really mean much: the Moon is going to accelerate the Earth and the people standing on it by just about the same amount. The force of gravity between two objects grows with the two objects’ masses, and the Earth is more massive than any person on it. But this means the Earth feels a greater force pulling it towards the Moon, and the acceleration works out tobe just the same. The force of gravity between two objects falls off as the square of the distance between them, and the people on the surface of the Earth are a little bit closer or a little bit farther away from the Moon than the center of the Earth is, but that’s not very different considering just how far away the Moon is. We spend all our lives falling into the Moon, as fast as we possibly can, and we are falling into the Moon as fast as the Earth is.

Continue reading “In Case Of Sudden Failure Of Planet Earth”

%d bloggers like this: