The Summer 2017 Mathematics A To Z: Young Tableau


I never heard of today’s entry topic three months ago. Indeed, three weeks ago I was still making guesses about just what Gaurish, author of For the love of Mathematics,, was asking about. It turns out to be maybe the grand union of everything that’s ever been in one of my A To Z sequences. I overstate, but barely.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Young Tableau.

The specific thing that a Young Tableau is is beautiful in its simplicity. It could almost be a recreational mathematics puzzle, except that it isn’t challenging enough.

Start with a couple of boxes laid in a row. As many or as few as you like.

Now set another row of boxes. You can have as many as the first row did, or fewer. You just can’t have more. Set the second row of boxes — well, your choice. Either below the first row, or else above. I’m going to assume you’re going below the first row, and will write my directions accordingly. If you do things the other way you’re following a common enough convention. I’m leaving it on you to figure out what the directions should be, though.

Now add in a third row of boxes, if you like. Again, as many or as few boxes as you like. There can’t be more than there are in the second row. Set it below the second row.

And a fourth row, if you want four rows. Again, no more boxes in it than the third row had. Keep this up until you’ve got tired of adding rows of boxes.

How many boxes do you have? I don’t know. But take the numbers 1, 2, 3, 4, 5, and so on, up to whatever the count of your boxes is. Can you fill in one number for each box? So that the numbers are always increasing as you go left to right in a single row? And as you go top to bottom in a single column? Yes, of course. Go in order: ‘1’ for the first box you laid down, then ‘2’, then ‘3’, and so on, increasing up to the last box in the last row.

Can you do it in another way? Any other order?

Except for the simplest of arrangements, like a single row of four boxes or three rows of one box atop another, the answer is yes. There can be many of them, turns out. Seven boxes, arranged three in the first row, two in the second, one in the third, and one in the fourth, have 35 possible arrangements. It doesn’t take a very big diagram to get an enormous number of possibilities. Could be fun drawing an arbitrary stack of boxes and working out how many arrangements there are, if you have some time in a dull meeting to pass.

Let me step away from filling boxes. In one of its later, disappointing, seasons Futurama finally did a body-swap episode. The gimmick: two bodies could only swap the brains within them one time. So would it be possible to put Bender’s brain back in his original body, if he and Amy (or whoever) had already swapped once? The episode drew minor amusement in mathematics circles, and a lot of amazement in pop-culture circles. The writer, a mathematics major, found a proof that showed it was indeed always possible, even after many pairs of people had swapped bodies. The idea that a theorem was created for a TV show impressed many people who think theorems are rarer and harder to create than they necessarily are.

It was a legitimate theorem, and in a well-developed field of mathematics. It’s about permutation groups. These are the study of the ways you can swap pairs of things. I grant this doesn’t sound like much of a field. There is a surprising lot of interesting things to learn just from studying how stuff can be swapped, though. It’s even of real-world relevance. Most subatomic particles of a kind — electrons, top quarks, gluons, whatever — are identical to every other particle of the same kind. Physics wouldn’t work if they weren’t. What would happen if we swap the electron on the left for the electron on the right, and vice-versa? How would that change our physics?

A chunk of quantum mechanics studies what kinds of swaps of particles would produce an observable change, and what kind of swaps wouldn’t. When the swap doesn’t make a change we can describe this as a symmetric operation. When the swap does make a change, that’s an antisymmetric operation. And — the Young Tableau that’s a single row of two boxes? That matches up well with this symmetric operation. The Young Tableau that’s two rows of a single box each? That matches up with the antisymmetric operation.

How many ways could you set up three boxes, according to the rules of the game? A single row of three boxes, sure. One row of two boxes and a row of one box. Three rows of one box each. How many ways are there to assign the numbers 1, 2, and 3 to those boxes, and satisfy the rules? One way to do the single row of three boxes. Also one way to do the three rows of a single box. There’s two ways to do the one-row-of-two-boxes, one-row-of-one-box case.

What if we have three particles? How could they interact? Well, all three could be symmetric with each other. This matches the first case, the single row of three boxes. All three could be antisymmetric with each other. This matches the three rows of one box. Or you could have two particles that are symmetric with each other and antisymmetric with the third particle. Or two particles that are antisymmetric with each other but symmetric with the third particle. Two ways to do that. Two ways to fill in the one-row-of-two-boxes, one-row-of-one-box case.

This isn’t merely a neat, aesthetically interesting coincidence. I wouldn’t spend so much time on it if it were. There’s a matching here that’s built on something meaningful. The different ways to arrange numbers in a set of boxes like this pair up with a select, interesting set of matrices whose elements are complex-valued numbers. You might wonder who introduced complex-valued numbers, let alone matrices of them, into evidence. Well, who cares? We’ve got them. They do a lot of work for us. So much work they have a common name, the “symmetric group over the complex numbers”. As my leading example suggests, they’re all over the place in quantum mechanics. They’re good to have around in regular physics too, at least in the right neighborhoods.

These Young Tableaus turn up over and over in group theory. They match up with polynomials, because yeah, everything is polynomials. But they turn out to describe polynomial representations of some of the superstar groups out there. Groups with names like the General Linear Group (square matrices), or the Special Linear Group (square matrices with determinant equal to 1), or the Special Unitary Group (that thing where quantum mechanics says there have to be particles whose names are obscure Greek letters with superscripts of up to five + marks). If you’d care for more, here’s a chapter by Dr Frank Porter describing, in part, how you get from Young Tableaus to the obscure baryons.

Porter’s chapter also lets me tie this back to tensors. Tensors have varied ranks, the number of different indicies you can have on the things. What happens when you swap pairs of indices in a tensor? How many ways can you swap them, and what does that do to what the tensor describes? Please tell me you already suspect this is going to match something in Young Tableaus. They do this by way of the symmetries and permutations mentioned above. But they are there.

As I say, three months ago I had no idea these things existed. If I ever ran across them it was from seeing the name at MathWorld’s list of terms that start with ‘Y’. The article shows some nice examples (with each rows a atop the previous one) but doesn’t make clear how much stuff this subject runs through. I can’t fit everything in to a reasonable essay. (For example: the number of ways to arrange, say, 20 boxes into rows meeting these rules is itself a partition problem. Partition problems are probability and statistical mechanics. Statistical mechanics is the flow of heat, and the movement of the stars in a galaxy, and the chemistry of life.) I am delighted by what does fit.

Advertisements

The Summer 2017 Mathematics A To Z: Sárközy’s Theorem


Gaurish, of For the love of Mathematics, gives me another chance to talk number theory today. Let’s see how that turns out.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Sárközy’s Theorem.

I have two pieces to assemble for this. One is in factors. We can take any counting number, a positive whole number, and write it as the product of prime numbers. 2038 is equal to the prime 2 times the prime 1019. 4312 is equal to 2 raised to the third power times 7 raised to the second times 11. 1040 is 2 to the fourth power times 5 times 13. 455 is 5 times 7 times 13.

There are many ways to divide up numbers like this. Here’s one. Is there a square number among its factors? 2038 and 455 don’t have any. They’re each a product of prime numbers that are never repeated. 1040 has a square among its factors. 2 times 2 divides into 1040. 4312, similarly, has a square: we can write it as 2 squared times 2 times 7 squared times 11. So that is my first piece. We can divide counting numbers into squarefree and not-squarefree.

The other piece is in binomial coefficients. These are numbers, often quite big numbers, that get dumped on the high school algebra student as she tries to work with some expression like (a + b)^n . They’re also dumped on the poor student in calculus, as something about Newton’s binomial coefficient theorem. Which we hear is something really important. In my experience it wasn’t explained why this should rank up there with, like, the differential calculus. (Spoiler: it’s because of polynomials.) But it’s got some great stuff to it.

Binomial coefficients are among those utility players in mathematics. They turn up in weird places. In dealing with polynomials, of course. They also turn up in combinatorics, and through that, probability. If you run, for example, 10 experiments each of which could succeed or fail, the chance you’ll get exactly five successes is going to be proportional to one of these binomial coefficients. That they touch on polynomials and probability is a sign we’re looking at a thing woven into the whole universe of mathematics. We saw them some in talking, last A-To-Z around, about Yang Hui’s Triangle. That’s also known as Pascal’s Triangle. It has more names too, since it’s been found many times over.

The theorem under discussion is about central binomial coefficients. These are one specific coefficient in a row. The ones that appear, in the triangle, along the line of symmetry. They’re easy to describe in formulas. for a whole number ‘n’ that’s greater than or equal to zero, evaluate what we call 2n choose n:

{{2n} \choose{n}} =  \frac{(2n)!}{(n!)^2}

If ‘n’ is zero, this number is \frac{0!}{(0!)^2} or 1. If ‘n’ is 1, this number is \frac{2!}{(1!)^2} or 2. If ‘n’ is 2, this number is \frac{4!}{(2!)^2} 6. If ‘n’ is 3, this number is (sparing the formula) 20. The numbers keep growing. 70, 252, 924, 3432, 12870, and so on.

So. 1 and 2 and 6 are squarefree numbers. Not much arguing that. But 20? That’s 2 squared times 5. 70? 2 times 5 times 7. 252? 2 squared times 3 squared times 7. 924? That’s 2 squared times 3 times 7 times 11. 3432? 2 cubed times 3 times 11 times 13; there’s a 2 squared in there. 12870? 2 times 3 squared times it doesn’t matter anymore. It’s not a squarefree number.

There’s a bunch of not-squarefree numbers in there. The question: do we ever stop seeing squarefree numbers here?

So here’s Sárközy’s Theorem. It says that this central binomial coefficient {{2n} \choose{n}} is never squarefree as long as ‘n’ is big enough. András Sárközy showed in 1985 that this was true. How big is big enough? … We have a bound, at least, for this theorem. If ‘n’ is larger than the number 2^{8000} then the corresponding coefficient can’t be squarefree. It might not surprise you that the formulas involved here feature the Riemann Zeta function. That always seems to turn up for questions about large prime numbers.

That’s a common state of affairs for number theory problems. Very often we can show that something is true for big enough numbers. I’m not sure there’s a clear reason why. When numbers get large enough it can be more convenient to deal with their logarithms, I suppose. And those look more like the real numbers than the integers. And real numbers are typically easier to prove stuff about. Maybe that’s it. This is vague, yes. But to ask ‘why’ some things are easy and some are hard to prove is a hard question. What is a satisfying ’cause’ here?

It’s tempting to say that since we know this is true for all ‘n’ above a bound, we’re done. We can just test all the numbers below that bound, and the rest is done. You can do a satisfying proof this way: show that eventually the statement is true, and show all the special little cases before it is. This particular result is kind of useless, though. 2^{8000} is a number that’s something like 241 digits long. For comparison, the total number of things in the universe is something like a number about 80 digits long. Certainly not more than 90. It’d take too long to test all those cases.

That’s all right. Since Sárközy’s proof in 1985 there’ve been other breakthroughs. In 1988 P Goetgheluck proved it was true for a big range of numbers: every ‘n’ that’s larger than 4 and less than 2^{42,205,184} . That’s a number something more than 12 million digits long. In 1991 I Vardi proved we had no squarefree central binomial coefficients for ‘n’ greater than 4 and less than 2^{774,840,978} , which is a number about 233 million digits long. And then in 1996 Andrew Granville and Olivier Ramare showed directly that this was so for all ‘n’ larger than 4.

So that 70 that turned up just a few lines in is the last squarefree one of these coefficients.

Is this surprising? Maybe, maybe not. I’ll bet most of you didn’t have an opinion on this topic twenty minutes ago. Let me share something that did surprise me, and continues to surprise me. In 1974 David Singmaster proved that any integer divides almost all the binomial coefficients out there. “Almost all” is here a term of art, but it means just about what you’d expect. Imagine the giant list of all the numbers that can be binomial coefficients. Then pick any positive integer you like. The number you picked will divide into so many of the giant list that the exceptions won’t be noticeable. So that square numbers like 4 and 9 and 16 and 25 should divide into most binomial coefficients? … That’s to be expected, suddenly. Into the central binomial coefficients? That’s not so obvious to me. But then so much of number theory is strange and surprising and not so obvious.

The Summer 2017 Mathematics A To Z: Morse Theory


Today’s A To Z entry is a change of pace. It dives deeper into analysis than this round has been. The term comes from Mr Wu, of the Singapore Maths Tuition blog, whom I thank for the request.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Morse Theory.

An old joke, as most of my academia-related ones are. The young scholar says to his teacher how amazing it was in the old days, when people were foolish, and thought the Sun and the Stars moved around the Earth. How fortunate we are to know better. The elder says, ah yes, but what would it look like if it were the other way around?

There are many things to ponder packed into that joke. For one, the elder scholar’s awareness that our ancestors were no less smart or perceptive or clever than we are. For another, the awareness that there is a problem. We want to know about the universe. But we can only know what we perceive now, where we are at this moment. Even a note we’ve written in the past, or a message from a trusted friend, we can’t take uncritically. What we know is that we perceive this information in this way, now. When we pay attention to our friends in the philosophy department we learn that knowledge is even harder than we imagine. But I’ll stop there. The problem is hard enough already.

We can put it in a mathematical form, one that seems immune to many of the worst problems of knowledge. In this form it looks something like this: if what can we know about the universe, if all we really know is what things in that universe are doing near us? The things that we look at are functions. The universe we’re hoping to understand is the domain of the functions. One filter we use to see the universe is Morse Theory.

We don’t look at every possible function. Functions are too varied and weird for that. We look at functions whose range is the real numbers. And they must be smooth. This is a term of art. It means the function has derivatives. It has to be continuous. It can’t have sharp corners. And it has to have lots of derivatives. The first derivative of a smooth function has to also be continuous, and has to also lack corners. And the derivative of that first derivative has to be continuous, and to lack corners. And the derivative of that derivative has to be the same. A smooth function can can differentiate over and over again, infinitely many times. None of those derivatives can have corners or jumps or missing patches or anything. This is what makes it smooth.

Most functions are not smooth, in much the same way most shapes are not circles. That’s all right. There are many smooth functions anyway, and they describe things we find interesting. Or we think they’re interesting, anyway. Smooth functions are easy for us to work with, and to know things about. There’s plenty of smooth functions. If you’re interested in something else there’s probably a smooth function that’s close enough for practical use.

Morse Theory builds on the “critical points” of these smooth functions. A critical point, in this context, is one where the derivative is zero. Derivatives being zero usually signal something interesting going on. Often they show where the function changes behavior. In freshman calculus they signal where a function changes from increasing to decreasing, so the critical point is a maximum. In physics they show where a moving body no longer has an acceleration, so the critical point is an equilibrium. Or where a system changes from one kind of behavior to another. And here — well, many things can happen.

So take a smooth function. And take a critical point that it’s got. (And, erg. Technical point. The derivative of your smooth function, at that critical point, shouldn’t be having its own critical point going on at the same spot. That makes stuff more complicated.) It’s possible to approximate your smooth function near that critical point with, of course, a polynomial. It’s always polynomials. The shape of these polynomials gives you an index for these points. And that can tell you something about the shape of the domain you’re on.

At least, it tells you something about what the shape is where you are. The universal model for this — based on skimming texts and papers and popularizations of this — is of a torus standing vertically. Like a doughnut that hasn’t tipped over, or like a tire on a car that’s working as normal. I suspect this is the best shape to use for teaching, as anyone can understand it while it still shows the different behaviors. I won’t resist.

Imagine slicing this tire horizontally. Slice it close to the bottom, below the central hole, and the part that drops down is a disc. At least, it could be flattened out tolerably well to a disc.

Slice it somewhere that intersects the hole, though, and you have a different shape. You can’t squash that down to a disc. You have a noodle shape. A cylinder at least. That’s different from what you got the first slice.

Slice the tire somewhere higher. Somewhere above the central hole, and you have … well, it’s still a tire. It’s got a hole in it, but you could imagine patching it and driving on. There’s another different shape that we’ve gotten from this.

Imagine we were confined to the surface of the tire, but did not know what surface it was. That we start at the lowest point on the tire and ascend it. From the way the smooth functions around us change we can tell how the surface we’re on has changed. We can see its change from “basically a disc” to “basically a noodle” to “basically a doughnut”. We could work out what the surface we’re on has to be, thanks to how these smooth functions around us change behavior.

Occasionally we mathematical-physics types want to act as though we’re not afraid of our friends in the philosophy department. So we deploy the second thing we know about Immanuel Kant. He observed that knowing the force of gravity falls off as the square of the distance between two things implies that the things should exist in a three-dimensional space. (Source: I dunno, I never read his paper or book or whatever and dunno I ever heard anyone say they did.) It’s a good observation. Geometry tells us what physics can happen, but what physics does happen tells us what geometry they happen in. And it tells the philosophy department that we’ve heard of Immanuel Kant. This impresses them greatly, we tell ourselves.

Morse Theory is a manifestation of how observable physics teaches us the geometry they happen on. And in an urgent way, too. Some of Edward Witten’s pioneering work in superstring theory was in bringing Morse Theory to quantum field theory. He showed a set of problems called the Morse Inequalities gave us insight into supersymmetric quantum mechanics. The link between physics and doughnut-shapes may seem vague. This is because you’re not remembering that mathematical physics sees “stuff happening” as curves drawn on shapes which represent the kind of problem you’re interested in. Learning what the shapes representing the problem look like is solving the problem.

If you’re interested in the substance of this, the universally-agreed reference is J Milnor’s 1963 text Morse Theory. I confess it’s hard going to read, because it’s a symbols-heavy textbook written before the existence of LaTeX. Each page reminds one why typesetters used to get hazard pay, and not enough of it.

The Summer 2017 Mathematics A To Z: L-function


I’m brought back to elliptic curves today thanks to another request from Gaurish, of the For The Love Of Mathematics blog. Interested in how that’s going to work out? Me too.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

So stop me if you’ve heard this one before. We’re going to make something interesting. You bring to it a complex-valued number. Anything you like. Let me call it ‘s’ for the sake of convenience. I know, it’s weird not to call it ‘z’, but that’s how this field of mathematics developed. I’m going to make a series built on this. A series is the sum of all the terms in a sequence. I know, it seems weird for a ‘series’ to be a single number, but that’s how that field of mathematics developed. The underlying sequence? I’ll make it in three steps. First, I start with all the counting numbers: 1, 2, 3, 4, 5, and so on. Second, I take each one of those terms and raise them to the power of your ‘s’. Third, I take the reciprocal of each of them. That’s the sequence. And when we add —

Yes, that’s right, it’s the Riemann-Zeta Function. The one behind the Riemann Hypothesis. That’s the mathematical conjecture that everybody loves to cite as the biggest unsolved problem in mathematics now that we know someone did something about Fermat’s Last Theorem. The conjecture is about what the zeroes of this function are. What values of ‘s’ make this sum equal to zero? Some boring ones. Zero, negative two, negative four, negative six, and so on. It has a lot of non-boring zeroes. All the ones we know of have an ‘s’ with a real part of ½. So far we know of at least 36 billion values of ‘s’ that make this add up to zero. They’re all ½ plus some imaginary number. We conjecture that this isn’t coincidence and all the non-boring zeroes are like that. We might be wrong. But it’s the way I would bet.

Anyone who’d be reading this far into a pop mathematics blog knows something of why the Riemann Hypothesis is interesting. It carries implications about prime numbers. It tells us things about a host of other theorems that are nice to have. Also they know it’s hard to prove. Really, really hard.

Ancient mathematical lore tells us there are a couple ways to solve a really, really hard problem. One is to narrow its focus. Try to find as simple a case of it as you can solve. Maybe a second simple case you can solve. Maybe a third. This could show you how, roughly, to solve the general problem. Not always. Individual cases of Fermat’s Last Theorem are easy enough to solve. You can show that a^3 + b^3 = c^3 doesn’t have any non-boring answers where a, b, and c are all positive whole numbers. Same with a^5 + b^5 = c^5 , though it takes longer. That doesn’t help you with the general a^n + b^n = c^n .

There’s another approach. It sounds like the sort of crazy thing Captain Kirk would get away with. It’s to generalize, to make a bigger, even more abstract problem. Sometimes that makes it easier.

For the Riemann-Zeta Function there’s one compelling generalization. It fits into that sequence I described making. After taking the reciprocals of integers-raised-to-the-s-power, multiply each by some number. Which number? Well, that depends on what you like. It could be the same number every time, if you like. That’s boring, though. That’s just the Riemann-Zeta Function times your number. It’s more interesting if what number you multiply by depends on which integer you started with. (Do not let it depend on ‘s’; that’s more complicated than you want.) When you do that? Then you’ve created an L-Function.

Specifically, you’ve created a Dirichlet L-Function. Dirichlet here is Peter Gustav Lejeune Dirichlet, a 19th century German mathematician who got his name on like everything. He did major work on partial differential equations, on Fourier series, on topology, in algebra, and on number theory, which is what we’d call these L-functions. There are other L-Functions, with identifying names such as Artin and Hecke and Euler, which get more directly into group theory. They look much like the Dirichlet L-Function. In building the sequence I described in the top paragraph, they do something else for the second step.

The L-Function is going to look like this:

L(s) = \sum_{n \ge 1}^{\infty} a_n \cdot \frac{1}{n^s}

The sigma there means to evaluate the thing that comes after it for each value of ‘n’ starting at 1 and increasing, by 1, up to … well, something infinitely large. The a_n are the numbers you’ve picked. They’re some value that depend on the index ‘n’, but don’t depend on the power ‘s’. This may look funny but it’s a standard way of writing the terms in a sequence.

An L-Function has to meet some particular criteria that I’m not going to worry about here. Look them up before you get too far into your research. These criteria give us ways to classify different L-Functions, though. We can describe them by degree, much as we describe polynomials. We can describe them by signature, part of those criteria I’m not getting into. We can describe them by properties of the extra numbers, the ones in that fourth step that you multiply the reciprocals by. And so on. LMFDB, an encyclopedia of L-Functions, lists eight or nine properties usable for a taxonomy of these things. (The ambiguity is in what things you consider to depend on what other things.)

What makes this interesting? For one, everything that makes the Riemann Hypothesis interesting. The Riemann-Zeta Function is a slice of the L-Functions. But there’s more. They merge into elliptic curves. Every elliptic curve corresponds to some L-Function. We can use the elliptic curve or the L-Function to prove what we wish to show. Elliptic curves are subject to group theory; so, we can bring group theory into these series.

And then it gets deeper. It always does. Go back to that formula for the L-Function like I put in mathematical symbols. I’m going to define a new function. It’s going to look a lot like a polynomial. Well, that L(s) already looked a lot like a polynomial, but this is going to look even more like one.

Pick a number τ. It’s complex-valued. Any number. All that I care is that its imaginary part be positive. In the trade we say that’s “in the upper half-plane”, because we often draw complex-valued numbers as points on a plane. The real part serves as the horizontal and the imaginary part serves as the vertical axis.

Now go back to your L-Function. Remember those a_n numbers you picked? Good. I’m going to define a new function based on them. It looks like this:

f(\tau) = \sum_{n \ge 1}^{\infty} a_n \left(  e^{2 \pi \imath \tau}\right)^n

You see what I mean about looking like a polynomial? If τ is a complex-valued number, then e^{2 \pi \imath \tau} is just another complex-valued number. If we gave that a new name like ‘z’, this function would look like the sum of constants times z raised to positive powers. We’d never know it was any kind of weird polynomial.

Anyway. This new function ‘f(τ)’ has some properties. It might be something called a weight-2 Hecke eigenform, a thing I am not going to explain without charging someone by the hour. But see the logic here: every elliptic curve matches with some kind of L-Function. Each L-Function matches with some ‘f(τ)’ kind of function. Those functions might or might not be these weight-2 Hecke eigenforms.

So here’s the thing. There was a big hypothesis formed in the 1950s that every rational elliptic curve matches to one of these ‘f(τ)’ functions that’s one of these eigenforms. It’s true. It took decades to prove. You may have heard of it, as the Taniyama-Shimura Conjecture. In the 1990s Wiles and Taylor proved this was true for a lot of elliptic curves, which is what proved Fermat’s Last Theorem after all that time. The rest of it was proved around 2000.

As I said, sometimes you have to make your problem bigger and harder to get something interesting out of it.

I mentioned this above. LMFDB is a fascinating site worth looking at. It’s got a lot of L-Function and Riemann-Zeta function-related materials.

The Summer 2017 Mathematics A To Z: Height Function (elliptic curves)


I am one letter closer to the end of Gaurish’s main block of requests. They’re all good ones, mind you. This gets me back into elliptic curves and Diophantine equations. I might be writing about the wrong thing.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Height Function.

My love’s father has a habit of asking us to rate our hobbies. This turned into a new running joke over a family vacation this summer. It’s a simple joke: I shuffled the comparables. “Which is better, Bon Jovi or a roller coaster?” It’s still a good question.

But as genial yet nasty as the spoof is, my love’s father asks natural questions. We always want to compare things. When we form a mathematical construct we look for ways to measure it. There’s typically something. We’ll put one together. We call this a height function.

We start with an elliptic curve. The coordinates of the points on this curve satisfy some equation. Well, there are many equations they satisfy. We pick one representation for convenience. The convenient thing is to have an easy-to-calculate height. We’ll write the equation for the curve as

y^2 = x^3 + Ax + B

Here both ‘A’ and ‘B’ are some integers. This form might be unique, depending on whether a slightly fussy condition on prime numbers hold. (Specifically, if ‘p’ is a prime number and ‘p4‘ divides into ‘A’, then ‘p6‘ must not divide into ‘B’. Yes, I know you realized that right away. But I write to a general audience, some of whom are learning how to see these things.) Then the height of this curve is whichever is the larger number, four times the cube of the absolute value of ‘A’, or 27 times the square of ‘B’. I ask you to just run with it. I don’t know the implications of the height function well enough to say why, oh, 25 times the square of ‘B’ wouldn’t do as well. The usual reason for something like that is that some obvious manipulation makes the 27 appear right away, or disappear right away.

This idea of height feeds in to a measure called rank. “Rank” is a term the young mathematician encounters first while learning matrices. It’s the number of rows in a matrix that aren’t equal to some sum or multiple of other rows. That is, it’s how many different things there are among a set. You can see why we might find that interesting. So many topics have something called “rank” and it measures how many different things there are in a set of things. In elliptic curves, the rank is a measure of how complicated the curve is. We can imagine the rational points on the elliptic curve as things generated by some small set of starter points. The starter points have to be of infinite order. Starter points that don’t, don’t count for the rank. Please don’t worry about what “infinite order” means here. I only mention this infinite-order business because if I don’t then something I have to say about two paragraphs from here will sound daft. So, the rank is how many of these starter points you need to generate the elliptic curve. (WARNING: Call them “generating points” or “generators” during your thesis defense.)

There’s no known way of guessing what the rank is if you just know ‘A’ and ‘B’. There are algorithms that can calculate the rank given a particular ‘A’ and ‘B’. But it’s not something like the quadratic formula where you can just do a quick calculation and know what you’re looking for. We don’t even know if the algorithms we have will work for every elliptic curve.

We think that there’s no limit to the height of elliptic curves. We don’t know this. We know there exist curves with ranks as high as 28. They seem to be rare [*]. I don’t know if that’s proven. But we do know there are elliptic curves with rank zero. A lot of them, in fact. (See what I meant two paragraphs back?) These are the elliptic curves that have only finitely many rational points on them.

And there’s a lot of those. There’s a well-respected that the average rank, of all the elliptic curves there are, is ½. It might be. What we have been able to prove is that the average rank is less than or equal to 1.17. Also that it should be larger than zero. So we’re maybe closing in on the ½ conjecture? At least we know something. I admit this essay I’ve started wondering what we do know of elliptic curves.

What do the height, and through it the rank, get us? I worry I’m repeating myself. By themselves they give us families of elliptic curves. Shapes that are similar in a particular and not-always-obvious way. And they feed into the Birch and Swinnerton-Dyer conjecture, which is the hipster’s Riemann Hypothesis. That is, it’s this big, unanswered, important problem that would, if answered, tell us things about a lot of questions that I’m not sure can be concisely explained. At least not why they’re interesting. We know some special cases, at least. Wikipedia tells me nothing’s proved for curves with rank greater than 1. Humanity’s ignorance on this point makes me feel slightly better pondering what I don’t know about elliptic curves.

(There are some other things within the field of elliptic curves called height functions. There’s particularly a height of individual points. I was unsure which height Gaurish found interesting so chose one. The other starts by measuring something different; it views, for example, \frac{1}{2} as having a lower height than does \frac{51}{101} , even though the numbers are quite close in value. It develops along similar lines, trying to find classes of curves with similar behavior. And it gets into different unsolved conjectures. We have our ideas about how to think of fields.).


[*] Wikipedia seems to suggest we only know of one, provided by Professor Noam Elkies in 2006, and let me quote it in full. I apologize that it isn’t in the format I suggested at top was standard. Elkies way outranks me academically so we have to do things his way:

y^2 + xy + y = x^3 - x^2 -  20,067,762,415,575,526,585,033,208,209,338,542,750,930,230,312,178,956,502 x + 34,481,611,795,030,556,467,032,985,690,390,720,374,855,944,359,319,180,361,266,008,296,291,939,448,732,243,429

I can’t figure how to get WordPress to present that larger. I sympathize. I’m tired just looking at an equation like that. This page lists records of known elliptic curve ranks. I don’t know if the lack of any records more recent than 2006 reflects the page not having been updated or nobody having found a rank-29 curve. I fully accept the field might be more difficult than even doing maintenance on a web page’s content is.

The Summer 2017 Mathematics A To Z: Diophantine Equations


I have another request from Gaurish, of the For The Love Of Mathematics blog, today. It’s another change of pace.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Diophantine Equations

A Diophantine equation is a polynomial. Well, of course it is. It’s an equation, or a set of equations, setting one polynomial equal to another. Possibly equal to a constant. What makes this different from “any old equation” is the coefficients. These are the constant numbers that you multiply the variables, your x and y and x2 and z8 and so on, by. To make a Diophantine equation all these coefficients have to be integers. You know one well, because it’s that x^n + y^n = z^n thing that Fermat’s Last Theorem is all about. And you’ve probably seen ax + by = 1 . It turns up a lot because that’s a line, and we do a lot of stuff with lines.

Diophantine equations are interesting. There are a couple of cases that are easy to solve. I mean, at least that we can find solutions for. ax + by = 1 , for example, that’s easy to solve. x^n + y^n = z^n it turns out we can’t solve. Well, we can if n is equal to 1 or 2. Or if x or y or z are zero. These are obvious, that is, they’re quite boring. That one took about four hundred years to solve, and the solution was “there aren’t any solutions”. This may convince you of how interesting these problems are. What, from looking at it, tells you that ax + by = 1 is simple while x^n + y^n = z^n is (most of the time) impossible?

I don’t know. Nobody really does. There are many kinds of Diophantine equation, all different-looking polynomials. Some of them are special one-off cases, like x^n + y^n = z^n . For example, there’s x^4 + y^4 + z^4 = w^4 for some integers x, y, z, and w. Leonhard Euler conjectured this equation had only boring solutions. You’ll remember Euler. He wrote the foundational work for every field of mathematics. It turns out he was wrong. It has infinitely many interesting solutions. But the smallest one is 2,682,440^4 + 15,365,639^4 + 18,796,760^4 = 20,615,673^4 and that one took a computer search to find. We can forgive Euler not noticing it.

Some are groups of equations that have similar shapes. There’s the Fermat’s Last Theorem formula, for example, which is a different equation for every different integer n. Then there’s what we call Pell’s Equation. This one is x^2 - D y^2 = 1 (or equals -1), for some counting number D. It’s named for the English mathematician John Pell, who did not discover the equation (even in the Western European tradition; Indian mathematicians were familiar with it for a millennium), did not solve the equation, and did not do anything particularly noteworthy in advancing human understanding of the solution. Pell owes his fame in this regard to Leonhard Euler, who misunderstood Pell’s revising a translation of a book discussing a solution for Pell’s authoring a solution. I confess Euler isn’t looking very good on Diophantine equations.

But nobody looks very good on Diophantine equations. Make up a Diophantine equation of your own. Use whatever whole numbers, positive or negative, that you like for your equation. Use whatever powers of however many variables you like for your equation. So you get something that looks maybe like this:

7x^2 - 20y + 18y^2 - 38z = 9

Does it have any solutions? I don’t know. Nobody does. There isn’t a general all-around solution. You know how with a quadratic equation we have this formula where you recite some incantation about “b squared minus four a c” and get any roots that exist? Nothing like that exists for Diophantine equations in general. Specific ones, yes. But they’re all specialties, crafted to fit the equation that has just that shape.

So for each equation we have to ask: is there a solution? Is there any solution that isn’t obvious? Are there finitely many solutions? Are there infinitely many? Either way, can we find all the solutions? And we have to answer them anew. What answers these have? Whether answers are known to exist? Whether answers can exist? We have to discover anew for each kind of equation. Knowing answers for one kind doesn’t help us for any others, except as inspiration. If some trick worked before, maybe it will work this time.

There are a couple usually reliable tricks. Can the equation be rewritten in some way that it becomes the equation for a line? If it can we probably have a good handle on any solutions. Can we apply modulo arithmetic to the equation? If it is, we might be able to reduce the number of possible solutions that the equation has. In particular we might be able to reduce the number of possible solutions until we can just check every case. Can we use induction? That is, can we show there’s some parameter for the equations, and that knowing the solutions for one value of that parameter implies knowing solutions for larger values? And then find some small enough value we can test it out by hand? Or can we show that if there is a solution, then there must be a smaller solution, and smaller yet, until we can either find an answer or show there aren’t any? Sometimes. Not always. The field blends seamlessly into number theory. And number theory is all sorts of problems easy to pose and hard or impossible to solve.

We name these equation after Diophantus of Alexandria, a 3rd century Greek mathematician. His writings, what we have of them, discuss how to solve equations. Not general solutions, the way we might want to solve ax^2 + bx + c = 0 , but specific ones, like 1x^2 - 5x + 6 = 0 . His books are among those whose rediscovery shaped the rebirth of mathematics. Pierre de Fermat’s scribbled his famous note in the too-small margins of Diophantus’s Arithmetica. (Well, a popular translation.)

But the field predates Diophantus, at least if we look at specific problems. Of course it does. In mathematics, as in life, any search for a source ends in a vast, marshy ambiguity. The field stays vital. If we loosen ourselves to looking at inequalities — x - Dy^2 < A , let's say — then we start seeing optimization problems. What values of x and y will make this equation most nearly true? What values will come closest to satisfying this bunch of equations? The questions are about how to find the best possible fit to whatever our complicated sets of needs are. We can't always answer. We keep searching.

What Second Derivatives Are And What They Can Do For You


Previous supplemental reading for Why Stuff Can Orbit:


This is another supplemental piece because it’s too much to include in the next bit of Why Stuff Can Orbit. I need some more stuff about how a mathematical physicist would look at something.

This is also a story about approximations. A lot of mathematics is really about approximations. I don’t mean numerical computing. We all know that when we compute we’re making approximations. We use 0.333333 instead of one-third and we use 3.141592 instead of π. But a lot of precise mathematics, what we call analysis, is also about approximations. We do this by a logical structure that works something like this: take something we want to prove. Now for every positive number ε we can find something — a point, a function, a curve — that’s no more than ε away from the thing we’re really interested in, and which is easier to work with. Then we prove whatever we want to with the easier-to-work-with thing. And since ε can be as tiny a positive number as we want, we can suppose ε is a tinier difference than we can hope to measure. And so the difference between the thing we’re interested in and the thing we’ve proved something interesting about is zero. (This is the part that feels like we’re pulling a scam. We’re not, but this is where it’s worth stopping and thinking about what we mean by “a difference between two things”. When you feel confident this isn’t a scam, continue.) So we proved whatever we proved about the thing we’re interested in. Take an analysis course and you will see this all the time.

When we get into mathematical physics we do a lot of approximating functions with polynomials. Why polynomials? Yes, because everything is polynomials. But also because polynomials make so much mathematical physics easy. Polynomials are easy to calculate, if you need numbers. Polynomials are easy to integrate and differentiate, if you need analysis. Here that’s the calculus that tells you about patterns of behavior. If you want to approximate a continuous function you can always do it with a polynomial. The polynomial might have to be infinitely long to approximate the entire function. That’s all right. You can chop it off after finitely many terms. This finite polynomial is still a good approximation. It’s just good for a smaller region than the infinitely long polynomial would have been.

Necessary qualifiers: pages 65 through 82 of any book on real analysis.

So. Let me get to functions. I’m going to use a function named ‘f’ because I’m not wasting my energy coming up with good names. (When we get back to the main Why Stuff Can Orbit sequence this is going to be ‘U’ for potential energy or ‘E’ for energy.) It’s got a domain that’s the real numbers, and a range that’s the real numbers. To express this in symbols I can write f: \Re \rightarrow \Re . If I have some number called ‘x’ that’s in the domain then I can tell you what number in the domain is matched by the function ‘f’ to ‘x’: it’s the number ‘f(x)’. You were expecting maybe 3.5? I don’t know that about ‘f’, not yet anyway. The one thing I do know about ‘f’, because I insist on it as a condition for appearing, is that it’s continuous. It hasn’t got any jumps, any gaps, any regions where it’s not defined. You could draw a curve representing it with a single, if wriggly, stroke of the pen.

I mean to build an approximation to the function ‘f’. It’s going to be a polynomial expansion, a set of things to multiply and add together that’s easy to find. To make this polynomial expansion this I need to choose some point to build the approximation around. Mathematicians call this the “point of expansion” because we froze up in panic when someone asked what we were going to name it, okay? But how are we going to make an approximation to a function if we don’t have some particular point we’re approximating around?

(One answer we find in grad school when we pick up some stuff from linear algebra we hadn’t been thinking about. We’ll skip it for now.)

I need a name for the point of expansion. I’ll use ‘a’. Many mathematicians do. Another popular name for it is ‘x0‘. Or if you’re using some other variable name for stuff in the domain then whatever that variable is with subscript zero.

So my first approximation to the original function ‘f’ is … oh, shoot, I should have some new name for this. All right. I’m going to use ‘F0‘ as the name. This is because it’s one of a set of approximations, each of them a little better than the old. ‘F1‘ will be better than ‘F0‘, but ‘F2‘ will be even better, and ‘F2038‘ will be way better yet. I’ll also say something about what I mean by “better”, although you’ve got some sense of that already.

I start off by calling the first approximation ‘F0‘ by the way because you’re going to think it’s too stupid to dignify with a number as big as ‘1’. Well, I have other reasons, but they’ll be easier to see in a bit. ‘F0‘, like all its sibling ‘Fn‘ functions, has a domain of the real numbers and a range of the real numbers. The rule defining how to go from a number ‘x’ in the domain to some real number in the range?

F^0(x) = f(a)

That is, this first approximation is simply whatever the original function’s value is at the point of expansion. Notice that’s an ‘x’ on the left side of the equals sign and an ‘a’ on the right. This seems to challenge the idea of what an “approximation” even is. But it’s legit. Supposing something to be constant is often a decent working assumption. If you failed to check what the weather for today will be like, supposing that it’ll be about like yesterday will usually serve you well enough. If you aren’t sure where your pet is, you look first wherever you last saw the animal. (Or, yes, where your pet most loves to be. A particular spot, though.)

We can make this rigorous. A mathematician thinks this is rigorous: you pick any margin of error you like. Then I can find a region near enough to the point of expansion. The value for ‘f’ for every point inside that region is ‘f(a)’ plus or minus your margin of error. It might be a small region, yes. Doesn’t matter. It exists, no matter how tiny your margin of error was.

But yeah, that expansion still seems too cheap to work. My next approximation, ‘F1‘, will be a little better. I mean that we can expect it will be closer than ‘F0‘ was to the original ‘f’. Or it’ll be as close for a bigger region around the point of expansion ‘a’. What it’ll represent is a line. Yeah, ‘F0‘ was a line too. But ‘F0‘ is a horizontal line. ‘F1‘ might be a line at some completely other angle. If that works better. The second approximation will look like this:

F^1(x) = f(a) + m\cdot\left(x - a\right)

Here ‘m’ serves its traditional yet poorly-explained role as the slope of a line. What the slope of that line should be we learn from the derivative of the original ‘f’. The derivative of a function is itself a new function, with the same domain and the same range. There’s a couple ways to denote this. Each way has its strengths and weaknesses about clarifying what we’re doing versus how much we’re writing down. And trying to write down almost anything can inspire confusion in analysis later on. There’s a part of analysis when you have to shift from thinking of particular problems to how problems work then.

So I will define a new function, spoken of as f-prime, this way:

f'(x) = \frac{df}{dx}\left(x\right)

If you look closely you realize there’s two different meanings of ‘x’ here. One is the ‘x’ that appears in parentheses. It’s the value in the domain of f and of f’ where we want to evaluate the function. The other ‘x’ is the one in the lower side of the derivative, in that \frac{df}{dx} . That’s my sloppiness, but it’s not uniquely mine. Mathematicians keep this straight by using the symbols \frac{df}{dx} so much they don’t even see the ‘x’ down there anymore so have no idea there’s anything to find confusing. Students keep this straight by guessing helplessly about what their instructors want and clinging to anything that doesn’t get marked down. Sorry. But what this means is to “take the derivative of the function ‘f’ with respect to its variable, and then, evaluate what that expression is for the value of ‘x’ that’s in parentheses on the left-hand side”. We can do some things that avoid the confusion in symbols there. They all require adding some more variables and some more notation in, and it looks like overkill for a measly definition like this.

Anyway. We really just want the deriviate evaluated at one point, the point of expansion. That is:

m = f'(a) = \frac{df}{dx}\left(a\right)

which by the way avoids that overloaded meaning of ‘x’ there. Put this together and we have what we call the tangent line approximation to the original ‘f’ at the point of expansion:

F^1(x) = f(a) + f'(a)\cdot\left(x - a\right)

This is also called the tangent line, because it’s a line that’s tangent to the original function. A plot of ‘F1‘ and the original function ‘f’ are guaranteed to touch one another only at the point of expansion. They might happen to touch again, but that’s luck. The tangent line will be close to the original function near the point of expansion. It might happen to be close again later on, but that’s luck, not design. Most stuff you might want to do with the original function you can do with the tangent line, but the tangent line will be easier to work with. It exactly matches the original function at the point of expansion, and its first derivative exactly matches the original function’s first derivative at the point of expansion.

We can do better. We can find a parabola, a second-order polynomial that approximates the original function. This will be a function ‘F2(x)’ that looks something like:

F^2(x) = f(a) + f'(a)\cdot\left(x - a\right) + \frac12 m_2 \left(x - a\right)^2

What we’re doing is adding a parabola to the approximation. This is that curve that looks kind of like a loosely-drawn U. The ‘m2‘ there measures how spread out the U is. It’s not quite the slope, but it’s kind of like that, which is why I’m using the letter ‘m’ for it. Its value we get from the second derivative of the original ‘f’:

m_2 = f''(a) = \frac{d^2f}{dx^2}\left(a\right)

We find the second derivative of a function ‘f’ by evaluating the first derivative, and then, taking the derivative of that. We can denote it with two ‘ marks after the ‘f’ as long as we aren’t stuck wrapping the function name in ‘ marks to set it out. And so we can describe the function this way:

F^2(x) = f(a) + f'(a)\cdot\left(x - a\right) + \frac12 f''(a) \left(x - a\right)^2

This will be a better approximation to the original function near the point of expansion. Or it’ll make larger the region where the approximation is good.

If the first derivative of a function at a point is zero that means the tangent line is horizontal. In physics stuff this is an equilibrium. The second derivative can tell us whether the equilibrium is stable or not. If the second derivative at the equilibrium is positive it’s a stable equilibrium. The function looks like a bowl open at the top. If the second derivative at the equilibrium is negative then it’s an unstable equilibrium.

We can make better approximations yet, by using even more derivatives of the original function ‘f’ at the point of expansion:

F^3(x) = f(a) + f'(a)\cdot\left(x - a\right) + \frac12 f''(a) \left(x - a\right)^2 + \frac{1}{3\cdot 2} f'''(a) \left(x - a\right)^3

There’s better approximations yet. You can probably guess what the next, fourth-degree, polynomial would be. Or you can after I tell you the fraction in front of the new term will be \frac{1}{4\cdot 3\cdot 2} . The only big difference is that after about the third derivative we give up on adding ‘ marks after the function name ‘f’. It’s just too many little dots. We start writing, like, ‘f(iv)‘ instead. Or if the Roman numerals are too much then ‘f(2038)‘ instead. Or if we don’t want to pin things down to a specific value ‘f(j)‘ with the understanding that ‘j’ is some whole number.

We don’t need all of them. In physics problems we get equilibriums from the first derivative. We get stability from the second derivative. And we get springs in the second derivative too. And that’s what I hope to pick up on in the next installment of the main series.

The End 2016 Mathematics A To Z: Yang Hui’s Triangle


Today’s is another request from gaurish and another I’m glad to have as it let me learn things too. That’s a particularly fun kind of essay to have here.

Yang Hui’s Triangle.

It’s a triangle. Not because we’re interested in triangles, but because it’s a particularly good way to organize what we’re doing and show why we do that. We’re making an arrangement of numbers. First we need cells to put the numbers in.

Start with a single cell in what’ll be the top middle of the triangle. It spreads out in rows beneath that. The rows are staggered. The second row has two cells, each one-half width to the side of the starting one. The third row has three cells, each one-half width to the sides of the row above, so that its center cell is directly under the original one. The fourth row has four cells, two of which are exactly underneath the cells of the second row. The fifth row has five cells, three of them directly underneath the third row’s cells. And so on. You know the pattern. It’s the one that pins in a plinko board take. Just trimmed down to a triangle. Make as many rows as you find interesting. You can always add more later.

In the top cell goes the number ‘1’. There’s also a ‘1’ in the leftmost cell of each row, and a ‘1’ in the rightmost cell of each row.

What of interior cells? The number for those we work out by looking to the row above. Take the cells to the immediate left and right of it. Add the values of those together. So for example the center cell in the third row will be ‘1’ plus ‘1’, commonly regarded as ‘2’. In the third row the leftmost cell is ‘1’; it always is. The next cell over will be ‘1’ plus ‘2’, from the row above. That’s ‘3’. The cell next to that will be ‘2’ plus ‘1’, a subtly different ‘3’. And the last cell in the row is ‘1’ because it always is. In the fourth row we get, starting from the left, ‘1’, ‘4’, ‘6’, ‘4’, and ‘1’. And so on.

It’s a neat little arithmetic project. It has useful application beyond the joy of making something neat. Many neat little arithmetic projects don’t have that. But the numbers in each row give us binomial coefficients, which we often want to know. That is, if we wanted to work out (a + b) to, say, the third power, we would know what it looks like from looking at the fourth row of Yanghui’s Triangle. It will be 1\cdot a^4 + 4\cdot a^3 \cdot b^1 + 6\cdot a^2\cdot b^2 + 4\cdot a^1\cdot b^3 + 1\cdot b^4 . This turns up in polynomials all the time.

Look at diagonals. By diagonal here I mean a line parallel to the line of ‘1’s. Left side or right side; it doesn’t matter. Yang Hui’s triangle is bilaterally symmetric around its center. The first diagonal under the edges is a bit boring but familiar enough: 1-2-3-4-5-6-7-et cetera. The second diagonal is more curious: 1-3-6-10-15-21-28 and so on. You’ve seen those numbers before. They’re called the triangular numbers. They’re the number of dots you need to make a uniformly spaced, staggered-row triangle. Doodle a bit and you’ll see. Or play with coins or pool balls.

The third diagonal looks more arbitrary yet: 1-4-10-20-35-56-84 and on. But these are something too. They’re the tetrahedronal numbers. They’re the number of things you need to make a tetrahedron. Try it out with a couple of balls. Oranges if you’re bored at the grocer’s. Four, ten, twenty, these make a nice stack. The fourth diagonal is a bunch of numbers I never paid attention to before. 1-5-15-35-70-126-210 and so on. This is — well. We just did tetrahedrons, the triangular arrangement of three-dimensional balls. Before that we did triangles, the triangular arrangement of two-dimensional discs. Do you want to put in a guess what these “pentatope numbers” are about? Sure, but you hardly need to. If we’ve got a bunch of four-dimensional hyperspheres and want to stack them in a neat triangular pile we need one, or five, or fifteen, or so on to make the pile come out neat. You can guess what might be in the fifth diagonal. I don’t want to think too hard about making triangular heaps of five-dimensional hyperspheres.

There’s more stuff lurking in here, waiting to be decoded. Add the numbers of, say, row four up and you get two raised to the third power. Add the numbers of row ten up and you get two raised to the ninth power. You see the pattern. Add everything in, say, the top five rows together and you get the fifth Mersenne number, two raised to the fifth power (32) minus one (31, when we’re done). Add everything in the top ten rows together and you get the tenth Mersenne number, two raised to the tenth power (1024) minus one (1023).

Or add together things on “shallow diagonals”. Start from a ‘1’ on the outer edge. I’m going to suppose you started on the left edge, but remember symmetry; it’ll be fine if you go from the right instead. Add to that ‘1’ the number you get by moving one cell to the right and going up-and-right. And then again, go one cell to the right and then one cell up-and-right. And again and again, until you run out of cells. You get the Fibonacci sequence, 1-1-2-3-5-8-13-21-and so on.

We can even make an astounding picture from this. Take the cells of Yang Hui’s triangle. Color them in. One shade if the cell has an odd number, another if the cell has an even number. It will create a pattern we know as the Sierpiński Triangle. (Wacław Sierpiński is proving to be the surprise special guest star in many of this A To Z sequence’s essays.) That’s the fractal of a triangle subdivided into four triangles with the center one knocked out, and the remaining triangles them subdivided into four triangles with the center knocked out, and on and on.

By now I imagine even my most skeptical readers agree this is an interesting, useful mathematical construct. Also that they’re wondering why I haven’t said the name “Blaise Pascal”. The Western mathematical tradition knows of this from Pascal’s work, particularly his 1653 Traité du triangle arithmétique. But mathematicians like to say their work is universal, and independent of the mere human beings who find it. Constructions like this triangle give support to this. Yang lived in China, in the 12th century. I imagine it possible Pascal had hard of his work or been influenced by it, by some chain, but I know of no evidence that he did.

And even if he had, there are other apparently independent inventions. The Avanti Indian astronomer-mathematician-astrologer Varāhamihira described the addition rule which makes the triangle work in commentaries written around the year 500. Omar Khayyám, who keeps appearing in the history of science and mathematics, wrote about the triangle in his 1070 Treatise on Demonstration of Problems of Algebra. Again so far as I am aware there’s not a direct link between any of these discoveries. They are things different people in different traditions found because the tools — arithmetic and aesthetically-pleasing orders of things — were ready for them.

Yang Hui wrote about his triangle in the 1261 book Xiangjie Jiuzhang Suanfa. In it he credits the use of the triangle (for finding roots) was invented around 1100 by mathematician Jia Xian. This reminds us that it is not merely mathematical discoveries that are found by many peoples at many times and places. So is Boyer’s Law, discovered by Hubert Kennedy.

The End 2016 Mathematics A To Z: Osculating Circle


I’m happy to say it’s another request today. This one’s from HowardAt58, author of the Saving School Math blog. He’s given me some great inspiration in the past.

Osculating Circle.

It’s right there in the name. Osculating. You know what that is from that one Daffy Duck cartoon where he cries out “Greetings, Gate, let’s osculate” while wearing a moustache. Daffy’s imitating somebody there, but goodness knows who. Someday the mystery drives the young you to a dictionary web site. Osculate means kiss. This doesn’t seem to explain the scene. Daffy was imitating Jerry Colonna. That meant something in 1943. You can find him on old-time radio recordings. I think he’s funny, in that 40s style.

Make the substitution. A kissing circle. Suppose it’s not some playground antic one level up from the Kissing Bandit that plagues recess yet one or two levels down what we imagine we’d do in high school. It suggests a circle that comes really close to something, that touches it a moment, and then goes off its own way.

But then touching. We know another word for that. It’s the root behind “tangent”. Tangent is a trigonometry term. But it appears in calculus too. The tangent line is a line that touches a curve at one specific point and is going in the same direction as the original curve is at that point. We like this because … well, we do. The tangent line is a good approximation of the original curve, at least at the tangent point and for some region local to that. The tangent touches the original curve, and maybe it does something else later on. What could kissing be?

The osculating circle is about approximating an interesting thing with a well-behaved thing. So are similar things with names like “osculating curve” or “osculating sphere”. We need that a lot. Interesting things are complicated. Well-behaved things are understood. We move from what we understand to what we would like to know, often, by an approximation. This is why we have tangent lines. This is why we build polynomials that approximate an interesting function. They share the original function’s value, and its derivative’s value. A polynomial approximation can share many derivatives. If the function is nice enough, and the polynomial big enough, it can be impossible to tell the difference between the polynomial and the original function.

The osculating circle, or sphere, isn’t so concerned with matching derivatives. I know, I’m as shocked as you are. Well, it matches the first and the second derivatives of the original curve. Anything past that, though, it matches only by luck. The osculating circle is instead about matching the curvature of the original curve. The curvature is what you think it would be: it’s how much a function curves. If you imagine looking closely at the original curve and an osculating circle they appear to be two arcs that come together. They must touch at one point. They might touch at others, but that’s incidental.

Osculating circles, and osculating spheres, sneak out of mathematics and into practical work. This is because we often want to work with things that are almost circles. The surface of the Earth, for example, is not a sphere. But it’s only a tiny bit off. It’s off in ways that you only notice if you are doing high-precision mapping. Or taking close measurements of things in the sky. Sometimes we do this. So we map the Earth locally as if it were a perfect sphere, with curvature exactly what its curvature is at our observation post.

Or we might be observing something moving in orbit. If the universe had only two things in it, and they were the correct two things, all orbits would be simple: they would be ellipses. They would have to be “point masses”, things that have mass without any volume. They never are. They’re always shapes. Spheres would be fine, but they’re never perfect spheres even. The slight difference between a perfect sphere and whatever the things really are affects the orbit. Or the other things in the universe tug on the orbiting things. Or the thing orbiting makes a course correction. All these things make little changes in the orbiting thing’s orbit. The actual orbit of the thing is a complicated curve. The orbit we could calculate is an osculating — well, an osculating ellipse, rather than an osculating circle. Similar idea, though. Call it an osculating orbit if you’d rather.

That osculating circles have practical uses doesn’t mean they aren’t respectable mathematics. I’ll concede they’re not used as much as polynomials or sine curves are. I suppose that’s because polynomials and sine curves have nicer derivatives than circles do. But osculating circles do turn up as ways to try solving nonlinear differential equations. We need the help. Linear differential equations anyone can solve. Nonlinear differential equations are pretty much impossible. They also turn up in signal processing, as ways to find the frequencies of a signal from a sampling of data. This, too, we would like to know.

We get the name “osculating circle” from Gottfried Wilhelm Leibniz. This might not surprise. Finding easy-to-understand shapes that approximate interesting shapes is why we have calculus. Isaac Newton described a way of making them in the Principia Mathematica. This also might not surprise. Of course they would on this subject come so close together without kissing.

Theorem Thursday: Liouville’s Approximation Theorem And How To Make Your Own Transcendental Number


As I get into the second month of Theorem Thursdays I have, I think, the whole roster of weeks sketched out. Today, I want to dive into some real analysis, and the study of numbers. It’s the sort of thing you normally get only if you’re willing to be a mathematics major. I’ll try to be readable by people who aren’t. If you carry through to the end and follow directions you’ll have your very own mathematical construct, too, so enjoy.

Liouville’s Approximation Theorem

It all comes back to polynomials. Of course it does. Polynomials aren’t literally everything in mathematics. They just come close. Among the things we can do with polynomials is divide up the real numbers into different sets. The tool we use is polynomials with integer coefficients. Integers are the positive and the negative whole numbers, stuff like ‘4’ and ‘5’ and ‘-12’ and ‘0’.

A polynomial is the sum of a bunch of products of coefficients multiplied by a variable raised to a power. We can use anything for the variable’s name. So we use ‘x’. Sometimes ‘t’. If we want complex-valued polynomials we use ‘z’. Some people trying to make a point will use ‘y’ or ‘s’ but they’re just showing off. Coefficients are just numbers. If we know the numbers, great. If we don’t know the numbers, or we want to write something that doesn’t commit us to any particular numbers, we use letters from the start of the alphabet. So we use ‘a’, maybe ‘b’ if we must. If we need a lot of numbers, we use subscripts: a0, a1, a2, and so on, up to some an for some big whole number n. To talk about one of these without committing ourselves to a specific example we use a subscript of i or j or k: aj, ak. It’s possible that aj and ak equal each other, but they don’t have to, unless j and k are the same whole number. They might also be zero, but they don’t have to be. They can be any numbers. Or, for this essay, they can be any integers. So we’d write a generic polynomial f(x) as:

f(x) = a_0 + a_1 x + a_2 x^2 + a_3 x^3 + \cdots + a_{n - 1}x^{n - 1} + a_n x^n

(Some people put the coefficients in the other order, that is, a_n + a_{n - 1}x + a_{n - 2}x^2 and so on. That’s not wrong. The name we give a number doesn’t matter. But it makes it harder to remember what coefficient matches up with, say, x14.)

A zero, or root, is a value for the variable (‘x’, or ‘t’, or what have you) which makes the polynomial equal to zero. It’s possible that ‘0’ is a zero, but don’t count on it. A polynomial of degree n — meaning the highest power to which x is raised is n — can have up to n different real-valued roots. All we’re going to care about is one.

Rational numbers are what we get by dividing one whole number by another. They’re numbers like 1/2 and 5/3 and 6. They’re numbers like -2.5 and 1.0625 and negative a billion. Almost none of the real numbers are rational numbers; they’re exceptional freaks. But they are all the numbers we actually compute with, once we start working out digits. Thus we remember that to live is to live paradoxically.

And every rational number is a root of a first-degree polynomial. That is, there’s some polynomial f(x) = a_0 + a_1 x that’s made zero for your polynomial. It’s easy to tell you what it is, too. Pick your rational number. You can write that as the integer p divided by the integer q. Now look at the polynomial f(x) = p – q x. Astounded yet?

That trick will work for any rational number. It won’t work for any irrational number. There’s no first-degree polynomial with integer coefficients that has the square root of two as a root. There are polynomials that do, though. There’s f(x) = 2 – x2. You can find the square root of two as the zero of a second-degree polynomial. You can’t find it as the zero of any lower-degree polynomials. So we say that this is an algebraic number of the second degree.

This goes on higher. Look at the cube root of 2. That’s another irrational number, so no first-degree polynomials have it as a root. And there’s no second-degree polynomials that have it as a root, not if we stick to integer coefficients. Ah, but f(x) = 2 – x3? That’s got it. So the cube root of two is an algebraic number of degree three.

We can go on like this, although I admit examples for higher-order algebraic numbers start getting hard to justify. Most of the numbers people have heard of are either rational or are order-two algebraic numbers. I can tell you truly that the eighth root of two is an eighth-degree algebraic number. But I bet you don’t feel enlightened. At best you feel like I’m setting up for something. The number r(5), the smallest radius a disc can have so that five of them will completely cover a disc of radius 1, is eighth-degree and that’s interesting. But you never imagined the number before and don’t have any idea how big that is, other than “I guess that has to be smaller than 1”. (It’s just a touch less than 0.61.) I sound like I’m wasting your time, although you might start doing little puzzles trying to make smaller coins cover larger ones. Do have fun.

Liouville’s Approximation Theorem is about approximating algebraic numbers with rational ones. Almost everything we ever do is with rational numbers. That’s all right because we can make the difference between the number we want, even if it’s r(5), and the numbers we can compute with, rational numbers, as tiny as we need. We trust that the errors we make from this approximation will stay small. And then we discover chaos science. Nothing is perfect.

For example, suppose we need to estimate π. Everyone knows we can approximate this with the rational number 22/7. That’s about 3.142857, which is all right but nothing great. Some people know we can approximate it as 333/106. (I didn’t until I started writing this paragraph and did some research.) That’s about 3.141509, which is better. Then there’s 355/113, which is not as famous as 22/7 but is a celebrity compared to 333/106. That’s about 3.141529. Then we get into some numbers only mathematics hipsters know: 103993/33102 and 104348/33215 and so on. Fine.

The Liouville Approximation Theorem is about sequences that converge on an irrational number. So we have our first approximation x1, that’s the integer p1 divided by the integer q1. So, 22 and 7. Then there’s the next approximation x2, that’s the integer p2 divided by the integer q2. So, 333 and 106. Then there’s the next approximation yet, x3, that’s the integer p3 divided by the integer q3. As we look at more and more approximations, xj‘s, we get closer and closer to the actual irrational number we want, in this case π. Also, the denominators, the qj‘s, keep getting bigger.

The theorem speaks of having an algebraic number, call it x, of some degree n greater than 1. Then we have this limit on how good an approximation can be. The difference between the number x that we want, and our best approximation p / q, has to be larger than the number (1/q)n + 1. The approximation might be higher than x. It might be lower than x. But it will be off by at least the n-plus-first power of 1/q.

Polynomials let us separate the real numbers into infinitely many tiers of numbers. They also let us say how well the most accessible tier of numbers, rational numbers, can approximate these more exotic things.

One of the things we learn by looking at numbers through this polynomial screen is that there are transcendental numbers. These are numbers that can’t be the root of any polynomial with integer coefficients. π is one of them. e is another. Nearly all numbers are transcendental. But the proof that any particular number is one is hard. Joseph Liouville showed that transcendental numbers must exist by using continued fractions. But this approximation theorem tells us how to make our own transcendental numbers. This won’t be any number you or anyone else has ever heard of, unless you pick a special case. But it will be yours.

You will need:

  1. a1, an integer from 1 to 9, such as ‘1’, ‘9’, or ‘5’.
  2. a2, another integer from 1 to 9. It may be the same as a1 if you like, but it doesn’t have to be.
  3. a3, yet another integer from 1 to 9. It may be the same as a1 or a2 or, if it so happens, both.
  4. a4, one more integer from 1 to 9 and you know what? Let’s summarize things a bit.
  5. A whopping great big gob of integers aj, every one of them from 1 to 9, for every possible integer ‘j’ so technically this is infinitely many of them.
  6. Comfort with the notation n!, which is the factorial of n. For whole numbers that’s the product of every whole number from 1 to n, so, 2! is 1 times 2, or 2. 3! is 1 times 2 times 3, or 6. 4! is 1 times 2 times 3 times 4, or 24. And so on.
  7. Not to be thrown by me writing -n!. By that I mean work out n! and then multiply that by -1. So -2! is -2. -3! is -6. -4! is -24. And so on.

Now, assemble them into your very own transcendental number z, by this formula:

z = a_1 \cdot 10^{-1} + a_2 \cdot 10^{-2!} + a_3 \cdot 10^{-3!} + a_4 \cdot 10^{-4!} + a_5 \cdot 10^{-5!} + a_6 \cdot 10^{-6!} \cdots

If you’ve done it right, this will look something like:

z = 0.a_{1}a_{2}000a_{3}00000000000000000a_{4}0000000 \cdots

Ah, but, how do you know this is transcendental? We can prove it is. The proof is by contradiction, which is how a lot of great proofs are done. We show nonsense follows if the thing isn’t true, so the thing must be true. (There are mathematicians that don’t care for proof-by-contradiction. They insist on proof by charging straight ahead and showing a thing is true directly. That’s a matter of taste. I think every mathematician feels that way sometimes, to some extent or on some issues. The proof-by-contradiction is easier, at least in this case.)

Suppose that your z here is not transcendental. Then it’s got to be an algebraic number of degree n, for some finite number n. That’s what it means not to be transcendental. I don’t know what n is; I don’t care. There is some n and that’s enough.

Now, let’s let zm be a rational number approximating z. We find this approximation by taking the first m! digits after the decimal point. So, z1 would be just the number 0.a1. z2 is the number 0.a1a2. z3 is the number 0.a1a2000a3. I don’t know what m you like, but that’s all right. We’ll pick a nice big m.

So what’s the difference between z and zm? Well, it can’t be larger than 10 times 10-(m + 1)!. This is for the same reason that π minus 3.14 can’t be any bigger than 0.01.

Now suppose we have the best possible rational approximation, p/q, of your number z. Its first m! digits are going to be p / 10m!. This will be zm And by the Liouville Approximation Theorem, then, the difference between z and zm has to be at least as big as (1/10m!)(n + 1).

So we know the difference between z and zm has to be larger than one number. And it has to be smaller than another. Let me write those out.

\frac{1}{10^{m! (n + 1)}} < |z - z_m | < \frac{10}{10^{(m + 1)!}}

We don’t need the z – zm anymore. That thing on the rightmost side we can write what I’ll swear is a little easier to use. What we have left is:

\frac{1}{10^{m! (n + 1)}} < \frac{1}{10^{(m + 1)! - 1}}

And this will be true whenever the number m! (n + 1) is greater than (m + 1)! – 1 for big enough numbers m.

But there’s the thing. This isn’t true whenever m is greater than n. So the difference between your alleged transcendental number and its best-possible rational approximation has to be simultaneously bigger than a number and smaller than that same number without being equal to it. Supposing your number is anything but transcendental produces nonsense. Therefore, congratulations! You have a transcendental number.

If you chose all 1’s for your aj‘s, then you have what is sometimes called the Liouville Constant. If you didn’t, you may have a transcendental number nobody’s ever noticed before. You can name it after someone if you like. That’s as meaningful as naming a star for someone and cheaper. But you can style it as weaving someone’s name into the universal truth of mathematics. Enjoy!

I’m glad to finally give you a mathematics essay that lets you make something you can keep.

Theorem Thursday: The Intermediate Value Theorem


I am still taking requests for this Theorem Thursdays sequence. I intend to post each Thursday in June and July an essay talking about some theorem and what it means and why it’s important. I have gotten a couple of requests in, but I’m happy to take more; please just give me a little lead time. But I want to start with one that delights me.

The Intermediate Value Theorem

I own a Scion tC. It’s a pleasant car, about 2400 percent more sporty than I am in real life. I got it because it met my most important criteria: it wasn’t expensive and it had a sun roof. That it looks stylish is an unsought bonus.

But being a car, and a black one at that, it has a common problem. Leave it parked a while, then get inside. In the winter, it gets so cold that snow can fall inside it. In the summer, it gets so hot that the interior, never mind the passengers, risk melting. While pondering this slight inconvenience I wondered, isn’t there any outside temperature that leaves my car comfortable?

Scion tC covered in snow and ice from a late winter storm.
My Scion tC, here, not too warm.

Of course there is. We know this before thinking about it. The sun heats the car, yes. When the outside temperature is low enough, there’s enough heat flowing out that the car gets cold. When the outside temperature’s high enough, not enough heat flows out. The car stays warm. There must be some middle temperature where just enough heat flows out that the interior doesn’t get particularly warm or cold. Not just one middle temperature, come to that. There is a range of temperatures that are comfortable to sit in. But that just means there’s a range of outside temperatures for which the car’s interior stays comfortable. We know this range as late April, early May, here. Most years, anyway.

The reasoning that lets us know there is a comfort-producing outside temperature we can see as a use of the Intermediate Value Theorem. It addresses a function f with domain [a, b], and range of the real numbers. The domain is closed; that is, the numbers we call ‘a’ and ‘b’ are both in the set. And f has to be a continuous function. If you want to draw it, you can do so without having to lift pen from paper. (WARNING: Do not attempt to pass your Real Analysis course with that definition. But that’s what the proper definition means.)

So look at the numbers f(a) and f(b). Pick some number between them, and I’ll call that number ‘g’. There must be at least one number ‘c’, that’s between ‘a’ and ‘b’, and for which f(c) equals g.

Bernard Bolzano, an early-19th century mathematician/logician/theologist/priest, gets the credit for first proving this theorem. Bolzano’s version was a little different. It supposes that f(a) and f(b) are of opposite sign. That is, f(a) is a positive and f(b) a negative number. Or f(a) is negative and f(b) is positive. And Bolzano’s theorem says there must be some number ‘c’ for which f(c) is zero.

You can prove this by drawing any wiggly curve at all and then a horizontal line in the middle of it. Well, that doesn’t prove it to mathematician’s satisfaction. But it will prove the matter in the sense that you’ll be convinced. It’ll also convince anyone you try explaining this to.

A generic wiggly function, with vertical lines marking off the domain limits of a and b. Horizontal lines mark off f(a) and f(b), as well as a putative value g. The wiggly function indeed has at least one point for which its value is g.
Any old real-valued function, drawn in blue. The number ‘g’ is something between the number f(a) and f(b). And somewhere there’s at least one number, between a and b, for where the function’s equal to g.

You might wonder why anyone needed this proved at all. It’s a bit like proving that as you pour water into the sink there’ll come a time the last dish gets covered with water. So it is. The need for a proof came about from the ongoing attempt to make mathematics rigorous. We have an intuitive idea of what it means for functions to be continuous; see my above comment about lifting pens from paper. Can that be put in terms that don’t depend on physical intuition? … Yes, it can. And we can divorce the Intermediate Value Theorem from our physical intuitions. We can know something that’s true even if we never see a car or a sink.

This theorem might leave you feeling a little hollow inside. Proving that there is some ‘c’ for which f(c) equals g, or even equals zero, doesn’t seem to tell us much about how to find it. It doesn’t even tell us that there’s only one ‘c’, rather than two or three or a hundred million candidates that meet our criteria. Fair enough. The Intermediate Value Theorem is more about proving the existence of solutions, rather than how to find them.

But knowing there is a solution can help us find them. The Intermediate Value Theorem as we know it grew out of finding roots for polynomials. One numerical method, easy to set up for any problem, is the bisection method. If you know that somewhere between ‘a’ and ‘b’ the function goes from positive to negative, then find the midpoint, ‘c’. The function is equal to zero either between ‘a’ and ‘c’, or between ‘c’ and ‘b’. Pick the side that it’s on, and bisect that. Pick the half of that which the zero must be in. Bisect that half. And repeat until you get close enough to the answer for your needs. (The same reasoning applies to a lot of problems in which you divide the search range in two each time until the answer appears.)

We can get some pretty heady results from the Intermediate Value Theorem, too, even if we don’t know where any of them are. An example you’ll see everywhere is that there must be spots on the opposite sides of the globe with the exact same temperature. Or humidity, or daily rainfall, or any other quantity like that. I had thought everyone was ripping that example off from Richard Courant and Herbert Robbins’s masterpiece What Is Mathematics?. But I can’t find this particular example in there. I wonder what we are all ripping it off from.

Two blobby shapes, one of them larger and more complicated, the other looking kind of like the outline of a trefoil, both divided by a magenta line.
Does this magenta line bisect both the red and the greyish blobs simultaneously? … Probably not, unless I’ve been way lucky. But there is some line that does.

So here’s a neat example that is ripped off from them. Draw two blobs on the plane. Is there a straight line that bisects both of them at once? Bisecting here means there’s exactly as much of one blob on one side of the line as on the other. There certainly is. The trick is there are any number of lines that will bisect one blob, and then look at what that does to the other.

A similar ripped-off result you can do with a single blob of any shape you like. Draw any line that bisects it. There are a lot of candidates. Can you draw a line perpendicular to that so that the blob gets quartered, divided into four spots of equal area? Yes. Try it.

A generic blobby shape with two perpendicular magenta lines crossing over it.
Does this pair of magenta lines split this blue blob into four pieces of exactly the same area? … Probably not, unless I’ve been lucky. But there is some pair of perpendicular lines that will do it. Also, is it me or does that blob look kind of like a butterfly?

But surely the best use of the Intermediate Value Theorem is in the problem of wobbly tables. If the table has four legs, all the same length, and the problem is the floor isn’t level it’s all right. There is some way to adjust the table so it won’t wobble. (Well, the ground can’t be angled more than a bit over 35 degrees, but that’s all right. If the ground has a 35 degree angle you aren’t setting a table on it. You’re rolling down it.) Finally a mathematical proof can save us from despair!

Except that the proof doesn’t work if the table legs are uneven which, alas, they often are. But we can’t get everything.

Courant and Robbins put forth one more example that’s fantastic, although it doesn’t quite work. But it’s a train problem unlike those you’ve seen before. Let me give it to you as they set it out:

Suppose a train travels from station A to station B along a straight section of track. The journey need not be of uniform speed or acceleration. The train may act in any manner, speeding up, slowing down, coming to a halt, or even backing up for a while, before reaching B. But the exact motion of the train is supposed to be known in advance; that is, the function s = f(t) is given, where s is the distance of the train from station A, and t is the time, measured from the instant of departure.

On the floor of one of the cars a rod is pivoted so that it may move without friction either forward or backward until it touches the floor. If it does touch the floor, we assume that it remains on the floor henceforth; this wil be the case if the rod does not bounce.

Is it possible to place the rod in such a position that, if it is released at the instant when the train starts and allowed to move solely under the influence of gravity and the motion of the train, it will not fall to the floor during the entire journey from A to B?

They argue it is possible, and use the Intermediate Value Theorem to show it. They admit the range of angles it’s safe to start the rod from may be too small to be useful.

But they’re not quite right. Ian Stewart, in the revision of What Is Mathematics?, includes an appendix about this. Stewart credits Tim Poston with pointing out, in 1976, the flaw. It’s possible to imagine a path which causes the rod, from one angle, to just graze tipping over, let’s say forward, and then get yanked back and fall over flat backwards. This would leave no room for any starting angles that avoid falling over entirely.

It’s a subtle flaw. You might expect so. Nobody mentioned it between the book’s original publication in 1941, after which everyone liking mathematics read it, and 1976. And it is one that touches on the complications of spaces. This little Intermediate Value Theorem problem draws us close to chaos theory. It’s one of those ideas that weaves through all mathematics.

A Leap Day 2016 Mathematics A To Z: Transcendental Number


I’m down to the last seven letters in the Leap Day 2016 A To Z. It’s also the next-to-the-last of Gaurish’s requests. This was a fun one.

Transcendental Number.

Take a huge bag and stuff all the real numbers into it. Give the bag a good solid shaking. Stir up all the numbers until they’re thoroughly mixed. Reach in and grab just the one. There you go: you’ve got a transcendental number. Enjoy!

OK, I detect some grumbling out there. The first is that you tried doing this in your head because you somehow don’t have a bag large enough to hold all the real numbers. And you imagined pulling out some number like “2” or “37” or maybe “one-half”. And you may not be exactly sure what a transcendental number is. But you’re confident the strangest number you extracted, “minus 8”, isn’t it. And you’re right. None of those are transcendental numbers.

I regret saying this, but that’s your own fault. You’re lousy at picking random numbers from your head. So am I. We all are. Don’t believe me? Think of a positive whole number. I predict you probably picked something between 1 and 10. Almost surely something between 1 and 100. Surely something less than 10,000. You didn’t even consider picking something between 10,012,002,214,473,325,937,775 and 10,012,002,214,473,325,937,785. Challenged to pick a number, people will select nice and familiar ones. The nice familiar numbers happen not to be transcendental.

I detect some secondary grumbling there. Somebody picked π. And someone else picked e. Very good. Those are transcendental numbers. They’re also nice familiar numbers, at least to people who like mathematics a lot. So they attract attention.

Still haven’t said what they are. What they are traces back, of course, to polynomials. Take a polynomial that’s got one variable, which we call ‘x’ because we don’t want to be difficult. Suppose that all the coefficients of the polynomial, the constant numbers we presumably know or could find out, are integers. What are the roots of the polynomial? That is, for what values of x is the polynomial a complicated way of writing ‘zero’?

For example, try the polynomial x2 – 6x + 5. If x = 1, then that polynomial is equal to zero. If x = 5, the polynomial’s equal to zero. Or how about the polynomial x2 + 4x + 4? That’s equal to zero if x is equal to -2. So a polynomial with integer coefficients can certainly have positive and negative integers as roots.

How about the polynomial 2x – 3? Yes, that is so a polynomial. This is almost easy. That’s equal to zero if x = 3/2. How about the polynomial (2x – 3)(4x + 5)(6x – 7)? It’s my polynomial and I want to write it so it’s easy to find the roots. That polynomial will be zero if x = 3/2, or if x = -5/4, or if x = 7/6. So a polynomial with integer coefficients can have positive and negative rational numbers as roots.

How about the polynomial x2 – 2? That’s equal to zero if x is the square root of 2, about 1.414. It’s also equal to zero if x is minus the square root of 2, about -1.414. And the square root of 2 is irrational. So we can certainly have irrational numbers as roots.

So if we can have whole numbers, and rational numbers, and irrational numbers as roots, how can there be anything else? Yes, complex numbers, I see you raising your hand there. We’re not talking about complex numbers just now. Only real numbers.

It isn’t hard to work out why we can get any whole number, positive or negative, from a polynomial with integer coefficients. Or why we can get any rational number. The irrationals, though … it turns out we can only get some of them this way. We can get square roots and cube roots and fourth roots and all that. We can get combinations of those. But we can’t get everything. There are irrational numbers that are there but that even polynomials can’t reach.

It’s all right to be surprised. It’s a surprising result. Maybe even unsettling. Transcendental numbers have something peculiar about them. The 19th Century French mathematician Joseph Liouville first proved the things must exist, in 1844. (He used continued fractions to show there must be such things.) It would be seven years later that he gave an example of one in nice, easy-to-understand decimals. This is the number 0.110 001 000 000 000 000 000 001 000 000 (et cetera). This number is zero almost everywhere. But there’s a 1 in the n-th digit past the decimal if n is the factorial of some number. That is, 1! is 1, so the 1st digit past the decimal is a 1. 2! is 2, so the 2nd digit past the decimal is a 1. 3! is 6, so the 6th digit past the decimal is a 1. 4! is 24, so the 24th digit past the decimal is a 1. The next 1 will appear in spot number 5!, which is 120. After that, 6! is 720 so we wait for the 720th digit to be 1 again.

And what is this Liouville number 0.110 001 000 000 000 000 000 001 000 000 (et cetera) used for, besides showing that a transcendental number exists? Not a thing. It’s of no other interest. And this plagued the transcendental numbers until 1873. The only examples anyone had of transcendental numbers were ones built to show that they existed. In 1873 Charles Hermite showed finally that e, the base of the natural logarithm, was transcendental. e is a much more interesting number; we have reasons to care about it. Every exponential growth or decay or oscillating process has e lurking in it somewhere. In 1882 Ferdinand von Lindemann showed that π was transcendental, and that’s an even more interesting number.

That bit about π has interesting implications. One goes back to the ancient Greeks. Is it possible, using straightedge and compass, to create a square that’s exactly the same size as a given circle? This is equivalent to saying, if I give you a line segment, can you create another line segment that’s exactly the square root of π times as long? This geometric problem is equivalent to an algebraic one. That problem: can you create a polynomial, with integer coefficients, that has the square root of π as a root? (WARNING: I’m skipping some important points for the sake of clarity. DO NOT attempt to use this to pass your thesis defense without putting those points back in.) We want the square root of π because … well, what’s the area of a square whose sides are the square root of π long? That’s right. So we start with a line segment that’s equal to the radius of the circle and we can do that, surely. Once we have the radius, can’t we make a line that’s the square root of π times the radius, and from that make a square with area exactly π times the radius squared? Since π is transcendental, then, no. We can’t. Sorry. One of the great problems of ancient mathematics, and one that still has the power to attract the casual mathematician, got its final answer in 1882.

Georg Cantor is a name even non-mathematicians might recognize. He showed there have to be some infinite sets bigger than others, and that there must be more real numbers than there are rational numbers. Four years after showing that, he proved there are as many transcendental numbers as there are real numbers.

They’re everywhere. They permeate the real numbers so much that we can understand the real numbers as the transcendental numbers plus some dust. They’re almost the dark matter of mathematics. We don’t actually know all that many of them. Wolfram MathWorld has a table listing numbers proven to be transcendental, and the fact we can list that on a single web page is remarkable. Some of them are large sets of numbers, yes, like e^{\pi \sqrt{d}} for every positive whole number d. And we can infer many more from them; if π is transcendental then so is 2π, and so is 5π, and so is -20.38π, and so on. But the table of numbers proven to be irrational is still just 25 rows long.

There are even mysteries about obvious numbers. π is transcendental. So is e. We know that at least one of π times e and π plus e is transcendental. Perhaps both are. We don’t know which one is, or if both are. We don’t know whether ππ is transcendental. We don’t know whether ee is, either. Don’t even ask if πe is.

How, by the way, does this fit with my claim that everything in mathematics is polynomials? — Well, we found these numbers in the first place by looking at polynomials. The set is defined, even to this day, by how a particular kind of polynomial can’t reach them. Thinking about a particular kind of polynomial makes visible this interesting set.

Some More Stuff To Read


I’ve actually got enough comics for yet another Reading The Comics post. But rather than overload my Recent Posts display with those I’ll share some pointers to other stuff I think worth looking at.

So remember how the other day I said polynomials were everything? And I tried to give some examples of things you might not expect had polynomials tied to them? Here’s one I forgot. Howard Phillips, of the HowardAt58 blog, wrote recently about discrete signal processing, the struggle to separate real patterns from random noise. It’s a hard problem. If you do very little filtering, then meaningless flutterings can look like growing trends. If you do a lot of filtering, then you miss rare yet significant events and you take a long time to detect changes. Either can be mistakes. The study of a filter’s characteristics … well, you’ll see polynomials. A lot.

For something else to read, and one that doesn’t get into polynomials, here’s a post from Stephen Cavadino of the CavMaths blog, abut the areas of lunes. Lunes are … well, they’re kind of moon-shaped figures. Cavadino particularly writes about the Theorem of Not That Hippocrates. Start with a half circle. Draw a symmetric right triangle inside the circle. Draw half-circles off the two equal legs of that right triangle. The area between the original half-circle and the newly-drawn half circles is … how much? The answer may surprise you.

Cavadino doesn’t get into this, but: it’s possible to make a square that has the same area as these strange crescent shapes using only straightedge and compass. Not That Hippocrates knew this. It’s impossible to make a square with the exact same area as a circle using only straightedge and compass. But these figures, with edges that are defined by circles of just the right relative shapes, they’re fine. Isn’t that wondrous?

And this isn’t mathematics but what the heck. Have you been worried about the Chandler Wobble? Apparently there’s been a bit of a breakthrough in understanding it. Turns out water melting can change the Earth’s rotation enough to be noticed. And to have been noticed since the 1890s.

A Leap Day 2016 Mathematics A To Z: Polynomials


I have another request for today’s Leap Day Mathematics A To Z term. Gaurish asked for something exciting. This should be less challenging than Dedekind Domains. I hope.

Polynomials.

Polynomials are everything. Everything in mathematics, anyway. If humans study it, it’s a polynomial. If we know anything about a mathematical construct, it’s because we ran across it while trying to understand polynomials.

I exaggerate. A tiny bit. Maybe by three percent. But polynomials are big.

They’re easy to recognize. We can get them in pre-algebra. We make them out of a set of numbers called coefficients and one or more variables. The coefficients are usually either real numbers or complex-valued numbers. The variables we usually allow to be either real or complex-valued numbers. We take each coefficient and multiply it by some power of each variable. And we add all that up. So, polynomials are things that look like these things:

x^2 - 2x + 1
12 x^4 + 2\pi x^2 y^3 - 4x^3 y - \sqrt{6}
\ln(2) + \frac{1}{2}\left(x - 2\right) - \frac{1}{2 \cdot 2^2}\left(x - 2\right)^2 + \frac{1}{2 \cdot 2^3}\left(x - 2\right)^3 - \frac{1}{2 \cdot 2^4}\left(x - 2\right)^4  + \cdots
a_n x^n + a_{n - 1}x^{n - 1} + a_{n - 2}x^{n - 2} + \cdots + a_2 x^2 + a_1 x^1 + a_0

The first polynomial maybe looks nice and comfortable. The second may look a little threatening, what with it having two variables and a square root in it, but it’s not too weird. The third is an infinitely long polynomial; you’re supposed to keep going on in that pattern, adding even more terms. The last is a generic representation of a polynomial. Each number a0, a1, a2, et cetera is some coefficient that we in principle know. It’s a good way of representing a polynomial when we want to work with it but don’t want to tie ourselves down to a particular example. The highest power we raise a variable to we call the degree of the polynomial. A second-degree polynomial, for example, has an x2 in it, but not an x3 or x4 or x18 or anything like that. A third-degree polynomial has an x3, but not x to any higher powers. Degree is a useful way of saying roughly how long a polynomial is, so it appears all over discussions of polynomials.

But why do we like polynomials? Why like them so much that MathWorld lists 1,163 pages that mention polynomials?

It’s because they’re great. They do everything we’d ever want to do and they’re great at it. We can add them together as easily as we add regular old numbers. We can subtract them as well. We can multiply and divide them. There’s even prime polynomials, just like there are prime numbers. They take longer to work out, but they’re not harder.

And they do great stuff in advanced mathematics too. In calculus we want to take derivatives of functions. Polynomials, we always can. We get another polynomial out of that. So we can keep taking derivatives, as many as we need. (We might need a lot of them.) We can integrate too. The integration produces another polynomial. So we can keep doing that as long as we need too. (We need to do this a lot, too.) This lets us solve so many problems in calculus, which is about how functions work. It also lets us solve so many problems in differential equations, which is about systems whose change depends on the current state of things.

That’s great for analyzing polynomials, but what about things that aren’t polynomials?

Well, if a function is continuous, then it might as well be a polynomial. To be a little more exact, we can set a margin of error. And we can always find polynomials that are less than that margin of error away from the original function. The original function might be annoying to deal with. The polynomial that’s as close to it as we want, though, isn’t.

Not every function is continuous. Most of them aren’t. But most of the functions we want to do work with are, or at least are continuous in stretches. Polynomials let us understand the functions that describe most real stuff.

Nice for mathematicians, all right, but how about for real uses? How about for calculations?

Oh, polynomials are just magnificent. You know why? Because you can evaluate any polynomial as soon as you can add and multiply. (Also subtract, but we think of that as addition.) Remember, x4 just means “x times x times x times x”, four of those x’s in the product. All these polynomials are easy to evaluate.

Even better, we don’t have to evaluate them. We can automate away the evaluation. It’s easy to set a calculator doing this work, and it will do it without complaint and with few unforeseeable mistakes.

Now remember that thing where we can make a polynomial close enough to any continuous function? And we can always set a calculator to evaluate a polynomial? Guess that this means about continuous functions. We have a tool that lets us calculate stuff we would want to know. Things like arccosines and logarithms and Bessel functions and all that. And we get nice easy to understand numbers out of them. For example, that third polynomial I gave you above? That’s not just infinitely long. It’s also a polynomial that approximates the natural logarithm. Pick a positive number x that’s between 0 and 4 and put it in that polynomial. Calculate terms and add them up. You’ll get closer and closer to the natural logarithm of that number. You’ll get there faster if you pick a number near 2, but you’ll eventually get there for whatever number you pick. (Calculus will tell us why x has to be between 0 and 4. Don’t worry about it for now.)

So through polynomials we can understand functions, analytically and numerically.

And they keep revealing things to us. We discovered complex-valued numbers because we wanted to find roots, values of x that make a polynomial of x equal to zero. Some formulas worked well for third- and fourth-degree polynomials. (They look like the quadratic formula, which solves second-degree polynomials. The big difference is nobody remembers what they are without looking them up.) But the formulas sometimes called for things that looked like square roots of negative numbers. Absurd! But if you carried on as if these square roots of negative numbers meant something, you got meaningful answers. And correct answers.

We wanted formulas to solve fifth- and higher-degree polynomials exactly. We can do this with second and third and fourth-degree polynomials, after all. It turns out we can’t. Oh, we can solve some of them exactly. The attempt to understand why, though, helped us create and shape group theory, the study of things that look like but aren’t numbers.

Polynomials go on, sneaking into everything. We can look at a square matrix and discover its characteristic polynomial. This allows us to find beautifully-named things like eigenvalues and eigenvectors. These reveal secrets of the matrix’s structure. We can find polynomials in the formulas that describe how many ways to split up a group of things into a smaller number of sets. We can find polynomials that describe how networks of things are connected. We can find polynomials that describe how a knot is tied. We can even find polynomials that distinguish between a knot and the knot’s reflection in the mirror.

Polynomials are everything.

A Leap Day 2016 Mathematics A To Z: Dedekind Domain


When I tossed this season’s A To Z open to requests I figured I’d get some surprising ones. So I did. This one’s particularly challenging. It comes fro Gaurish Korpal, author of the Gaurish4Math blog.

Dedekind Domain

A major field of mathematics is Algebra. By this mathematicians don’t mean algebra. They mean studying collections of things on which you can do stuff that looks like arithmetic. There’s good reasons why this field has that confusing name. Nobody knows what they are.

We’ve seen before the creation of things that look a bit like arithmetic. Rings are a collection of things for which we can do something that works like addition and something that works like multiplication. There are a lot of different kinds of rings. When a mathematics popularizer tries to talk about rings, she’ll talk a lot about the whole numbers. We can usually count on the audience to know what they are. If that won’t do for the particular topic, she’ll try the whole numbers modulo something. If she needs another example then she talks about the ways you can rotate or reflect a triangle, or square, or hexagon and get the original shape back. Maybe she calls on the sets of polynomials you can describe. Then she has to give up on words and make do with pictures of beautifully complicated things. And after that she has to give up because the structures get too abstract to describe without losing the audience.

Dedekind Domains are a kind of ring that meets a bunch of extra criteria. There’s no point my listing them all. It would take several hundred words and you would lose motivation to continue before I was done. If you need them anyway Eric W Weisstein’s MathWorld dictionary gives the exact criteria. It also has explanations for all the words in those criteria.

Dedekind Domains, also called Dedekind Rings, are aptly named for Richard Dedekind. He was a 19th century mathematician, the last doctoral student of Gauss, and one of the people who defined what we think of as algebra. He also gave us a rigorous foundation for what irrational numbers are.

Among the problems that fascinated Dedekind was Fermat’s Last Theorem. This can’t surprise you. Every person who would be a mathematician is fascinated by it. We take our innings fiddling with cases and ways to show an + bn can’t equal cn for interesting whole numbers a, b, c, and n. We usually go about this by saying, “Suppose we have the smallest a, b, and c for which this is true and for which n is bigger than 2”. Then we do a lot of scribbling that shows this implies something contradictory, like an even number equals an odd, or that there’s some set of smaller numbers making this true. This proves the original supposition was false. Mathematicians first learn that trick as a way to show the square root of two can’t be a rational number. We stick with it because it’s nice and familiar and looks relevant. Most of us get maybe as far as proving there aren’t any solutions for n = 3 or maybe n = 4 and go on to other work. Dedekind didn’t prove the theorem. But he did find new ways to look at numbers.

One problem with proving Fermat’s Last Theorem is that it’s all about integers. Integers are hard to prove things about. Real numbers are easier. Complex-valued numbers are easier still. This is weird but it’s so. So we have this promising approach: if we could prove something like Fermat’s Last Theorem for complex-valued numbers, we’d get it up for integers. Or at least we’d be a lot of the way there. The one flaw is that Fermat’s Last Theorem isn’t true for complex-valued numbers. It would be ridiculous if it were true.

But we can patch things up. We can construct something called Gaussian Integers. These are complex-valued numbers which we can match up to integers in a compelling way. We could use the tools that work on complex-valued numbers to squeeze out a result about integers.

You know that this didn’t work. If it had, we wouldn’t have had to wait for the 1990s for the proof of Fermat’s Last Theorem. And that proof would have anything to do with this stuff. It hasn’t. One of the problems keeping this kind of proof from working is factoring. Whole numbers are either prime numbers or the product of prime numbers. Or they’re 1, ruled out of the universe of prime numbers for reasons I get to after the next paragraph. Prime numbers are those like 2, 5, 13, 37 and many others. They haven’t got any factors besides themselves and 1. The other whole numbers are the products of prime numbers. 12 is equal to 2 times 2 times 3. 35 is equal to 5 times 7. 165 is equal to 3 times 5 times 11.

If we stick to whole numbers, then, these all have unique prime factorizations. 24 is equal to 2 times 2 times 2 times 3. And there are no other combinations of prime numbers that multiply together to give us 24. We could rearrange the numbers — 2 times 3 times 2 times 2 works. But it will always be a combination of three 2’s and a single 3 that we multiply together to get 24.

(This is a reason we don’t consider 1 a prime number. If we did consider a prime number, then “three 2’s and a single 3” would be a prime factorization of 24, but so would “three 2’s, a single 3, and two 1’s”. Also “three 2’s, a single 3, and fifteen 1’s”. Also “three 2’s, a single 3, and one 1”. We have a lot of theorems that depend on whole numbers having a unique prime factorization. We could add the phrase “except for the count of 1’s in the factorization” to every occurrence of the phrase “prime factorization”. Or we could say that 1 isn’t a prime number. It’s a lot less work to say 1 isn’t a prime number.)

The trouble is that if we work with Gaussian integers we don’t have that unique prime factorization anymore. There are still prime numbers. But it’s possible to get some numbers as a product of different sets of prime numbers. And this point breaks a lot of otherwise promising attempts to prove Fermat’s Last Theorem. And there’s no getting around that, not for Fermat’s Last Theorem.

Dedekind saw a good concept lurking under this, though. The concept is called an ideal. It’s a subset of a ring that itself satisfies the rules for being a ring. And if you take something from the original ring and multiply it by something in the ideal, you get something that’s still in the ideal. You might already have one in mind. Start with the ring of integers. The even numbers are an ideal of that. Add any two even numbers together and you get an even number. Multiply any two even numbers together and you get an even number. Take any integer, even or not, and multiply it by an even number. You get an even number.

(If you were wondering: I mean the ideal would be a “ring without identity”. It’s not required to have something that acts like 1 for the purpose of multiplication. If we insisted on looking at the even numbers and the number 1, then we couldn’t be sure that adding two things from the ideal would stay in the ideal. After all, 2 is in the ideal, and if 1 also is, then 2 + 1 is a peculiar thing to consider an even number.)

It’s not just even numbers that do this. The multiples of 3 make an ideal in the integers too. Add two multiples of 3 together and you get a multiple of 3. Multiply two multiples of 3 together and you get another multiple of 3. Multiply any integer by a multiple of 3 and you get a multiple of 3.

The multiples of 4 also make an ideal, as do the multiples of 5, or the multiples of 82, or of any whole number you like.

Odd numbers don’t make an ideal, though. Add two odd numbers together and you don’t get an odd number. Multiply an integer by an odd number and you might get an odd number, you might not.

And not every ring has an ideal lurking within it. For example, take the integers modulo 3. In this case there are only three numbers: 0, 1, and 2. 1 + 1 is 2, uncontroversially. But 1 + 2 is 0. 2 + 2 is 1. 2 times 1 is 2, but 2 times 2 is 1 again. This is self-consistent. But it hasn’t got an ideal within it. There isn’t a smaller set that has addition work.

The multiples of 4 make an interesting ideal in the integers. They’re not just an ideal of the integers. They’re also an ideal of the even numbers. Well, the even numbers make a ring. They couldn’t be an ideal of the integers if they couldn’t be a ring in their own right. And the multiples of 4 — well, multiply any even number by a multiple of 4. You get a multiple of 4 again. This keeps on going. The multiples of 8 are an ideal for the multiples of 4, the multiples of 2, and the integers. Multiples of 16 and 32 make for even deeper nestings of ideals.

The multiples of 6, now … that’s an ideal of the integers, for all the reasons the multiples of 2 and 3 and 4 were. But it’s also an ideal of the multiples of 2. And of the multiples of 3. We can see the collection of “things that are multiples of 6” as a product of “things that are multiples of 2” and “things that are multiples of 3”. Dedekind saw this before us.

You might want to pause a moment while considering the idea of multiplying whole sets of numbers together. It’s a heady concept. Trying to do proofs with the concept feels at first like being tasked with alphabetizing a cloud. But we’re not planning to prove anything so you can move on if you like with an unalphabetized cloud.

A Dedekind Domain is a ring that has ideals like this. And the ideals come in two categories. Some are “prime ideals”, which act like prime numbers do. The non-prime ideals are the products of prime ideals. And while we might not have unique prime factorizations of numbers, we do have unique prime factorizations of ideals. That is, if an ideal is a product of some set of prime ideals, then it can’t also be the product of some other set of prime ideals. We get back something like unique factors.

This may sound abstract. But you know a Dedekind Domain. The integers are one. That wasn’t a given. Yes, we start algebra by looking for things that work like regular arithmetic do. But that doesn’t promise that regular old numbers will still satisfy us. We can, for instance, study things where the order matters in multiplication. Then multiplying one thing by a second gives us a different answer to multiplying the second thing by the first. Still, regular old integers are Dedekind domains and it’s hard to think of being more familiar than that.

Another example is the set of polynomials. You might want to pause for a moment here. Mathematics majors need a pause to start thinking of polynomials as being something kind of like regular old numbers. But you can certainly add one polynomial to another, and you get a polynomial out of it. You can multiply one polynomial by another, and you get a polynomial out of that. Try it. After that the only surprise would be that there are prime polynomials. But if you try to think of two polynomials that multiply together to give you “x + 1” you realize there have to be.

Other examples start getting more exotic. They’re things like the Gaussian integers I mentioned before. Gaussian integers are themselves an example of a structure called algebraic integers. Algebraic integers are — well, think of all the polynomials you can out of integer coefficients, and with a leading coefficient of 1. So, polynomials that look like “x3 – 4 x2 + 15 x + 6” or the like. All of the roots of those, the values of x which make that expression equal to zero, are algebraic integers. Yes, almost none of them are integers. We know. But the algebraic integers are also a Dedekind Domain.

I’d like to describe some more Dedekind Domains. I am foiled. I can find some more, but explaining them outside the dialect of mathematics is hard. It would take me more words than I am confident readers will give me.

I hope you are satisfied to know a bit of what a Dedekind Domain is. It is a kind of thing which works much like integers do. But a Dedekind Domain can be just different enough that we can’t count on factoring working like we are used to. We don’t lose factoring altogether, though. We are able to keep an attenuated version. It does take quite a few words to explain exactly how to set this up, however.

Making Lots Of Change


John D Cook’s Algebra Fact of the Day points to a pair of algorithms about making change. Specifically these are about how many ways there are to provide a certain amount of change using United States coins. By that he, and the algorithms, mean 1, 5, 10, 25, and 50 cent pieces. I’m not sure if 50 cent coins really count, since they don’t circulate any more than dollar coins do. Anyway, if you want to include or rule out particular coins it’s clear enough how to adapt things.

What surprised me was a simple algorithm, taken from Ronald L Graham, Donald E Knuth, and Oren Patashnik’s Concrete Mathematics: A Foundation For Computer Science to count the number of ways to make a certain amount of change. You start with the power series that’s equivalent to this fraction:

\frac{1}{\left(1 - z\right)\cdot\left(1 - z^{5}\right)\cdot\left(1 - z^{10}\right)\cdot\left(1 - z^{25}\right)\cdot\left(1 - z^{50}\right)}

A power series is a polynomial. The power series for \frac{1}{1 - z} , for example, is 1 + z + z^2 + z^3 + z^4 + \cdots ... and carries on forever like that. But if you choose a number between minus one and positive one, and put that in for z in either \frac{1}{1 - z} or in that series 1 + z + z^2 + z^3 + z^4 + \cdots ... you’ll get the same number. (If z is not between minus one and positive one, it doesn’t. Don’t worry about it. For what we’re doing we will never need any z.)

The power series for that big fraction with all the kinds of change in it is more tedious to work out. You’d need the power series for \frac{1}{1 - z} and \frac{1}{1 - z^5} and \frac{1}{1 - z^{10}} and so on, and to multiply all those series together. And yes, that’s multiplying infinitely long polynomials together, which you might reasonably expect will take some time.

You don’t need to, though. All you really want is a single term in this series. To tell how many ways there are to make n cents of change, look at the coefficient, the number, in front of the zn term. That’s the number of ways. So while this may be a lot of work, it’s not going to be hard work, and it’s going to be finite. You only have to work out the products that give you a zn power. That will take planning and preparation to do correctly, but that’s all.

A Summer 2015 Mathematics A To Z: z-transform


z-transform.

The z-transform comes to us from signal processing. The signal we take to be a sequence of numbers, all representing something sampled at uniformly spaced times. The temperature at noon. The power being used, second-by-second. The number of customers in the store, once a month. Anything. The sequence of numbers we take to stretch back into the infinitely great past, and to stretch forward into the infinitely distant future. If it doesn’t, then we pad the sequence with zeroes, or some other safe number that we know means “nothing”. (That’s another classic mathematician’s trick.)

It’s convenient to have a name for this sequence. “a” is a good one. The different sampled values are denoted by an index. a0 represents whatever value we have at the “start” of the sample. That might represent the present. That might represent where sampling began. That might represent just some convenient reference point. It’s the equivalent of mileage maker zero; we have to have something be the start.

a1, a2, a3, and so on are the first, second, third, and so on samples after the reference start. a-1, a-2, a-3, and so on are the first, second, third, and so on samples from before the reference start. That might be the last couple of values before the present.

So for example, suppose the temperatures the last several days were 77, 81, 84, 82, 78. Then we would probably represent this as a-4 = 77, a-3 = 81, a-2 = 84, a-1 = 82, a0 = 78. We’ll hope this is Fahrenheit or that we are remotely sensing a temperature.

The z-transform of a sequence of numbers is something that looks a lot like a polynomial, based on these numbers. For this five-day temperature sequence the z-transform would be the polynomial 77 z^4 + 81 z^3 + 84 z^2 + 81 z^1 + 78 z^0 . (z1 is the same as z. z0 is the same as the number “1”. I wrote it this way to make the pattern more clear.)

I would not be surprised if you protested that this doesn’t merely look like a polynomial but actually is one. You’re right, of course, for this set, where all our samples are from negative (and zero) indices. If we had positive indices then we’d lose the right to call the transform a polynomial. Suppose we trust our weather forecaster completely, and add in a1 = 83 and a2 = 76. Then the z-transform for this set of data would be 77 z^4 + 81 z^3 + 84 z^2 + 81 z^1 + 78 z^0 + 83 \left(\frac{1}{z}\right)^1 + 76 \left(\frac{1}{z}\right)^2 . You’d probably agree that’s not a polynomial, although it looks a lot like one.

The use of z for these polynomials is basically arbitrary. The main reason to use z instead of x is that we can learn interesting things if we imagine letting z be a complex-valued number. And z carries connotations of “a possibly complex-valued number”, especially if it’s used in ways that suggest we aren’t looking at coordinates in space. It’s not that there’s anything in the symbol x that refuses the possibility of it being complex-valued. It’s just that z appears so often in the study of complex-valued numbers that it reminds a mathematician to think of them.

A sound question you might have is: why do this? And there’s not much advantage in going from a list of temperatures “77, 81, 84, 81, 78, 83, 76” over to a polynomial-like expression 77 z^4 + 81 z^3 + 84 z^2 + 81 z^1 + 78 z^0 + 83 \left(\frac{1}{z}\right)^1 + 76 \left(\frac{1}{z}\right)^2 .

Where this starts to get useful is when we have an infinitely long sequence of numbers to work with. Yes, it does too. It will often turn out that an interesting sequence transforms into a polynomial that itself is equivalent to some easy-to-work-with function. My little temperature example there won’t do it, no. But consider the sequence that’s zero for all negative indices, and 1 for the zero index and all positive indices. This gives us the polynomial-like structure \cdots + 0z^2 + 0z^1 + 1 + 1\left(\frac{1}{z}\right)^1 + 1\left(\frac{1}{z}\right)^2 + 1\left(\frac{1}{z}\right)^3 + 1\left(\frac{1}{z}\right)^4 + \cdots . And that turns out to be the same as 1 \div \left(1 - \left(\frac{1}{z}\right)\right) . That’s much shorter to write down, at least.

Probably you’ll grant that, but still wonder what the point of doing that is. Remember that we started by thinking of signal processing. A processed signal is a matter of transforming your initial signal. By this we mean multiplying your original signal by something, or adding something to it. For example, suppose we want a five-day running average temperature. This we can find by taking one-fifth today’s temperature, a0, and adding to that one-fifth of yesterday’s temperature, a-1, and one-fifth of the day before’s temperature a-2, and one-fifth a-3, and one-fifth a-4.

The effect of processing a signal is equivalent to manipulating its z-transform. By studying properties of the z-transform, such as where its values are zero or where they are imaginary or where they are undefined, we learn things about what the processing is like. We can tell whether the processing is stable — does it keep a small error in the original signal small, or does it magnify it? Does it serve to amplify parts of the signal and not others? Does it dampen unwanted parts of the signal while keeping the main intact?

We can understand how data will be changed by understanding the z-transform of the way we manipulate it. That z-transform turns a signal-processing idea into a complex-valued function. And we have a lot of tools for studying complex-valued functions. So we become able to say a lot about the processing. And that is what the z-transform gets us.

Vertex of a parabola – language in math again


I don’t want folks thinking I’m claiming a monopoly on the mathematics-glossary front. HowardAt58 is happy to explain words too. Here he talks about one of the definitions of “vertex”, in this case the one that relates to parabolas and other polynomial curves. As a bonus, there’s osculating circles.

Saving school math

Here are some definitions of the vertex of a parabola.

One is complete garbage, one is correct  though put rather chattily.

The rest are not definitions, though very popular (this is just a selection). But they are true statements

Mathwarehouse: The vertex of a parabola is the highest or lowest point, also known as the maximum or minimum of a
parabola.
Mathopenref: A parabola is the shape defined by a quadratic equation. The vertex is the peak in the curve as shown on
the right. The peak will be pointing either downwards or upwards depending on the sign of the x2 term.
Virtualnerd: Each quadratic equation has either a maximum or minimum, but did you that this point has a special name?
In a quadratic equation, this point is called the vertex!
Mathwords: Vertex of a Parabola: The point at which a parabola makes its sharpest turn.
Purplemath: The…

View original post 419 more words

A Summer 2015 Mathematics A To Z: knot


Knot.

It’s a common joke that mathematicians shun things that have anything to do with the real world. You can see where the impression comes from, though. Even common mathematical constructs, such as “functions”, are otherworldly abstractions once a mathematician is done defining them precisely. It can look like mathematicians find real stuff to be too dull to study.

Knot theory goes against the stereotype. A mathematician’s knot is just about what you would imagine: threads of something that get folded and twisted back around themselves. Every now and then a knot theorist will get a bit of human-interest news going for the department by announcing a new way to tie a tie, or to tie a shoelace, or maybe something about why the Christmas tree lights get so tangled up. These are really parts of the field, and applications that almost leap off the page as one studies. It’s a bit silly, admittedly. The only way anybody needs to tie a tie is go see my father and have him do it for you, and then just loosen and tighten the knot for the two or three times you’ll need it. And there’s at most two ways of tying a shoelace anybody needs. Christmas tree lights are a bigger problem but nobody can really help with getting them untangled. But studying the field encourages a lot of sketches of knots, and they almost cry out to be done out of some real material.

One amazing thing about knots is that they can be described as mathematical expressions. There are multiple ways to encode a description for how a knot looks as a polynomial. An expression like t + t^3 - t^4 contains enough information to draw one knot as opposed to all the others that might exist. (In this case it’s a very simple knot, one known as the right-hand trefoil knot. A trefoil knot is a knot with a trefoil-like pattern.) Indeed, it’s possible to describe knots with polynomials that let you distinguish between a knot and its mirror-image reflection.

Biology, life, is knots. The DNA molecules that carry and transmit genes tangle up on themselves, creating knots. The molecules that DNA encodes, proteins and enzymes and all the other basic tools of cells, can be represented as knots. Since at this level the field is about how molecules interact you probably would expect that much of chemistry can be seen as the ways knots interact. Statistical mechanics, the study of unspeakably large number of particles, do as well. A field you can be introduced to by studying your sneaker runs through the most useful arteries of science.

That said, mathematicians do make their knots of unreal stuff. The mathematical knot is, normally, a one-dimensional thread rather than a cylinder of stuff like a string or rope or shoelace. No matter; just imagine you’ve got a very thin string. And we assume that it’s frictionless; the knot doesn’t get stuck on itself. As a result a mathematician just learning knot theory would snootily point out that however tightly wound up your extension cord is, it’s not actually knotted. You could in principle push one of the ends of the cord all the way through the knot and so loosen it into an untangled string, if you could push the cord from one end and if the cord didn’t get stuck on itself. So, yes, real-world knots are mathematically not knots. After all, something that just falls apart with a little push hardly seems worth the name “knot”.

My point is that mathematically a knot has to be a closed loop. And it’s got to wrap around itself in some sufficiently complicated way. A simple circle of string is not a knot. If “not a knot” sounds a bit childish you might use instead the Lewis Carrollian term “unknot”.

We can fix that, though, using a surprisingly common mathematical trick. Take the shoelace or rope or extension cord you want to study. And extend it: draw lines from either end of the cord out to the edge of your paper. (This is a great field for doodlers.) And then pretend that the lines go out and loop around, touching each other somewhere off the sheet of paper, as simply as possible. What had been an unknot is now not an unknot. Study wisely.

Everything I Learned In Eighth-Grade Math


My title is an exaggeration. In eighth grade Prealgebra I learned many things, but I confess that I didn’t learn well from that particular teacher that particular year. What I most clearly remember learning I picked up from a substitute who filled in a few weeks. It’s a method for factoring quadratic expressions into binomial expressions, and I must admit, it’s not very good. It’s cumbersome and totally useless once one knows the quadratic equation. But it’s fun to do, and I liked it a lot, and I’ve never seen it described as a way to factor quadratic expressions. So let me put it on the web and do what I can to preserve its legacy, and get hundreds of people telling me what it actually is and how everybody but the people I know went through a phase of using it.

It’s a method which looks at first like it’s going to be a magic square, but it’s not, and I’m at a loss what to call it. I don’t remember the substitute teacher’s name, so I can’t use that. I do remember the regular teacher’s name, but it wasn’t, as far as I know, part of his lesson plan, and it’d not be fair to him to let his legacy be defined by one student who just didn’t get him.

Continue reading “Everything I Learned In Eighth-Grade Math”