I don’t know. I say this for anyone this has unintentionally clickbaited, or who’s looking at a search engine’s preview of the page.
I come to this question from a friend, though, and it’s got me wondering. I don’t have a good answer, either. But I’m putting the question out there in case someone reading this, sometime, does know. Even if it’s in the remote future, it’d be nice to know.
And before getting to the question I should admit that “why” questions are, to some extent, a mug’s game. Especially in mathematics. I can ask why the sum of two consecutive triangular numbers a square number. But the answer is … well, that’s what we chose to mean by ‘triangular number’, ‘square number’, ‘sum’, and ‘consecutive’. We can show why the arithmetic of the combination makes sense. But that doesn’t seem to answer “why” the way, like, why Neil Armstrong was the first person to walk on the moon. It’s more a “why” like, “why are there Seven Sisters [ in the Pleiades ]?” [*]
But looking for “why” can, at least, give us hints to why a surprising result is reasonable. Draw dots representing a square number, slice it along the space right below a diagonal. You see dots representing two successive triangular numbers. That’s the sort of question I’m asking here.
From here, we get to some technical stuff and I apologize to readers who don’t know or care much about this kind of mathematics. It’s about the wave-mechanics formulation of quantum mechanics. In this, everything that’s observable about a system is contained within a function named . You find by solving a differential equation. The differential equation represents problems. Like, a particle experiencing some force that depends on position. This is written as a potential energy, because that’s easier to work with. But it’s the kind of problem done.
Each thing that you can possibly observe, in a quantum-mechanics context, matches an operator. For example, there’s the x-coordinate operator, which tells you where along the x-axis your particle’s likely to be found. This operator is, conveniently, just x. So evaluate and that’s your x-coordinate distribution. (This is assuming that we know in Cartesian coordinates, ones with an x-axis. Please let me do that.) This looks just like multiplying your old function by x, which is nice and easy.
Or you might want to know momentum. The momentum in the x-direction has an operator, , which equals . The is partial derivatives. The is Planck’s constant, a number which in normal systems of measurement is amazingly tiny. And you know how . That – symbol is just the minus or the subtraction symbol. So to find the momentum distribution, evaluate . This means taking a derivative of the you already had. And multiplying it by some numbers.
But. Why is there a in the momentum operator rather than the position operator? Why isn’t one and the other ? From a mathematical physics perspective, position and momentum are equally good variables. We tend to think of position as fundamental, but that’s surely a result of our happening to be very good at seeing where things are. If we were primarily good at spotting the momentum of things around us, we’d surely see that as the more important variable. When we get into Hamiltonian mechanics we start treating position and momentum as equally fundamental. Even the notation emphasizes how equal they are in importance, and treatment. We stop using ‘x’ or ‘r’ as the variable representing position. We use ‘q’ instead, a mirror to the ‘p’ that’s the standard for momentum. (‘p’ we’ve always used for momentum because … … … uhm. I guess ‘m’ was already committed, for ‘mass’. What I have seen is that it was taken as the first letter in ‘impetus’ with no other work to do. I don’t know that this is true. I’m passing on what I was told explains what looks like an arbitrary choice.)
So I’m supposing that this reflects how we normally set up as a function of position. That this is maybe why the position operator is so simple and bare. And then why the momentum operator has a minus, an imaginary number, and this partial derivative stuff. That if we started out with the wave function as a function of momentum, the momentum operator would be just the momentum variable. The position operator might be some mess with and derivatives or worse.
I don’t have a clear guess why one and not the other operator gets full possession of the though. I suppose that has to reflect convenience. If position and momentum are dual quantities then I’d expect we could put a mere constant like wherever we want. But this is, mostly, me writing out notes and scattered thoughts. I could be trying to explain something that might be as explainable as why the four interior angles of a rectangle are all right angles.
So I would appreciate someone pointing out the obvious reason these operators look like that. I may grumble privately at not having seen the obvious myself. But I’d like to know it anyway.
Today’s A To Z term is … well, my second choice. Goldenoj suggested Yang-Mills and I was so interested. Yang-Mills describes a class of mathematical structures. They particularly offer insight into how to do quantum mechanics. Especially particle physics. It’s of great importance. But, on thinking out what I would have to explain I realized I couldn’t write a coherent essay about it. Getting to what the theory is made of would take explaining a bunch of complicated mathematical structures. If I’d scheduled the A-to-Z differently, setting up matters like Lie algebras, maybe I could do it, but this time around? No such help. And I don’t feel comfortable enough in my knowledge of Yang-Mills to describe it without describing its technical points.
That said I hope that Jacob Siehler, who suggested the Game of ‘Y’, does not feel slighted. I hadn’t known anything of the game going in to the essay-writing. When I started research I was delighted. I have yet to actually play a for-real game of this. But I like what I see, and what I can think I can write about it.
Game of ‘Y’.
This is, as the name implies, a game. It has two players. They have the same objective: to create a ‘y’. Here, they do it by laying down tokens representing their side. They take turns, each laying down one token in a turn. They do this on a shape with three edges. The ‘y’ is created when there’s a continuous path of their tokens that reaches all three edges. Yes, it counts to have just a single line running along one edge of the board. This makes a pretty sorry ‘y’ but it suggests your opponent isn’t trying.
There are details of implementation. The board is a mesh of, mostly, hexagons. I take this to be for the same reason that so many conquest-type strategy games use hexagons. They tile space well, they give every space a good number of neighbors, and the distance from the centers of one neighbor to another is constant. In a square grid, the centers of diagonal neighbors are farther than the centers of left-right or up-down neighbors. Hexagons do well for this kind of game, where the goal is to fill space, or at least fill paths in space. There’s even a game named Hex, slightly older than Y, with similar rules. In that the goal is to draw a continuous path from one end of the rectangular grid to another. The grid of commercial boards, that I see, are around nine hexagons on a side. This probably reflects a desire to have a big enough board that games go on a while, but not so big that they go on forever
Mathematicians have things to say about this game. It fits nicely in game theory. It’s well-designed to show some things about game theory. It’s the kind of game which has perfect information game, for example. Each player knows, at all times, the moves all the players have made. Just look at the board and see where they’ve placed their tokens. A player might have forgotten the order the tokens were placed in, but that’s the player’s problem, not the game’s. Anyway in Y, the order of token-placing doesn’t much matter.
It’s also a game of complete information. Every player knows, at every step, what the other player could do. And what objective they’re working towards. One party, thinking enough, could forecast the other’s entire game. This comes close to the joke about the prisoners telling each other jokes by shouting numbers out to one another.
It is also a game in which a draw is impossible. Play long enough and someone must win. This even if both parties are for some reason trying to lose. There are ingenious proofs of this, but we can show it by considering a really simple game. Imagine playing Y on a tiny board, one that’s just one hex on each side. Definitely want to be the first player there.
So now imagine playing a slightly bigger board. Augment this one-by-one-by-one board by one row. That is, here, add two hexes along one of the sides of the original board. So there’s two pieces here; one is the original territory, and one is this one-row augmented territory. Look first at the original territory. Suppose that one of the players has gotten a ‘Y’ for the original territory. Will that player win the full-size board? … Well, sure. The other player can put a token down on either hex in the augmented territory. But there’s two hexes, either of which would make a path that connects the three edges of the board. The first player can put a token down on the other hex in the augmented territory, and now connects all three of the new sides again. First player wins.
All right, but how about a slightly bigger board? So take that two-by-two-by-two board and augment it, adding three hexes along one of the sides. Imagine a player’s won the original territory board. Do they have to win the full-size board? … Sure. The second player can put something in the augmented territory. But there’s again two hexes that would make the path connecting all three sides of the full board. The second player can put a token in one of those hexes. But the first player can put a token in the other of those. First player wins again.
How about a slightly bigger board yet? … Same logic holds. Really the only reason that the first player doesn’t always win is that, at some point, the first player screws up. And this is an existence proof, showing that the first player can always win. It doesn’t give any guidance into how to play, though. If the first player plays perfectly, she’s compelled to win. This is something which happens in many two-player, symmetric games. A symmetric game is one where either player has the same set of available moves, and can make the same moves with the same results. This proof needs to be tightened up to really hold. But it should convince you, at least, that the first player has an advantage.
So given that, the question becomes why play this game after you’ve decided who’ll go first? The reason you might if you were playing a game is, what, you have something else to do? And maybe you think you’ll make fewer mistakes than your opponent. One approach often used in symmetric games like this is the “pie rule”. The name comes from the story about how to slice a pie so you and your sibling don’t fight over the results. One cuts the pie, the other gets first pick of the slice, and then you fight anyway. In this game, though, one player makes a tentative first move. The other decides whether they will be Player One with that first move made or whether they’ll be Player Two, responding.
There are some neat quirks in the commercial Y games. One is that they don’t actually show hexes, and you don’t put tokens in the middle of hexes. Instead you put tokens on the spots that would be the center of the hex. On the board are lines pointing to the neighbors. This makes the board actually a string of triangles. This is the dual to the hex grid. It shows a set of vertices, and their connections, instead of hexes and their neighbors. Whether you think the hex grid or this dual makes it easier to tell when you’ve connected all three edges is a matter of taste. It does make the edges less jagged all around.
Another is that there will be three vertices that don’t connect to six others. They connect to five others, instead. Their spaces would be pentagons. As I understand the literature on this, this is a concession to game balance. It makes it easier for one side to fend off a path coming from the center.
It has geometric significance, though. A pure hexagonal grid is a structure that tiles the plane. A mostly hexagonal grid, with a couple of pentagons, though? That can tile the sphere. To cover the whole sphere you need something like at least twelve irregular spots. But this? With the three pentagons? That gives you a space that’s topographically equivalent to a hemisphere, or at least a slice of the sphere. If we do imagine the board to be a hemisphere covered, then the result of the handful of pentagon spaces is to make the “pole” closer to the equator.
So as I say the game seems fun enough to play. And it shows off some of the ways that game theorists classify games. And the questions they ask about games. Is the game always won by someone? Does one party have an advantage? Can one party always force a win? It also shows the kinds of approach game theorists can use to answer these questions. This before they consider whether they’d enjoy playing it.
I came across a little geometry thing that left me unsettled, even as I have to admit it’s correct. Start with a two-dimensional space, or as the hew-mons call it, a plane. Draw a square with sides of length two and centered on the origin. So it has corners at the points with Cartesian coordinates (+1, +1), (+1, -1), (-1, +1), and (-1, -1). Around each of these corners draw a circle of radius 1.
There is some largest circle that you can draw, centered on the origin, the dead center of the square, with Cartesian coordinates (0, 0), and that just touches all of the corners’ circles. It has a radius of a little under 0.414.
Now think of the three-dimensional analog. Three-dimensional space. Draw a box with sides all of length two and centered on the origin. So it has corners at the points with Cartesian coordinates (+1, +1, +1), (+1, +1, -1), (+1, -1, +1), (+1, -1, -1), (-1, +1, +1), (-1, +1, -1), (-1, -1, +1), and (-1, -1, -1). Around each of these eight corners draw a circle of radius 1.
There is some largest sphere that you can draw, centered on the origin, the point with Cartesian coordinates (0, 0, 0), that just touches all of the corners’ circles. It has a radius of a little under 0.732.
Think of the four-dimensional analog. This is harder to sketch. But a four-dimensional hypercube, with each side of length 2 and centered on the origin. So it has corners at the points with Cartesian coordinates (+1, +1, +1, +1), (+1, +1, +1, -1), (+1, +1, -1, +1), (+1, +1, -1, -1), and you know what? Will you let me pretend we listed all sixteen corners? Thanks. Around each of these corners draw a circle of radius 1.
There is some largest hypersphere you can draw, centered on the origin, the point with Cartesian coordinates (0, 0, 0, 0), and that just touches all of these corners’ circles. It has a radius of 1.
Keep going. Five-dimensional space, with corners like (+1, +1, +1, +1, +1). Six-dimensional space, with corners like (+1, +1, +1, +1, +1, +1). Seven-dimensional space. And so on.
Eventually, the space is vast enough that the radius of this largest-touching hypersphere is bigger than 2. That is, reaching out more than twice as far as the original box goes, this even though the corner hyperspheres line the edges of the box, and touch one another along those edges.
Non-Euclidean geometry has the reputation of holding deep, inscrutable mysteries. To say something is a non-Euclidean space, outside of a mathematical context, is to designate it as a place immune to reason and beyond human comprehension. This is not such a case. This is a completely Euclidean space; it’s just got a lot of dimensions to it. Strange things will happen.
Another weird, but to me not so unsettling matter, concerns the surface (or hypersurface) area and the volume of these spheres. The circumference of a unit circle is, famously, 2π. The area of a unit sphere is 4π. For a four-dimensional hypersphere the surface area is a bit bigger yet. And bigger again for five and six and seven dimensions. But at eight dimensions the surface area starts shrinking again, and it never grows again. Have a great enough number of dimensions and the unit hypersphere has almost zero surface area. The volume of a unit circle is π. Of a unit sphere, . For a four-dimensional hypersphere, . For a five-dimensional hypersphere, . It is never so large again; for six or more dimensions the volume starts to shrink again. As the number of dimensions of space grows, the volume of the unit hypersphere dwindles to zero.
You know, that’s unsettling me more now that I’m paying attention to it.
Now let me discuss the comic strips from last week with some real meat to their subject matter. There weren’t many: after Wednesday of last week there were only casual mentions of any mathematics topic. But one of the strips got me quite excited. You’ll know which soon enough.
Mac King and Bill King’s Magic in a Minute for the 10th uses everyone’s favorite topological construct to do a magic trick. This one uses a neat quirk of the Möbius strip: that if sliced along the center of its continuous loop you get not two separate shapes but one Möbius strip of greater length. There are more astounding feats possible. If the strip were cut one-third of the way from an edge it would slice the strip into two shapes, one another Möbius strip and one a simple loop.
Or consider not starting with a Möbius strip. Make the strip of paper by taking one end and twisting it twice around, for a full loop, before taping it to the other end. Slice this down the center and what results are two interlinked rings. Or place three twists in the original strip of paper before taping the ends together. Then, the shape, cut down the center, unfolds into a trefoil knot. But this would take some expert hand work to conceal the loops from the audience while cutting. It’d be a neat stunt if you could stage it, though.
Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 10th uses mathematics as obfuscation. We value mathematics for being able to make precise and definitely true statements. And for being able to describe the world with precision and clarity. But this has got the danger that people hear mathematical terms and tune out, trusting that the point will be along soon after some complicated talk.
The formulas on the blackboard are nearly all legitimate, and correct, formulas for the value of π. The upper-left and the lower-right formulas are integrals, and ones that correspond to particular trigonometric formulas. The The middle-left and the upper-right formulas are series, the sums of infinitely many terms. The one in the upper right, , was roughly proven by Leonhard Euler. Euler developed a proof that’s convincing, but that assumed that infinitely-long polynomials behave just like finitely-long polynomials. In this context, he was correct, but this can’t be generally trusted to happen. We’ve got proofs that, to our eyes, seem rigorous enough now.
The center-left formula doesn’t look correct to me. To my eye, this looks like a mistaken representation of the formula
The center-right formula is interesting because, in part, it looks weird. It’s written out as
That looks at first glance like something’s gone wrong with one of those infinite-product series for π. Not so; this is a notation used for continued fractions. A continued fraction has a string of denominators that are typically some whole number plus another fraction. Often the denominator of that fraction will itself be a whole number plus another fraction. This gets to be typographically challenging. So we have this notation instead. Its syntax is that
There are many attractive formulas for π. It’s temping to say this is because π is such a lovely number it naturally has beautiful formulas. But more likely humans are so interested in π we go looking for formulas with some appealing sequence to them. There are some awful-looking formulas out there too. I don’t know your tastes, but for me I feel my heart cool when I see that π is equal to four divided by this number:
however much I might admire the ingenuity which found that relationship, and however efficiently it may calculate digits of π.
Glenn McCoy and Gary McCoy’s The Duplex for the 13th uses skill at arithmetic as shorthand for proving someone’s a teacher. There’s clearly some implicit idea that this is a school teacher, probably for elementary schools, and doesn’t have a particular specialty. But it is only three panels; they have to get the joke done, after all.
I knew by Thursday this would be a brief week. The number of mathematically-themed comic strips has been tiny. I’m not upset, as the days turned surprisingly full on me once again. At some point I would have to stop being surprised that every week is busier than I expect, right?
Anyway, the week gives me plenty of chances to look back to 1936, which is great fun for people who didn’t have to live through 1936.
Elzie Segar’s Thimble Theatre rerun for the 28th of October is part of the story introducing Eugene the Jeep. The Jeep has astounding powers which, here, are finally explained as being due to it being a fourth-dimensional creature. Or at least able to move into the fourth dimension. This is amazing for how it shows off the fourth dimension being something you could hang a comic strip plot on, back in the day. (Also back in the day, humor strips with ongoing plots that might run for months were very common. The only syndicated strips like it today are Gasoline Alley, Alley Oop, the current storyline in Safe Havens where they’ve just gone and terraformed Mars, and Popeye, rerunning old daily stories.) The Jeep has many astounding powers, including that he can’t be kept inside — or outside — anywhere against his will, and he’s able to forecast the future.
Could there be a fourth-dimensional animal? I dunno, I’m not a dimensional biologist. It seems like we need a rich chemistry for life to exist. Lots of compounds, many of them long and complicated ones. Can those exist in four dimensions? I don’t know the quantum mechanics of chemical formation well enough to say. I think there’s obvious problems. Electrical attraction and repulsion would fall off much more rapidly with distance than they do in three-dimensional space. This seems like it argues chemical bonds would be weaker things, which generically makes for weaker chemical compounds. So probably a simpler chemistry. On the other hand, what’s interesting in organic chemistry is shapes of molecules, and four dimensions of space offer plenty of room for neat shapes to form. So maybe that compensates for the chemical bonds. I don’t know.
But if we take the premise as given, that there is a four-dimensional animal? With some minor extra assumptions then yeah, the Jeep’s powers fit well enough. Not being able to be enclosed follows almost naturally. You, a three-dimensional being, can’t be held against your will by someone tracing a line on the floor around you. The Jeep — if the fourth dimension is as easy to move through as the third — has the same ability.
Forecasting the future, though? We have a long history of treating time as “the” fourth dimension. There’s ways that this makes good organizational sense. But we do have to treat time as somehow different from space, even to make, for example, general relativity work out. If the Jeep can see and move through time? Well, yeah, then if he wants he can check on something for you, at least if it’s something whose outcome he can witness. If it’s not, though? Well, maybe the flow of events from the fourth dimension is more obvious than it is from a mere three, in the way that maybe you can spot something coming down the creek easily, from above, in a way that people on the water can’t tell.
Olive Oyl and Popeye use the Jeep to tease one another, asking for definite answers about whether the other is cute or not. This seems outside the realm of things that the fourth dimension could explain. In the 1960s cartoons he even picks up the power to electrically shock offenders; I don’t remember if this was in the comic strips at all.
Elzie Segar’s Thimble Theatre rerun for the 29th of October has Wimpy doing his best to explain the fourth dimension. I think there’s a warning here for mathematician popularizers here. He gets off to a fair start and then it all turns into a muddle. Explaining the fourth dimension in terms of the three dimensions we’re familiar with seems like a good start. Appealing to our intuition to understand something we have to reason about has a long and usually successful history. But then Wimpy goes into a lot of talk about the mystery of things, and it feels like it’s all an appeal to the strangeness of the fourth dimension. I don’t blame Popeye for not feeling it’s cleared anything up. Segar would come back, in this storyline, to several other attempted explanations of the Jeep’s powers, although they do come back around to, y’know, it’s a magical animal. They’re all over the place in the Popeye comic universe.
Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 28th of October is a riff on predictability and encryption. Good encryption schemes rely on randomness. Concealing the content of a message means matching it to an alternate message. Each of the alternate messages should be equally likely to be transmitted. This way, someone who hasn’t got the key would not be able to tell what’s being sent. The catch is that computers do not truly do randomness. They mostly rely on quasirandom schemes that could, in principle, be detected and spoiled. There are ways to get randomness, mostly involving putting in something from the real world. Sensors that detect tiny fluctuations in temperature, for example, or radio detectors. I recall one company going for style and using a wall of lava lamps, so that the rise and fall of lumps were in some way encoded into unpredictable numbers.
Robb Armstrong’s JumpStart for the 2nd of November is a riff on the Birthday “Paradox”, the thing where you’re surprised to find someone shares a birthday with you. (I have one small circle of friends featuring two people who share my birthday, neatly enough.) Paradox is in quotes because it defies only intuition, not logic. The logic is clear that you need only a couple dozen people before some pair will probably share a birthday. Marcie goes overboard in trying to guess how many people at her workplace would share their birthday on top of that. Birthdays are nearly uniformly spread across all days of the year. There are slight variations; September birthdays are a little more likely than, say, April ones; the 13th of any month is a less likely birthday than the 12th or the 24th are. But this is a minor correction, aptly ignored when you’re doing a rough calculation. With 615 birthdays spread out over the year you’d expect the average day to be the birthday of about 1.7 people. (To be not silly about this, a ten-day span should see about 17 birthdays.) However, there are going to be “clumps”, days where three or even four people have birthdays. There will be gaps, days nobody has a birthday, or even streaks of days where nobody has a birthday. If there weren’t a fair number of days with a lot of birthdays, and days with none, we’d have to suspect birthdays weren’t random here.
There were also a handful of comic strips just mentioning mathematics, that I can’t make anything in depth about. Here’s two.
I hope to have proper comment about it in the usual Sunday Reading the Comics post. But the “current” storyline in Elzie Segar’s Thimble Theatre comic strip — Popeye to normal people — is the 1936 introduction of Eugene the Jeep. If you’ve looked at my user icon here you know I like Eugene.
Anyway, Eugene the Jeep has wondrous powers. These include the power of prophecy and the power to disappear from even enclosed spaces. Segar’s explanation for this was that the Jeep can turn into the fourth dimension and so do things we can’t hope to do. Which is a fun premise, yes. More, though, it’s got to be a pretty early use of the fourth or other high dimensions in pop culture. Yes, there were some things normal people might know that talk about higher dimensions. H G Wells’s The Time Machine starts with talk about time as a dimension like space. Edwin Abbott’s Flatland is explicitly about two- and three-dimensions, although Square thinks of whether there could be four- or more-dimensional spaces.
Wikipedia helps me find a few pieces of literature mentioning the fourth dimension before Eugene the Jeep. And a few pieces of visual art as well. No mention of earlier comic strips, although there’s no mention of Eugene the Jeep in either. So, all I can say is this is an early pop cultural appearance of the fourth dimension. I can’t say it’s the first, even among major comic strips.
Do not try to use this to pass your geometry quals.
I got a good nomination for a Q topic, thanks again to goldenoj. It was for Qualitative/Quantitative. Either would be a good topic, but they make a natural pairing. They describe the things mathematicians look for when modeling things. But ultimately I couldn’t find an angle that I liked. So rather than carry on with an essay that wasn’t working I went for a topic of my own. Might come back around to it, though, especially if nothing good presents itself for the letter X, which will probably need to be a wild card topic anyway.
We like comparing sizes. I talked about that some with norms. We do the same with shapes, though. We’d like to know which one is bigger than another, and by how much. We rely on squares to do this for us. It could be any shape, but we in the western tradition chose squares. I don’t know why.
My guess, unburdened by knowledge, is the ancient Greek tradition of looking at the shapes one can make with straightedge and compass. The easiest shape these tools make is, of course, circles. But it’s hard to find a circle with the same area as, say, any old triangle. Squares are probably a next-best thing. I don’t know why not equilateral triangles or hexagons. Again I would guess that the ancient Greeks had more rectangular or square rooms than the did triangles or hexagons, and went with what they knew.
So that’s what lurks behind that word “quadrature”. It may be hard for us to judge whether this pentagon is bigger than that octagon. But if we find squares that are the same size as the pentagon and the octagon, great. We can spot which of the squares is bigger, and by how much.
Straightedge-and-compass lets you find the quadrature for many shapes. Like, take a rectangle. Let me call that ABCD. Let’s say that AB is one of the long sides and BC one of the short sides. OK. Extend AB, outwards, to another point that I’ll call E. Pick E so that the length of BE is the same as the length of BC.
Next, bisect the line segment AE. Call that point F. F is going to be the center of a new semicircle, one with radius FE. Draw that in, on the side of AE that’s opposite the point C. Because we are almost there.
Extend the line segment CB upwards, until it touches this semicircle. Call the point where it touches G. The line segment BG is the side of a square with the same area as the original rectangle ABCD. If you know enough straightedge-and-compass geometry to do that bisection, you know enough to turn BG into a square. If you’re not sure why that’s the correct length, you can get there quickly. Use a little algebra and the Pythagorean theorem.
Neat, yeah, I agree. Also neat is that you can use the same trick to find the area of a parallelogram. A parallelogram has the same area as a square with the same bases and height between them, you remember. So take your parallelogram, draw in some perpendiculars to share that off into a rectangle, and find the quadrature of that rectangle. you’ve got the quadrature of your parallelogram.
Having the quadrature of a parallelogram lets you find the quadrature of any triangle. Pick one of the sides of the triangle as the base. You have a third point not on that base. Draw in the parallel to that base that goes through that third point. Then choose one of the other two sides. Draw the parallel to that side which goes through the other point. Look at that: you’ve got a parallelogram with twice the area of your original triangle. Bisect either the base or the height of this parallelogram, as you like. Then follow the rules for the quadrature of a parallelogram, and you have the quadrature of your triangle. Yes, you’re doing a lot of steps in-between the triangle you started with and the square you ended with. Those steps don’t count, not by this measure. Getting the results right matters.
And here’s some more beauty. You can find the quadrature for any polygon. Remember how you can divide any polygon into triangles? Go ahead and do that. Find the quadrature for every one of those triangles then. And you can create a square that has an area as large as all those squares put together. I’ll refrain from saying quite how, because realizing how is such a delight, one of those moments that at least made me laugh at how of course that’s how. It’s through one of those things that even people who don’t know mathematics know about.
With that background you understand why people thought the quadrature of the circle ought to be possible. Moreso when you know that the lune, a particular crescent-moon-like shape, can be squared. It looks so close to a half-circle that it’s obvious the rest should be possible. It’s not, and it took two thousand years and a completely different idea of geometry to prove it. But it sure looks like it should be possible.
Along the way to modernity quadrature picked up a new role. This is as part of calculus. One of the legs of calculus is integration. There is an interpretation of what the (definite) integral of a function means so common that we sometimes forget it doesn’t have to be that. This is to say that the integral of a function is the area “underneath” the curve. That is, it’s the area bounded by the limits of integration, by the horizontal axis, and by the curve represented by the function. If the function is sometimes less than zero, within the limits of integration, we’ll say that the integral represents the “net area”. Then we allow that the net area might be less than zero. Then we ignore the scolding looks of the ancient Greek mathematicians.
No matter. We love being able to find “the” integral of a function. This is a new function, and evaluating it tells us what this net area bounded by the limits of integration is. Finding this is “integration by quadrature”. At least in books published back when they wrote words like “to-day” or “coördinate”. My experience is that the term’s passed out of the vernacular, at least in North American Mathematician’s English.
Anyway the real flaw is that there are, like, six functions we can find the integral for. For the rest, we have to make do with approximations. This gives us “numerical quadrature”, a phrase which still has some currency.
And with my prologue about compass-and-straightedge quadrature you can see why it’s called that. Numerical integration schemes often rely on finding a polynomial with a part that looks like a graph of the function you’re interested in. The other edges look like the limits of the integration. Then the area of that polygon should be close to the area “underneath” this function. So it should be close to the integral of the function you want. And we’re old hands at how the quadrature of polygons, since we talked that out like five hundred words ago.
Now, no person ever has or ever will do numerical quadrature by compass-and-straightedge on some function. So why call it “numerical quadrature” instead of just “numerical integration”? Style, for one. “Quadrature” as a word has a nice tone, clearly jargon but not threateningly alien. Also “numerical integration” often connotes the solving differential equations numerically. So it can clarify whether you’re evaluating integrals or solving differential equations. If you think that’s a distinction worth making. Evaluating integrals and solving differential equations are similar together anyway.
And there is another adjective that often attaches to quadrature. This is Gaussian Quadrature. Gaussian Quadrature is, in principle, a fantastic way to do numerical integration perfectly. For some problems. For some cases. The insight which justifies it to me is one of those boring little theorems you run across in the chapter introducing How To Integrate. It runs something like this. Suppose ‘f’ is a continuous function, with domain the real numbers and range the real numbers. Suppose a and b are the limits of integration. Then there’s at least one point c, between a and b, for which:
So if you could pick the right c, any integration would be so easy. Evaluate the function for one point and multiply it by whatever b minus a is. The catch is, you don’t know what c is.
Except there’s some cases where you kinda do. Like, if f is a line, rising or falling with a constant slope from a to b? Then have c be the midpoint of a and b.
That won’t always work. Like, if f is a parabola on the region from a to b, then c is not going to be the midpoint. If f is a cubic, then the midpoint is probably not c. And so on. And if you don’t know what kind of function f is? There’s no guessing where c will be.
But. If you decide you’re only trying to certain kinds of functions? Then you can do all right. If you decide you only want to integrate polynomials, for example, then … well, you’re not going to find a single point c for this. But what you can find is a set of points between a and b. Evaluate the function for those points. And then find a weighted average by rules I’m not getting into here. And that weighted average will be exactly that integral.
Of course there’s limits. The Gaussian Quadrature of a function is only possible if you can evaluate the function at arbitrary points. If you’re trying to integrate, like, a set of sample data it’s inapplicable. The points you pick, and the weighting to use, depend on what kind of function you want to integrate. The results will be worse the less your function is like what you supposed. It’s tedious to find what these points are for a particular assumption of function. But you only have to do that once, or look it up, if you know (say) you’re going to use polynomials of degree up to six or something like that.
And there are variations on this. They have names like the Chevyshev-Gauss Quadrature, or the Hermite-Gauss Quadrature, or the Jacobi-Gauss Quadrature. There are even some that don’t have Gauss’s name in them at all.
Despite that, you can get through a lot of mathematics not talking about quadrature. The idea implicit in the name, that we’re looking to compare areas of different things by looking at squares, is obsolete. It made sense when we worked with numbers that depended on units. One would write about a shape’s area being four times another shape’s, or the length of its side some multiple of a reference length.
We’ve grown comfortable thinking of raw numbers. It makes implicit the step where we divide the polygon’s area by the area of some standard reference unit square. This has advantages. We don’t need different vocabulary to think about integrating functions of one or two or ten independent variables. We don’t need wordy descriptions like “the area of this square is to the area of that as the second power of this square’s side is to the second power of that square’s side”. But it does mean we don’t see squares as intermediaries to understanding different shapes anymore.
Today’s A To Z term is another from goldenoj. It was just the proposal “Platonic”. Most people, prompted, would follow that adjective with one of three words. There’s relationship, ideal, and solid. Relationship is a little too far off of mathematics for me to go into here. Platonic ideals run very close to mathematics. Probably the default philosophy of western mathematics is Platonic. At least a folk Platonism, where the rest of us follow what the people who’ve taken the study of mathematical philosophy seriously seem to be doing. The idea that mathematical constructs are “real things” and have some “existence” that we can understand even if we will never see a true circle or an unadulterated four. Platonic solids, though, those are nice and familiar things. Many of them we can find around the house. That’s one direction to go.
Before I get to the Platonic Solids, though, I’d like to think a little more about Platonic Ideals. What do they look like? I gather our friends in the philosophy department have debated this question a while. So I won’t pretend to speak as if I had actual knowledge. I just have an impression. That impression is … well, something simple. My reasoning is that the Platonic ideal of, say, a chair has to have all the traits that every chair ever has. And there’s not a lot that every chair has. Whatever’s in the Platonic Ideal chair has to be just the things that every chair has, and to omit things that non-chairs do not.
That’s comfortable to me, thinking like a mathematician, though. I think mathematicians train to look for stuff that’s very generally true. This will tend to be things that have few properties to satisfy. Things that look, in some way, simple.
So what is simple in a shape? There’s no avoiding aesthetic judgement here. We can maybe use two-dimensional shapes as a guide, though. Polygons seem nice. They’re made of line segments which join at vertices. Regular polygons even nicer. Each vertex in a regular polygon connects to two edges. Each edge connects to exactly two vertices. Each edge has the same length. The interior angles are all congruent. And if you get many many sides, the regular polygon looks like a circle.
So there’s some things we might look for in solids. Shapes where every edge is the same length. Shapes where every edge connects exactly two vertices. Shapes where every vertex connects to the same number of edges. Shapes where the interior angles are all constant. Shapes where each face is the same polygon as every other face. Look for that and, in three-dimensional space, we find nine shapes.
Yeah, you want that to be five also. The four extra ones are “star polyhedrons”. They look like spikey versions of normal shapes. What keeps these from being Platonic solids isn’t a lack of imagination on Plato’s part. It’s that they’re not convex shapes. There’s no pair of points in a convex shape for which the line segment connecting them goes outside the shape. For the star polyhedrons, well, look at the ends of any two spikes. If we decide that part of this beautiful simplicity is convexity, then we’re down to five shapes. They’re famous. Tetrahedron, cube, octahedron, icosahedron, and dodecahedron.
I’m not sure why they’re named the Platonic Solids, though. Before you explain to me that they were named by Plato in the dialogue Timaeus, let me say something. They were named by Plato in the dialogue Timaeus. That isn’t the same thing as why they have the name Platonic Solids. I trust Plato didn’t name them “the me solids”, since if I know anything about Plato he would have called them “the Socratic solids”. It’s not that Plato was the first to group them either. At least some of the solids were known long before Plato. I don’t know of anyone who thinks Plato particularly advanced human understanding of the solids.
But he did write about them, and in things that many people remembered. It’s natural for a name to attach to the most famous person writing them. Still, someone had the thought which we follow to group these solids together under Plato’s name. I’m curious who, and when. Naming is often a more arbitrary thing than you’d think. The Fibonacci sequence has been known at latest since Fibonacci wrote about it in 1204. But it could not have that name before 1838, when historian Guillaume Libri gave Leonardo of Pisa the name Fibonacci. I’m not saying that the name “Platonic Solid” was invented in, like, 2002. But traditions that seem age-old can be surprisingly recent.
What is an age-old tradition is looking for physical significance in the solids. Plato himself cleverly matched the solids to the ancient concept of four elements plus a quintessence. Johannes Kepler, whom we thank for noticing the star polyhedrons, tried to match them to the orbits of the planets around the sun. Wikipedia tells me of a 1980s attempt to understand the atomic nucleus using Platonic solids. The attempt even touches me. Along the way to my thesis I looked at uniform charges free to move on the surface of a sphere. It was obvious if there were four charges they’d move to the vertices of a tetrahedron on the sphere. Similarly, eight charges would go to the vertices of the cube. 20 charges to the vertices of the icosahedron. And so on. The Platonic Solids seem not just attractive but also of some deep physical significance.
There are not the four (or five) elements of ancient Greek atomism. Attractive as it is to think that fire is a bunch of four-sided dice. The orbits of the planets have nothing to do with the Platonic solids. I know too little about the physics of the atomic nucleus to say whether that panned out. However, that it doesn’t even get its own Wikipedia entry suggests something to me. And, in fact, eight charges on the sphere will not settle at the vertices of a cube. They’ll settle on a staggered pattern, two squares turned 45 degrees relative to each other. The shape is called a “square antiprism”. I was as surprised as you to learn that. It’s possible that the Platonic Solids are, ultimately, pleasant to us but not a key to the universe.
The example of the Platonic Solids does give us the cue to look for other families of solids. There are many such. The Archimedean Solids, for example, are again convex polyhedrons. They have faces of two or more regular polygons, rather than the lone one of Platonic Solids. There are 13 of these, with names of great beauty like the snub cube or the small rhombicuboctahedron. The Archimedean Solids have duals. The dual of a polyhedron represents a face of the original shape with a vertex. Faces that meet in the original polyhedron have an edge between their dual’s vertices. The duals to the Archimedean Solids get the name Catalan Solids. This for the Belgian mathematician Eugène Catalan, who described them in 1865. These attract names like “deltoidal icositetrahedron”. (The Platonic Solids have duals too, but those are all Platonic solids too. The tetrahedron is even its own dual.) The star polyhedrons hint us to look at stellations. These are shapes we get by stretching out the edges or faces of a polyhedron until we get a new polyhedron. It becomes a dizzying taxonomy of shapes, many of them with pointed edges.
There are things that look like Platonic Solids in more than three dimensions of space. In four dimensions of space there are six of these, five of which look like versions of the Platonic Solids we all know. The sixth is this novel shape called the 24-cell, or hyperdiamond, or icositetrachoron, or some other wild names. In five dimensions of space? … it turns out there are only three things that look like Platonic Solids. There’s versions of the tetrahedron, the cube, and the octahedron. In six dimensions? … Three shapes, again versions of the tetrahedron, cube, and octahedron. And it carries on like this for seven, eight, nine, any number of dimensions of space. Which is an interesting development. If I hadn’t looked up the answer I’d have expected more dimensions of space to allow for more Platonic Solid-like shapes. Well, our experience with two and three dimensions guides us to thinking about more dimensions of space. It doesn’t mean that they’re just regular space with a note in the corner that “N = 8”. Shapes hold surprises.
Today’s A To Z term is another free choice. So I’m picking a term from the world of … mathematics. There are a lot of norms out there. Many are specialized to particular roles, such as looking at complex-valued numbers, or vectors, or matrices, or polynomials.
Still they share things in common, and that’s what this essay is for. And I’ve brushed up against the topic before.
The norm, also, has nothing particular to do with “normal”. “Normal” is an adjective which attaches to every noun in mathematics. This is security for me as while these A-To-Z sequences may run out of X and Y and W letters, I will never be short of N’s.
A “norm” is the size of whatever kind of thing you’re working with. You can see where this is something we look for. It’s easy to look at two things and wonder which is the smaller.
There are many norms, even for one set of things. Some seem compelling. For the real numbers, we usually let the absolute value do this work. By “usually” I mean “I don’t remember ever seeing a different one except from someone introducing the idea of other norms”. For a complex-valued number, it’s usually the square root of the sum of the square of the real part and the square of the imaginary coefficient. For a vector, it’s usually the square root of the vector dot-product with itself. (Dot product is this binary operation that is like multiplication, if you squint, for vectors.) Again, these, the “usually” means “always except when someone’s trying to make a point”.
Which is why we have the convention that there is a “the norm” for a kind of operation. The norm dignified as “the” is usually the one that looks as much as possible like the way we find distances between two points on a plane. I assume this is because we bring our intuition about everyday geometry to mathematical structures. You know how it is. Given an infinity of possible choices we take the one that seems least difficult.
Every sort of thing which can have a norm, that I can think of, is a vector space. This might be my failing imagination. It may also be that it’s quite easy to have a vector space. A vector space is a collection of things with some rules. Those rules are about adding the things inside the vector space, and multiplying the things in the vector space by scalars. These rules are not difficult requirements to meet. So a lot of mathematical structures are vector spaces, and the things inside them are vectors.
A norm is a function that has these vectors as its domain, and the non-negative real numbers as its range. And there are three rules that it has to meet. So. Give me a vector ‘u’ and a vector ‘v’. I’ll also need a scalar, ‘a. Then the function f is a norm when:
. This is a famous rule, called the triangle inequality. You know how in a triangle, the sum of the lengths of any two legs is greater than the length of the third leg? That’s the rule at work here.
. This doesn’t have so snappy a name. Sorry. It’s something about being homogeneous, at least.
If then u has to be the additive identity, the vector that works like zero does.
Norms take on many shapes. They depend on the kind of thing we measure, and what we find interesting about those things. Some are familiar. Look at a Euclidean space, with Cartesian coordinates, so that we might write something like (3, 4) to describe a point. The “the norm” for this, called the Euclidean norm or the L2 norm, is the square root of the sum of the squares of the coordinates. So, 5. But there are other norms. The L1 norm is the sum of the absolute values of all the coefficients; here, 7. The L∞ norm is the largest single absolute value of any coefficient; here, 4.
A polynomial, meanwhile? Write it out as . Take the absolute value of each of these terms. Then … you have choices. You could take those absolute values and add them up. That’s the L1 polynomial norm. Take those absolute values and square them, then add those squares, and take the square root of that sum. That’s the L2 norm. Take the largest absolute value of any of these coefficients. That’s the L∞ norm.
These don’t look so different, even though points in space and polynomials seem to be different things. We designed the tool. We want it not to be weirder than it has to be. When we try to put a norm on a new kind of thing, we look for a norm that resembles the old kind of thing. For example, when we want to define the norm of a matrix, we’ll typically rely on a norm we’ve already found for a vector. At least to set up the matrix norm; in practice, we might do a calculation that doesn’t explicitly use a vector’s norm, but gives us the same answer.
If we have a norm for some vector space, then we have an idea of distance. We can say how far apart two vectors are. It’s the norm of the difference between the vectors. This is called defining a metric on the vector space. A metric is that sense of how far apart two things are. What keeps a norm and a metric from being the same thing is that it’s possible to come up with a metric that doesn’t match any sensible norm.
It’s always possible to use a norm to define a metric, though. Doing that promotes our normed vector space to the dignified status of a “metric space”. Many of the spaces we find interesting enough to work in are such metric spaces. It’s hard to think of doing without some idea of size.
Comic Strip Master Command hoped to give me an easy week, one that would let me finally get ahead on my A-to-Z essays and avoid the last-minute rush to complete tasks. I showed them, though. I can procrastinate more than they can give me breaks. This essay alone I’m writing about ten minutes after you read it.
Eric the Circle for the 7th, by Shoy, is one of the jokes where Eric’s drawn as something besides a circle. I can work with this, though, because the cube is less far from a circle than you think. It gets to what we mean by “a circle”. If it’s all the points that are exactly a particular distance from a given center? Or maybe all the points up to that particular distance from a given center? This seems too reasonable to argue with, so you know where the trick is.
The trick is asking what we mean by distance? The ordinary distance that normal people use has a couple names. The Euclidean distance, often. Or Euclidean metric. Euclidean norm. It has some fancier names that can wait. Give two points. You can find this distance easily if you have their coordinates in a Cartesian system. (There’s infinitely many Cartesian systems you could use. You can pick whatever one you like; the distance will be the same whatever they are.) That’s that thing about finding the distance between corresponding coordinates, squaring those distances, adding that up, and taking the square root. And that’s good.
That’s not our only choice, though. We can make a perfectly good distance using other rules. For example, take the difference between corresponding coordinates, take the absolute value of each, and add all those absolute values up. This distance even has real-world application. It’s how far it is to go from one place to another on a grid of city squares, where it’s considered poor form to walk directly through buildings. There’s another. Instead of adding those absolute values up? Just pick the biggest of the absolute values. This is another distance. In it, circles look like squares. Or, in three dimensions, spheres look like cubes.
Ryan North’s Dinosaur Comics for the 9th builds on a common science fictional premise, that contact with an alien intelligence is done through mathematics first. It’s a common supposition in science fiction circles, and among many scientists, that mathematics is a truly universal language. It’s hard to imagine a species capable of communication with us that wouldn’t understand two and two adding up to four. Or about the ratio of a circle circumference to its diameter being independent of that diameter. Or about how an alternating knot for which the minimum number of crossing points is odd can’t ever be amphicheiral.
All right, I guess I can imagine a species that never ran across that point. Which is one of the things we suppose in using mathematics as a universal language. Its truths are indisputable, if we allow the rules of logic and axioms and definitions that we use. And I agree I don’t know that it’s possible not to notice basic arithmetic and basic geometry, not if one lives in a sensory world much like humans’. But it does seem to me at least some of mathematics is probably idiosyncratic. In representation at least; certainly in organization. I suspect there may be trouble in using universal and generically true things to express something local and specific. I don’t know how to go from deductive logic to telling someone when my birthday is. Well, I’m sure our friends in the philosophy department have considered that problem and have some good thoughts we can use, if there were only some way to communicate with them.
Bill Whitehead’s Free Range for the 12th is your classic blackboard-full-of-symbols. I like the beauty of the symbols used. I mean, the whole expression doesn’t parse, but many of the symbols do and are used in reasonable ways. Long trailing strings of arrows to extend one line to another are common and reasonable too. In the middle of the second line is , which doesn’t make sense, but which doesn’t make sense in a way that seems authentic to working out an idea. It’s something that could be cleaned up if the reasoning needed to be made presentable.
I couldn’t find a place to fit this in the essay proper. But it’s too good to leave out. The simplex method, discussed within, traces to George Dantzig. He’d been planning methods for the US Army Air Force during the Second World War. Dantzig is a person you have heard about, if you’ve heard any mathematical urban legends. In 1939 he was late to Jerzy Neyman’s class. He took two statistics problems on the board to be homework. He found them “harder than usual”, but solved them in a couple days and turned in the late homework hoping Neyman would be understanding. They weren’t homework. They were examples of famously unsolved problems. Within weeks Neyman had written one of the solutions up for publication. When he needed a thesis topic Neyman advised him to just put what he already had in a binder. It’s the stuff every grad student dreams of. The story mutated. It picked up some glurge to become a narrative about positive thinking. And mutated further, into the movie Good Will Hunting.
Every three days one of the comic strips I read has the elderly main character talk about how they never used algebra. This is my hyperbole. But mathematics has got the reputation for being difficult and inapplicable to everyday life. We’ll concede using arithmetic, when we get angry at the fast food cashier who hands back our two pennies before giving change for our $6.77 hummus wrap. But otherwise, who knows what an elliptic integral is, and whether it’s working properly?
Linear programming does not have this problem. In part, this is because it lacks a reputation. But those who have heard of it, acknowledge it as immensely practical mathematics. It is about something a particular kind of human always finds compelling. That is how to do a thing best.
There are several kinds of “best”. There is doing a thing in as little time as possible. Or for as little effort as possible. For the greatest profit. For the highest capacity. For the best score. For the least risk. The goals have a thousand names, none of which we need to know. They all mean the same thing. They mean “the thing we wish to optimize”. To optimize has two directions, which are one. The optimum is either the maximum or the minimum. To be good at finding a maximum is to be good at finding a minimum.
It’s obvious why we call this “programming”; obviously, we leave the work of finding answers to a computer. It’s a spurious reason. The “programming” here comes from an independent sense of the word. It means more about finding a plan. Think of “programming” a night’s entertainment, so that every performer gets their turn, all scene changes have time to be done, you don’t put two comedians right after the other, and you accommodate the performer who has to leave early and the performer who’ll get in an hour late. Linear programming problems are often about finding how to do as well as possible given various priorities. All right. At least the “linear” part is obvious. A mathematics problem is “linear” when it’s something we can reasonably expect to solve. This is not the technical meaning. Technically what it means is we’re looking at a function something like:
Here, x, y, and z are the independent variables. We don’t know their values but wish to. a, b, and c are coefficients. These values are set to some constant for the problem, but they might be something else for other problems. They’re allowed to be positive or negative or even zero. If a coefficient is zero, then the matching variable doesn’t affect matters at all. The corresponding value can be anything at all, within the constraints.
I’ve written this for three variables, as an example and because ‘x’ and ‘y’ and ‘z’ are comfortable, familiar variables. There can be fewer. There can be more. There almost always are. Two- and three-variable problems will teach you how to do this kind of problem. They’re too simple to be interesting, usually. To avoid committing to a particular number of variables we can use indices. for values of j from 1 up to N. Or we can bundle all these values together into a vector, and write everything as . This has a particular advantage since when we can write the coefficients as a vector too. Then we use the notation of linear algebra, and write that we hope to maximize the value of:
(The superscript T means “transpose”. As a linear algebra problem we’d usually think of writing a vector as a tall column of things. By transposing that we write a long row of things. By transposing we can use the notation of matrix multiplication.)
This is the objective function. Objective here in the sense of goal; it’s the thing we want to find the best possible value of.
We have constraints. These represent limits on the variables. The variables are always things that come in limited supply. There’s no allocating more money than the budget allows, nor putting more people on staff than work for the company. Often these constraints interact. Perhaps not only is there only so much staff, but no one person can work more than a set number of days in a row. Something like that. That’s all right. We can write all these constraints as a matrix equation. An inequality, properly. We can bundle all the constraints into a big matrix named A, and demand:
Also, traditionally, we suppose that every component of is non-negative. That is, positive, or at lowest, zero. This reflects the field’s core problems of figuring how to allocate resources. There’s no allocating less than zero of something.
But we need some bounds. This is easiest to see with a two-dimensional problem. Try it yourself: draw a pair of axes on a sheet of paper. Now put in a constraint. Doesn’t matter what. The constraint’s edge is a straight line, which you can draw at any position and any angle you like. This includes horizontal and vertical. Shade in one side of the constraint. Whatever you shade in is the “feasible region”, the sets of values allowed under the constraint. Now draw in another line, another constraint. Shade in one side or the other of that. Draw in yet another line, another constraint. Shade in one side or another of that. The “feasible region” is whatever points have taken on all these shades. If you were lucky, this is a bounded region, a triangle. If you weren’t lucky, it’s not bounded. It’s maybe got some corners but goes off to the edge of the page where you stopped shading things in.
So adding that every component of is at least as big as zero is a backstop. It means we’ll usually get a feasible region with a finite volume. What was the last project you worked on that had no upper limits for anything, just minimums you had to satisfy? Anyway if you know you need something to be allowed less than zero go ahead. We’ll work it out. The important thing is there’s finite bounds on all the variables.
I didn’t see the bounds you drew. It’s possible you have a triangle with all three shades inside. But it’s also possible you picked the other sides to shade, and you have an annulus, with no region having more than two shades in it. This can happen. It means it’s impossible to satisfy all the constraints at once. At least one of them has to give. You may be reminded of the sign taped to the wall of your mechanics’ about picking two of good-fast-cheap.
But impossibility is at least easy. What if there is a feasible region?
Well, we have reason to hope. The optimum has to be somewhere inside the region, that’s clear enough. And it even has to be on the edge of the region. If you’re not seeing why, think of a simple example, like, finding the maximum of , inside the square where x is between 0 and 2 and y is between 0 and 3. Suppose you had a putative maximum on the inside, like, where x was 1 and y was 2. What happens if you increase x a tiny bit? If you increase y by twice that? No, it’s only on the edges you can get a maximum that can’t be locally bettered. And only on the corners of the edges, at that.
(This doesn’t prove the case. But it is what the proof gets at.)
So the problem sounds simple then! We just have to try out all the vertices and pick the maximum (or minimum) from them all.
OK, and here’s where we start getting into trouble. With two variables and, like, three constraints? That’s easy enough. That’s like five points to evaluate? We can do that.
We never need to do that. If someone’s hiring you to test five combinations I admire your hustle and need you to start getting me consulting work. A real problem will have many variables and many constraints. The feasible region will most often look like a multifaceted gemstone. It’ll extend into more than three dimensions, usually. It’s all right if you just imagine the three, as long as the gemstone is complicated enough.
Because now we’ve got lots of vertices. Maybe more than we really want to deal with. So what’s there to do?
The basic approach, the one that’s foundational to the field, is the simplex method. A “simplex” is a triangle. In three dimensions, anyway. In four dimensions it’s a tetrahedron. In two dimensions it’s a line segment. Generally, however many dimensions of space you have? The simplex is the simplest thing that fills up volume in your space.
You know how you can turn any polygon into a bunch of triangles? Just by connecting enough vertices together? You can turn a polyhedron into a bunch of tetrahedrons, by adding faces that connect trios of vertices. And for polyhedron-like shapes in more dimensions? We call those polytopes. Polytopes we can turn into a bunch of simplexes. So this is why it’s the “simplex method”. Any one simplex it’s easy to check the vertices on. And we can turn the polytope into a bunch of simplexes. And we can ignore all the interior vertices of the simplexes.
So here’s the simplex method. First, break your polytope up into simplexes. Next, pick any simplex; doesn’t matter which. Pick any outside vertex of that simplex. This is the first viable possible solution. It’s most likely wrong. That’s okay. We’ll make it better.
Because there are other vertices on this simplex. And there are other simplexes, adjacent to that first, which share this vertex. Test the vertices that share an edge with this one. Is there one that improves the objective function? Probably. Is there a best one of those in this simplex? Sure. So now that’s our second viable possible solution. If we had to give an answer right now, that would be our best guess.
But this new vertex, this new tentative solution? It shares edges with other vertices, across several simplexes. So look at these new neighbors. Are any of them an improvement? Which one of them is the best improvement? Move over there. That’s our next tentative solution.
You see where this is going. Keep at this. Eventually it’ll wind to a conclusion. Usually this works great. If you have, like, 8 constraints, you can usually expect to get your answer in from 16 to 24 iterations. If you have 20 constraints, expect an answer in from 40 to 60 iterations. This is doing pretty well.
But it might take a while. It’s possible for the method to “stall” a while, often because one or more of the variables is at its constraint boundary. Or the division of polytope into simplexes got unlucky, and it’s hard to get to better solutions. Or there might be a string of vertices that are all at, or near, the same value, so the simplex method can’t resolve where to “go” next. In the worst possible case, the simplex method takes a number of iterations that grows exponentially with the number of constraints. This, yes, is very bad. It doesn’t typically happen. It’s a numerical algorithm. There’s some problem to spoil any numerical algorithm.
You may have complaints. Like, the world is complicated. Why are we only looking at linear objective functions? Or, why only look at linear constraints? Well, if you really need to do that? Go ahead, but that’s not linear programming anymore. Think hard about whether you really need that, though. Linear anything is usually simpler than nonlinear anything. I mean, if your optimization function absolutely has to have in it? Could we just say you have a new variable that just happens to be equal to the square of y? Will that work? If you have to have the sine of z? Are you sure that z isn’t going to get outside the region where the sine of z is pretty close to just being z? Can you check?
Maybe you have, and there’s just nothing for it. That’s all right. This is why optimization is a living field of study. It demands judgement and thought and all that hard work.
Today’s A To Z term is another I drew from Mr Wu, of the Singapore Math Tuition blog. It gives me more chances to discuss differential equations and mathematical physics, too.
The Hamiltonian we name for Sir William Rowan Hamilton, the 19th century Irish mathematical physicists who worked on everything. You might have encountered his name from hearing about quaternions. Or for coining the terms “scalar” and “tensor”. Or for work in graph theory. There’s more. He did work in Fourier analysis, which is what you get into when you feel at ease with Fourier series. And then wild stuff combining matrices and rings. He’s not quite one of those people where there’s a Hamilton’s Theorem for every field of mathematics you might be interested in. It’s close, though.
When you first learn about physics you learn about forces and accelerations and stuff. When you major in physics you learn to avoid dealing with forces and accelerations and stuff. It’s not explicit. But you get trained to look, so far as possible, away from vectors. Look to scalars. Look to single numbers that somehow encode your problem.
A great example of this is the Lagrangian. It’s built on “generalized coordinates”, which are not necessarily, like, position and velocity and all. They include the things that describe your system. This can be positions. It’s often angles. The Lagrangian shines in problems where it matters that something rotates. Or if you need to work with polar coordinates or spherical coordinates or anything non-rectangular. The Lagrangian is, in your general coordinates, equal to the kinetic energy minus the potential energy. It’ll be a function. It’ll depend on your coordinates and on the derivative-with-respect-to-time of your coordinates. You can take partial derivatives of the Lagrangian. This tells how the coordinates, and the change-in-time of your coordinates should change over time.
The Hamiltonian is a similar way of working out mechanics problems. The Hamiltonian function isn’t anything so primitive as the kinetic energy minus the potential energy. No, the Hamiltonian is the kinetic energy plus the potential energy. Totally different in idea.
From that description you maybe guessed you can transfer from the Lagrangian to the Hamiltonian. Maybe vice-versa. Yes, you can, although we use the term “transform”. Specifically a “Legendre transform”. We can use any coordinates we like, just as with Lagrangian mechanics. And, as with the Lagrangian, we can find how coordinates change over time. The change of any coordinate depends on the partial derivative of the Hamiltonian with respect to a particular other coordinate. This other coordinate is its “conjugate”. (It may either be this derivative, or minus one times this derivative. By the time you’re doing work in the field you’ll know which.)
That conjugate coordinate is the important thing. It’s why we muck around with Hamiltonians when Lagrangians are so similar. In ordinary, common coordinate systems these conjugate coordinates form nice pairs. In Cartesian coordinates, the conjugate to a particle’s position is its momentum, and vice-versa. In polar coordinates, the conjugate to the angular velocity is the angular momentum. These are nice-sounding pairs. But that’s our good luck. These happen to match stuff we already think is important. In general coordinates one or more of a pair can be some fusion of variables we don’t have a word for and would never care about. Sometimes it gets weird. In the problem of vortices swirling around each other on an infinitely great plane? The horizontal position is conjugate to the vertical position. Velocity doesn’t enter into it. For vortices on the sphere the longitude is conjugate to the cosine of the latitude.
What’s valuable about these pairings is that they make a “symplectic manifold”. A manifold is a patch of space where stuff works like normal Euclidean geometry does. In this case, the space is in “phase space”. This is the collection of all the possible combinations of all the variables that could ever turn up. Every particular moment of a mechanical system matches some point in phase space. Its evolution over time traces out a path in that space. Call it a trajectory or an orbit as you like.
We get good things from looking at the geometry that this symplectic manifold implies. For example, if we know that one variable doesn’t appear in the Hamiltonian, then its conjugate’s value never changes. This is almost the kindest thing you can do for a mathematical physicist. But more. A famous theorem by Emmy Noether tells us that symmetries in the Hamiltonian match with conservation laws in the physics. Time-invariance, for example — time not appearing in the Hamiltonian — gives us the conservation of energy. If only distances between things, not absolute positions, matter, then we get conservation of linear momentum. Stuff like that. To find conservation laws in physics problems is the kindest thing you can do for a mathematical physicist.
The Hamiltonian was born out of planetary physics. These are problems easy to understand and, apart from the case of one star with one planet orbiting each other, impossible to solve exactly. That’s all right. The formalism applies to all kinds of problems. They’re very good at handling particles that interact with each other and maybe some potential energy. This is a lot of stuff.
More, the approach extends naturally to quantum mechanics. It takes some damage along the way. We can’t talk about “the” position or “the” momentum of anything quantum-mechanical. But what we get when we look at quantum mechanics looks very much like what Hamiltonians do. We can calculate things which are quantum quite well by using these tools. This though they came from questions like why Saturn’s rings haven’t fallen part and whether the Earth will stay around its present orbit.
It holds surprising power, too. Notice that the Hamiltonian is the kinetic energy of a system plus its potential energy. For a lot of physics problems that’s all the energy there is. That is, the value of the Hamiltonian for some set of coordinates is the total energy of the system at that time. And, if there’s no energy lost to friction or heat or whatever? Then that’s the total energy of the system for all time.
Here’s where this becomes almost practical. We often want to do a numerical simulation of a physics problem. Generically, we do this by looking up what all the values of all the coordinates are at some starting time t0. Then we calculate how fast these coordinates are changing with time. We pick a small change in time, Δ t. Then we say that at time t0 plus Δ t, the coordinates are whatever they started at plus Δ t times that rate of change. And then we repeat, figuring out how fast the coordinates are changing now, at this position and time.
The trouble is we always make some mistake, and once we’ve made a mistake, we’re going to keep on making mistakes. We can do some clever stuff to make the smallest error possible figuring out where to go, but it’ll still happen. Usually, we stick to calculations where the error won’t mess up our results.
But when we look at stuff like whether the Earth will stay around its present orbit? We can’t make each step good enough for that. Unless we get to thinking about the Hamiltonian, and our symplectic variables. The actual system traces out a path in phase space. Everyone on that path the Hamiltonian is a particular value, the energy of the system. So use the regular methods to project most of the variables to the new time, t0 + Δ t. But the rest? Pick the values that makes the Hamiltonian work out right. Also momentum and angular momentum and other stuff we know get conserved. We’ll still make an error. But it’s a different kind of error. It’ll project to a point that’s maybe in the wrong place on the trajectory. But it’s on the trajectory.
(OK, it’s near the trajectory. Suppose the real energy is, oh, the square root of 5. The computer simulation will have an energy of 2.23607. This is close but not exactly the same. That’s all right. Each step will stay close to the real energy.)
So what we’ll get is a projection of the Earth’s orbit that maybe puts it in the wrong place in its orbit. Putting the planet on the opposite side of the sun from Venus when we ought to see Venus transiting the Sun. That’s all right, if what we’re interested in is whether Venus and Earth are still in the solar system.
There’s a special cost for this. If there weren’t we’d use it all the time. The cost is computational complexity. It’s pricey enough that you haven’t heard about these “symplectic integrators” before. That’s all right. These are the kinds of things open to us once we look long into the Hamiltonian.
One of the podcasts I regularly listen to is the BBC’s In Our Time. This is a roughly 50-minute chat, each week, about some topic of general interest. It’s broad in its subjects; they can be historical, cultural, scientific, artistic, and even sometimes mathematical.
Recently they repeated an episode about Emmy Noether. I knew, before, that she was one of the great figures in our modern understanding of physics. Noether’s Theorem tells us how the geometry of a physics problem constrains the physics we have, and in useful ways. That, for example, what we understand as the conservation of angular momentum results from a physical problem being rotationally symmetric. (That if we rotated everything about the problem by the same angle around the same axis, we’d not see any different behaviors.) Similarly, that you could start a physics scenario at any time, sooner or later, without changing the results forces the physics scenario to have a conservation of energy. This is a powerful and stunning way to connect physics and geometry.
What I had not appreciated until listening to this episode was her work in group theory, and in organizing it in the way we still learn the subject. This startled and embarrassed me. It forced me to realize I knew little about the history of group theory. Group theory has over the past two centuries been a key piece of mathematics. It’s given us results as basic as showing there are polynomials that no quadratic formula-type expression will ever solve. It’s given results as esoteric as predicting what kinds of exotic subatomic particles we should expect to exist. And her work’s led into the modern understanding of the fundamentals of mathematics. So it’s exciting to learn some more about this.
There were several more comic strips last week worth my attention. One of them, though, offered a lot for me to write about, packed into one panel featuring what comic strip fans call the Wall O’ Text.
Bea R’s In Security for the 9th is part of a storyline about defeating an evil “home assistant”. The choice of weapon is Michaela’s barrage of questions, too fast and too varied to answer. There are some mathematical questions tossed in the mix. The obvious one is “zero divided by two equals zero, but why’z two divided by zero called crazy town?” Like with most “why” mathematics questions there are a range of answers.
The obvious one, I suppose, is to appeal to intuition. Think of dividing one number by another by representing the numbers with things. Start with a pile of the first number of things. Try putting them into the second number of bins. How many times can you do this? And then you can pretty well see that you can fill two bins with zero things zero times. But you can fill zero bins with two things — well, what is filling zero bins supposed to mean? And that warns us that dividing by zero is at least suspicious.
That’s probably enough to convince a three-year-old, and probably most sensible people. If we start getting open-mined about what it means to fill no containers, we might say, well, why not have two things fill the zero containers zero times over, or once over, or whatever convenient answer would work? And here we can appeal to mathematical logic. Start with some ideas that seem straightforward. Like, that division is the inverse of multiplication. That addition and multiplication work like you’d guess from the way integers work. That distribution works. Then you can quickly enough show that if you allow division by zero, this implies that every number equals every other number. Since it would be inconvenient for, say, “six” to also equal “minus 113,847,506 and three-quarters” we say division by zero is the problem.
This is compelling until you ask what’s so great about addition and multiplication as we know them. And here’s a potentially fruitful line of attack. Coming up with alternate ideas for what it means to add or to multiply are fine. We can do this easily with modular arithmetic, that thing where we say, like, 5 + 1 equals 0 all over again, and 5 + 2 is 1 and 5 + 3 is 2. This can create a ring, and it can offer us wild ideas like “3 times 2 equals 0”. This doesn’t get us to where dividing by zero means anything. But it hints that maybe there’s some exotic frontier of mathematics in which dividing by zero is good, or useful. I don’t know of one. But I know very little about topics like non-standard analysis (where mathematicians hypothesize non-negative numbers that are not zero, but are also smaller than any positive number) or structures like surreal numbers. There may be something lurking behind a Quanta Magazine essay I haven’t read even though they tweet about it four times a week. (My twitter account is, for some reason, not loading this week.)
Michaela’s questions include a couple other mathematically-connected topics. “If infinity is forever, isn’t that crazy, too?” Crazy is a loaded word and probably best avoided. But there are infinity large sets of things. There are processes that take infinitely many steps to complete. Please be kind to me in my declaration “are”. I spent five hundred words on “two divided by zero”. I can’t get into that it means for a mathematical thing to “exist”. I don’t know. In any event. Infinities are hard and we rely on them. They defy our intuition. Mathematicians over the 19th and 20th centuries worked out fairly good tools for handling these. They rely on several strategies. Most of these amount to: we can prove that the difference between “infinitely many steps” and “very many steps” can be made smaller than any error tolerance we like. And we can say what “very many steps” implies for a thing. Therefore we can say that “infinitely many steps” gives us some specific result. A similar process holds for “infinitely many things” instead of “infinitely many steps”. This does not involve actually dealing with infinity, not directly. It involves dealing with large numbers, which work like small numbers but longer. This has worked quite well. There’s surely some field of mathematics about to break down that happy condition.
And there’s one more mathematical bit. Why is a ball round? This comes around to definitions. Suppose a ball is all the points within a particular radius of a center. What shape that is depends on what you mean by “distance”. The common definition of distance, the “Euclidean norm”, we get from our physical intuition. It implies this shape should be round. But there are other measures of distance, useful for other roles. They can imply “balls” that we’d say were octahedrons, or cubes, or rounded versions of these shapes. We can pick our distance to fit what we want to do, and shapes follow.
I suspect but do not know that it works the other way, that if we want a “ball” to be round, it implies we’re using a distance that’s the Euclidean measure. I defer to people better at normed spaces than I am.
Mark Anderson’s Andertoons for the 10th is the Mark Anderson’s Andertoons for the week. It’s also a refreshing break from talking so much about In Security. Wavehead is doing the traditional kid-protesting-the-chalkboard-problem. This time with an electronic chalkboard, an innovation that I’ve heard about but never used myself.
Three of the strips I have for this installment feature kids around mathematics talk. That’s enough for a theme name.
Gary Delainey and Gerry Rasmussen’s Betty for the 23rd is a strip about luck. It’s easy to form the superstitious view that you have a finite amount of luck, or that you have good and bad lucks which offset each other. It feels like it. If you haven’t felt like it, then consider that time you got an unexpected $200, hours before your car’s alternator died.
If events are independent, though, that’s just not so. Whether you win $600 in the lottery this week has no effect on whether you win any next week. Similarly whether you’re struck by lightning should have no effect on whether you’re struck again.
Except that this assumes independence. Even defines independence. This is obvious when you consider that, having won $600, it’s easier to buy an extra twenty dollars in lottery tickets and that does increase your (tiny) chance of winning again. If you’re struck by lightning, perhaps it’s because you tend to be someplace that’s often struck by lightning. Probability is a subtler topic than everyone acknowledges, even when they remember that it is such a subtle topic.
Darrin Bell’s Candorville for the 23rd jokes about the uselessness of arithmetic in modern society. I’m a bit surprised at Lemont’s glee in not having to work out tips by hand. The character’s usually a bit of a science nerd. But liking science is different from enjoying doing arithmetic. And bad experiences learning mathematics can sour someone on the subject for life. (Which is true of every subject. Compare the number of people who come out of gym class enjoying physical fitness.)
If you need some Internet Old, read the comments at GoComics, which include people offering dire warnings about what you need in case your machine gives the wrong answer. Which is technically true, but for this application? Getting the wrong answer is not an immediately awful affair. Also a lot of cranky complaining about tipping having risen to 20% just because the United States continues its economic punishment of working peoples.
Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 25th is some wordplay. Mathematicians often need to find minimums of things. Or maximums of things. Being able to do one lets you do the other, as you’d expect. If you didn’t expect, think about it a moment, and then you expect it. So min and max are often grouped together.
Paul Trap’s Thatababy for the 26th is circling around wordplay, turning some common shape names into pictures. This strip might be aimed at mathematics teachers’ doors. I’d certainly accept these as jokes that help someone learn their shapes.
So a couple days ago I was chatting with a mathematician friend. He mentioned how he was struggling with the Ricci Tensor. Not the definition, not exactly, but its point. What the Ricci Tensor was for, and why it was a useful thing. He wished he knew of a pop mathematics essay about the thing. And this brought, slowly at first, to my mind that I knew of one. I wrote such a pop-mathematics essay about the Ricci Tensor, as part of my 2017 A To Z sequence. In it, I spend several paragraphs admitting that I’m not sure I understand what the Ricci tensor is for, and why it’s a useful thing.
Daniel Beyer’s Long Story Short for the 11th mentions some physics hypotheses. These are ideas about how the universe might be constructed. Like many such cosmological thoughts they blend into geometry. The no-boundary proposal, also known as the Hartle-Hawking state (for James Hartle and Stephen Hawking), is a hypothesis about the … I want to write “the start of time”. But I am not confident that this doesn’t beg the question. Well, we think we know what we mean by “the start of the universe”. A natural question in mathematical physics is, what was the starting condition? At the first moment that there was anything, what did it look like? And this becomes difficult to answer, difficult to even discuss, because part of the creation of the universe was the creation of spacetime. In this no-boundary proposal, the shape of spacetime at the creation of the universe is such that there just isn’t a “time” dimension at the “moment” of the Big Bang. The metaphor I see reprinted often about this is how there’s not a direction south of the south pole, even though south is otherwise a quite understandable concept on the rest of the Earth. (I agree with this proposal, but I feel like analogy isn’t quite tight enough.)
Still, there are mathematical concepts which seem akin to this. What is the start of the positive numbers, for example? Any positive number you might name has some smaller number we could have picked instead, until we fall out of the positive numbers altogether and into zero. For a mathematical physics concept there’s absolute zero, the coldest temperature there is. But there is no achieving absolute zero. The thermodynamical reasons behind this are hard to argue. (I’m not sure I could put them in a two-thousand-word essay, not the way I write.) It might be that the “moment of the Big Bang” is similarly inaccessible but, at least for the correct observer, incredibly close by.
The Weyl Curvature is a creation of differential geometry. So it is important in relativity, in describing the curve of spacetime. It describes several things that we can think we understand. One is the tidal forces on something moving along a geodesic. Moving along a geodesic is the general-relativity equivalent of moving in a straight line at a constant speed. Tidal forces are those things we remember reading about. They come from the Moon, sometimes the Sun, sometimes from a black hole a theoretical starship is falling into. Another way we are supposed to understand it is that it describes how gravitational waves move through empty space, space which has no other mass in it. I am not sure that this is that understandable, but it feels accessible.
The Weyl tensor describes how the shapes of things change under tidal forces, but it tracks no information about how the volume changes. The Ricci tensor, in contrast, tracks how the volume of a shape changes, but not the shape. Between the Ricci and the Weyl tensors we have all the information about how the shape of spacetime affects the things within it.
Ted Baum, writing to John Baez, offers a great piece of advice in understanding what the Weyl Tensor offers. Baum compares the subject to electricity and magnetism. If one knew all the electric charges and current distributions in space, one would … not quite know what the electromagnetic fields were. This is because there are electromagnetic waves, which exist independently of electric charges and currents. We need to account for those to have a full understanding of electromagnetic fields. So, similarly, the Weyl curvature gives us this for gravity. How is a gravitational field affected by waves, which exist and move independently of some source?
I am not sure that the Weyl Curvature is truly, as the comic strip proposes, a physics hypothesis “still on the table”. It’s certainly something still researched, but that’s because it offers answers to interesting questions. But that’s also surely close enough for the comic strip’s needs.
Dave Coverly’s Speed Bump for the 11th is a wordplay joke, and I have to admit its marginality. I can’t say it’s false for people who (presumably) don’t work much with coefficients to remember them after a long while. I don’t do much with French verb tenses, so I don’t remember anything about the pluperfect except that it existed. (I have a hazy impression that I liked it, but not an idea why. I think it was something in the auxiliary verb.) Still, this mention of coefficients nearly forms a comic strip synchronicity with Mike Thompson’s Grand Avenue for the 11th, in which a Math Joke allegedly has a mistaken coefficient as its punch line.
Mike Thompson’s Grand Avenue for the 12th is the one I’m taking as representative for the week, though. The premise has been that Gabby and Michael were sent to Math Camp. They do not want to go to Math Camp. They find mathematics to be a bewildering set of arbitrary and petty rules to accomplish things of no interest to them. From their experience, it’s hard to argue. The comic has, since I started paying attention to it, consistently had mathematics be a chore dropped on them. And not merely from teachers who want them to solve boring story problems. Their grandmother dumps workbooks on them, even in the middle of summer vacation, presenting it as a chore they must do. Most comic strips present mathematics as a thing students would rather not do, and that’s both true enough and a good starting point for jokes. But I don’t remember any that make mathematics look so tedious. Anyway, I highlight this one because of the Math Camp jokes it, and the coefficients mention above, are the most direct mention of some mathematical thing. The rest are along the lines of the strip from the 9th, asserting that the “Math Camp Activity Board” spelled the last word wrong. The joke’s correct but it’s not mathematical.
So I had to put this essay to bed before I could read Saturday’s comics. Were any of them mathematically themed? I may know soon! And were there comic strips with some mention of mathematics, but too slight for me to make a paragraph about? What could be even slighter than the mathematical content of the Speed Bump and the Grand Avenue I did choose to highlight? Please check the Reading the Comics essay I intend to publish Tuesday. I’m curious myself.
A friend was playing with that cute little particle-physics simulator idea I mentioned last week. And encountered a problem. With a little bit of thought, I was able to not solve the problem. But I was able to explain why it was a subtler and more difficult problem than they had realized. These are the moments that make me feel justified calling myself a mathematician.
The proposed simulation was simple enough: imagine a bunch of particles that interact by rules that aren’t necessarily symmetric. Like, the attraction particle A exerts on particle B isn’t the same as what B exerts on A. Or there are multiple species of particles. So (say) red particles are attracted to blue but repelled by green. But green is attracted to red and repelled by blue twice as strongly as red is attracted to blue. Your choice.
Give a mathematician a perfectly good model of something. She’ll have the impulse to try tinkering with it. One reliable way to tinker with it is to change the domain on which it works. If your simulation supposes you have particles moving on the plane, then, what if they were in space instead? Or on the surface of a sphere? Or what if something was strange about the plane? My friend had this idea: what if the particles were moving on the surface of a cube?
And the problem was how to find the shortest distance between two particles on the surface of a cube. The distance matters since most any attraction rule depends on the distance. This may be as simple as “particles more than this distance apart don’t interact in any way”. The obvious approach, or if you prefer the naive approach, is to pretend the cube is a sphere and find distances that way. This doesn’t get it right, not if the two points are on different faces of the cube. If they’re on adjacent faces, ones which share an edge — think the floor and the wall of a room — it seems straightforward enough. My friend got into trouble with points on opposite faces. Think the floor and the ceiling.
Inside a rectangular room, measuring 30 feet in length and 12 feet in width and height, a spider is at a point on the middle of one of the end walls, 1 foot from the ceiling, as at A; and a fly is on the opposite wall, 1 foot from the floor in the centre, as shown at B. What is the shortest distance that the spider must crawl in order to reach the fly, which remains stationary? Of course the spider never drops or uses its web, but crawls fairly.
(Also I admire Dudeney’s efficient closing off of the snarky, problem-breaking answer someone was sure to give. It suggests experienced thought about how to pose problems.)
What makes this a puzzle, even a paradox, is that the obvious answer is wrong. At least, what seems like the obvious answer is to start at point A, move to one of the surfaces connecting the spider’s and the fly’s starting points, and from that move to the fly’s surface. But, no: you get a shorter answer by using more surfaces. Going on a path that seems like it wanders more gets you a shorter distance. The solution’s presented here, along with some follow-up problems. In this case, the spider’s shortest path uses five of the six surfaces of the room.
The approach to finding this is an ingenious one. Imagine the room as a box, and unfold it into something flat. Then find the shortest distance on that flat surface. Then fold the box back up. It’s a good trick. It turns out to be useful in many problems. Mathematical physicists often have reason to ponder paths of things on flattenable surfaces like this. Sometimes they’re boxes. Sometimes they’re toruses, the shape of a doughnut. This kind of unfolding often makes questions like “what’s the shortest distance between points” easier to solve.
There are wrinkles to the unfolding. Of course there are. How interesting would it be if there weren’t? The wrinkles amount to this. Imagine you start at the corner of the room, and walk up a wall at a 45 degree angle to the horizon. You’ll get to the far corner eventually, if the room has proportions that allow it. All right. But suppose you walked up at an angle of 30 degrees to the horizon? At an angle of 75 degrees? You’ll wind your way around the walls (and maybe floor and ceiling) some number of times, each path you start with. Probably different numbers of times. Some path will be shortest, and that’s fine. But … like, think about the path that goes along the walls and ceiling and floor three times over. The room, unfolded into a flat panel, has only one floor and one ceiling and each wall once. The straight line you might be walking goes right off the page.
And this is the wrinkle. You might need to tile the room. In a column of blocks (like in Dudeney’s solution) every fourth block might be the floor, with, between any two of them, a ceiling. This is fine, and what’s needed. It can be a bit dizzying to imagine such a state of affairs. But if you’ve ever zoomed a map of the globe out far enough that you see Australia six times over then you’ve understood how this works.
I cannot attest that this has helped my friend in the slightest. I am glad that my friend wanted to think about the surface of the cube. The surface of a dodecahedron would be far, far past my ability to help with.
I hoped I’d get a Reading the Comics post in for Tuesday, and even managed it. With this I’m all caught up to the syndicated comic strips which, last week, brought up some mathematics topic. I’m open for nominations about what to publish here Thursday. Write in quick.
Hilary Price’s Rhymes With Orange for the 30th is a struggling-student joke. And set in summer school, so the comic can be run the last day of June without standing out to its United States audience. It expresses a common anxiety, about that point when mathematics starts using letters. It superficially seems strange that this change worries students. Students surely had encountered problems where some term in an equation was replaced with a blank space and they were expected to find the missing term. This is the same work as using a letter. Still, there are important differences. First is that a blank line (box, circle, whatever) has connotations of “a thing to be filled in”. A letter seems to carry meaning in to the problem, even if it’s just “x marks the spot”. And a letter, as we use it in English, always stands for the same thing (or at least the same set of things). That ‘x’ may be 7 in one problem and 12 in another seems weird. I mean weird even by the standards of English orthography.
A letter might represent a number whose value we wish to know; it might represent a number whose value we don’t care about. These are different ideas. We usually fall into a convention where numbers we wish to know are more likely x, y, and z, while those we don’t care about are more likely a, b, and c. But even that’s no reliable rule. And there may be several letters in a single equation. It’s one thing to have a single unknown number to deal with. To have two? Three? I don’t blame people fearing they can’t handle that.
Mark Leiknes’s Cow and Boy for the 30th has Billy and Cow pondering the Prisoner’s Dilemma. This is one of the first examples someone encounters in game theory. Game theory sounds like the most fun part of mathematics. It’s the study of situations in which there’s multiple parties following formal rules which allow for gains or losses. This is an abstract description. It means many things fit a mathematician’s idea of a game.
The Prisoner’s Dilemma is described well enough by Billy. It’s built on two parties, each — separately and without the ability to coordinate — having to make a choice. Both would be better off, under interrogation, to keep quiet and trust that the cops can’t get anything significant on them. But both have the temptation that if they rat out the other, they’ll get off free while their former partner gets screwed. And knowing that their partner has the same temptation. So what would be best for the two of them requires them both doing the thing that maximizes their individual risk. The implication is unsettling: everyone acting in their own best interest is supposed to produce the best possible result for society. And here, for the society of these two accused, it breaks down entirely.
Exponents have been written as numbers in superscript following a base for a long while now. The notation developed over the 17th century. I don’t know why mathematicians settled on superscripts, as opposed to the many other ways a base and an exponent might fit together. It’s a good mnemonic to remember, say, “z raised to the 10th” is z with a raised 10. But I don’t know the etymology of “raised” in a mathematical context well enough. It’s plausible that we say “raised” because that’s what the notation suggests.
The proof of the Pythagorean Theorem is one of the very many known to humanity. This one is among the family of proofs that are wordless. At least nearly wordless. You can get from here to with very little prompting. If you do need prompting, it’s this: there are two expressions for how much area of the square with sides a-plus-b. One of these expressions uses only terms of a and b. The other expression uses terms of a, b, and c. If this doesn’t get a bit of a grin out of you, don’t worry. There’s, like, 2,037 other proofs we already know about. We might ask whether we need quite so many proofs of the Pythagorean theorem. It doesn’t seem to be under serious question most of the time.
And then a couple comic strips last week just mentioned mathematics. Morrie Turner’s Wee Pals for the 1st of July has the kids trying to understand their mathematics homework. Could have been anything. Mike Thompson’s Grand Avenue for the 5th started a sequence with the kids at Math Camp. The comic is trying quite hard to get me riled up. So far it’s been the kids agreeing that mathematics is the worst, and has left things at that. Hrmph.
Ernie Bushmiller’s Nancy Classics for the 27th uses arithmetic as an economical way to demonstrate intelligence. At least, the ability to do arithmetic is used as proof of intelligence. Which shouldn’t surprise. The conventional appreciation for Ernie Bushmiller is of his skill at efficiently communicating the ideas needed for a joke. That said, it’s a bit surprising Sluggo asks the dog “six times six divided by two”; if it were just showing any ability at arithmetic “one plus one” or “two plus two” would do. But “six times six divided by two” has the advantage of being a bit complicated. That is, it’s reasonable Sluggo wouldn’t know it right away, and would see it as something only the brainiest would. But it’s not so complicated that Sluggo wouldn’t plausibly know the question.
Eric the Circle for the 28th, this one by AusAGirl, uses “Non-Euclidean” as a way to express weirdness in shape. My first impulse was to say that this wouldn’t really be a non-Euclidean circle. A non-Euclidean geometry has space that’s different from what we’re approximating with sheets of paper or with boxes put in a room. There are some that are familiar, or roughly familiar, such as the geometry of the surface of a planet. But you can draw circles on the surface of a globe. They don’t look like this mooshy T-circle. They look like … circles. Their weirdness comes in other ways, like how the circumference is not π times the diameter.
On reflection, I’m being too harsh. What makes a space non-Euclidean is … well, many things. One that’s easy to understand is to imagine that the space uses some novel definition for the distance between points. Distance is a great idea. It turns out to be useful, in geometry and in analysis, to use a flexible idea of of what distance is. We can define the distance between things in ways that look just like the Euclidean idea of distance. Or we can define it in other, weirder ways. We can, whatever the distance, define a “circle” as the set of points that are all exactly some distance from a chosen center point. And the appearance of those “circles” can differ.
There are literally infinitely many possible distance functions. But there is a family of them which we use all the time. And the “circles” in those look like … well, at the most extreme, they look like squares. Others will look like rounded squares, or like slightly diamond-shaped circles. I don’t know of any distance function that’s useful that would give us a circle like this picture of Eric. But there surely is one that exists and that’s enough for the joke to be certified factually correct. And that is what’s truly important in a comic strip.
Sandra Bell-Lundy’s Between Friends for the 29th is the Venn Diagram joke for the week. Formally, you have to read this diagram charitably for it to parse. If we take the “what” that Maeve says, or doesn’t say, to be particular sentences, then the intersection has to be empty. You can’t both say and not-say a sentence. But it seems to me that any conversation of importance has the things which we choose to say and the things which we choose not to say. And it is so difficult to get the blend of things said and things unsaid correct. And I realize that the last time Between Friends came up here I was similarly defending the comic’s Venn Diagram use. I’m a sympathetic reader, at least to most comic strips.
And that was the conclusion of comic strips through the 29th of June which mentioned mathematics enough for me to write much about. There were a couple other comics that brought up something or other, though. Wulff and Morgenthaler’s WuMo for the 27th of June has a Rubik’s Cube joke. The traditional Rubik’s Cube has three rows, columns, and layers of cubes. But there’s no reason there can’t be more rows and columns and layers. Back in the 80s there were enough four-by-four-by-four cubes sold that I even had one. Wikipedia tells me the officially licensed cubes have gotten only up to five-by-five-by-five. But that there was a 17-by-17-by-17 cube sold, with prototypes for 22-by-22-by-22 and 33-by-33-by-33 cubes. This seems to me like a great many stickers to peel off and reattach.
I’d meant to get back into discussing continuous functions this week, and then didn’t have the time. I hope nobody was too worried.
Bill Amend’s FoxTrot for the 19th is set up as geometry or trigonometry homework. There are a couple of angles that we use all the time, and they do correspond to some common unit fractions of a circle: a quarter, a sixth, an eighth, a twelfth. These map nicely to common cuts of circular pies, at least. Well, it’s a bit of a freak move to cut a pie into twelve pieces, but it’s not totally out there. If someone cuts a pie into 24 pieces, flee.
Tom Batiuk’s vintage Funky Winkerbean for the 19th of May is a real vintage piece, showing off the days when pocket electronic calculators were new. The sales clerk describes the calculator as having “a floating decimal”. And here I must admit: I’m poorly read on early-70s consumer electronics. So I can’t say that this wasn’t a thing. But I suspect that Batiuk either misunderstood “floating-point decimal”, which would be a selling point, or shortened the phrase in order to make the dialogue less needlessly long. Which is fine, and his right as an author. The technical detail does its work, for the setup, by existing. It does not have to be an actual sales brochure. Reducing “floating point decimal” to “floating decimal” is a useful artistic shorthand. It’s the dialogue equivalent to the implausibly few, but easy to understand, buttons on the calculator in the title panel.
Floating point is one of the ways to represent numbers electronically. The storage scheme is much like scientific notation. That is, rather than think of 2,038, think of 2.038 times 103. In the computer’s memory are stored the 2.038 and the 3, with the “times ten to the” part implicit in the storage scheme. The advantage of this is the range of numbers one can use now. There are different ways to implement this scheme; a common one will let one represent numbers as tiny as 10-308 or as large as 10308, which is enough for most people’s needs.
The disadvantage is that floating point numbers aren’t perfect. They have only around (commonly) sixteen digits of significance. That is, the first sixteen or so nonzero numbers in the number you represent mean anything; everything after that is garbage. Most of the time, that trailing garbage doesn’t hurt. But most is not always. Trying to add, for example, a tiny number, like 10-20, to a huge number, like 1020 won’t get the right answer. And there are numbers that can’t be represented correctly anyway, including such exotic and novel numbers as . A lot of numerical mathematics is about finding ways to compute that avoid these problems.
Back when I was a grad student I did have one casual friend who proclaimed that no real mathematician ever worked with floating point numbers, because of the limitations they impose. I could not get him to accept that no, in fact, mathematicians are fine with these limitations. Every scheme for representing numbers on a computer has limitations, and floating point numbers work quite well. At some point, you have to suspect some people would rather fight for a mistaken idea they already have than accept something new.
Mac King and Bill King’s Magic in a Minute for the 19th does a bit of stage magic supported by arithmetic: forecasting the sum of three numbers. The trick is that all eight possible choices someone would make have the same sum. There’s a nice bit of group theory hidden in the “Howdydoit?” panel, about how to do the trick a second time. Rotating the square of numbers makes what looks, casually, like a different square. It’s hard for human to memorize a string of digits that don’t have any obvious meaning, and the longer the string the worse people are at it. If you’ve had a person — as directed — black out the rows or columns they didn’t pick, then it’s harder to notice the reused pattern.
The different directions that you could write the digits down in represent symmetries of the square. That is, geometric operations that would replace a square with something that looks like the original. This includes rotations, by 90 or 180 or 270 degrees clockwise. Mac King and Bill King don’t mention it, but reflections would also work: if the top row were 4, 9, 2, for example, and the middle 3, 5, 7, and the bottom 8, 1, 6. Combining rotations and reflections also works.
If you do the trick a second time, your mark might notice it’s odd that the sum came up 15 again. Do it a third time, even with a different rotation or reflection, and they’ll know something’s up. There are things you could do to disguise that further. Just double each number in the square, for example: a square of 4/18/8, 14/10/6, 12/2/16 will have each row or column or diagonal add up to 30. But this loses the beauty of doing this with the digits 1 through 9, and your mark might grow suspicious anyway. The same happens if, say, you add one to each number in the square, and forecast a sum of 18. Even mathematical magic tricks are best not repeated too often, not unless you have good stage patter.
Mark Anderson’s Andertoons for the 20th is the Mark Anderson’s Andertoons for the week. Wavehead’s marveling at what seems at first like an asymmetry, about squares all being rhombuses yet rhombuses not all being squares. There are similar results with squares and rectangles. Still, it makes me notice something. Nobody would write a strip where the kid marvelled that all squares were polygons but not all polygons were squares. It seems that the rhombus connotes something different. This might just be familiarity. Polygons are … well, if not a common term, at least something anyone might feel familiar. Rhombus is a more technical term. It maybe never quite gets familiar, not in the ways polygons do. And the defining feature of a rhombus — all four sides the same length — seems like the same thing that makes a square a square.