This is easy. The velocity is the first derivative of the position. First derivative with respect to time, if you must know. That hardly needed an extra week to write.
Yes, there’s more. There is always more. Velocity is important by itself. It’s also important for guiding us into new ideas. There are many. One idea is that it’s often the first good example of vectors. Many things can be vectors, as mathematicians see them. But the ones we think of most often are “some magnitude, in some direction”.
The position of things, in space, we describe with vectors. But somehow velocity, the changes of positions, seems more significant. I suspect we often find static things below our interest. I remember as a physics major that my Intro to Mechanics instructor skipped Statics altogether. There are many important things, like bridges and roofs and roller coaster supports, that we find interesting because they don’t move. But the real Intro to Mechanics is stuff in motion. Balls rolling down inclined planes. Pendulums. Blocks on springs. Also planets. (And bridges and roofs and roller coaster supports wouldn’t work if they didn’t move a bit. It’s not much though.)
So velocity shows us vectors. Anything could, in principle, be moving in any direction, with any speed. We can imagine a thing in motion inside a room that’s in motion, its net velocity being the sum of two vectors.
And they show us derivatives. A compelling answer to “what does differentiation mean?” is “it’s the rate at which something changes”. Properly, we can take the derivative of any quantity with respect to any variable. But there are some that make sense to do, and position with respect to time is one. Anyone who’s tried to catch a ball understands the interest in knowing.
We take derivatives with respect to time so often we have shorthands for it, by putting a ‘ mark after, or a dot above, the variable. So if x is the position (and it often is), then is the velocity. If we want to emphasize we think of vectors, is the position and the velocity.
Velocity has another common shorthand. This is , or if we want to emphasize its vector nature, . Why a name besides the good enough ? It helps us avoid misplacing a ‘ mark in our work, for one. And giving velocity a separate symbol encourages us to think of the velocity as independent from the position. It’s not — not exactly — independent. But knowing that a thing is in the lawn outside tells us nothing about how it’s moving. Velocity affects position, in a process so familiar we rarely consider how there’s parts we don’t understand about it. But velocity is also somehow also free of the position at an instant.
Velocity also guides us into a first understanding of how to take derivatives. Thinking of the change in position over smaller and smaller time intervals gets us to the “instantaneous” velocity by doing only things we can imagine doing with a ruler and a stopwatch.
Velocity has a velocity. , also known as . Or, if we’re sure we won’t lose a ‘ mark, . Once we are comfortable thinking of how position changes in time we can think of other changes. Velocity’s change in time we call acceleration. This is also a vector, more abstract than position or velocity. Multiply the acceleration by the mass of the thing accelerating and we have a vector called the “force”. That, we at least feel we understand, and can work with.
Acceleration has a velocity too, a rate of change in time. It’s called the “jerk” by people telling you the change in acceleration in time is called the “jerk”. (I don’t see the term used in the wild, but admit my experience is limited.) And so on. We could, in principle, keep taking derivatives of the position and keep finding new changes. But most physics problems we find interesting use just a couple of derivatives of the position. We can label them, if we need, , where n is some big enough number like 4.
We can bundle them in interesting ways, though. Come back to that mention of treating position and velocity of something as though they were independent coordinates. It’s a useful perspective. Imagine the rules about how particles interacting with one another and with their environment. These usually have explicit roles for position and velocity. (Granting this may reflect a selection bias. But these do cover enough interesting problems to fill a career.)
So we create a new vector. It’s made of the positition and the velocity. We’d write it out as . The superscript-T there, “transposition”, lets us use the tools of matrix algebra. This vector describes a point in phase space. Phase space is the collection of all the physically possible positions and velocities for the system.
What’s the derivative, in time, of this point in phase space? Glad to say we can do this piece by piece. The derivative of a vector is the derivative of each component of a vector. So the derivative of is , or, . This acceleration itself depends on, normally, the positions and velocities. So we can describe this as for some function . You are surely impressed with this symbol-shuffling. You are less sure why this bother.
The bother is a trick of ordinary differential equations. All differential equations are about how a function-to-be-determined and its derivatives relate to one another. In ordinary differential equations, the function-to-be-determined depends on a single variable. Usually it’s called x or t. There may be many derivatives of f. This symbol-shuffling rewriting takes away those higher-order derivatives. We rewrite the equation as a vector equation of just one order. There’s some point in phase space, and we know what its velocity is. That we do because in this form many problems can be written as a matrix problem: . Or approximate our problem as a matrix problem. This lets us bring in linear algebra tools, and that’s worthwhile.
It calls on a more abstract idea of what a “velocity” might be. We can explain what the thing that’s “moving” and what it’s moving through are, given time. But the instincts we develop from watching ordinary things move help us in these new territories. This is also a classic mathematician’s trick. It may seem like all mathematicians do is develop tricks to extend what they already do. I can’t say this is wrong.
I decided to let the V essay slide to Wednesday. This will make the end of the 2020 A-to-Z run a week later than I originally imagined, but that’s all right. It’ll all end in 2020 unless there’s another unexpected delay.
I have gotten several good suggestions for the letters W and X, but I’m still open to more, preferably for X. And I would like any thoughts anyone would like to share for the last letters of the alphabet. If you have an idea for a mathematical term starting with either letter, please let me know in comments. Also please let me know about any blogs or other projects you have, so that I can give them my modest boost with the essay. I’m open to revisiting topics I’ve already discussed, if I can think of something new to say or if I’ve forgotten I wrote them about them already.
Topics I’ve already covered, starting with the letter ‘Y’, are:
I have accepted that this week, at least, I do not have it in me to write an A-to-Z essay. I’ll be back to it next week, I think. I don’t know whether I’ll publish my usual I-meant-this-to-be-800-words-and-it’s-three-times-that piece on Monday or on Wednesday, but it’ll be sometime next week. And, events personal and public allowing, I’ll continue weekly from there. Should still finish the essay series before 2020 finishes. I say this assuming that 2020 will in fact finish.
But now let me look back on a time when I could produce essays with an almost machine-like reliability, except for when I forgot to post them. My 2019 Mathematics A To Z: Versine is such an essay. The versine is a function that had a respectably long life in a niche of computational computing. Cheap electronic computers wiped out that niche. The reasons that niche ever existed, though, still apply, just to different problems. Knowing of past experiences can help us handle future problems.
I am not writing another duplicate essay. I intend to have an A-to-Z essay for the week. I just haven’t had the time or energy to write anything so complicated as an A-to-Z since the month began. Things are looking up, though, and I hope to have something presentable for Friday.
So let me just swap my publication slots around, and share an older essay, as I would have on Friday. My 2018 Mathematics A To Z: Volume was suggested by Ray Kassinger, of the popular web comic Housepets!, albeit as a Mystery Science Theater 3000 reference. It’s a great topic, though. It’s one of those things everyone instinctively understands. But making that instinct precise demands we accept some things that seem absurd. It’s a great example of what mathematics can do, given a chance.
In looking over past A-to-Z’s I notice a lot of my U- entries are the negation of something. Unknots, for example. Or unbounded. English makes this construction hard to avoid. Any interesting property is also interesting when it’s absent. But there are also mathematical terms that start with a U on their own terms. The Summer 2017 Mathematics A To Z: Ulam’s Spiral shows off one of them. Stanislaw Ulam’s spiral is one of those things we find as a curious graphical adjunct to prime numbers. The essay also features one of my many pieces in praise of boredom.
I assume that last week I disappointed Mr Wu, of the Singapore Maths Tuition blog, last week when I passed on a topic he suggested to unintentionally rewrite a good enough essay. I hope to make it up this week with a piece of linear algebra.
A Unitary Matrix — note the article; there is not a singular the Unitary Matrix — starts with a matrix. This is an ordered collection of scalars. The scalars we call elements. I can’t think of a time I ever saw a matrix represented except as a rectangular grid of elements, or as a capital letter for the name of a matrix. Or a block inside a matrix. In principle the elements can be anything. In practice, they’re almost always either real numbers or complex numbers. To speak of Unitary Matrixes invokes complex-valued numbers. If a matrix that would be Unitary has only real-valued elements, we call that an Orthogonal Matrix. It’s not wrong to call an Orthogonal matrix “Unitary”. It’s like pointing to a known square, though, and calling it a parallelogram. Your audience will grant that’s true. But it wonder what you’re getting at, unless you’re talking about a bunch of parallelograms and some of them happen to be squares.
As with polygons, though, there are many names for particular kinds of matrices. The flurry of them settles down on the Intro to Linear Algebra student and it takes three or four courses before most of them feel like familiar names. I will try to keep the flurry clear. First, we’re talking about square matrices, ones with the same number of rows as columns.
Start with any old square matrix. Give it the name U because you see where this is going. There are a couple of new matrices we can derive from it. One of them is the complex conjugate. This is the matrix you get by taking the complex conjugate of every term. So, if one element is , in the complex conjugate, that element would be . Reverse the plus or minus sign of the imaginary component. The shorthand for “the complex conjugate to matrix U” is . Also we’ll often just say “the conjugate”, taking the “complex” part as implied.
Start back with any old square matrix, again called U. Another thing you can do with it is take the transposition. This matrix, U-transpose, you get by keeping the order of elements but changing rows and columns. That is, the elements in the first row become the elements in the first column. The elements in the second row become the elements in the second column. Third row becomes the third column, and so on. The diagonal — first row, first column; second row, second column; third row, third column; and so on — stays where it was. The shorthand for “the transposition of U” is .
You can chain these together. If you start with U and take both its complex-conjugate and its transposition, you get the adjoint. We write that with a little dagger: . For a wonder, as matrices go, it doesn’t matter whether you take the transpose or the conjugate first. It’s the same . You may ask how people writing this out by hand never mistake for . This is a good question and I hope to have an answer someday. (I would write it as in my notes.)
And the last thing you can maybe do with a square matrix is take its inverse. This is like taking the reciprocal of a number. When you multiply a matrix by its inverse, you get the Identity Matrix. Not every matrix has an inverse, though. It’s worse than real numbers, where only zero doesn’t have a reciprocal. You can have a matrix that isn’t all zeroes and that doesn’t have an inverse. This is part of why linear algebra mathematicians command the big money. But if a matrix U has an inverse, we write that inverse as .
The Identity Matrix is one of a family of square matrices. Every element in an identity matrix is zero, except on the diagonal. That is, the element at row one, column one, is the number 1. The element at row two, column two is also the number 1. Same with row three, column three: another one. And so on. This is the “identity” matrix because it works like the multiplicative identity. Pick any matrix you like, and multiply it by the identity matrix; you get the original matrix right back. We use the name for an identity matrix. If we have to be clear how many rows and columns the matrix has, we write that as a subscript: or or or so on.
So this, finally, lets me say what a Unitary Matrix is. It’s any square matrix U where the adjoint, is the same matrix as the inverse, . It’s wonderful to learn you have a Unitary Matrix. Not just because, most of the time, finding the inverse of a matrix is a long and tedious procedure. Here? You have to write the elements in a different order and change the plus-or-minus sign on the imaginary numbers. The only way it would be easier if you had only real numbers, and didn’t have to take the conjugates.
That’s all a nice heap of terms. What makes any of them important, other than so Intro to Linear Algebra professors can test their students?
Well, you know mathematicians. If we like something like this, it’s usually because it holds out the prospect of turning a hard problems into easier ones. So it is. Start out with any old matrix. Call it A. Then there exist some unitary matrixes, call them U and V. And their product does something wonderful: is a “diagonal” matrix. A diagonal matrix has zeroes for every element except the diagonal ones. That is, row one, column one; row two, column two; row three, column three; and so on. The elements that trace a path from the upper-left to the lower-right corner of the matrix. (The diagonal from the upper-right to the lower-left we have nothing to do with.) Everything we might do with matrices is easier on a diagonal matrix. So we process our matrix A into this diagonal matrix D. Process it by whatever the heck we’re doing. If we then multiply this by the inverses of U and V? If we calculate ? We get whatever our process would have given us had we done it to A. And, since U and V are unitary matrices, it’s easy to find these inverses. Wonderful!
Also this sounds like I just said Unitary Matrixes are great because they solve a problem you never heard of before.
The 20th Century’s first great use for Unitary Matrixes, and I imagine the impulse for Mr Wu’s suggestion, was quantum mechanics. (A later use would be data compression.) Unitary Matrixes help us calculate how quantum systems evolve. This should be a little easier to understand if I use a simple physics problem as demonstration.
So imagine three blocks, all the same mass. They’re connected in a row, left to right. There’s two springs, one between the left and the center mass, one between the center and the right mass. The springs have the same strength. The blocks can only move left-to-right. But, within those bounds, you can do anything you like with the blocks. Move them wherever you like and let go. Let them go with a kick moving to the left or the right. The only restraint is they can’t pass through one another; you can’t slide the center block to the right of the right block.
This is not quantum mechanics, by the way. But it’s not far, either. You can turn this into a fine toy of a molecule. For now, though, think of it as a toy. What can you do with it?
A bunch of things, but there’s two really distinct ways these blocks can move. These are the ways the blocks would move if you just hit it with some energy and let the system do what felt natural. One is to have the center block stay right where it is, and the left and right blocks swinging out and in. We know they’ll swing symmetrically, the left block going as far to the left as the right block goes to the right. But all these symmetric oscillations look about the same. They’re one mode.
The other is … not quite antisymmetric. In this mode, the center block moves in one direction and the outer blocks move in the other, just enough to keep momentum conserved. Eventually the center block switches direction and swings the other way. But the outer blocks switch direction and swing the other way too. If you’re having trouble imagining this, imagine looking at it from the outer blocks’ point of view. To them, it’s just the center block wobbling back and forth. That’s the other mode.
And it turns out? It doesn’t matter how you started these blocks moving. The movement looks like a combination of the symmetric and the not-quite-antisymmetric modes. So if you know how the symmetric mode evolves, and how the not-quite-antisymmetric mode evolves? Then you know how every possible arrangement of this system evolves.
So here’s where we get to quantum mechanics. Suppose we know the quantum mechanics description of a system at some time. This we can do as a vector. And we know the Hamiltonian, the description of all the potential and kinetic energy, for how the system evolves. The evolution in time of our quantum mechanics description we can see as a unitary matrix multiplied by this vector.
The Hamiltonian, by itself, won’t (normally) be a Unitary Matrix. It gets the boring name H. It’ll be some complicated messy thing. But perhaps we can find a Unitary Matrix U, so that is a diagonal matrix. And then that’s great. The original H is hard to work with. The diagonalized version? That one we can almost always work with. And then we can go from solutions on the diagonalized version back to solutions on the original. (If the function describes the evolution of , then describes the evolution of .) The work that U (and ) does to H is basically what we did with that three-block, two-spring model. It’s picking out the modes, and letting us figure out their behavior. Then put that together to work out the behavior of what we’re interested in.
There are other uses, besides time-evolution. For instance, an important part of quantum mechanics and thermodynamics is that we can swap particles of the same type. Like, there’s no telling an electron that’s on your nose from an electron that’s in one of the reflective mirrors the Apollo astronauts left on the Moon. If they swapped positions, somehow, we wouldn’t know. It’s important for calculating things like entropy that we consider this possibility. Two particles swapping positions is a permutation. We can describe that as multiplying the vector that describes what every electron on the Earth and Moon is doing by a Unitary Matrix. Here it’s a matrix that does nothing but swap the descriptions of these two electrons. I concede this doesn’t sound thrilling. But anything that goes into calculating entropy is first-rank important.
As with time-evolution and with permutation, though, any symmetry matches a Unitary Matrix. This includes obvious things like reflecting across a plane. But it also covers, like, being displaced a set distance. And some outright obscure symmetries too, such as the phase of the state function . I don’t have a good way to describe what this is, physically; we can’t observe it directly. This symmetry, though, manifests as the conservation of electric charge, a thing we rather like.
This, then, is the sort of problem that draws Unitary Matrixes to our attention.
I’m still only doing short reviews of my readership figures. These are nice easy posts to make, and strangely popular, but they do take time and I’m never sure why people find them interesting. I think it’s all from other bloggers, happy to know how much better their blogs are doing.
Granted that: I had, for me, a really well-read month. According to WordPress, there were 3,043 pages viewed here in October 2020. This is way above the twelve-month running average of 2,381.5 views per month. Also this is the second-largest number of page views I’ve gotten since October 2019. That month, too, was part of an A-to-Z sequence. I wrote something that got referenced on some actually popular web site, though, last year. This year, all I can figure is spillover of people on my other blog wanting to know what’s going on with Mark Trail.
(If you read any web site that regularly talks about Mark Trail, poke around the comments. There’s people upset about the new artist. It’s not my intention to mock them; anything you like changing out from under you is upsetting. But it is soothing to see people worrying about, ultimately, a guy who punches smugglers while giant squirrels talk. On my other blog I plan to have a full plot recap of that in about two weeks.)
There were more unique visitors in October 2020 than any other month besides October 2019, also. WordPress recorded 2,161 unique visitors, well above the twelve-month running average of 1,644.2. It’s much the same for interactions as well: 79 things were liked, compared to the running average of 59.8, and 18 comments, above the 17.1 running average.
October was another month of 18 posts, and I have a running average of 17.6 posts per month now. I’m surprised by that too. I feel like any month that isn’t an A-to-Z sequence I have twelve posts, but there we go. This all means the per-post October averages were above the per-post running averages.
What were the most popular recent posts? Here recent means “from September or October”? That I’m glad to share:
All told, in October I published 12,937 words, down a bit from September. This was an average of 718.7 words per posting in October, which still brings my year-to-date average post length up to 697 words. It had been 694 at the start of October.
As of the start of November I’ve published 1,554 posts here. They’ve gathered 116,811 page views. I like how nearly but not quite palindromic that number is. It even almost but not quite stays the same under a 180 degree rotation. These pages overall have drawn 66,030 logged unique visitors.
I know, it’s strange for me to not post another piece about tiling. But My 2019 Mathematics A To Z: Taylor Series is going to be a good utility essay, useful for a long while to come. Taylor Series represent one of the standard mathematician tricks. This is to rewrite a thing we want to do as a sum of things it’s easy to do. This can make our problem into a long series of little problems. But the advantage is we know what to do with all those little problems. It’s often a worthwhile trade.
I’ve always held out the option that I would revisit a topic sometime. I thought it would most likely be taking some essay from one of my earliest A-to-Z’s where, with a half-decade’s more experience in pop mathematics writing, I could do much better. And at the request of someone who felt that, like, my piece on duals was foggy. It is, but nobody’s ever cared enough about duals to say anything.
So I went looking at what previous T topics I’d written about here. Usually I pick them the Sunday or Monday of a week, since that’s easy to do. This week, I didn’t have the time until Thursday when I looked and found I wrote up “Tiling” for the 2018 A-to-Z. In about November of that year, too. And after casting aside a suggestion from Mr Wu of the Singapore Maths Tuition blog, although that time at least I was responding to a specific topic suggestion. 2020, you know?
Well, now that the deed is done, I can see what I learned from it anyway. First is picking out the archive pieces before I write the week’s essay. Second is how my approach differed in the 2020 essay. The broad picture is similar enough. The most interesting differences are that in the 2020 essay I look at more specifics. Like, just when Robert Berger found his aperiodic tiling of the plane. And what the Wang Tiles are that he found them with. Or, a very brief sketch of how to show Penrose (rhomboid) tiling is aperiodic. This changes the shape of the essay. Also it makes the essay longer, but that might also might reflect that in 2018 I was publishing two essays a week. This year I’m doing one, and somehow still putting out as many words per week.
I like the greater focus on specifics, although that might just reflect that I’m usually happiest with something I just wrote. As I get distance from it, I come to feel the whole thing’s so bad as to be humiliating. When it’s far enough in the past, usually, I come around again and feel it’s pretty good, and maybe that I don’t know how to write like that anymore. The 2018 essay is, to me, only embarrassing in stuff that I glossed over that in 2020 I made specific. Not to worry, though. I still get foggy and elliptical about important topics anyway.
Mr Wu, author of the Singapore Maths Tuition blog, had an interesting suggestion for the letter T: Talent. As in mathematical talent. It’s a fine topic but, in the end, too far beyond my skills. I could share some of the legends about mathematical talent I’ve received. But what that says about the culture of mathematicians is a deeper and more important question.
So I picked my own topic for the week. I do have topics for next week — U — and the week after — V — chosen. But the letters W and X? I’m still open to suggestions. I’m open to creative or wild-card interpretations of the letters. Especially for X and (soon) Z. Thanks for sharing any thoughts you care to.
Think of a floor. Imagine you are bored. What do you notice?
What I hope you notice is that it is covered. Perhaps by carpet, or concrete, or something homogeneous like that. Let’s ignore that. My floor is covered in small pieces, repeated. My dining room floor is slats of wood, about three and a half feet long and two inches wide. The slats are offset from the neighbors so there’s a pleasant strong line in one direction and stippled lines in the other. The kitchen is squares, one foot on each side. This is a grid we could plot high school algebra functions on. The bathroom is more elaborate. It has white rectangles about two inches long, tan rectangles about two inches long, and black squares. Each rectangle is perpendicular to ones of the other color, and arranged to bisect those. The black squares fill the gaps where no rectangle would fit.
Move from my house to pure mathematics. It’s easy to turn the floor of a room into abstract mathematics. We start with something to tile. Usually this is the infinite, two-dimensional plane. The thing you get if you have a house and forget the walls. Sometimes we look to tile the hyperbolic plane, a different geometry that we of course represent with a finite circle. (Setting particular rules about how to measure distance makes this equivalent to a funny-shaped plane.) Or the surface of a sphere, or of a torus, or something like that. But if we don’t say otherwise, it’s the plane.
What to cover it with? … Smaller shapes. We have a mathematical tiling if we have a collection of not-overlapping open sets. And if those open sets, plus their boundaries, cover the whole plane. “Cover” here means what “cover” means in English, only using more technical words. These sets — these tiles — can be any shape. We can have as many or as few of them as we like. We can even add markings to the tiles, give them colors or patterns or such, to add variety to the puzzles.
(And if we want, we can do this in other dimensions. There are good “tiling” questions to ask about how to fill a three-dimensional space, or a four-dimensional one, or more.)
Having an unlimited collection of tiles is nice. But mathematicians learn to look for how little we need to do something. Here, we look for the smallest number of distinct shapes. As with tiling an actual floor, we can get all the tiles we need. We can rotate them, too, to any angle. We can flip them over and put the “top” side “down”, something kitchen tiles won’t let us do. Can we reflect them? Use the shape we’d get looking at the mirror image of one? That’s up to whoever’s writing this paper.
What shapes will work? Well, squares, for one. We can prove that by looking at a sheet of graph paper. Rectangles would work too. We can see that by drawing boxes around the squares on our graph paper. Two-by-one blocks, three-by-two blocks, 40-by-1 blocks, these all still cover the paper and we can imagine covering the plane. If we like, we can draw two-by-two squares. Squares made up of smaller squares. Or repeat this: draw two-by-one rectangles, and then group two of these rectangles together to make two-by-two squares.
We can take it on faith that, oh, rectangles π long by e wide would cover the plane too. These can all line up in rows and columns, the way our squares would. Or we can stagger them, like bricks or my dining room’s wood slats are.
How about parallelograms? Those, it turns out, tile exactly as well as rectangles or squares do. Grids or staggered, too. Ah, but how about trapezoids? Surely they won’t tile anything. Not generally, anyway. The slanted sides will, most of the time, only fit in weird winding circle-like paths.
Unless … take two of these trapezoid tiles. We’ll set them down so the parallel sides run horizontally in front of you. Rotate one of them, though, 180 degrees. And try setting them — let’s say so the longer slanted line of both trapezoids meet, edge to edge. These two trapezoids come together. They make a parallelogram, although one with a slash through it. And we can tile parallelograms, whether or not they have a slash.
OK, but if you draw some weird quadrilateral shape, and it’s not anything that has a more specific name than “quadrilateral”? That won’t tile the plane, will it?
It will! In one of those turns that surprises and impresses me every time I run across it again, any quadrilateral can tile the plane. It opens up so many home decorating options, if you get in good with a tile maker.
That’s some good news for quadrilateral tiles. How about other shapes? Triangles, for example? Well, that’s good news too. Take two of any identical triangle you like. Turn one of them around and match sides of the same length. The two triangles, bundled together like that, are a quadrilateral. And we can use any quadrilateral to tile the plane, so we’re done.
How about pentagons? … With pentagons, the easy times stop. It turns out not every pentagon will tile the plane. The pentagon has to be of the right kind to make it fit. If the pentagon is in one of these kinds, it can tile the plane. If not, not. There are fifteen families of tiling known. The most recent family was discovered in 2015. It’s thought that there are no other convex pentagon tilings. I don’t know whether the proof of that is generally accepted in tiling circles. And we can do more tilings if the pentagon doesn’t need to be convex. For example, we can cut any parallelogram into two identical pentagons. So we can make as many pentagons as we want to cover the plane. But we can’t assume any pentagon we like will do it.
And there the good times end. There are no convex heptagons or octagons or any other shape with more sides that tile the plane.
Not by themselves, anyway. If we have more than one tile shape we can start doing fine things again. Octagons assisted by squares, for example, will tile the plane. I’ve lived places with that tiling. Or something that looks like it. It’s easier to install if you have square tiles with an octagon pattern making up the center, and triangle corners a different color. These squares come together to look like octagons and squares.
And this leads to a fun avenue of tiling. Hao Wang, in the early 60s, proposed a sort of domino-like tiling. You may have seen these in mathematics puzzles, or in toys. Each of these Wang Tiles, or Wang Dominoes, is a square. But the square is cut along the diagonals, into four quadrants. Each quadrant is a right triangle. Each quadrant, each triangle, is one of a finite set of colors. Adjacent triangles can have the same color. You can place down tiles, subject only to the rule that the tile edge has to have the same color on both sides. So a tile with a blue right-quadrant has to have on its right a tile with a blue left-quadrant. A tile with a white upper-quadrant on its top has, above it, a tile with a white lower-quadrant.
In 1961 Wang conjectured that if a finite set of these tiles will tile the plane, then there must be a periodic tiling. That is, if you picked up the plane and slid it a set horizontal and vertical distance, it would all look the same again. This sort of translation is common. All my floors do that. If we ignore things like the bounds of their rooms, or the flaws in their manufacture or installation or where a tile broke in some mishap.
This is not to say you couldn’t arrange them aperiodically. You don’t even need Wang Tiles for that. Get two colors of square tile, a white and a black, and lay them down based on whether the next decimal digit of π is odd or even. No; Wang’s conjecture was that if you had tiles that you could lay down aperiodically, then you could also arrange them to set down periodically. With the black and white squares, lay down alternate colors. That’s easy.
In 1964, Robert Berger proved Wang’s conjecture was false. He found a collection of Wang Tiles that could only tile the plane aperiodically. In 1966 he published this in the Memoirs of the American Mathematical Society. The 1964 proof was for his thesis. 1966 was its general publication. I mention this because while doing research I got irritated at how different sources dated this to 1964, 1966, or sometimes 1961. I want to have this straightened out. It appears Berger had the proof in 1964 and the publication in 1966.
I would like to share details of Berger’s proof, but haven’t got access to the paper. What fascinates me about this is that Berger’s proof used a set of 20,426 different tiles. I assume he did not work this all out with shards of construction paper, but then, how to get 20,426 of anything? With computer time as expensive as it was in 1964? The mystery of how he got all these tiles is worth an essay of its own and regret I can’t write it.
Berger conjectured that a smaller set might do. Quite so. He himself reduced the set to 104 tiles. Donald Knuth in 1968 modified the set down to 92 tiles. In 2015 Emmanuel Jeandel and Michael Rao published a set of 11 tiles, using four colors. And showed by computer search that a smaller set of tiles, or fewer colors, would not force some aperiodic tiling to exist. I do not know whether there might be other sets of 11, four-colored, tiles that work. You can see the set at the top of Wikipedia’s page on Wang Tiles.
These Wang Tiles, all squares, inspired variant questions. Could there be other shapes that only aperiodically tile the plane? What if they don’t have to be squares? Raphael Robinson, in 1971, came up with a tiling using six shapes. The shapes have patterns on them too, usually represented as colored lines. Tiles can be put down only in ways that fit and that make the lines match up.
Among my readers are people who have been waiting, for 1800 words now, for Roger Penrose. It’s now that time. In 1974 Penrose published an aperiodic tiling, one based on pentagons and using a set of six tiles. You’ve never heard of that either, because soon after he found a different set, based on a quadrilateral cut into two shapes. The shapes, as with Wang Tiles or Robinson’s tiling, have rules about what edges may be put against each other. Penrose — and independently Robert Ammann — also developed another set, this based on a pair of rhombuses. These have rules about what edges may tough one another, and have patterns on them which must line up.
To show that the rhombus-based Penrose tiling is aperiodic takes some arguing. But it uses tools already used in this essay. Remember drawing rectangles around several squares? And then drawing squares around several of these rectangles? We can do that with these Penrose-Ammann rhombuses. From the rhombus tiling we can draw bigger rhombuses. Ones which, it turns out, follow the same edge rules that the originals do. So that we can go again, grouping these bigger rhombuses into even-bigger rhombuses. And into even-even-bigger rhombuses. And so on.
What this gets us is this: suppose the rhombus tiling is periodic. Then there’s some finite-distance horizontal-and-vertical move that leaves the pattern unchanged. So, the same finite-distance move has to leave the bigger-rhombus pattern unchanged. And this same finite-distance move has to leave the even-bigger-rhombus pattern unchanged. Also the even-even-bigger pattern unchanged.
Keep bundling rhombuses together. You get eventually-big-enough-rhombuses. Now, think of how far you have to move the tiles to get a repeat pattern. Especially, think how many eventually-big-enough-rhombuses it is. This distance, the move you have to make, is less than one eventually-big-enough rhombus. (If it’s not you aren’t eventually-big-enough yet. Bundle them together again.) And that doesn’t work. Moving one tile over without changing the pattern makes sense. Moving one-half a tile over? That doesn’t. So the eventually-big-enough pattern can’t be periodic, and so, the original pattern can’t be either. This is explained in graphic detail a nice Powerpoint slide set from Professor Alexander F Ritter, A Tour Of Tilings In Thirty Minutes.
It’s possible to do better. In 2010 Joshua E S Socolar and Joan M Taylor published a single tile that can force an aperiodic tiling. As with the Wang Tiles, and Robinson shapes, and the Penrose-Ammann rhombuses, markings are part of it. They have to line up so that the markings — in two colors, in the renditions I’ve seen — make sense. With the Penrose tilings, you can get away from the pattern rules for the edges by replacing them with little notches. The Socolar-Taylor shape can make a similar trade. Here the rules are complex enough that it would need to be a three-dimensional shape, one that looks like the dilithium housing of the warp core. You can see the tile — in colored, marked form, and also in three-dimensional tile shape — at the PDF here. It’s likely not coming to the flooring store soon.
It’s all wonderful, but is it useful? I could go on a few hundred words about, particularly, crystals and quasicrystals. These are important for materials science. Especially these days as we have harnessed slightly-imperfect crystals to be our computers. I don’t care. These are lovely to look at. If you see nothing appealing in a great heap of colors and polygons spread over the floor there are things we cannot communicate about. Tiling is a delight; what more do you need?
It feels to me like I did a lot of functional analysis terms in the Leap Day 2016 series. Its essay for the letter ‘S’, Surjective Map, is one of them. We have many ways of dividing up the kinds of functions we have. One of them is in how they use their range. A function has a set called the domain, and a set called the range, and they might be the same set, yes. The function pairs things in the domain with things in the range. Everything in the domain has to pair with something in the range. But we allow having things in the range that aren’t paired to anything in the domain. So we have jargon that tells us, quickly, whether there are unmatched pieces in the range.
Sometimes I write an essay and know it’s something I’m going to refer back to a lot. Sometimes I know it’s just going to sink without trace. Often these deserve it; the subject is something particular and not well-connected to other topics. Sometimes, one sinks without a trace and for not much good reason. Smooth is one of those curiously sunk pieces. It’s about a concept important to analysis. And also a piece that shows my obsession with pointing out cultural factors in mathematics: we care about ‘smooth’ because we’ve found it a useful thing to highlight. And yet it’s gotten no comments, only an average number of likes, and I don’t seem to have linked back to it in any essays where it might be useful. I may have forgotten I wrote the thing. So here’s a referral that maybe will help me remember I have it on hand, ready for future use.
I owe Mr Wu, author of the Singapore Maths Tuition blog, thanks for another topic for this A-to-Z. Statistics is a big field of mathematics, and so I won’t try to give you a course’s worth in 1500 words. But I have to start with a question. I seem to have ended at two thousand words.
Is statistics mathematics?
The answer seems obvious at first. Look at a statistics textbook. It’s full of algebra. And graphs of great sloped mounds. There’s tables full of four-digit numbers in back. The first couple chapters are about probability. They’re full of questions about rolling dice and dealing cards and guessing whether the sibling who just entered is the younger.
Thinking of the field’s history, though, and its use, tell us more. Some of the earliest work we now recognize as statistics was Arab mathematicians deciphering messages. This cryptanalysis is the observation that (in English) a three-letter word is very likely to be ‘the’, mildly likely to be ‘one’, and not likely to be ‘pyx’. A more modern forerunner is the Republic of Venice supposedly calculating that war with Milan would not be worth the winning. Or the gatherings of mortality tables, recording how many people of what age can be expected to die any year, and what from. (Mortality tables are another of Edmond Halley’s claims to fame, though it won’t displace his comet work.) Florence Nightingale’s charts explaining how more soldiers die of disease than in fighting the Crimean War. William Sealy Gosset sharing sample-testing methods developed at the Guinness brewery.
You see a difference in kind to a mathematical question like finding a square with the same area as this trapezoid. It’s not that mathematics is not practical; it’s always been. And it’s not that statistics lacks abstraction and pure mathematics content. But statistics wears practicality in a way that number theory won’t.
Practical about what? History and etymology tip us off. The early uses of things we now see as statistics are about things of interest to the State. Decoding messages. Counting the population. Following — in the study of annuities — the flow of money between peoples. With the industrial revolution, statistics sneaks into the factory. To have an economy of scale you need a reliable product. How do you know whether the product is reliable, without testing every piece? How can you test every beer brewed without drinking it all?
One great leg of statistics — it’s tempting to call it the first leg, but the history is not so neat as to make that work — is descriptive. This gives us things like mean and median and mode and standard deviation and quartiles and quintiles. These try to let us represent more data than we can really understand in a few words. We lose information in doing so. But if we are careful to remember the difference between the descriptive statistics we have and the original population? (nb, a word of the State) We might not do ourselves much harm.
Another great leg is inferential statistics. This uses tools with names like z-score and the Student t distribution. And talk about things like p-values and confidence intervals. Terms like correlation and regression and . This is about looking for causes in complex scenarios. We want to believe there is a cause to, say, a person’s lung cancer. But there is no tracking down what that is; there are too many things that could start a cancer, and too many of them will go unobserved. But we can notice that people who smoke have lung cancer more often than those who don’t. We can’t say why a person recovered from the influenza in five days. But we can say people who were vaccinated got fewer influenzas, and ones that passed quicker, than those who did not. We can get the dire warning that “correlation is not causation”, uttered by people who don’t like what the correlation suggests may be a cause.
Also by people being honest, though. In the 1980s geologists wondered if the sun might have a not-yet-noticed companion star. Its orbit would explain an apparent periodicity in meteor bombardments of the Earth. But completely random bombardments would produce apparent periodicity sometimes. It’s much the same way trees in a forest will sometimes seem to line up. Or imagine finding there is a neighborhood in your city with a high number of arrests. Is this because it has the highest rate of street crime? Or is the rate of street crime the same as any other spot and there are simply more cops here? But then why are there more cops to be found here? Perhaps they’re attracted by the neighborhood’s reputation for high crime. It is difficult to see through randomness, to untangle complex causes, and to root out biases.
The tools of statistics, as we recognize them, largely came together in the 19th and early 20th century. Adolphe Quetelet, a Flemish scientist, set out much early work, including introducing the concept of the “average man”. He studied the crime statistics of Paris for five years and noticed how regular the numbers were. The implication, to Quetelet — who introduced the idea of the “average man”, representative of societal matters — was that crime is a societal problem. It’s something we can control by mindfully organizing society, without infringing anyone’s autonomy. Put like that, the study of statistics seems an obvious and indisputable good, a way for governments to better serve their public.
So here is the dispute. It’s something mathematicians understate when sharing the stories of important pioneers like Francis Galton or Karl Pearson. They were eugenicists. Part of what drove their interest in studying human populations was to find out which populations were the best. And how to help them overcome their more-populous lessers.
I don’t have the space, or depth of knowledge, to fully recount the 19th century’s racial politics, popular scientific understanding, and international relations. Please accept this as a loose cartoon of the situation. Do not forget the full story is more complex and more ambiguous than I write.
One of the 19th century’s greatest scientific discoveries was evolution. That populations change in time, in size and in characteristics, even budding off new species, is breathtaking. Another of the great discoveries was entropy. This incorporated into science the nostalgic romantic notion that things used to be better. I write that figuratively, but to express the way the notion is felt.
There are implications. If the Sun itself will someday wear out, how long can the Tories last? It was easy for the aristocracy to feel that everything was quite excellent as it was now and dread the inevitable change. This is true for the aristocracy of any country, although the United Kingdom had a special position here. The United Kingdom enjoyed a privileged position among the Great Powers and the Imperial Powers through the 19th century. Note we still call it the Victorian era, when Louis Napoleon or Giuseppe Garibaldi or Otto von Bismarck are more significant European figures. (Granting Victoria had the longer presence on the world stage; “the 19th century” had a longer presence still.) But it could rarely feel secure, always aware that France or Germany or Russia was ready to displace it.
And even internally: if Darwin was right and reproductive success all that matters in the long run, what does it say that so many poor people breed so much? How long could the world hold good things? Would the eternal famines and poverty of the “overpopulated” Irish or Indian colonial populations become all that was left? During the Crimean War, the British military found a shocking number of recruits from the cities were physically unfit for service. In the 1850s this was only an inconvenience; there were plenty of strong young farm workers to recruit. But the British population was already majority-urban, and becoming more so. What would happen by 1880? 1910?
One can follow the reasoning, even if we freeze at the racist conclusions. And we have the advantage of a century-plus hindsight. We can see how the eugenic attitude leads quickly to horrors. And also that it turns out “overpopulated” Ireland and India stopped having famines once they evicted their colonizers.
Does this origin of statistics matter? The utility of a hammer does not depend on the moral standing of its maker. The Central Limit Theorem has an even stronger pretense to objectivity. Why not build as best we can with the crooked timbers of mathematics?
It is in my lifetime that a popular racist book claimed science proved that Black people were intellectual inferiors to White people. This on the basis of supposedly significant differences in the populations’ IQ scores. It proposed that racism wasn’t a thing, or at least nothing to do anything about. It would be mere “realism”. Intelligence Quotients, incidentally, are another idea we can trace to Francis Galton. But an IQ test is not objective. The best we can say is it might be standardized. This says nothing about the biases built into the test, though, or of the people evaluating the results.
So what if some publisher 25 years ago got suckered into publishing a bad book? And racist chumps bought it because they liked its conclusion?
The past is never fully past. In the modern environment of surveillance capitalism we have abundant data on any person. We have abundant computing power. We can find many correlations. This gives people wild ideas for “artificial intelligence”. Something to make predictions. Who will lose a job soon? Who will get sick, and from what? Who will commit a crime? Who will fail their A-levels? At least, who is most likely to?
Consider, for example, the body mass index. It was developed by our friend Adolphe Quetelet, as he tried to understand the kinds of bodies in the population. It is now used to judge whether someone is overweight. Weight is treated as though it were a greater threat to health than actual illnesses are. Your diagnosis for the same condition with the same symptoms will be different — and on average worse — if your number says 25.2 rather than 24.8.
We must do better. We can hope that learning how tools were used to injure people will teach us to use them better, to reduce or to avoid harm. We must fight our tendency to latch on to simple ideas as the things we can understand in the world. We must not mistake the greater understanding we have from the statistics for complete understanding. To do this we must have empathy, and we must have humility, and we must understand what we have done badly in the past. We must catch ourselves when we repeat the patterns that brought us to past evils. We must do more than only calculate.
Bernard Riemann is one of those figures you can’t be a mathematics major without learning about. His name attaches to an enormous amount of analysis. One Riemann-named thing every mathematician learns very well is the Riemann Sum. It’s the first analysis model we use to explain why integration works. And we can put together a version of this for numerical integration. Its greatest use, though, is that we can use it to justify other ways to integrate that are easier to actually use. Great little utility.
Part of why I write these essays is to save future time. If I have an essay explaining some complex idea, then in the future, I can use a link and a short recap of the central idea. There’s some essays that have been perennials. I think I’ve linked to polynomials more than anything else on this site. And then some disappear, even though they seem to be about good useful subjects. Riemann sphere, from the Leap Day 2016 sequence, is one of those disappeared topics. This is one of the ways to convert between “shapes on the plane” and “shapes on the sphere”. There’s no way to perfectly move something from the plane to the sphere, or vice-versa. The Riemann Sphere is an approach which preserves the interior angles. If two lines on the plane intersect at a 25 degree angle, their representation on the sphere will intersect at a 25 degree angle. But everything else may get strange.
I have again Elke Stangl, author of elkemental Force, to thank for the subject this week. Again, Stangl’s is a blog of wide-ranging theme interests. And it’s got more poetry this week again, this time haikus about the Dirac delta function.
I also have Kerson Huang, of the Massachusetts Institute of Technology and of Nanyang Technological University, to thank for much insight into the week’s subject. Huang published this A Critical History of Renormalization, which gave me much to think about. It’s likely a paper that would help anyone hoping to know the history of the technique better.
There is a mathematical model, the Ising Model, for how magnets work. The model has the simplicity of a toy model given by a professor (Wilhelm Lenz) to his grad student (Ernst Ising). Suppose matter is a uniform, uniformly-spaced grid. At each point on the grid we have either a bit of magnetism pointed up (value +1) or down (value -1). It is a nearest-neighbor model. Each point interacts with its nearest neighbors and none of the other points. For a one-dimensional grid this is easy. It’s the stuff of thermodynamics homework for physics majors. They don’t understand it, because you need the hyperbolic trigonometric functions. But they could. For two dimensions … it’s hard. But doable. And interesting. It describes important things like phase changes. The way that you can take a perfectly good strong magnet and heat it up until it’s an iron goo, then cool it down to being a strong magnet again.
For such a simple model it works well. A lot of the solids we find interesting are crystals, or are almost crystals. These are molecules arranged in a grid. So that part of the model is fine. They do interact, foremost, with their nearest neighbors. But not exclusively. In principle, every molecule in a crystal interacts with every other molecule. Can we account for this? Can we make a better model?
Yes, many ways. Here’s one. It’s designed for a square grid, the kind you get by looking at the intersections on a normal piece of graph paper. Each point is in a row and a column. The rows are a distance ‘a’ apart. So are the columns.
Now draw a new grid, on top of the old. Do it by grouping together two-by-two blocks of the original. Draw new rows and columns through the centers of these new blocks. Put at the new intersections a bit of magnetism. Its value is the mean of whatever the four blocks around it are. So, could be 1, could be -1, could be 0, could be ½, could be -½. There’s more options. But look at what we have. It’s still an Ising-like model, with interactions between nearest-neighbors. There’s more choices for what value each point can have. And the grid spacing is now 2a instead of a. But it all looks pretty similar.
And now the great insight, that we can trace to Leo P Kadanoff in 1966. What if we relabel the distance between grid points? We called it 2a before. Call it a, now, again. What’s important that’s different from the Ising model we started with?
There’s the not-negligible point that there’s five different values a point can have, instead of two. But otherwise? In the operations we do, not much is different. How about in what it models? And there it’s interesting. Think of the original grid points. In the original scaling, they interacted only with units one original-row or one original-column away. Now? Their average interacts with the average of grid points that were as far as three original-rows or three original-columns away. It’s a small change. But it’s closer to reflecting the reality of every molecule interacting with every other molecule.
You know what happens when mathematicians get one good trick. We figure what happens if we do it again. Take the rescaled grid, the one that represents two-by-two blocks of the original. Rescale it again, making two-by-two blocks of these two-by-two blocks. Do the same rules about setting the center points as a new grid. And then re-scaling. What we have now are blocks that represent averages of four-by-four blocks of the original. And that, imperfectly, let a point interact with a point seven original-rows or original-columns away. (Or farther: seven original-rows down and three original-columns to the left, say. Have fun counting all the distances.) And again: we have eight-by-eight blocks and even more range. Again: sixteen-by-sixteen blocks and double the range again. Why not carry this on forever?
This is renormalization. It’s a specific sort, called the block-spin renormalization group. It comes from condensed matter physics, where we try to understand how molecules come together to form bulks of matter. Kenneth Wilson stretched this over to studying the Kondo Effect. This is a problem in how magnetic impurities affect electrical resistance. (It’s named for Jun Kondo.) It’s great work. It (in part) earned Wilson a Nobel Prize. But the idea is simple. We can understand complex interactions by making them simple ones. The interactions have a natural scale, cutting off at the nearest neighbor. But we redefine ‘nearest neighbor’, again and again, until it reaches infinitely far away.
This problem, and its solution, come from thermodynamics. Particularly, statistical mechanics. This is a bit ahistoric. Physicists first used renormalization in quantum mechanics. This is all right. As a general guideline, everything in statistical mechanics turns into something in quantum mechanics, and vice-versa. What quantum mechanics lacked, for a generation, was logical rigor for renormalization. This statistical mechanics approach provided that.
Renormalization in quantum mechanics we needed because of virtual particles. Quantum mechanics requires that particles can pop into existence, carrying momentum, and then pop back out again. This gives us electromagnetism, and the strong nuclear force (which holds particles together), and the weak nuclear force (which causes nuclear decay). Leave gravity over on the side. The more momentum in the virtual particle, the shorter a time it can exist. It’s actually the more energy, the shorter the particle lasts. In that guise you know it as the Uncertainty Principle. But it’s momentum that’s important here. This means short-range interactions transfer more momentum, and long-range ones transfer less. And here we had thought forces got stronger as the particles interacting got closer together.
In principle, there is no upper limit to how much momentum one of these virtual particles can have. And, worse, the original particle can interact with its virtual particle. This by exchanging another virtual particle. Which is even higher-energy and shorter-range. The virtual particle can also interact with the field that’s around the original particle. Pairs of virtual particles can exchange more virtual particles. And so on. What we get, when we add this all together, seems like it should be infinitely large. Every particle the center of an infinitely great bundle of energy.
Renormalization, the original renormalization, cuts that off. Sets an effective limit on the system. The limit is not “only particles this close will interact” exactly. It’s more “only virtual particles with less than this momentum will”. (Yes, there’s some overlap between these ideas.) This seems different to us mere dwellers in reality. But to a mathematical physicist, knowing that position and momentum are conjugate variables? Limiting one is the same work as limiting the other.
This, when developed, left physicists uneasy. It’s for good reasons. The cutoff is arbitrary. Its existence, sure, but we often deal with arbitrary cutoffs for things. When we calculate a weather satellite’s orbit we do not care that other star systems exist. We barely care that Jupiter exists. Still, where to put the cutoff? Quantum Electrodynamics, using this, could provide excellent predictions of physical properties. But shouldn’t we get different predictions with different cutoffs? How do we know we’re not picking a cutoff because it makes our test problem work right? That we’re not picking one that produces garbage for every other problem? Read the writing of a physicist of the time and — oh, why be coy? We all read Richard Feynman, his QED at least. We see him sulking about a technique he used to brilliant effect.
Wilson-style renormalization answered Feynman’s objections. (Though not to Feynman’s satisfaction, if I understand the history right.) The momentum cutoff serves as a scale. Or if you prefer, the scale of interactions we consider tells us the cutoff. Different scales give us different quantum mechanics. One scale, one cutoff, gives us the way molecules interact together, on the scale of condensed-matter physics. A different scale, with a different cutoff, describes the particles of Quantum Electrodynamics. Other scales describe something more recognizable as classical physics. Or the Yang-Mills gauge theory, as describes the Standard Model of subatomic particles, all those quarks and leptons.
Renormalization offers a capsule of much of mathematical physics, though. It started as an arbitrary trick to avoid calculation problems. In time, we found a rationale for the trick. But found it from looking at a problem that seemed unrelated. On learning the related trick well, though, we see they’re different aspects of the same problem. It’s a neat bit of work.
You know what? I should probably get as much of November done ahead of schedule as possible. So to that end, I’ll also open up the next three letters of the alphabet. If you’d like me to try explaining a mathematics term that starts with V, W, or X, please leave a comment saying so. Also please let me know what your home blog, YouTube channel, Twitter feed, or whatnot is, so I can give that some attention too. I’m also really eager to find other X words; this is a difficult part of the alphabet. And, I’m open to considering re-doing past essay topics, if I have some new angle on them. Don’t be unreasonably afraid to ask.
Topics I’ve already covered, starting with the letter ‘V’, are:
I feel like I talk group theory a lot in these A-to-Z sequences. Some of that’s deserved. Group theory underlies a lot of modern mathematics. Part of it is surely that it made the deepest impression on me, as a mathematics major, even though my work ended up not touching groups often. Quotient Groups are at that nice intersection of being important yet having a misleading name. You’re introduced to them after learning about groups, which have an operation that works like addition/subtraction; and then rings, which have addition/subtraction plus multiplication. Surely a quotient group is just a ring with division, right? No, it is not. But, lucky thing, there’s one quotient group you certainly know and feel familiar with. You’ll see.
In summer 2015 I picked all the topics for my A-to-Z; I didn’t work up the courage to ask for topics until the next time around. Some, I remember why I chose. I’m not sure why I picked Quintile, as a statistics term, rather than quartile. Both are legitimate terms, and circle around a similar idea. That is that we need to know how data is distributed: what range of numbers are common, what ones are rare. I wonder if I wasn’t saving ‘quartile’ for some later A-to-Z, for fear of running out of Q terms. Or if I felt that quartiles were familiar enough that quintiles would seem a touch strange. That is the sort of thing I’d likely do.
I continue my tradition of doing these monthly readership reviews just a little too far into the month to feel sensible. Well, I’m trying to publish more things on the weekdays and have three of those five committed, while the A-to-Z goes on.
In September I posted only 18 pieces. That’s all right. There was more to them: 15,922 words posted in total. This comes to an average of 936.6 words per posting, way up from August’s 634.3. It’s my most wordy month this year, so far. My year-to-date average post has been 694 words, around here.
Those 18, on average enormous, posts drew 2,422 page views. I like seeing that sort of number, since it’s above the twelve-month running average of 2,383.3 page views. There were 1,643 unique visitors, again above the twelve-month running average of 1,622.8. And I’m really amazed by that since the twelve-month running average includes that fluke last October where something like five thousand more people than usual came in and looked at my post about linear programming.
It was an engaged month, too. There were 80 things liked in September, above the average of 62.3. And 32 comments, beating the 17.4 average.
The per-posting figures were similarly above the twelve-month running averages. 134.6 views per posting, above the 125.3 running average. 91.3 unique visitors per posting, above the 85.0 running average. 4.4 likes per posting, compared to a 3.3 running average. 1.8 comments per posting, compared to a 1.0 running average. I’m going to be felling good about this month until that happens again.
I wanted to look at the most popular posts from August and September around here. August because, you know, there’s stuff posted the last week of the month that gets readers early in the new month. It doesn’t seem fair to rule them out as popular posts just because the kalends work against them. Turns out nothing from late August was among the most popular stuff. There was a tie for fifth place, though, as sometimes happens. So here’s the six most popular posts of September:
I’m happy to have a subject from Elke Stangl, author of elkemental Force. That’s a fun and wide-ranging blog which, among other things, just published a poem about proofs. You might enjoy.
One delight, and sometimes deadline frustration, of these essays is discovering things I had not thought about. Researching quadratic forms invited the obvious question of what is a form? And that goes undefined on, for example, Mathworld. Also in the textbooks I’ve kept. Even ones you’d think would mention, like R W R Darling’s Differential Forms and Connections, or Frigyes Riesz and Béla Sz-Nagy’s Functional Analysis. Reluctantly I started thinking about what we talk about when discussing forms.
Quadratic forms offer some hints. These take a vector in some n-dimensional space, and return a scalar. Linear forms, and cubic forms, do the same. The pattern suggests a form is a mapping from a space like to or maybe to . That looks good, but then we have to ask: isn’t that just an operator? Also: then what about differential forms? Or volume forms? These are about how to fill space. There’s nothing scalar in that. But maybe these are both called forms because they fill similar roles. They might have as little to do with one another as red pandas and giant pandas do.
Enlightenment comes after much consideration or happening on Wikipedia’s page about homogenous polynomials. That offers “an algebraic form, or simply form, is a function defined by a homogeneous polynomial”. That satisfies. First, because it gets us back to polynomials. Second, because all the forms I could think of do have rules based in homogeneous polynomials. They might be peculiar polynomials. Volume forms, for example, have a polynomial in wedge products of differentials. But it counts.
A function’s homogenous if it scales a particular way. Evaluate it at some set of coordinates x, y, z, (more variables if you need). That’s some number (let’s say). Take all those coordinates and multiply them by the same constant; let me call that α. Evaluate the function at α x, α y α z, (α times more variables if you need). Then that value is αk times the original value of f. k is some constant. It depends on the function, but not on what x, y, z, (more) are.
For a quadratic form, this constant k equals 4. This is because in the quadratic form, all the terms in the polynomial are of the second degree. So, for example, is a quadratic form. So is ; the x times the y brings this to a second degree. Also a quadratic form is . So is .
This can have many variables. If we have a lot, we have a couple choices. One is to start using subscripts, and to write the form something like:
This is respectable enough. People who do a lot of differential geometry get used to a shortcut, the Einstein Summation Convention. In that, we take as implicit the summation instructions. So they’d write the more compact . Those of us who don’t do a lot of differential geometry think that looks funny. And we have more familiar ways to write things down. Like, we can put the collection of variables into an ordered n-tuple. Call it the vector . If we then think to put the numbers into a square matrix we have a great way of writing things. We have to manipulate the a little to make the matrix, but it’s nothing complicated. Once that’s done we can write the quadratic form as:
This uses matrix multiplication. The vector we assume is a column vector, a bunch of rows one column across. Then we have to take its transposition, one row a bunch of columns across, to make the matrix multiplication work out. If we don’t like that notation with its annoying superscripts? We can declare the bare ‘x’ to mean the vector, and use inner products:
This is easier to type at least. But what does it get us?
Looking at some quadratic forms may give us an idea. practically begs to be matched to an , and the name “the equation of a circle”. is less familiar, but to the crowd reading this, not much less familiar. Fill that out to and we have a hyperbola. If we have and let that then we have an ellipse, something a bit wider than it is tall. Similarly is a hyperbola still, just anamorphic.
If we expand into three variables we start to see spheres: just begs to equal . Or ellipsoids: , set equal to some (positive) , is something we might get from rolling out clay. Or hyperboloids: or , set equal to , give us nice shapes. (We can also get cylinders: equalling some positive number describes a tube.)
How about ? This also wants to be an ellipse. , to pick an easy number, is a rotated ellipse. The long axis is along the line described by . The short axis is along the line described by . How about — let me make this easy. ? The equation describes a hyperbola, but a rotated one, with the x- and y-axes as its asymptotes.
Do you want to take any guesses about three-dimensional shapes? Like, what might represent? If you’re thinking “ellipsoid, only it’s at an angle” you’re doing well. It runs really long in one direction, along the plane described by . It runs medium-size along the plane described by . It runs pretty short along the z-axis. We could run some more complicated shapes. Ellipses pointing in weird directions. Hyperboloids of different shapes. They’ll have things in common.
One is that they have obviously important axes. Axes of symmetry, particularly. There’ll be one for each dimension of space. An ellipse has a long axis and a short axis. An ellipsoid has a long, a middle, and a short. (It might be that two of these have the same length. If all three have the same length, you have a sphere, my friend.) A hyperbola, similarly, has two axes of symmetry. One of them is the midpoint between the two branches of the hyperbola. One of them slices through the two branches, through the points where the two legs come closest together. Hyperboloids, in three dimensions, have three axes of symmetry. One of them connects the points where the two branches of hyperboloid come closest together. The other two run perpendicular to that.
We can go on imagining more dimensions of space. We don’t need them. The important things are already there. There are, for these shapes, some preferred directions. The ones around which these quadratic-form shapes have symmetries. These directions are perpendicular to each other. These preferred directions are important. We call them “eigenvectors”, a partly-German name.
Eigenvectors are great for a bunch of purposes. One is that if the matrix A represents a problem you’re interested in? The eigenvectors are probably a great basis to solve problems in it. This is a change of basis vectors, which is the same work as doing a rotation. And it’s happy to report this change of coordinates doesn’t mess up the problem any. We can rewrite the problem to be easier.
And, roughly, any time we look at reflections in a Euclidean space, there’s a quadratic form lurking around. This leads us into interesting places. Looking at reflections encourages us to see abstract algebra, to see groups. That space can be rotated in infinitesimally small pieces gets us a kind of group named a Lie (pronounced ‘lee’) Algebra. Quadratic forms give us a way of classifying those.
Quadratic forms work in number theory also. There’s a neat theorem, the 15 Theorem. If a quadratic form, with integer coefficients, can produce all the integers from 1 through 15, then it can produce all positive numbers. For example, can, for sets of integers x, y, z, and w, add up to any positive number you like. (It’s not guaranteed this will happen. can’t produce 15.) We know of at least 54 combinations which generate all the positive integers, like and and such.
There’s more, of course. There always is. I spent time skimming Quadratic Forms and their Applications, Proceedings of the Conference on Quadratic Forms and their Applications. It was held at University College Dublin in July of 1999. It’s some impressive work. I can think of very little that I can describe. Even Winfried Scharlau’s On the History of the Algebraic Theory of Quadratic Forms, from page 229, is tough going. Ina Kersten’s Biography of Ernst Witt, one of the major influences on quadratic forms, is accessible. I’m not sure how much of the particular work communicates.
It’s easy at least to know what things this field is about, though. The things that we calculate. That they connect to novel and abstract places shows how close together arithmetic and dynamical systems and topology and group theory and number theory are, despite appearances.
I am as surprised as anyone to be this near the end of the All 2020 A-to-Z. But, also, I am hoping to stockpile a couple of essays for the first weeks of November. I expect that to be an even more emotionally trying time and would like to have as little work, even fun work like this, as possible then.
So please, in comments, suggest mathematical terms starting with the letters S, T, or U, or that can be reasonably phrased as something with those letters. Also please list any blogs, YouTube channels, books, anything that you’ve written or would like to see publicized.
I’m probably going to put out an appeal for the letter V soon, also, since that’s also scheduled for an early-November publication.
I am open to revisiting topics I looked at in the past, if I think I can do better, or can cover a different aspect of them. So for reference, the topics I’ve already covered starting with the letter ‘S’ were:
And in last year’s A-to-Z I published one of those essays already becoming a favorite. I haven’t had much chance to link back to it. So let me fix that. My 2019 Mathematics A To Z: Platonic focuses on the Platonic Solids, and questions like why we might find them interesting. Also, what Platonic solids look like in spaces of other than three dimensions. Three-dimensional space has five Platonic solids. There are six Platonic Solids in four dimensions. How many would you expect in a five-dimensional space? Or a ten-dimensional one? The answer may surprise you!
As I did the 2015 A-to-Z I learned how to do them in a way that feels me. In writing about the meaning of Proper, I found an important part of my voice. That’s the part which began with a corny mathematician’s joke. It also shows something I have forgotten how to do: it explains the whole thing, even with a joke to warm things up, in maybe 500 words. Well, I was publishing three A-to-Z essays a week back then; something had to go.
We learn to count permutations before we know what they are. There are good reasons to. Counting permutations gives us numbers that are big, and therefore interesting, fast. Counting is easy to motivate. Humans like counting. Counting is useful. Many probability questions are best answered by counting all the ways to arrange things, and how many of those arrangements are desirable somehow.
The count of permutations asks how many ways there are to put some things in order. If some of the things are identical, the number is smaller. Calculating the count may be a little tedious, but it’s not hard. We calculate, rather than “really” count, because — well, list all the possible ways to arrange the letters of the word ‘DEMONSTRATION’. I bet you turn that listing over to a computer too. But what is the computer counting?
If we’re trying to do this efficiently we have some system. Start with ‘DEMONSTRATION’. Then, say, swap the last two letters: ‘DEMONSTRATINO’. Then, mm, move the ‘N’ to the antepenultimate position: ‘DEMONSTRATNIO’. Then, oh, swap the last two letters again: ‘DEMONSTRATNOI’.
Then, oh, move the ‘N’ to the third-to-the-last position: ‘DEMONSTRANTIO’. What next? Oh, swap the last two letters again: ‘DEMONSTRANTOI’. Or, move what been the last letter to the antepenultimate position: ‘DEMONSTRANOTI’. And swap the last two letters once more: ‘DEMONSTRANOIT’.
Enough of that, you and my spellchecker say. I agree. What is it that all this is doing? What does that tell us about what a permutation is?
An obvious thing. Each new variation of the order came from swapping two letters of an earlier one. We needed a sequence of swaps to get to ‘DEMONSTRANOIT’. But each swap was of only two things. It’s a good thing to observe.
Another obvious thing. There’s no letters in ‘DEMONSTRANOIT’ or any of the other variations that weren’t in ‘DEMONSTRATION’. All that’s changed is the order.
This all has listed eight permutations, counting the original ‘DEMONSTRATION’ as one. There are, calculations tell me, 778,377,592 to go.
Would the number of permutations be different if we were shuffling around different things? If instead of the letters in the word ‘DEMONSTRATION’ it were, say, the numerals in the sequence ‘1234567897045’? Or the sequence of symbols ‘!@#$%^&*(&)$%’ instead? No, and that it would not is another clue about what permutations are.
Another thing, obvious in retrospect. Grant that we’ve been making new permutations by taking a sequence of letters (numerals, symbols) and swapping a pair. We got from ‘DEMONSTRATION’ to ‘DEMONSTRATINO’ by swapping the last two letters. What happens if we swap the last two letters again? We get ‘DEMONSTRATION’, a sequence of letters all right, although one already on our list of permutations.
One more thing, obvious once you’ve seen it. Imagine we had not started with ‘DEMONSTRATION’ but instead ‘DEMONSTRATNIO’. But that we followed the same sequences of swappings. Would we have come up with different permutations? … At least for the first couple permutations? Or would they be the same permutations, listed in a different order?
You’ve been kind, letting me call these things “permutations” before I say what a permutation is. It’s relied on a casual, intuitive idea of a permutation. It’s a shuffling around of some set of things. This is the casual idea that mathematicians rely on for a permutation. Sure we can make the idea precise. How hard will that be?
It’s not hard in form. The permutation is the rearranging of things into a new order. The hard part is the concept. It’s not “these symbols in this order” that’s the permutation. It’s the act of putting them in this new order that is. So it’s “swap the 12th and the 13th symbols”. Or, “move the 13th symbol to 11th place, the 11th symbol to 12th, and the 12th symbol to 13th place”.
So one permutation is “swap the 12th and the 13th elements”. Another permutation is “swap the 11th and the 12th elements”. Since the range of one function is the domain of another, we can compose the together. That is, we can “swap the 12th and the 13th elements, and then swap the 11th and the 12th elements”. This gets us another permutation. The effect of these two permutations, in this order, is “make the 13th element the 11th, make the 11th element the 12th, and make the 12th element the 13th”. The order we do these permutations in counts. “Swap the 11th and the 12th elements, and then swap the 12th and the 13th” gets us a different net effect. That one is “make the 12th element the 11th, make the 13th element the 12th, and make the 11th element the 13th”. Composition of functions does not commute.
That functions compose is normal enough. That their composition doesn’t commute is normal enough too. These functions are a bit odd in that we don’t care what the domain-and-range is. We only care that we can index the elements in it. That leads us to some new observations.
The big one is that the set of all these permutations is a group. I mean the way mathematicians mean group. That is, we have a set of items. These are the functions, the permutations. The instructions, like, “make the 12th element the 11th and the 13th element the 12th”, or “the 12th element the 13th”. We also need a group action, a thing that works like addition does for real numbers. That’s composition. That is, doing one permutation and then the other, to get a new permutation out of it. That new permutation is itself one of the permutations we’d had. We can’t compose permutations and get something that’s not a permutation. No amount of swapping around the letters of ‘DEMONSTRATION’ will get us ‘DEMONSTRATIONERS’.
When we talk about how permutations as a group work, we want to give individual permutations names. That ends up being letters. These are often Greek letters. I don’t know why we can’t use the ordinary Latin alphabet. I suppose someone who liked Greek letters wrote a really good textbook and everyone copies that. So instead of speaking about x and y, we’ll get α and β. Sometimes σ and τ. Or, quite often π, especially if we need a bunch of permutations. Then we get π1, π2, π3, and so on. πj. All the way to πN. For the young mathematics major it might be the first time seeing π used for something not at all circle-related. It’s a weird sensation. Still, αβ is the composition of permutation α with permutation β. This means, do permutation β first, and then permutation α on whatever that result is. This is the same way that f(g(x)) means “evaluate g(x) first, and then figure out what f( that ) is”.
That’s all fine for naming them. But we would also like a good way to describe what a permutation does. There are several good forms. They all rely on indexing the elements, using the counting numbers: 1, 2, 3, 4, and so on. The notation I’ll share is called cycle notation. It’s easy to type. You write it within nice ordinary parentheses: (11 12) means “put the 11th element in slot 12, and the 12th element in slot 11”. (11, 12, 13) means “put the 11th element in slot 12, the 12th element in slot 13, and the 13th element in slot 11”. You can even chain these together: (10, 11)(12, 13) means “put the 10th element in slot 11 and the 11th element in slot 10; also, put the 12th element in slot 13, and the 13th element in slot 12”.
In that notation, writing (9), for example, means “put the 9th element in slot 9”. Or if you prefer, “leave element 9 alone”. Or we don’t mention it at all. The convention is that if something isn’t mentioned, leave it where it is.
This by the way is where we get the identity element. The permutation (1)(2)(3)(4)(etc) doesn’t actually swap anything. It counts as a permutation. Doing this is the equivalent of adding zero to a number.
This cycle notation makes it not hard to figure out the composition of permutations. What does (1 2)(1 3) do? Well, the (1 3) swaps the first and the third items. The (1 2), next, swaps what’s become the first and the second items. The effect is the same as the permutation (2 3 1). You can get pretty good at this sort of manipulation, in time.
You may also consider: if (1 2)(1 3) is the same as (2 3 1), then isn’t (2 3 1) the same as (1 2)(1 3)? Sure. But, like, can we write a longer permutation, like, (1 3 5 2 4), as the product of some smaller permutations? And we can. If it’s convenient, we can write it as a string of swaps, exchanging pairs of elements. This was the first “obvious” thing I had listed. A long enough chain of pairwise swaps will, in time, swap everything.
We call the group made of all these permutations the Symmetric Group of the set. Since it doesn’t matter what the underlying set is, just the number of elements in it, we can abbreviate this with the number of elements. S2. S4. SN. Symmetric Groups are among the first groups you meet in abstract algebra that aren’t, like, integers modulo 12 or symmetries of a triangle. It’s novel enough to be interesting and to not be completely sure you’re doing it right.
You never leave the Symmetric Group, though, not if you stay in algebra. It has powerful consequences. It ties, for example, into the roots of polynomials. The structure of S5 tells us there must exist fifth-degree polynomials we can’t solve by ordinary arithmetic and root operations. That is, there’s no version of the quadratic equation for high-order polynomials, and never can be.
There are more groups to build from permutations. The next one that you meet in Intro to Abstract Algebra is the Alternating Group. It’s made of only the even permutations. Those are the permutations made from an even number of swaps. (There are also odd permutations, which are what you imagine. They can’t make a group, though. No identity element.) They’re great for recapturing dread and uncertainty once you think you’ve got a handle on the Symmetric Group.
They lead to other groups too, and even rings. The Levi-Civita symbol describes whether a set of indices gives an even or an odd permutation (or neither). It makes life easier when we work on determinants and tensors and Jacobians. These tie in to the geometry of space, and how that affects physics. It also gets a supporting role in cross products. There are many cryptography schemes that have permutations at their core.
So this is a bit of what permutations are, and what they can get us.
This is the 141st Playful Math Education Blog Carnival. And I will be taking this lower-key than I have past times I was able to host the carnival. I do not have higher keys available this year.
I will start by borrowing a page from Iva Sallay, kind creator and host of FindTheFactors.com, and say some things about 141. I owe Iva Sallay many things, including this comfortable lead-in to the post, and my participation in the Playful Math Education Blog Carnival. She was also kind enough to send me many interesting blogs and pages and I am grateful.
141 is a centered pentagonal number. It’s like 1 or 6 or 16 that way. That is, if I give you six pennies and ask you to do something with it, a natural thing is one coin in the center and a pentagon around that. With 16 coins, you can add a nice regular pentagon around that, one that reaches three coins from vertex to vertex. 31, 51, 76, and 106 are the next couple centered pentagonal numbers. 181 and 226 are the next centered pentagonal numbers. The units number in these follow a pattern, too, in base ten. The last digits go 1-6-6-1, 1-6-6-1, 1-6-6-1, and so on.
141’s also a hendecagonal number. That is, arrange your coins to make a regular 11-sided polygon. 1 and then 11 are hendecagonal numbers. Then 30, 58, 95, and 141. 196 and 260 are the next couple. There are many of these sorts of polygonal numbers, for any regular polygon you like.
141 is also a Hilbert Prime, a class of number I hadn’t heard of before. It’s still named for the Hilbert of Hilbert’s problems. 141 is not a prime number, which you notice from adding up the digits. But a Hilbert Prime is a different kind of beast. These come from looking at counting numbers that are one more than a whole multiple of four. So, numbers like 1, 5, 9, 13, and so on. This sequence describes a lot of classes of numbers. A Hilbert Prime, at least as some number theorists use it, is a Hilbert Number that can’t be divided by any other Hilbert Number (other than 1). So these include 5, 9, 13, 17, and 21, and some of those are already not traditional primes. There are Hilbert Numbers that are the products of different sets of Hilbert Primes, such as 441 or 693. (441 is both 21 times 21 and also 9 times 49. 693 is 9 times 77 and also 21 times 33) So I don’t know what use Hilbert Primes are specifically. If someone knows, I’d love to hear.
Also, at the risk of causing trouble, The Aperiodical also hosts a monthly Carnival of Mathematics. It’s a similar gathering of interesting mathematics content. It doesn’t look necessarily for educational or playful pieces.
The Reflective Educator posted Precision In Language. This is about one of the hardest bits of teaching. That is to say things which are true and which can’t be mis-remembered as something false. Author David Wees points out an example of this hazard, as kids apply rules outside their context.
Simon Gregg’s essay The Gardener and the Carpenter follows a connected theme. The experience students have with a thing can be different depending on how the teacher presents it. The lead example of Gregg’s essay is about the different ways students played with a toy depending on how the teacher prompted them to explore it.
Now I can come to more bundles of things to teach. Colleen Young gathered Maths at school … and at home, bundles of exercises and practice sheets. One of the geometry puzzles, about the missing lengths in the perimeter of a hexagon, brings me a smile as this is a sort of work I’ve been doing for my day job.
Starting Points Maths has a page of Radian Measure — Intro. The goal here is building comfort in the use of radians as angle measure. Mathematicians tend to think in radians. The trigonometric functions for radian measure behave well. Derivatives and integrals are easy, for example. We do a lot of derivatives and integrals. The measures look stranger, is all, especially as they almost always involve fractions times π.
Lowry also has Helping Your Child Learn Time, using both analog and digital clocks. That lets me mention a recent discussion with my love, who teaches. My love’s students were not getting the argument that analog clocks can offer a better sense of how time is elapsing. I had what I think a compelling argument: an analog clock is like a health bar, a digital clock like the count of hit points. Logic tells me this will communicate well.
YummyMath’s Fall Equinox 2020 describes some of the geometry of the equinoxes. It also offers questions about how to calculate the time of daylight given one’s position on the Earth. This is one of the great historic and practical uses for trigonometry.
To some play! Miguel Barral wrote Much More Than a Diversion: The Mathematics of Solitaire. There are many kinds of solitaire, which is ultimately just a game that can be played alone. They’re all subject to study through game theory. And to questions like “what is the chance of winning”? That’s often a question best answered by computer simulation. Working out that challenge helped create Monte Carlo methods. These can find approximate solutions to problems too difficult to find perfect solutions for.
Conditional probability is fun. It’s full of questions easy to present and contradicting intuition to solve. Wayne Chadburn’s Big Question explores one of them. It’s based on a problem which went viral a couple years ago, called “Hannah’s Sweet”. I missed the problem when it was getting people mad. But Chadburn explores how to think through the problem.
Now to some deeper personal interests. I am an amusement park enthusiast: I’ve ridden at least 250 different roller coasters at least once each. This includes all the wooden Möbius-strip roller coasters out there. Also all three racing merry-go-rounds. The oldest roller coaster still standing. And I had hoped, this year, to get to the centennial years for the Jackrabbit roller coaster at Kennywood Amusement Park (Pittsburgh) and Jack Rabbit roller coaster at Seabreeze Park (Rochester, New York). Jackrabbit (with spelling variants) used to be a quite popular roller coaster name.
So plans went awry and it seems unlikely we’ll get to any amusement parks this year. No county fairs or carnivals. We can still go to virtual ones, though. Amusement parks and midway games inspire many mathematical questions. So let’s take some in.
Michigan State University’s Connected Mathematics Program set up set up a string of carnival-style games. The event’s planners figured on then turning the play money into prize raffles but you can also play games. Some are legitimate midway games, such as plinko, spinner wheels, or racing games, too.
Hooda Math’s Carnival Fun offers a series of games, many of them Flash, a fair number HTML5, and mostly for kindergraden through 8th grade. There are a lot of mathematics games here, along with some physics and word games.
Specific rides, though, are always beautiful and worth looking at. Ann-Marie Pendrill’s Rotating swings—a theme with variations looks at rotating swing rides. These have many kinds of motion and many can be turned into educational problems. Pendrill looks at some of them. There are other articles recommended by this, which seem relevant, but this was the only article I found which I had permission to read in full. Your institution might have better access.
Lin McMullin’s The Scrambler, or A Family of Vectors at the Amusement Park looks at the motion of the most popular thrill ride out there. (There are more intense rides. But they’re also ones many people feel are too much for them. Few people in a population think the Scrambler is too much for them.) McMullin uses the language of vectors to examine what path the rider traces out during a ride, and what they say about velocity and acceleration. These are all some wonderful shapes.
And Amusement Parks
Many amusement parks host science and mathematics education days. In fact I’ve never gone to the opening day of my home park, Michigan’s Adventure, as that’s a short four-hour day filled with area kids. Many of the parks do have activity pages, though, suggesting the kinds of things to think about at a park. Some of the mathematics is things one can use; some is toying with curiosity.
Here’s The State Fair of Texas’s Grade 6 STEM games. I don’t know whether there’s a more recent edition. But also imagine that tasks like counting the traffic flow or thinking about what energies are shown at different times in a ride do not age.
Dorney Park, in northeastern Pennsylvania, was never my home park, but it was close. And I’ve had the chance to visit several times. People with Kutztown University, regional high schools, and Dorney Park prepared Coaster Quest – Geometry. These include a lot of observations and measurements all tied to specific rides at the park. (And a side fact, fun for me: Dorney Park’s carousel used to be at Lake Lansing Amusement Park, a few miles from me. Lake Lansing’s park closed in 1972, and the carousel spent several decades at Cedar Point in Ohio before moving to Pennsylvania. The old carousel building at Lake Lansing still stands, though, and I happened to be there a few weeks ago.)
A 2018 posting on Social Mathematics asks: Do height restrictions matter to safety on Roller Coasters? Of course they do, or else we’d have more roller coasters that allowed mice to ride. The question is how much the size restriction matters, and how sensitive that dependence is. So the leading question is a classic example of applying mathematics to the real world. This includes practical subtleties like if a person 39.5 inches tall could ride safely, is it fair to round that off to 40 inches? It also includes the struggle to work out how dangerous an amusement park is.
Speaking from my experience as a rider and lover of amusement parks: don’t try to plead someone’s “close enough”. You’re putting an unfair burden on the ride operator. Accept the rules as posted. Everybody who loves amusement parks has their disappointment stories; accept yours in good grace.
This leads me into planning amusement park fun. School Specialty’s blog particularly offers PLAY & PLAN: Amusement Park. This is a guide to building an amusement park activity packet for any primary school level. It includes, by the way, some mention of the historical and cultural aspects. That falls outside my focus on mathematics with a side of science here. But there is a wealth of culture in amusement parks, in their rides, their attractions, and their policies.
Let me resume the fun, by looking to imaginary amusement parks. TeachEngineering’s Amusement Park Ride: Ups and Downs in Design designs and builds model “roller coasters”. This from foam tubes, toothpicks, masking tape, and marbles. It’s easier to build a ride in Roller Coaster Tycoon but that will always lack some of the thrill of having a real thing that doesn’t quite do what you want. The builders of Son Of Beast had the same frustration.
The Brunswick (Ohio) City Schools published a nice Amusement Park Map Project. It also introduces students to coordinate systems. This by having them lay out and design their own amusement park. It includes introductions to basic shapes. I am surprised reading the requirements that merry-go-rounds aren’t included, as circles. I am delighted that the plan calls for eight to ten roller coasters and a petting zoo, though. That plan works for me.
I’m going to take one more day, I think, preparing the Playful Math Education Blog Carnival. It’s hard work. But while you wait let me please share an older piece. In 2017 I wrote about Open Sets. These are important things, born of topology and offering us many useful tools. One of the best is that it lets us define “neighborhoods” and, along the way, “limits” and from that, “continuity”.
It was also a chance for me to finally think about one of those obvious nagging questions. There are open sets and there are closed sets. But it’s not the case that a set is either open or closed. A set can be not-open without being closed, and not-closed without being open. A set can even be both open and closed simultaneously. How can that turn out? And I learned that while “open” and “closed” are an obvious matched pair of words, they’re about describing very different traits of sets.
Occasionally an A-to-Z gives me the chance to naturally revisit an earlier piece. Orthonormal, from the Leap Day 2016 series, was one of those. It builds heavily on orthogonal, discussed the year before. When you know what the terms mean, of course it would. But getting to what the terms mean is part of the point of these essays.
Also, I hope to publish the 141th installment of the Playful Math Education Blog Carnival this weekend. If you’ve found a mathematics page, video, game, anything that delights or teaches or both, please mention in the comments. I’m eager to share it with more people.
Mr Wu, author of the Singapore Maths Tuition blog, asked me to explain a technical term today. I thought that would be a fun, quick essay. I don’t learn very fast, do I?
A note on style. I make reference here to “Big-O” and “Little-O”, capitalizing and hyphenating them. This is to give them visual presence as a name. In casual discussion they’re just read, or said, as the two words or word-and-a-letter. Often the Big- or Little- gets dropped and we just talk about O. An O, without further context, in my experience means Big-O.
The part of me that wants smooth consistency in prose urges me to write “Little-o”, as the thing described is represented with a lowercase ‘o’. But Little-o sounds like a midway game or an Eyerly Aircraft Company amusement park ride. And I never achieve consistency in my prose anyway. Maybe for the book publication. Until I’m convinced another is better, though, “Little-O” it is.
Big-O and Little-O Notation.
When I first went to college I had a campus post office box. I knew my box number. I also knew the length of the sluggish line for the combination lock code. The lock was a dial, lettered A through J. Being a young STEM-class idiot I thought, boy, would it actually be quicker to pick the lock than wait for the line? A three-letter combination, of ten options? That’s 1,000 possibilities. If I could try five a minute that’s, at worst, three hours 20 minutes. Combination might be anywhere in that set; I might get lucky. I could expect to spend 80 minutes picking my lock.
I decided to wait in line instead, and good that I did. I was unaware combination might not be a letter, like ‘A’. It could be the midway point between adjacent letters, like ‘AB’. That meant there were eight times as many combinations as I estimated, and I could expect to spend over ten hours. Even the slow line was faster than that. It transpired that my combination had two of these midway letters.
But that’s a little demonstration of algorithmic complexity. Also in cracking passwords by trial-and-error. Doubling the set of possible combination codes octuples the time it takes to break into the set. Making the combination longer would also work; each extra letter would multiply the cracking time by twenty. So you understand why your password should include “special characters” like punctuation, but most of all should be long.
We’re often interested in how long to expect a task to take. Sometimes we’re interested in the typical time it takes. Often we’re interested in the longest it could ever take. If we have a deterministic algorithm, we can say. We can count how many steps it takes. Sometimes this is easy. If we want to add two two-digit numbers together we know: it will be, at most, three single-digit additions plus, maybe, writing down a carry. (To add 98 and 37 is adding 8 + 7 to get 15, to add 9 + 3 to get 12, and to take the carry from the 15, so, 1 + 12 to get 13, so we have 135.) We can get a good quarrel going about what “a single step” is. We can argue whether that carry into the hundreds column is really one more addition. But we can agree that there is some smallest bit of arithmetic work, and work from that.
For any algorithm we have something that describes how big a thing we’re working on. It’s often ‘n’. If we need more than one variable to describe how big it is, ‘m’ gets called up next. If we’re estimating how long it takes to work on a number, ‘n’ is the number of digits in the number. If we’re thinking about a square matrix, ‘n’ is the number of rows and columns. If it’s a not-square matrix, then ‘n’ is the number of rows and ‘m’ the number of columns. Or vice-versa; it’s your matrix. If we’re looking for an item in a list, ‘n’ is the number of items in the list. If we’re looking to evaluate a polynomial, ‘n’ is the order of the polynomial.
In normal circumstances we don’t work out how many steps some operation does take. It’s more useful to know that multiplying these two long numbers would take about 900 steps than that it would need only 816. And so this gives us an asymptotic estimate. We get an estimate of how much longer cracking the combination lock will take if there’s more letters to pick from. This allowing that some poor soul will get the combination A-B-C.
There are a couple ways to describe how long this will take. The more common is the Big-O. This is just the letter, like you find between N and P. Since that’s easy, many have taken to using a fancy, vaguely cursive O, one that looks like . I agree it looks nice. Particularly, though, we write , where f is some function. In practice, we’ll see functions like or or . Usually something simple like that. It can be tricky. There’s a scheme for multiplying large numbers together that’s . What you will not see is something like , or or such. This comes to what we mean by the Big-O.
It’ll be convenient for me to have a name for the actual number of steps the algorithm takes. Let me call the function describing that g(n). Then g(n) is if once n gets big enough, g(n) is always less than C times f(n). Here c is some constant number. Could be 1. Could be 1,000,000. Could be 0.00001. Doesn’t matter; it’s some positive number.
There’s some neat tricks to play here. For example, the function ‘‘ is . It’s also and and . The function ‘ is also and those later terms, but it is not . And you can see why is right out.
There is also a Little-O notation. It, too, is an upper bound on the function. But it is a stricter bound, setting tighter restrictions on what g(n) is like. You ask how it is the stricter bound gets the minuscule letter. That is a fine question. I think it’s a quirk of history. Both symbols come to us through number theory. Big-O was developed first, published in 1894 by Paul Bachmann. Little-O was published in 1909 by Edmund Landau. Yes, the one with the short Hilbert-like list of number theory problems. In 1914 G H Hardy and John Edensor Littlewood would work on another measure and they used Ω to express it. (If you see the letter used for Big-O and Little-O as the Greek omicron, then you see why a related concept got called omega.)
What makes the Little-O measure different is its sternness. g(n) is if, for every positive number C, whenever n is large enough g(n) is less than or equal to C times f(n). I know that sounds almost the same. Here’s why it’s not.
If g(n) is , then you can go ahead and pick a C and find that, eventually, . If g(n) is , then I, trying to sabotage you, can go ahead and pick a C, trying my best to spoil your bounds. But I will fail. Even if I pick, like a C of one millionth of a billionth of a trillionth, eventually f(n) will be so big that . I can’t find a C small enough that f(n) doesn’t eventually outgrow it, and outgrow g(n).
This implies some odd-looking stuff. Like, that the function n is not . But the function n is at least , and and those other fun variations. Being Little-O compels you to be Big-O. Big-O is not compelled to be Little-O, although it can happen.
These definitions, for Big-O and Little-O, I’ve laid out from algorithmic complexity. It’s implicitly about functions defined on the counting numbers. But there’s no reason I have to limit the ideas to that. I could define similar ideas for a function g(x), with domain the real numbers, and come up with an idea of being on the order of f(x).
We make some adjustments to this. The important one is that, with algorithmic complexity, we assumed g(n) had to be a positive number. What would it even mean for something to take minus four steps to complete? But a regular old function might be zero or negative or change between negative and positive. So we look at the absolute value of g(x). Is there some value of C so that, when x is big enough, the absolute value of g(x) stays less than C times f(x)? If it does, then g(x) is . Is it the case that for every positive number C it’s true that g(x) is less than C times f(x), once x is big enough? Then g(x) is .
Fine, but why bother defining this?
A compelling answer is that it gives us a way to describe how different a function is from an approximation to that function. We are always looking for approximations to functions because most functions are hard. We have a small set of functions we like to work with. Polynomials are great numerically. Exponentials and trig functions are great analytically. That’s about all the functions that are easy to work with. Big-O notation particularly lets us estimate how bad an error we make using the approximation.
For example, the Runge-Kutta method numerically approximates solutions to ordinary differential equations. It does this by taking the information we have about the function at some point x to approximate its value at a point x + h. ‘h’ is some number. The difference between the actual answer and the Runge-Kutta approximation is . We use this knowledge to make sure our error is tolerable. Also, we don’t usually care what the function is at x + h. It’s just what we can calculate. What we want is the function at some point a fair bit away from x, call it x + L. So we use our approximate knowledge of conditions at x + h to approximate the function at x + 2h. And use x + 2h to tell us about x + 3h, and from that x + 4h and so on, until we get to x + L. We’d like to have as few of these uninteresting intermediate points as we can, so look for as big an h as is safe.
That context may be the more common one. We see it, particularly, in Taylor Series and other polynomial approximations. For example, the sine of a number is approximately:
This has consequences. It tells us, for example, that if x is about 0.1, this approximation is probably pretty good. So it is: the sine of 0.1 (radians) is about 0.0998334166468282 and that’s exactly what five terms here gives us. But it also warns that if x is about 10, this approximation may be gibberish. And so it is: the sine of 10.0 is about -0.5440 and the polynomial is about 1448.27.
The connotation in using Big-O notation here is that we look for small h’s, and for to be a tiny number. It seems odd to use the same notation with a large independent variable and with a small one. The concept carries over, though, and helps us talk efficiently about this different problem.
For the 2018 A-to-Z I spent some time talking about a big piece of thermodynamics. Anyone taking a statistical mechanics course learns about the Nearest Neighbor Model. It’s a way of handling big systems of things that all interact. This is really hard to do. But if you make the assumption that the nearest pairs are the most important ones, and everything else is sort of a correction or meaningless noise? You get … a problem that’s easier to simulate on a computer. It’s not necessarily easier to solve. But it’s a good starting point for a lot of systems.
The restaurant I was thinking of, when I wrote this, was Woody’s Oasis, which had been kicked out of East Lansing as part of the stage in gentrification where all the good stuff gets the rent raised out from under it, and you get chain restaurants instead. They had a really good vegetarian … thing … called smead, that we guess was some kind of cracked-wheat sandwich filling. No idea what it was. There are other Woody’s Oasises in the area, somehow all different and before the pandemic we kept figuring we’d go and see if they had smead, sometime.
I am hosting, later this month, the 141st installment of Denise Gaskins’s Playful Math Education Blog Carnival. If you’ve seen recently any mathematics piece — a blog, a YouTube video, a magazine article — that you found educational or enlightening or just fun, please, share it with me in comments so I can share it with the wider world.
The Summer 2015 A-to-Z was the first I’d done. Its essays tended to be shorter and narrower in focus than what I write these days. But another feature is that they tended to be more practical, like, something that you could use to read a mathematics paper with better understanding. N-tuple is an example this. N-tuples are ordered bunches of numbers, and turn up in many places. They’re not quite vectors and matrices. But the ordinary use of vectors and matrices we represent with n-tuples.