## My All 2020 Mathematics A to Z: Permutation

Laura, the author of MathSux2, offered this week’s A-to-Z term. (I apologize for it being late but the Playful Math Education Blog Carnival 141 work took a lot out of me.) She writes the blog weekly, and hosts a YouTube channel of mathematics videos also. I’m glad to have the topic to discuss.

# Permutation.

We learn to count permutations before we know what they are. There are good reasons to. Counting permutations gives us numbers that are big, and therefore interesting, fast. Counting is easy to motivate. Humans like counting. Counting is useful. Many probability questions are best answered by counting all the ways to arrange things, and how many of those arrangements are desirable somehow.

The count of permutations asks how many ways there are to put some things in order. If some of the things are identical, the number is smaller. Calculating the count may be a little tedious, but it’s not hard. We calculate, rather than “really” count, because — well, list all the possible ways to arrange the letters of the word ‘DEMONSTRATION’. I bet you turn that listing over to a computer too. But what is the computer counting?

If we’re trying to do this efficiently we have some system. Start with ‘DEMONSTRATION’. Then, say, swap the last two letters: ‘DEMONSTRATINO’. Then, mm, move the ‘N’ to the antepenultimate position: ‘DEMONSTRATNIO’. Then, oh, swap the last two letters again: ‘DEMONSTRATNOI’.

Then, oh, move the ‘N’ to the third-to-the-last position: ‘DEMONSTRANTIO’. What next? Oh, swap the last two letters again: ‘DEMONSTRANTOI’. Or, move what been the last letter to the antepenultimate position: ‘DEMONSTRANOTI’. And swap the last two letters once more: ‘DEMONSTRANOIT’.

Enough of that, you and my spellchecker say. I agree. What is it that all this is doing? What does that tell us about what a permutation is?

An obvious thing. Each new variation of the order came from swapping two letters of an earlier one. We needed a sequence of swaps to get to ‘DEMONSTRANOIT’. But each swap was of only two things. It’s a good thing to observe.

Another obvious thing. There’s no letters in ‘DEMONSTRANOIT’ or any of the other variations that weren’t in ‘DEMONSTRATION’. All that’s changed is the order.

This all has listed eight permutations, counting the original ‘DEMONSTRATION’ as one. There are, calculations tell me, 778,377,592 to go.

Would the number of permutations be different if we were shuffling around different things? If instead of the letters in the word ‘DEMONSTRATION’ it were, say, the numerals in the sequence ‘1234567897045’? Or the sequence of symbols ‘!@#$%^&*(&)$%’ instead? No, and that it would not is another clue about what permutations are.

Another thing, obvious in retrospect. Grant that we’ve been making new permutations by taking a sequence of letters (numerals, symbols) and swapping a pair. We got from ‘DEMONSTRATION’ to ‘DEMONSTRATINO’ by swapping the last two letters. What happens if we swap the last two letters again? We get ‘DEMONSTRATION’, a sequence of letters all right, although one already on our list of permutations.

One more thing, obvious once you’ve seen it. Imagine we had not started with ‘DEMONSTRATION’ but instead ‘DEMONSTRATNIO’. But that we followed the same sequences of swappings. Would we have come up with different permutations? … At least for the first couple permutations? Or would they be the same permutations, listed in a different order?

You’ve been kind, letting me call these things “permutations” before I say what a permutation is. It’s relied on a casual, intuitive idea of a permutation. It’s a shuffling around of some set of things. This is the casual idea that mathematicians rely on for a permutation. Sure we can make the idea precise. How hard will that be?

It’s not hard in form. The permutation is the rearranging of things into a new order. The hard part is the concept. It’s not “these symbols in this order” that’s the permutation. It’s the act of putting them in this new order that is. So it’s “swap the 12th and the 13th symbols”. Or, “move the 13th symbol to 11th place, the 11th symbol to 12th, and the 12th symbol to 13th place”.

We can describe each permutation as a function. All the permutation functions have the same domain and the same range. And the range is the domain. The function is a bijection. Every item in the domain matches exactly one item in the range, and vice-versa. There’s some sequence for the elements in the domain. And the rule for the function describes how that sequence changes.

So one permutation is “swap the 12th and the 13th elements”. Another permutation is “swap the 11th and the 12th elements”. Since the range of one function is the domain of another, we can compose the together. That is, we can “swap the 12th and the 13th elements, and then swap the 11th and the 12th elements”. This gets us another permutation. The effect of these two permutations, in this order, is “make the 13th element the 11th, make the 11th element the 12th, and make the 12th element the 13th”. The order we do these permutations in counts. “Swap the 11th and the 12th elements, and then swap the 12th and the 13th” gets us a different net effect. That one is “make the 12th element the 11th, make the 13th element the 12th, and make the 11th element the 13th”. Composition of functions does not commute.

That functions compose is normal enough. That their composition doesn’t commute is normal enough too. These functions are a bit odd in that we don’t care what the domain-and-range is. We only care that we can index the elements in it. That leads us to some new observations.

The big one is that the set of all these permutations is a group. I mean the way mathematicians mean group. That is, we have a set of items. These are the functions, the permutations. The instructions, like, “make the 12th element the 11th and the 13th element the 12th”, or “the 12th element the 13th”. We also need a group action, a thing that works like addition does for real numbers. That’s composition. That is, doing one permutation and then the other, to get a new permutation out of it. That new permutation is itself one of the permutations we’d had. We can’t compose permutations and get something that’s not a permutation. No amount of swapping around the letters of ‘DEMONSTRATION’ will get us ‘DEMONSTRATIONERS’.

When we talk about how permutations as a group work, we want to give individual permutations names. That ends up being letters. These are often Greek letters. I don’t know why we can’t use the ordinary Latin alphabet. I suppose someone who liked Greek letters wrote a really good textbook and everyone copies that. So instead of speaking about x and y, we’ll get α and β. Sometimes σ and τ. Or, quite often π, especially if we need a bunch of permutations. Then we get π1, π2, π3, and so on. πj. All the way to πN. For the young mathematics major it might be the first time seeing π used for something not at all circle-related. It’s a weird sensation. Still, αβ is the composition of permutation α with permutation β. This means, do permutation β first, and then permutation α on whatever that result is. This is the same way that f(g(x)) means “evaluate g(x) first, and then figure out what f( that ) is”.

That’s all fine for naming them. But we would also like a good way to describe what a permutation does. There are several good forms. They all rely on indexing the elements, using the counting numbers: 1, 2, 3, 4, and so on. The notation I’ll share is called cycle notation. It’s easy to type. You write it within nice ordinary parentheses: (11 12) means “put the 11th element in slot 12, and the 12th element in slot 11”. (11, 12, 13) means “put the 11th element in slot 12, the 12th element in slot 13, and the 13th element in slot 11”. You can even chain these together: (10, 11)(12, 13) means “put the 10th element in slot 11 and the 11th element in slot 10; also, put the 12th element in slot 13, and the 13th element in slot 12”.

In that notation, writing (9), for example, means “put the 9th element in slot 9”. Or if you prefer, “leave element 9 alone”. Or we don’t mention it at all. The convention is that if something isn’t mentioned, leave it where it is.

This by the way is where we get the identity element. The permutation (1)(2)(3)(4)(etc) doesn’t actually swap anything. It counts as a permutation. Doing this is the equivalent of adding zero to a number.

This cycle notation makes it not hard to figure out the composition of permutations. What does (1 2)(1 3) do? Well, the (1 3) swaps the first and the third items. The (1 2), next, swaps what’s become the first and the second items. The effect is the same as the permutation (2 3 1). You can get pretty good at this sort of manipulation, in time.

You may also consider: if (1 2)(1 3) is the same as (2 3 1), then isn’t (2 3 1) the same as (1 2)(1 3)? Sure. But, like, can we write a longer permutation, like, (1 3 5 2 4), as the product of some smaller permutations? And we can. If it’s convenient, we can write it as a string of swaps, exchanging pairs of elements. This was the first “obvious” thing I had listed. A long enough chain of pairwise swaps will, in time, swap everything.

We call the group made of all these permutations the Symmetric Group of the set. Since it doesn’t matter what the underlying set is, just the number of elements in it, we can abbreviate this with the number of elements. S2. S4. SN. Symmetric Groups are among the first groups you meet in abstract algebra that aren’t, like, integers modulo 12 or symmetries of a triangle. It’s novel enough to be interesting and to not be completely sure you’re doing it right.

You never leave the Symmetric Group, though, not if you stay in algebra. It has powerful consequences. It ties, for example, into the roots of polynomials. The structure of S5 tells us there must exist fifth-degree polynomials we can’t solve by ordinary arithmetic and root operations. That is, there’s no version of the quadratic equation for high-order polynomials, and never can be.

There are more groups to build from permutations. The next one that you meet in Intro to Abstract Algebra is the Alternating Group. It’s made of only the even permutations. Those are the permutations made from an even number of swaps. (There are also odd permutations, which are what you imagine. They can’t make a group, though. No identity element.) They’re great for recapturing dread and uncertainty once you think you’ve got a handle on the Symmetric Group.

They lead to other groups too, and even rings. The Levi-Civita symbol describes whether a set of indices gives an even or an odd permutation (or neither). It makes life easier when we work on determinants and tensors and Jacobians. These tie in to the geometry of space, and how that affects physics. It also gets a supporting role in cross products. There are many cryptography schemes that have permutations at their core.

So this is a bit of what permutations are, and what they can get us.

Today’s and all the other 2020 A-to-Z essays should be at this link. Both the All-2020 and past A-to-Z essays should be at this link. Thanks for reading.

## My All 2020 Mathematics A to Z: Michael Atiyah

To start this year’s great glossary project Mr Wu, author of the MathTuition88.com blog, had a great suggestion: The Atiyah-Singer Index Theorem. It’s an important and spectacular piece of work. I’ll explain why I’m not doing that in a few sentences.

Mr Wu pointed out that a biography of Michael Atiyah, one of the authors of this theorem, might be worth doing. GoldenOj endorsed the biography idea, and the more I thought it over the more I liked it. I’m not able to do a true biography, something that goes to primary sources and finds a convincing story of a life. But I can sketch out a bit, exploring his work and why it’s of note.

# Michael Atiyah.

Theodore Frankel’s The Geometry of Physics: An Introduction is a wonderful book. It’s 686 pages, including the index. It all explores how our modern understanding of physics is our modern understanding of geometry. On page 465 it offers this:

The Atiyah-Singer index theorem must be considered a high point of geometrical analysis of the twentieth century, but is far too complicated to be considered in this book.

I know when I’m licked. Let me attempt to look at one of the people behind this theorem instead.

The Riemann Hypothesis is about where to find the roots of a particular infinite series. It’s been out there, waiting for a solution, for a century and a half. There are many interesting results which we would know to be true if the Riemann Hypothesis is true. In 2018, Michael Atiyah declared that he had a proof. And, more, an amazing proof, a short proof. Albeit one that depended on a great deal of background work and careful definitions. The mathematical community was skeptical. It still is. But it did not dismiss outright the idea that he had a solution. It was plausible that Atiyah might solve one of the greatest problems of mathematics in something that fits on a few PowerPoint slides.

So think of a person who commands such respect.

His proof of the Riemann Hypothesis, as best I understand, is not generally accepted. For example, it includes the fine structure constant. This comes from physics. It describes how strongly electrons and photons interact. The most compelling (to us) consequence of the Riemann Hypothesis is in how prime numbers are distributed among the integers. It’s hard to think how photons and prime numbers could relate. But, then, if humans had done all of mathematics without noticing geometry, we would know there is something interesting about π. Differential equations, if nothing else, would turn up this number. We happened to discover π in the real world first too. If it were not familiar for so long, would we think there should be any commonality between differential equations and circles?

I do not mean to say Atiyah is right and his critics wrong. I’m no judge of the matter at all. What is interesting is that one could imagine a link between a pure number-theory matter like the Riemann hypothesis and a physical matter like the fine structure constant. It’s not surprising that mathematicians should be interested in physics, or vice-versa. Atiyah’s work was particularly important. Much of his work, from the late 70s through the 80s, was in gauge theory. This subject lies under much of modern quantum mechanics. It’s born of the recognition of symmetries, group operations that you can do on a field, such as the electromagnetic field.

In a sequence of papers Atiyah, with other authors, sorted out particular cases of how magnetic monopoles and instantons behave. Magnetic monopoles may sound familiar, even though no one has ever seen one. These are magnetic points, an isolated north or a south pole without its opposite partner. We can understand well how they would act without worrying about whether they exist. Instantons are more esoteric; I don’t remember encountering the term before starting my reading for this essay. I believe I did, encountering the technique as a way to describe the transitions between one quantum state and another. Perhaps the name failed to stick. I can see where there are few examples you could give an undergraduate physics major. And it turns out that monopoles appear as solutions to some problems involving instantons.

This was, for Atiyah, later work. It arose, in part, from bringing the tools of index theory to nonlinear partial differential equations. This index theory is the thing that got us the Atiyah-Singer Index Theorem too complicated to explain in 686 pages. Index theory, here, studies questions like “what can we know about a differential equation without solving it?” Solving a differential equation would tell us almost everything we’d like to know, yes. But it’s also quite hard. Index theory can tell us useful things like: is there a solution? Is there more than one? How many? And it does this through topological invariants. A topological invariant is a trait like, for example, the number of holes that go through a solid object. These things are indifferent to operations like moving the object, or rotating it, or reflecting it. In the language of group theory, they are invariant under a symmetry.

It’s startling to think a question like “is there a solution to this differential equation” has connections to what we know about shapes. This shows some of the power of recasting problems as geometry questions. From the late 50s through the mid-70s, Atiyah was a key person working in a topic that is about shapes. We know it as K-theory. The “K” from the German Klasse, here. It’s about groups, in the abstract-algebra sense; the things in the groups are themselves classes of isomorphisms. Michael Atiyah and Friedrich Hirzebruch defined this sort of group for a topological space in 1959. And this gave definition to topological K-theory. This is again abstract stuff. Frankel’s book doesn’t even mention it. It explores what we can know about shapes from the tangents to the shapes.

And it leads into cobordism, also called bordism. This is about what you can know about shapes which could be represented as cross-sections of a higher-dimension shape. The iconic, and delightfully named, shape here is the pair of pants. In three dimensions this shape is a simple cartoon of what it’s named. On the one end, it’s a circle. On the other end, it’s two circles. In between, it’s a continuous surface. Imagine the cross-sections, how on separate layers the two circles are closer together. How their shapes distort from a real circle. In one cross-section they come together. They appear as two circles joined at a point. In another, they’re a two-looped figure. In another, a smoother circle. Knowing that Atiyah came from these questions may make his future work seem more motivated.

But how does one come to think of the mathematics of imaginary pants? Many ways. Atiyah’s path came from his first research specialty, which was algebraic geometry. This was his work through much of the 1950s. Algebraic geometry is about the kinds of geometric problems you get from studying algebra problems. Algebra here means the abstract stuff, although it does touch on the algebra from high school. You might, for example, do work on the roots of a polynomial, or a comfortable enough equation like $x^2 + y^2 = 1$. Atiyah had started — as an undergraduate — working on projective geometries. This is what one curve looks like projected onto a different surface. This moved into elliptic curves and into particular kinds of transformations on surfaces. And algebraic geometry has proved important in number theory. You might remember that the Wiles-Taylor proof of Fermat’s Last Theorem depended on elliptic curves. Some work on the Riemann hypothesis is built on algebraic topology.

(I would like to trace things farther back. But the public record of Atiyah’s work doesn’t offer hints. I can find amusing notes like his father asserting he knew he’d be a mathematician. He was quite good at changing local currency into foreign currency, making a profit on the deal.)

It’s possible to imagine this clear line in Atiyah’s career, and why his last works might have been on the Riemann hypothesis. That’s too pat an assertion. The more interesting thing is that Atiyah had several recognizable phases and did iconic work in each of them. There is a cliche that mathematicians do their best work before they are 40 years old. And, it happens, Atiyah did earn a Fields Medal, given to mathematicians for the work done before they are 40 years old. But I believe this cliche represents a misreading of biographies. I suspect that first-rate work is done when a well-prepared mind looks fresh at a new problem. A mathematician is likely to have these traits line up early in the career. Grad school demands the deep focus on a particular problem. Getting out of grad school lets one bring this deep knowledge to fresh questions.

It is easy, in a career, to keep studying problems one has already had great success in, for good reason and with good results. It tends not to keep producing revolutionary results. Atiyah was able — by chance or design I can’t tell — to several times venture into a new field. The new field was one that his earlier work prepared him for, yes. But it posed new questions about novel topics. And this creative, well-trained mind focusing on new questions produced great work. And this is one way to be credible when one announces a proof of the Riemann hypothesis.

Here is something I could not find a clear way to fit into this essay. Atiyah recorded some comments about his life for the Web of Stories site. These are biographical and do not get into his mathematics at all. Much of it is about his life as child of British and Lebanese parents and how that affected his schooling. One that stood out to me was about his peers at Manchester Grammar School, several of whom he rated as better students than he was. Being a good student is not tightly related to being a successful academic. Particularly as so much of a career depends on chance, on opportunities happening to be open when one is ready to take them. It would be remarkable if there wre three people of greater talent than Atiyah who happened to be in the same school at the same time. It’s not unthinkable, though, and we may wonder what we can do to give people the chance to do what they are good in. (I admit this assumes that one finds doing what one is good in particularly satisfying or fulfilling.) In looking at any remarkable talent it’s fair to ask how much of their exceptional nature is that they had a chance to excel.

## My 2019 Mathematics A To Z: Relatively Prime

I have another subject nominated by goldenoj today. And it even lets me get into number theory, the field of mathematics questions that everybody understands and nobody can prove.

# Relatively Prime.

I was once a young grad student working as a teaching assistant and unaware of the principles of student privacy. Near the end of semesters I would e-mail students their grades. This so they could correct any mistakes and know what they’d have to get on the finals. I was learning Perl, which was an acceptable pastime in the 1990s. So I wrote scripts that would take my spreadsheet of grades and turn it into e-mails that were automatically sent. And then I got all fancy.

It seemed boring to send out completely identical form letters, even if any individual would see it once. Maybe twice if they got me for another class. So I started writing variants of the boilerplate sentences. My goal was that every student would get a mass-produced yet unique e-mail. To best the chances of this I had to make sure of something about all these variant sentences and paragraphs.

So you see the trick. I needed a set of relatively prime numbers. That way, it would be the greatest possible number of students before I had a completely repeated text. We know what prime numbers are. They’re the numbers that, in your field, have exactly two factors. In the counting numbers the primes are numbers like 2, 3, 5, 7 and so on. In the Gaussian integers, these are numbers like 3 and 7 and $3 - 2\imath$. But not 2 or 5. We can look to primes among the polynomials. Among polynomials with rational coefficients, $x^2 + x + 1$ is prime. So is $2x^2 + 14x + 1$. $x^2 - 4$ is not.

The idea of relative primes appears wherever primes appears. We can say without contradiction that 4 and 9 are relative primes, among the whole numbers. Though neither’s prime, in the whole numbers, neither has a prime factor in common. This is an obvious way to look at it. We can use that definition for any field that has a concept of primes. There are others, though. We can say two things are relatively prime if there’s a linear combination of them that adds to the identity element. You get a linear combination by multiplying each of the things by a scalar and adding these together. Multiply 4 by -2 and 9 by 1 and add them and look what you get. Or, if the least common multiple of a set of elements is equal to their product, then the elements are relatively prime. Some make sense only for the whole numbers. Imagine the first quadrant of a plane, marked in Cartesian coordinates. Draw the line segment connecting the point at (0, 0) and the point with coordinates (m, n). If that line segment touches no dots between (0, 0) and (m, n), then the whole numbers m and n are relatively prime.

We start looking at relative primes as pairs of things. We can be interested in larger sets of relative primes, though. My little e-mail generator, for example, wouldn’t work so well if any pair of sentence replacements were not relatively prime. So, like, the set of numbers 2, 6, 9 is relatively prime; all three numbers share no prime factors. But neither the pair 2, 6 and the pair 6, 9 are not relatively prime. 2, 9 is, at least there’s that. I forget how many replaceable sentences were in my form e-mails. I’m sure I did the cowardly thing, coming up with a prime number of alternate ways to phrase as many sentences as possible. As an undergraduate I covered the student government for four years’ worth of meetings. I learned a lot of ways to say the same thing.

Which is all right, but are relative primes important? Relative primes turn up all over the place in number theory, and in corners of group theory. There are some thing that are easier to calculate in modulo arithmetic if we have relatively prime numbers to work with. I know when I see modulo arithmetic I expect encryption schemes to follow close behind. Here I admit I’m ignorant whether these imply things which make encryption schemes easier or harder.

Some of the results are neat, certainly. Suppose that the function f is a polynomial. Then, if its first derivative f’ is relatively prime to f, it turns out f has no repeated roots. And vice-versa: if f has no repeated roots, then it and its first derivative are relatively prime. You remember repeated roots. They’re factors like $(x - 2)^2$, that foiled your attempt to test a couple points and figure roughly where a polynomial crossed the x-axis.

I mentioned that primeness depends on the field. This is true of relative primeness. Polynomials really show this off. (Here I’m using an example explained in a 2007 Ask Dr Math essay.) Is the polynomial $3x + 6$ relatively prime to $3x^2 + 12$?

It is, if we are interested in polynomials with integer coefficients. There’s no linear combination of $3x + 6$ and $3x^2 + 12$ which gets us to 1. Go ahead and try.

It is not, if we are interested in polynomials with rational coefficients. Multiply $3x + 6$ by $\frac{1}{12}\left(1 - \frac{1}{2}x\right)$ and multiply $3x^2 + 12$ by $\frac{1}{24}$. Then add those up.

Tell me what polynomials you want to deal with today and I will tell you which answer is right.

This may all seem cute if, perhaps, petty. A bunch of anonymous theorems dotting the center third of an abstract algebra text will inspire that. The most important relative-primes thing I know of is the abc conjecture, posed in the mid-80s by Joseph Oesterlé and David Masser. Start with three counting numbers, a, b, and c. Require that a + b = c.

There is a product of the unique prime factors of a, b, and c. That is, let’s say a is 36. This is 2 times 2 times 3 times 3. Let’s say b is 5. This is prime. c is 41; it’s prime. Their unique prime factors are 2, 3, 5, and 41; the product of all these is 1,230.

The conjecture deals with this product of unique prime factors for this relatively prime triplet. Almost always, c is going to be smaller than this unique prime factors product. The conjecture says that there will be, for every positive real number $\epsilon$, at most finitely many cases where c is larger than this product raised to the power $1 + \epsilon$. I do not know why raising this product to this power is so important. I assume it rules out some case where this product raised to the first power would be too easy a condition.

Apart from that $1 + \epsilon$ bit, though, this is a classic sort of number theory conjecture. Like, it involves some technical terms, but nothing too involved. You could almost explain it at a party and expect to be understood, and to get some people writing down numbers, testing out specific cases. Nobody will go away solving the problem, but they’ll have some good exercise and that’s worthwhile.

And it has consequences. We do not know whether the abc conjecture is true. We do know that if it is true, then a bunch of other things follow. The one that a non-mathematician would appreciate is that Fermat’s Last Theorem would be provable by an alterante route. The abc conjecture would only prove the cases for Fermat’s Last Theorem for powers greater than 5. But that’s all right. We can separately work out the cases for the third, fourth, and fifth powers, and then cover everything else at once. (That we know Fermat’s Last Theorem is true doesn’t let us conclude the abc conjecture is true, unfortunately.)

There are other implications. Some are about problems that seem like fun to play with. If the abc conjecture is true, then for every integer A, there are finitely many values of n for which $n! + A$ is a perfect square. Some are of specialist interest: Lang’s conjecture, about elliptic curves, would be true. This is a lower bound for the height of non-torsion rational points. I’d stick to the $n! + A$ stuff at a party. A host of conjectures about Diophantine equations — (high school) algebra problems where only integers may be solutions — become theorems. Also coming true: the Fermat-Catalan conjecture. This is a neat problem; it claims that the equation

$a^m + b^n = c^k$

where a, b, and c are relatively prime, and m, n, and k are positive integers satisfying the constraint

$\frac{1}{m} + \frac{1}{n} + \frac{1}{k} < 1$

has only finitely many solutions with distinct triplets $\left(a^m, b^n, c^k\right)$. The inequality about reciprocals of m, n, and k is needed so we don’t have boring solutions like $2^2 + 3^3 = 31^1$ clogging us up. The bit about distinct triplets is so we don’t clog things up with a or b being 1 and then technically every possible m or n giving us a “different” set. To date we know something like ten solutions, one of them having a equal to 1.

Another implication is Pillai’s Conjecture. This one asks whether every positive integer occurs only finitely many times as the difference between perfect powers. Perfect powers are, like 32 (two to the fifth power) or 81 (three to the fourth power) or such.

So as often happens when we stumble into a number theory thing, the idea of relative primes is easy. And there are deep implications to them. But those in turn give us things that seem like fun arithmetic puzzles.

This closes out the A to Z essays for this week. Tomorrow and Saturday I hope to bring some attention to essays from past years. And next week I figure to open for topics for the end of the alphabet, the promising letters U through Z. This and the rest of the 2019 essays should appear at this link, as should the letter S next Tuesday. And all of the A to Z essays ought to be at this link. Thank you for reading.

## The Summer 2017 Mathematics A To Z: Volume Forms

I’ve been reading Elke Stangl’s Elkemental Force blog for years now. Sometimes I even feel social-media-caught-up enough to comment, or at least to like posts. This is relevant today as I discuss one of the Stangl’s suggestions for my letter-V topic.

# Volume Forms.

So sometime in pre-algebra, or early in (high school) algebra, you start drawing equations. It’s a simple trick. Lay down a coordinate system, some set of axes for ‘x’ and ‘y’ and maybe ‘z’ or whatever letters are important. Look to the equation, made up of x’s and y’s and maybe z’s and so. Highlight all the points with coordinates whose values make the equation true. This is the logical basis for saying (eg) that the straight line “is” $y = 2x + 1$.

A short while later, you learn about polar coordinates. Instead of using ‘x’ and ‘y’, you have ‘r’ and ‘θ’. ‘r’ is the distance from the center of the universe. ‘θ’ is the angle made with respect to some reference axis. It’s as legitimate a way of describing points in space. Some classrooms even have a part of the blackboard (whiteboard, whatever) with a polar-coordinates “grid” on it. This looks like the lines of a dartboard. And you learn that some shapes are easy to describe in polar coordinates. A circle, centered on the origin, is ‘r = 2’ or something like that. A line through the origin is ‘θ = 1’ or whatever. The line that we’d called $y = 2x + 1$ before? … That’s … some mess. And now $r = 2\theta + 1$ … that’s not even a line. That’s some kind of spiral. Two spirals, really. Kind of wild.

And something to bother you a while. $y = 2x + 1$ is an equation that looks the same as $r = 2\theta + 1$. You’ve changed the names of the variables, but not how they relate to each other. But one is a straight line and the other a spiral thing. How can that be?

The answer, ultimately, is that the letters in the equations aren’t these content-neutral labels. They carry meaning. ‘x’ and ‘y’ imply looking at space a particular way. ‘r’ and ‘θ’ imply looking at space a different way. A shape has different representations in different coordinate systems. Fair enough. That seems to settle the question.

But if you get to calculus the question comes back. You can integrate over a region of space that’s defined by Cartesian coordinates, x’s and y’s. Or you can integrate over a region that’s defined by polar coordinates, r’s and θ’s. The first time you try this, you find … well, that any region easy to describe in Cartesian coordinates is painful in polar coordinates. And vice-versa. Way too hard. But if you struggle through all that symbol manipulation, you get … different answers. Eventually the calculus teacher has mercy and explains. If you’re integrating in Cartesian coordinates you need to use “dx dy”. If you’re integrating in polar coordinates you need to use “r dr dθ”. If you’ve never taken calculus, never mind what this means. What is important is that “r dr dθ” looks like three things multiplied together, while “dx dy” is two.

We get this explained as a “change of variables”. If we want to go from one set of coordinates to a different one, we have to do something fiddly. The extra ‘r’ in “r dr dθ” is what we get going from Cartesian to polar coordinates. And we get formulas to describe what we should do if we need other kinds of coordinates. It’s some work that introduces us to the Jacobian, which looks like the most tedious possible calculation ever at that time. (In Intro to Differential Equations we learn we were wrong, and the Wronskian is the most tedious possible calculation ever. This is also wrong, but it might as well be true.) We typically move on after this and count ourselves lucky it got no worse than that.

None of this is wrong, even from the perspective of more advanced mathematics. It’s not even misleading, which is a refreshing change. But we can look a little deeper, and get something good from doing so.

The deeper perspective looks at “differential forms”. These are about how to encode information about how your coordinate system represents space. They’re tensors. I don’t blame you for wondering if they would be. A differential form uses interactions between some of the directions in a space. A volume form is a differential form that uses all the directions in a space. And satisfies some other rules too. I’m skipping those because some of the symbols involved I don’t even know how to look up, much less make WordPress present.

What’s important is the volume form carries information compactly. As symbols it tells us that this represents a chunk of space that’s constant no matter what the coordinates look like. This makes it possible to do analysis on how functions work. It also tells us what we would need to do to calculate specific kinds of problem. This makes it possible to describe, for example, how something moving in space would change.

The volume form, and the tools to do anything useful with it, demand a lot of supporting work. You can dodge having to explicitly work with tensors. But you’ll need a lot of tensor-related materials, like wedge products and exterior derivatives and stuff like that. If you’ve never taken freshman calculus don’t worry: the people who have taken freshman calculus never heard of those things either. So what makes this worthwhile?

Yes, person who called out “polynomials”. Good instinct. Polynomials are usually a reason for any mathematics thing. This is one of maybe four exceptions. I have to appeal to my other standard answer: “group theory”. These volume forms match up naturally with groups. There’s not only information about how coordinates describe a space to consider. There’s ways to set up coordinates that tell us things.

That isn’t all. These volume forms can give us new invariants. Invariants are what mathematicians say instead of “conservation laws”. They’re properties whose value for a given problem is constant. This can make it easier to work out how one variable depends on another, or to work out specific values of variables.

For example, classical physics problems like how a bunch of planets orbit a sun often have a “symplectic manifold” that matches the problem. This is a description of how the positions and momentums of all the things in the problem relate. The symplectic manifold has a volume form. That volume is going to be constant as time progresses. That is, there’s this way of representing the positions and speeds of all the planets that does not change, no matter what. It’s much like the conservation of energy or the conservation of angular momentum. And this has practical value. It’s the subject that brought my and Elke Stangl’s blogs into contact, years ago. It also has broader applicability.

There’s no way to provide an exact answer for the movement of, like, the sun and nine-ish planets and a couple major moons and all that. So there’s no known way to answer the question of whether the Earth’s orbit is stable. All the planets are always tugging one another, changing their orbits a little. Could this converge in a weird way suddenly, on geologic timescales? Might the planet might go flying off out of the solar system? It doesn’t seem like the solar system could be all that unstable, or it would have already. But we can’t rule out that some freaky alignment of Jupiter, Saturn, and Halley’s Comet might not tweak the Earth’s orbit just far enough for catastrophe to unfold. Granted there’s nothing we could do about the Earth flying out of the solar system, but it would be nice to know if we face it, we tell ourselves.

But we can answer this numerically. We can set a computer to simulate the movement of the solar system. But there will always be numerical errors. For example, we can’t use the exact value of π in a numerical computation. 3.141592 (and more digits) might be good enough for projecting stuff out a day, a week, a thousand years. But if we’re looking at millions of years? The difference can add up. We can imagine compensating for not having the value of π exactly right. But what about compensating for something we don’t know precisely, like, where Jupiter will be in 16 million years and two months?

Symplectic forms can help us. The volume form represented by this space has to be conserved. So we can rewrite our simulation so that these forms are conserved, by design. This does not mean we avoid making errors. But it means we avoid making certain kinds of errors. We’re more likely to make what we call “phase” errors. We predict Jupiter’s location in 16 million years and two months. Our simulation puts it thirty degrees farther in its circular orbit than it actually would be. This is a less serious mistake to make than putting Jupiter, say, eight-tenths as far from the Sun as it would really be.

Volume forms seem, at first, a lot of mechanism for a small problem. And, unfortunately for students, they are. They’re more trouble than they’re worth for changing Cartesian to polar coordinates, or similar problems. You know, ones that the student already has some feel for. They pay off on more abstract problems. Tracking the movement of a dozen interacting things, say, or describing a space that’s very strangely shaped. Those make the effort to learn about forms worthwhile.

## A Leap Day 2016 Mathematics A To Z: Isomorphism

Gillian B made the request that’s today’s A To Z word. I’d said it would be challenging. Many have been, so far. But I set up some of the work with “homomorphism” last time. As with “homomorphism” it’s a word that appears in several fields and about different kinds of mathematical structure. As with homomorphism, I’ll try describing what it is for groups. They seem least challenging to the imagination.

## Isomorphism.

An isomorphism is a kind of homomorphism. And a homomorphism is a kind of thing we do with groups. A group is a mathematical construct made up of two things. One is a set of things. The other is an operation, like addition, where we take two of the things and get one of the things in the set. I think that’s as far as we need to go in this chain of defining things.

A homomorphism is a mapping, or if you like the word better, a function. The homomorphism matches everything in a group to the things in a group. It might be the same group; it might be a different group. What makes it a homomorphism is that it preserves addition.

I gave an example last time, with groups I called G and H. G had as its set the whole numbers 0 through 3 and as operation addition modulo 4. H had as its set the whole numbers 0 through 7 and as operation addition modulo 8. And I defined a homomorphism φ which took a number in G and matched it the number in H which was twice that. Then for any a and b which were in G’s set, φ(a + b) was equal to φ(a) + φ(b).

We can have all kinds of homomorphisms. For example, imagine my new φ1. It takes whatever you start with in G and maps it to the 0 inside H. φ1(1) = 0, φ1(2) = 0, φ1(3) = 0, φ1(0) = 0. It’s a legitimate homomorphism. Seems like it’s wasting a lot of what’s in H, though.

An isomorphism doesn’t waste anything that’s in H. It’s a homomorphism in which everything in G’s set matches to exactly one thing in H’s, and vice-versa. That is, it’s both a homomorphism and a bijection, to use one of the terms from the Summer 2015 A To Z. The key to remembering this is the “iso” prefix. It comes from the Greek “isos”, meaning “equal”. You can often understand an isomorphism from group G to group H showing how they’re the same thing. They might be represented differently, but they’re equivalent in the lights you use.

I can’t make an isomorphism between the G and the H I started with. Their sets are different sizes. There’s no matching everything in H’s set to everything in G’s set without some duplication. But we can make other examples.

For instance, let me start with a new group G. It’s got as its set the positive real numbers. And it has as its operation ordinary multiplication, the kind you always do. And I want a new group H. It’s got as its set all the real numbers, positive and negative. It has as its operation ordinary addition, the kind you always do.

For an isomorphism φ, take the number x that’s in G’s set. Match it to the number that’s the logarithm of x, found in H’s set. This is a one-to-one pairing: if the logarithm of x equals the logarithm of y, then x has to equal y. And it covers everything: all the positive real numbers have a logarithm, somewhere in the positive or negative real numbers.

And this is a homomorphism. Take any x and y that are in G’s set. Their “addition”, the group operation, is to multiply them together. So “x + y”, in G, gives us the number xy. (I know, I know. But trust me.) φ(x + y) is equal to log(xy), which equals log(x) + log(y), which is the same number as φ(x) + φ(y). There’s a way to see the postive real numbers being multiplied together as equivalent to all the real numbers being added together.

You might figure that the positive real numbers and all the real numbers aren’t very different-looking things. Perhaps so. Here’s another example I like, drawn from Wikipedia’s entry on Isomorphism. It has as sets things that don’t seem to have anything to do with one another.

Let me have another brand-new group G. It has as its set the whole numbers 0, 1, 2, 3, 4, and 5. Its operation is addition modulo 6. So 2 + 2 is 4, while 2 + 3 is 5, and 2 + 4 is 0, and 2 + 5 is 1, and so on. You get the pattern, I hope.

The brand-new group H, now, that has a more complicated-looking set. Its set is ordered pairs of whole numbers, which I’ll represent as (a, b). Here ‘a’ may be either 0 or 1. ‘b’ may be 0, 1, or 2. To describe its addition rule, let me say we have the elements (a, b) and (c, d). Find their sum first by adding together a and c, modulo 2. So 0 + 0 is 0, 1 + 0 is 1, 0 + 1 is 1, and 1 + 1 is 0. That result is the first number in the pair. The second number we find by adding together b and d, modulo 3. So 1 + 0 is 1, and 1 + 1 is 2, and 1 + 2 is 0, and so on.

So, for example, (0, 1) plus (1, 1) will be (1, 2). But (0, 1) plus (1, 2) will be (1, 0). (1, 2) plus (1, 0) will be (0, 2). (1, 2) plus (1, 2) will be (0, 1). And so on.

The isomorphism matches up things in G to things in H this way:

In G φ(G), in H
0 (0, 0)
1 (1, 1)
2 (0, 2)
3 (1, 0)
4 (0, 1)
5 (1, 2)

I recommend playing with this a while. Pick any pair of numbers x and y that you like from G. And check their matching ordered pairs φ(x) and φ(y) in H. φ(x + y) is the same thing as φ(x) + φ(y) even though the things in G’s set don’t look anything like the things in H’s.

Isomorphisms exist for other structures. The idea extends the way homomorphisms do. A ring, for example, has two operations which we think of as addition and multiplication. An isomorphism matches two rings in ways that preserve the addition and multiplication, and which match everything in the first ring’s set to everything in the second ring’s set, one-to-one. The idea of the isomorphism is that two different things can be paired up so that they look, and work, remarkably like one another.

One of the common uses of isomorphisms is describing the evolution of systems. We often like to look at how some physical system develops from different starting conditions. If you make a little variation in how things start, does this produce a small change in how it develops, or does it produce a big change? How big? And the description of how time changes the system is, often, an isomorphism.

Isomorphisms also appear when we study the structures of groups. They turn up naturally when we look at things called “normal subgroups”. The name alone gives you a good idea what a “subgroup” is. “Normal”, well, that’ll be another essay.

## A Leap Day 2016 Mathematics A To Z: Homomorphism

I’m not sure how, but many of my Mathematics A To Z essays seem to circle around algebra. I mean abstract algebra, not the kind that involves petty concerns like ‘x’ and ‘y’. In abstract algebra we worry about letters like ‘g’ and ‘h’. For special purposes we might even have ‘e’. Maybe it’s that the subject has a lot of familiar-looking words. For today’s term, I’m doing an algebra term, and one that wasn’t requested. But it’ll make my life a little easier when I get to a word that was requested.

## Homomorphism.

Also, I lied when I said this was an abstract algebra word. At least I was imprecise. The word appears in a fairly wide swath of mathematics. But abstract algebra is where most mathematics majors first encounter it. And the other uses hearken back to this. If you understand what an algebraist means by “homomorphism” then you understand the essence of what someone else means by it.

One of the things mathematicians study a lot is mapping. This is matching the things in one set to things in another set. Most often we want this to be done by some easy-to-understand rule. Why? Well, we often want to understand how one group of things relates to another group. So we set up maps between them. These describe how to match the things in one set to the things in another set. You may think this sounds like it’s just a function. You’re right. I suppose the name “mapping” carries connotations of transforming things into other things that a “function” might not have. And “functions”, I think, suggest we’re working with numbers. “Mappings” sound more abstract, at least to my ear. But it’s just a difference in dialect, not substance.

A homomorphism is a mapping that obeys a couple of rules. What they are depends on the kind of things the homomorphism maps between. I want a simple example, so I’m going to use groups.

A group is made up of two things. One is a set, a collection of elements. For example, take the whole numbers 0, 1, 2, and 3. That’s a good enough set. The second thing in the group is an operation, something to work like addition. For example, we might use “addition modulo 4”. In this scheme, addition (and subtraction) work like they do with ordinary whole numbers. But if the result would be more than 3, we subtract 4 from the result, until we get something that’s 0, 1, 2, or 3. Similarly if the result would be less than 0, we add 4, until we get something that’s 0, 1, 2, or 3. The result is an addition table that looks like this:

+ 0 1 2 3
0 0 1 2 3
1 1 2 3 0
2 2 3 0 1
3 3 0 1 2

So let me call G the group that has as its elements 0, 1, 2, and 3, and that has addition be this modulo-4 addition.

Now I want another group. I’m going to name it H, because the alternative is calling it G2 and subscripts are tedious to put on web pages. H will have a set with the elements 0, 1, 2, 3, 4, 5, 6, and 7. Its addition will be modulo-8 addition, which works the way you might have guessed after looking at the above. But here’s the addition table:

+ 0 1 2 3 4 5 6 7
0 0 1 2 3 4 5 6 7
1 1 2 3 4 5 6 7 0
2 2 3 4 5 6 7 0 1
3 3 4 5 6 7 0 1 2
4 4 5 6 7 0 1 2 3
5 5 6 7 0 1 2 3 4
6 6 7 0 1 2 3 4 5
7 7 0 1 2 3 4 5 6

G and H look a fair bit like each other. Their sets are made up of familiar numbers, anyway. And the addition rules look a lot like what we’re used to.

We can imagine mapping from one to the other pretty easily. At least it’s easy to imagine mapping from G to H. Just match a number in G’s set — say, ‘1’ — to a number in H’s set — say, ‘2’. Easy enough. We’ll do something just as daring in matching ‘0’ to ‘1’, and we’ll map ‘2’ to ‘3’. And ‘3’? Let’s match that to ‘4’. Let me call that mapping f.

But f is not a homomorphism. What makes a homomorphism an interesting map is that the group’s original addition rule carries through. This is easier to show than to explain.

In the original group G, what’s 1 + 2? … 3. That’s easy to work out. But in H, what’s f(1) + f(2)? f(1) is 2, and f(2) is 3. So f(1) + f(2) is 5. But what is f(3)? We set that to be 4. So in this mapping, f(1) + f(2) is not equal to f(3). And so f is not a homomorphism.

Could anything be? After all, G and H have different sets, sets that aren’t even the same size. And they have different addition rules, even if the addition rules look like they should be related. Why should we expect it’s possible to match the things in group G to the things in group H?

Let me show you how they could be. I’m going to define a mapping φ. The letter’s often used for homomorphisms. φ matches things in G’s set to things in H’s set. φ(0) I choose to be 0. φ(1) I choose to be 2. φ(2) I choose to be 4. φ(3) I choose to be 6.

And now look at this … φ(1) + φ(2) is equal to 2 + 4, which is 6 … which is φ(3). Was I lucky? Try some more. φ(2) + φ(2) is 4 + 4, which in the group H is 0. In the group G, 2 + 2 is 0, and φ(0) is … 0. We’re all right so far.

One more. φ(3) + φ(3) is 6 + 6, which in group H is 4. In group G, 3 + 3 is 2. φ(2) is 4.

If you want to test the other thirteen possibilities go ahead. If you want to argue there’s actually only seven other possibilities do that, too. What makes φ a homomorphism is that if x and y are things from the set of G, then φ(x) + φ(y) equals φ(x + y). φ(x) + φ(y) uses the addition rule for group H. φ(x + y) uses the addition rule for group G. Some mappings keep the addition of things from breaking. We call this “preserving” addition.

This particular example is called a group homomorphism. That’s because it’s a homomorphism that starts with one group and ends with a group. There are other kinds of homomorphism. For example, a ring homomorphism is a homomorphism that maps a ring to a ring. A ring is like a group, but it has two operations. One works like addition and the other works like multiplication. A ring homomorphism preserves both the addition and the multiplication simultaneously.

And there are homomorphisms for other structures. What makes them homomorphisms is that they preserve whatever the important operations on the strutures are. That’s typically what you might expect when you are introduced to a homomorphism, whatever the field.

## What Are Equivalence Classes?

(A couple weeks ago I published a little lemma of an essay. This is the next lemma; if my prose style holds out this’ll all lead to something neat.)

If you have the idea of equivalence — that you can pick elements of a set out and say whether they share some property, and that the sharing of that property works in some of the ways that equality works — then you can create equivalence classes. This is the dividing of your original set up into smaller parts according to the rule that everything in one of those parts is equivalent to anything else in that same part.

Since I think about mathematics so much, the most familiar equivalence classes to my mind comes from the counting numbers, that familiar old group of 1, 2, 3, and so on. The equivalence relationship I’d like to use looks a little more alien; it’s “has the same remainder when divided by two as”. But every integer, divided by two, has a remainder of either zero or one, and it’s not hard to follow the divisions here: 4 divided by two has a remainder of zero; 5 divided by two has a remainder of one; 6 divided by two has a remainder of zero; 7 divided by two has a remainder of one; and so on. Using this equivalence relationship, 4 and 6 and 8 and for that matter 2 are in the same class. 3 and 5 and 7 and 9 and so on also share a class, though not the same one that 4 and 6 and 8 had. And, yeah, that’s just the even and the odd numbers, presented in a way that uses much more abstraction than you needed to learn odds and evens.

But we can do this dividing into classes for any set and for any equivalence relationship on that set: dividing a group of people up by those who have the same age; dividing clothes up by those which are the same color; dividing functions up by which ones share some interesting property; whatever you like. There aren’t necessarily just the two equivalence classes. For example, “has the same remainder when divided by four as” will split the counting numbers into four classes, and “is the same age as” will split a group of people up into from as few as one class to as many classes as there are people.

So, why split sets up into equivalence classes? Besides the giddy fun of doing it, the most useful reason I know is that it can often be easier to prove something about a whole set of things if you can break the problem up into smaller ones, proving something about a special case of things. If you need to test something that you know will be the same for all the elements in an equivalence class, you just have to pick one element from that class and test that; that can be a wonderful time-saver.

If you do pick one of the elements of your class that’s called a “class representative”, which is one of mathematics’s less exotic terms. If you’ve picked your representative — let me call it a — and want to talk about the equivalence class that contains it, then that’s normally written with braces around it: [a]. Everything in [a] is equivalent to a, by definition. For the example of odds and evens you could use 1 and 2 — the sets [1] and [2] — although that’s just because we tend to look at nice familiar small numbers when we can. We wouldn’t be doing anything wrong if we wrote the sets as [147] and [2038] instead. We’ve entered a realm without uniquely right answers, just answers that are more attractive because we think they’re easier to work with or we think they look nicer.

But when we write down [147] and [2038] we’ve exhausted the odd-and-even partitioning of the counting numbers. We’ve also written down the quotient set, the collection of all the equivalence classes. This is a set whose elements are themselves sets, which is something a little odd to encounter at first, but not anything too exotic.

Here’s a neat little equivalence relation and quotient set I’d like to toss out for folks to consider. The original set is all the real numbers — positive and negative, rational and irrational. The equivalence relation is “is a whole number different from” — so that, for example, 0, 1, 3, and 35 are all in one equivalence class; 0.5, -3.5, and 147.5 are all in another class together; π, π + 1, π – 8, and &pi + 35 are in yet another class. How many equivalence classes are there for this set and this relation, and, what might a quotient set for them look like?

## Tessellation Using Equilateral Triangles, Isosceles Triangles, Squares, Regular Pentagons, and Equilateral, Non-Convex Octakaitetracontagons

I’m afraid I lack the time to talk about this in more detail today, but, Robert Loves Pi, a geometry-oriented blog, has a lovely tessellation that you might like to see. Tessellations are ways to cover a surface, usually a plane, with an, ideally, small set of a couple pieces infinitely repeated. As a field of mathematics it’s more closely related to kitchen floors than the usual, but it’s also wonderfully artistic, and the study of these patterns brings one into abstract algebra.

In abstract algebra you look at things that work, in some ways, like arithmetic does — you can add and multiply things — without necessarily being arithmetic. The things that you can do to a pattern without changing it — sliding it in some direction, rotating it some angle, maybe reflecting it across some dividing line — can often be added together and multiplied in ways that look strikingly like what you do with regular old numbers, which is part of why this is a field that’s fascinating both when you first look at it and when you get deeply into its study.

In this tessellation, regular polygons have been given the brighter colors, while the two non-regular polygons have pastel colors.

View original post

## Reblog: Free Harvard Course, Abstract Algebra

Here, Gregory Reese points out a reference to another useful course that might have made my undergraduate life a bit easier were there an Internet to speak of in the early 1990s. (These were primitive days, before Google, before Alta Vista, and when we actually put up with xdvi readers and couldn’t have imagined pdf with its tendency to work and user interface that looks like any thought at all was put into it).

In this case Reese is pointing out the Free Harvard course in Abstract Algebra. Abstract Algebra — it gets called just “Algebra” later on, when we’re not worried that undergraduates will think it’s the thing they did in middle school — is kind of what you get by taking the next set in abstracting middle- and high-school algebra.

One of the things that makes algebra a subject important enough to revolutionize thought and to get into middle- and high-school curriculums is the idea that we can do work with a number — add to it, multiply it, divide by it, raise it to powers, take its logarithm, or so — without necessarily having to know what the number is.

In abstract algebra, we consider the things that we do with numbers, in arithmetic — things like adding them, multiplying them, factoring them — and ask, can we do these things with stuff that isn’t numbers? If we put some thought into what these things are, and what we mean by addition and multiplication and such, it turns out we often can. Abstract Algebra is one of the courses that starts on this trail of doing things that look like arithmetic on things which are not numbers.