This is not a proper Reading the Comics post, since there’s nothing mathematical about this. But it does reflect a project I’ve been letting linger for months and that I intend to finish before starting the abbreviated Mathematics A-to-Z for this year.
In the meanwhile. I have a person dear to me who’s learning college algebra. For no reason clear to me this put me in mind of last year’s essay about Extraneous Solutions. These are fun and infuriating friends. They’re created when you follow the rules about how you can rewrite a mathematical expression without changing its value. And yet sometimes you do these rewritings correctly and get a would-be solution that isn’t actually one. So I’d shared some thoughts about why they appear, and what tedious work keeps them from showing up.
Jacob Siehler had several suggestions for this last of the A-to-Z essays for 2020. Zorn’s Lemma was an obvious choice. It’s got an important place in set theory, it’s got some neat and weird implications. It’s got a great name. The zero divisor is one of those technical things mathematics majors have deal with. It never gets any pop-mathematics attention. I picked the less-travelled road and found a delightful scenic spot.
3 times 4 is 12. That’s a clear, unambiguous, and easily-agreed-upon arithmetic statement. The thing to wonder is what kind of mathematics it takes to mess that up. The answer is algebra. Not the high school kind, with x’s and quadratic formulas and all. The college kind, with group theory and rings.
A ring is a mathematical construct that lets you do a bit of arithmetic. Something that looks like arithmetic, anyway. It has a set of elements. (An element is just a thing in a set. We say “element” because it feels weird to call it “thing” all the time.) The ring has an addition operation. The ring has a multiplication operation. Addition has an identity element, something you can add to any element without changing the original element. We can call that ‘0’. The integers, or to use the lingo , are a ring (among other things).
Among the rings you learn, after the integers, is the integers modulo … something. This can be modulo any counting number. The integers modulo 10, for example, we write as for short. There are different ways to think of what this means. The one convenient for this essay is that it’s the integers 0, 1, 2, up through 9. And that the result of any calculation is “how much more than a whole multiple of 10 this calculation would otherwise be”. So then 3 times 4 is now 2. 3 times 5 is 5; 3 times 6 is 8. 3 times 7 is 1, and doesn’t that seem peculiar? That’s part of how modulo arithmetic warns us that groups and rings can be quite strange things.
We can do modulo arithmetic with any of the counting numbers. Look, for example, at instead. In the integers modulo 5, 3 times 4 is … 2. This doesn’t seem to get us anything new. How about ? In this, 3 times 4 is 4. That’s interesting. It doesn’t make 3 the multiplicative identity for this ring. 3 times 3 is 1, for example. But you’d never see something like that for regular arithmetic.
How about ? Now we have 3 times 4 equalling 0. And that’s a dramatic break from how regular numbers work. One thing we know about regular numbers is that if a times b is 0, then either a is 0, or b is zero, or they’re both 0. We rely on this so much in high school algebra. It’s what lets us pick out roots of polynomials. Now? Now we can’t count on that.
When this does happen, when one thing times another equals zero, we have “zero divisors”. These are anything in your ring that can multiply by something else to give 0. Is zero, the additive identity, always a zero divisor? … That depends on what the textbook you first learned algebra from said. To avoid ambiguity, you can write a “nonzero zero divisor”. This clarifies your intentions and slows down your copy editing every time you read “nonzero zero”. Or call it a “nontrivial zero divisor” or “proper zero divisor” instead. My preference is to accept 0 as always being a zero divisor. We can disagree on this. What of zero divisors other than zero?
Your ring might or might not have them. It depends on the ring. The ring of integers , for example, doesn’t have any zero divisors except for 0. The ring of integers modulo 12 , though? Anything that isn’t relatively prime to 12 is a zero divisor. So, 2, 3, 6, 8, 9, and 10 are zero divisors here. The ring of integers modulo 13 ? That doesn’t have any zero divisors, other than zero itself. In fact any ring of integers modulo a prime number, , lacks zero divisors besides 0.
Focusing too much on integers modulo something makes zero divisors sound like some curious shadow of prime numbers. There are some similarities. Whether a number is prime depends on your multiplication rule and what set of things it’s in. Being a zero divisor in one ring doesn’t directly relate to whether something’s a zero divisor in any other. Knowing what the zero divisors are tells you something about the structure of the ring.
It’s hard to resist focusing on integers-modulo-something when learning rings. They work very much like regular arithmetic does. Even the strange thing about them, that every result is from a finite set of digits, isn’t too alien. We do something quite like it when we observe that three hours after 10:00 is 1:00. But many sets of elements can create rings. Square matrixes are the obvious extension. Matrixes are grids of elements, each of which … well, they’re most often going to be numbers. Maybe integers, or real numbers, or complex numbers. They can be more abstract things, like rotations or whatnot, but they’re hard to typeset. It’s easy to find zero divisors in matrixes of numbers. Imagine, like, a matrix that’s all zeroes except for one element, somewhere. There are a lot of matrices which, multiplied by that, will be a zero matrix, one with nothing but zeroes in it. Another common kind of ring is the polynomials. For these you need some constraint like the polynomial coefficients being integers-modulo-something. You can make that work.
In 1988 Istvan Beck tried to establish a link between graph theory and ring theory. We now have a usable standard definition of one. If is any ring, then is the zero-divisor graph of . (I know some of you think is the real numbers. No; that’s a bold-faced instead. Unless that’s too much bother to typeset.) You make the graph by putting in a vertex for the elements in . You connect two vertices a and b if the product of the corresponding elements is zero. That is, if they’re zero divisors for one other. (In Beck’s original form, this included all the elements. In modern use, we don’t bother including the elements that are not zero divisors.)
Drawing this graph makes tools from graph theory available to study rings. We can measure things like the distance between elements, or what paths from one vertex to another exist. What cycles — paths that start and end at the same vertex — exist, and how large they are. Whether the graphs are bipartite. A bipartite graph is one where you can divide the vertices into two sets, and every edge connects one thing in the first set with one thing in the second. What the chromatic number — the minimum number of colors it takes to make sure no two adjacent vertices have the same color — is. What shape does the graph have?
And this lets me complete a cycle in this year’s A-to-Z, to my delight. There is an important question in topology which group theory could answer. It’s a generalization of the zero-divisors conjecture, a hypothesis about what fits in a ring based on certain types of groups. This hypothesis — actually, these hypotheses. There are a bunch of similar questions about invariants called the L2-Betti numbers can be. These we call the Atiyah Conjecture. This because of work Michael Atiyah did in the cohomology of manifolds starting in the 1970s. It’s work, I admit, I don’t understand well enough to summarize, and hope you’ll forgive me for that. I’m still amazed that one can get to cutting-edge mathematics research this. It seems, at its introduction, to be only a subversion of how we find x for which .
Nobody had particular suggestions for the letter ‘Y’ this time around. It’s a tough letter to find mathematical terms for. It doesn’t even lend itself to typography or wordplay the way ‘X’ does. So I chose to do one more biographical piece before the series concludes. There were twists along the way in writing.
Several problems beset me in writing about this significant 13th-century Chinese mathematician. One is my ignorance of the Chinese mathematical tradition. I have little to guide me in choosing what tertiary sources to trust. Another is that the tertiary sources know little about him. The Complete Dictionary of Scientific Biography gives a dire verdict. “Nothing is known about the life of Yang Hui, except that he produced mathematical writings”. MacTutor’s biography gives his lifespan as from circa 1238 to circa 1298, on what basis I do not know. He seems to have been born in what’s now Hangzhou, near Shanghai. He seems to have worked as a civil servant. This is what I would have imagined; most scholars then were. It’s the sort of job that gives one time to write mathematics. Also he seems not to have been a prominent civil servant; he’s apparently not listed in any dynastic records. After that, we need to speculate.
E F Robertson, writing the MacTutor biography, speculates that Yang Hui was a teacher. That he was writing to explain mathematics in interesting and helpful ways. I’m not qualified to judge Robertson’s conclusions. And Robertson notes that’s not inconsistent with Yang being a civil servant. Robertson’s argument is based on Yang’s surviving writings, and what they say about the demonstrated problems. There is, for example, 1274’s Cheng Chu Tong Bian Ben Mo. Robertson translates that title as Alpha and omega of variations on multiplication and division. I try to work out my unease at having something translated from Chinese as “Alpha and Omega”. That is my issue. Relevant here is that a syllabus prefaces the first chapter. It provides a schedule and series of topics, as well as a rationale for why this plan.
Was Yang Hui a discoverer of significant new mathematics? Or did he “merely” present what was already known in a useful way? This is not to dismiss him; we have the same questions about Euclid. He is held up as among the great Chinese mathematicians of the 13th century, a particularly fruitful time and place for mathematics. How much greatness to assign to original work and how much to good exposition is unanswerable with what we know now.
Consider for example the thing I’ve featured before, Yang Hui’s Triangle. It’s the arrangement of numbers known in the west as Pascal’s Triangle. Yang provides the earliest extant description of the triangle and how to form it and use it. This in the 1261 Xiangjie jiuzhang suanfa (Detailed analysis of the mathematical rules in the Nine Chapters and their reclassifications). But in it, Yang Hui says he learned the triangle from a treatise by Jia Xian, Huangdi Jiuzhang Suanjing Xicao (The Yellow Emperor’s detailed solutions to the Nine Chapters on the Mathematical Art). Jia Xian lived in the 11th century; he’s known to have written two books, both lost. Yang Hui’s commentary gives us a fair idea what Jia Xian wrote about. But we’re limited in judging what was Jia Xian’s idea and what was Yang Hui’s inference or what.
The Nine Chapters referred to is Jiuzhang suanshu. An English title is Nine Chapters on the Mathematical Art. The book is a 246-problem handbook of mathematics that dates back to antiquity. It’s impossible to say when the Nine Chapters was first written. Liu Hui, who wrote a commentary on the Nine Chapters in 263 CE, thought it predated the Qin ruler Shih Huant Ti’s 213 BCE destruction of all books. But the book — and the many commentaries on the book — served as a centerpiece for Chinese mathematics for a long while. Jia Xian’s and Yang Hui’s work was part of this tradition.
Yang Hui’s Detailed Analysis covers the Nine Chapters. It goes on for three chapters, more about geometry and fundamentals of mathematics. Even how to classify the problems. He had further works. In 1275 Yang published Practical mathematical rules for surveying and Continuation of ancient mathematical methods for elucidating strange properties of numbers. (I’m not confident in my ability to give the Chinese titles for these.) The first title particularly echoes how in the Western tradition geometry was born of practical concerns.
The breadth of topics covers, it seems to me, a decent modern (American) high school mathematics education. The triangle, and the binomial expansions it gives us, fit that. Yang writes about more efficient ways to multiply on the abacus. He writes about finding simultaneous solutions to sets of equations. And through a technique that amounts to finding the matrix of coefficients for the equations, and its determinant. He writes about finding the roots for cubic and quartic equations. The technique is commonly known in the west as Horner’s Method, a technique of calculating divided differences. We see the calculating of areas and volumes for regular shapes.
And sequences. He found the sum of the squares of natural numbers followed a rule:
And then there’s magic squares, and magic circles. He seems to have found them, as professional mathematicians today would, good ways to interest people in calculation. Not magic; he called them something like number diagrams. But he gives magic squares from three-by-three all the way to ten-by-ten. We don’t know of earlier examples of Chinese mathematicians writing about the larger magic squares. But Yang Hui doesn’t claim to be presenting new work. He also gives magic circles. The simplest is a web of seven intersecting circles, each with four numbers along the circle and one at its center. The sum of the center and the circumference numbers are 65 for all seven circles. Is this significant? No; merely fun.
Grant this breadth of work. Is he significant? I learned this year that familiar names might have been obscure until quite recently. The record is once again ambiguous. Other mathematicians wrote about Yang Hui’s work in the early 1300s. Yang Hui’s works were printed in China in 1378, says the Complete Dictionary of Scientific Biography, and reprinted in Korea in 1433. They’re listed in a 1441 catalogue of the Ming Imperial Library. Seki Takakazu, a towering figure in 17th century Japanese mathematics, copied the Korean text by hand. Yet Yang Hui’s work seems to have been lost by the 18th century. Reconstructions, from commentaries and encyclopedias, started in the 19th century. But we don’t have everything we know he wrote. We don’t even have a complete text of Detailed Analysis. This is not to say he wasn’t influential. All I could say is there seems to have been a time his influence was indirect.
We learn to count permutations before we know what they are. There are good reasons to. Counting permutations gives us numbers that are big, and therefore interesting, fast. Counting is easy to motivate. Humans like counting. Counting is useful. Many probability questions are best answered by counting all the ways to arrange things, and how many of those arrangements are desirable somehow.
The count of permutations asks how many ways there are to put some things in order. If some of the things are identical, the number is smaller. Calculating the count may be a little tedious, but it’s not hard. We calculate, rather than “really” count, because — well, list all the possible ways to arrange the letters of the word ‘DEMONSTRATION’. I bet you turn that listing over to a computer too. But what is the computer counting?
If we’re trying to do this efficiently we have some system. Start with ‘DEMONSTRATION’. Then, say, swap the last two letters: ‘DEMONSTRATINO’. Then, mm, move the ‘N’ to the antepenultimate position: ‘DEMONSTRATNIO’. Then, oh, swap the last two letters again: ‘DEMONSTRATNOI’.
Then, oh, move the ‘N’ to the third-to-the-last position: ‘DEMONSTRANTIO’. What next? Oh, swap the last two letters again: ‘DEMONSTRANTOI’. Or, move what been the last letter to the antepenultimate position: ‘DEMONSTRANOTI’. And swap the last two letters once more: ‘DEMONSTRANOIT’.
Enough of that, you and my spellchecker say. I agree. What is it that all this is doing? What does that tell us about what a permutation is?
An obvious thing. Each new variation of the order came from swapping two letters of an earlier one. We needed a sequence of swaps to get to ‘DEMONSTRANOIT’. But each swap was of only two things. It’s a good thing to observe.
Another obvious thing. There’s no letters in ‘DEMONSTRANOIT’ or any of the other variations that weren’t in ‘DEMONSTRATION’. All that’s changed is the order.
This all has listed eight permutations, counting the original ‘DEMONSTRATION’ as one. There are, calculations tell me, 778,377,592 to go.
Would the number of permutations be different if we were shuffling around different things? If instead of the letters in the word ‘DEMONSTRATION’ it were, say, the numerals in the sequence ‘1234567897045’? Or the sequence of symbols ‘!@#$%^&*(&)$%’ instead? No, and that it would not is another clue about what permutations are.
Another thing, obvious in retrospect. Grant that we’ve been making new permutations by taking a sequence of letters (numerals, symbols) and swapping a pair. We got from ‘DEMONSTRATION’ to ‘DEMONSTRATINO’ by swapping the last two letters. What happens if we swap the last two letters again? We get ‘DEMONSTRATION’, a sequence of letters all right, although one already on our list of permutations.
One more thing, obvious once you’ve seen it. Imagine we had not started with ‘DEMONSTRATION’ but instead ‘DEMONSTRATNIO’. But that we followed the same sequences of swappings. Would we have come up with different permutations? … At least for the first couple permutations? Or would they be the same permutations, listed in a different order?
You’ve been kind, letting me call these things “permutations” before I say what a permutation is. It’s relied on a casual, intuitive idea of a permutation. It’s a shuffling around of some set of things. This is the casual idea that mathematicians rely on for a permutation. Sure we can make the idea precise. How hard will that be?
It’s not hard in form. The permutation is the rearranging of things into a new order. The hard part is the concept. It’s not “these symbols in this order” that’s the permutation. It’s the act of putting them in this new order that is. So it’s “swap the 12th and the 13th symbols”. Or, “move the 13th symbol to 11th place, the 11th symbol to 12th, and the 12th symbol to 13th place”.
So one permutation is “swap the 12th and the 13th elements”. Another permutation is “swap the 11th and the 12th elements”. Since the range of one function is the domain of another, we can compose the together. That is, we can “swap the 12th and the 13th elements, and then swap the 11th and the 12th elements”. This gets us another permutation. The effect of these two permutations, in this order, is “make the 13th element the 11th, make the 11th element the 12th, and make the 12th element the 13th”. The order we do these permutations in counts. “Swap the 11th and the 12th elements, and then swap the 12th and the 13th” gets us a different net effect. That one is “make the 12th element the 11th, make the 13th element the 12th, and make the 11th element the 13th”. Composition of functions does not commute.
That functions compose is normal enough. That their composition doesn’t commute is normal enough too. These functions are a bit odd in that we don’t care what the domain-and-range is. We only care that we can index the elements in it. That leads us to some new observations.
The big one is that the set of all these permutations is a group. I mean the way mathematicians mean group. That is, we have a set of items. These are the functions, the permutations. The instructions, like, “make the 12th element the 11th and the 13th element the 12th”, or “the 12th element the 13th”. We also need a group action, a thing that works like addition does for real numbers. That’s composition. That is, doing one permutation and then the other, to get a new permutation out of it. That new permutation is itself one of the permutations we’d had. We can’t compose permutations and get something that’s not a permutation. No amount of swapping around the letters of ‘DEMONSTRATION’ will get us ‘DEMONSTRATIONERS’.
When we talk about how permutations as a group work, we want to give individual permutations names. That ends up being letters. These are often Greek letters. I don’t know why we can’t use the ordinary Latin alphabet. I suppose someone who liked Greek letters wrote a really good textbook and everyone copies that. So instead of speaking about x and y, we’ll get α and β. Sometimes σ and τ. Or, quite often π, especially if we need a bunch of permutations. Then we get π1, π2, π3, and so on. πj. All the way to πN. For the young mathematics major it might be the first time seeing π used for something not at all circle-related. It’s a weird sensation. Still, αβ is the composition of permutation α with permutation β. This means, do permutation β first, and then permutation α on whatever that result is. This is the same way that f(g(x)) means “evaluate g(x) first, and then figure out what f( that ) is”.
That’s all fine for naming them. But we would also like a good way to describe what a permutation does. There are several good forms. They all rely on indexing the elements, using the counting numbers: 1, 2, 3, 4, and so on. The notation I’ll share is called cycle notation. It’s easy to type. You write it within nice ordinary parentheses: (11 12) means “put the 11th element in slot 12, and the 12th element in slot 11”. (11, 12, 13) means “put the 11th element in slot 12, the 12th element in slot 13, and the 13th element in slot 11”. You can even chain these together: (10, 11)(12, 13) means “put the 10th element in slot 11 and the 11th element in slot 10; also, put the 12th element in slot 13, and the 13th element in slot 12”.
In that notation, writing (9), for example, means “put the 9th element in slot 9”. Or if you prefer, “leave element 9 alone”. Or we don’t mention it at all. The convention is that if something isn’t mentioned, leave it where it is.
This by the way is where we get the identity element. The permutation (1)(2)(3)(4)(etc) doesn’t actually swap anything. It counts as a permutation. Doing this is the equivalent of adding zero to a number.
This cycle notation makes it not hard to figure out the composition of permutations. What does (1 2)(1 3) do? Well, the (1 3) swaps the first and the third items. The (1 2), next, swaps what’s become the first and the second items. The effect is the same as the permutation (2 3 1). You can get pretty good at this sort of manipulation, in time.
You may also consider: if (1 2)(1 3) is the same as (2 3 1), then isn’t (2 3 1) the same as (1 2)(1 3)? Sure. But, like, can we write a longer permutation, like, (1 3 5 2 4), as the product of some smaller permutations? And we can. If it’s convenient, we can write it as a string of swaps, exchanging pairs of elements. This was the first “obvious” thing I had listed. A long enough chain of pairwise swaps will, in time, swap everything.
We call the group made of all these permutations the Symmetric Group of the set. Since it doesn’t matter what the underlying set is, just the number of elements in it, we can abbreviate this with the number of elements. S2. S4. SN. Symmetric Groups are among the first groups you meet in abstract algebra that aren’t, like, integers modulo 12 or symmetries of a triangle. It’s novel enough to be interesting and to not be completely sure you’re doing it right.
You never leave the Symmetric Group, though, not if you stay in algebra. It has powerful consequences. It ties, for example, into the roots of polynomials. The structure of S5 tells us there must exist fifth-degree polynomials we can’t solve by ordinary arithmetic and root operations. That is, there’s no version of the quadratic equation for high-order polynomials, and never can be.
There are more groups to build from permutations. The next one that you meet in Intro to Abstract Algebra is the Alternating Group. It’s made of only the even permutations. Those are the permutations made from an even number of swaps. (There are also odd permutations, which are what you imagine. They can’t make a group, though. No identity element.) They’re great for recapturing dread and uncertainty once you think you’ve got a handle on the Symmetric Group.
They lead to other groups too, and even rings. The Levi-Civita symbol describes whether a set of indices gives an even or an odd permutation (or neither). It makes life easier when we work on determinants and tensors and Jacobians. These tie in to the geometry of space, and how that affects physics. It also gets a supporting role in cross products. There are many cryptography schemes that have permutations at their core.
So this is a bit of what permutations are, and what they can get us.
I should have gone with Vayuputrii’s proposal that I talk about the Kronecker Delta. But both Jacob Siehler and Mr Wu proposed K-Theory as a topic. It’s a big and an important one. That was compelling. It’s also a challenging one. This essay will not teach you K-Theory, or even get you very far in an introduction. It may at least give some idea of what the field is about.
This is a difficult topic to discuss. It’s an important theory. It’s an abstract one. The concrete examples are either too common to look interesting or are already deep into things like “tangent bundles of Sn-1”. There are people who find tangent bundles quite familiar concepts. My blog will not be read by a thousand of them this month. Those who are familiar with the legends grown around Alexander Grothendieck will nod on hearing he was a key person in the field. Grothendieck was of great genius, and also spectacular indifference to practical mathematics. Allegedly he once, pressed to apply something to a particular prime number for an example, proposed 57, which is not prime. (One does not need to be a genius to make a mistake like that. If I proposed 447 or 449 as prime numbers, how long would you need to notice I was wrong?)
K-Theory predates Grothendieck. Now that we know it’s a coherent mathematical idea we can find elements leading to it going back to the 19th century. One important theorem has Bernhard Riemann’s name attached. Henri Poincaré contributed early work too. Grothendieck did much to give the field a particular identity. Also a name, the K coming from the German Klasse. Grothendieck pioneered what we now call Algebraic K-Theory, working on the topic as a field of abstract algebra. There is also a Topological K-Theory, early work on which we thank Michael Atiyah and Friedrick Hirzebruch for. Topology is, popularly, thought of as the mathematics of flexible shapes. It is, but we get there from thinking about relationships between sets, and these are the topologies of K-Theory. We understand these now as different ways of understandings structures.
You find at the center of K-Theory either “coherent sheaves” or “vector bundles”. Which alternative depends on whether you prefer Algebraic or Topological K-Theory. Both alternatives are ways to encode information about the space around a shape. Let me talk about vector bundles because I find that easier to describe. Take a shape, anything you like. A closed ribbon. A torus. A Möbius strip. Draw a curve on it. Every point on that curve has a tangent plane, the plane that just touches your original shape, and that’s guaranteed to touch your curve at one point. What are the directions you can go in that plane? That collection of directions is a fiber bundle — a tangent bundle — at that point. (As ever, do not use this at your thesis defense for algebraic topology.)
Now: what are all the tangent bundles for all the points along that curve? Does their relationship tell you anything about the original curve? The question is leading. If their relationship told us nothing, this would not be a subject anyone studies. If you pick a point on the curve and look at its tangent bundle, and you move that point some, how does the tangent bundle change?
Why create such a thing? The usual reasons. Often it turns out calculating something is easier on the associated ring than it is on the original space. What are we looking to calculate? Typically, we’re looking for invariants. Things that are true about the original shape whatever ways it might be rotated or stretched or twisted around. Invariants can be things as basic as “the number of holes through the solid object”. Or they can be as ethereal as “the total energy in a physics problem”. Unfortunately if we’re looking at invariants that familiar, K-Theory is probably too much overhead for the problem. I confess to feeling overwhelmed by trying to learn enough to say what it is for.
There are some big things which it seems well-suited to do. K-Theory describes, in its way, how the structure of a set of items affects the functions it can have. This links it to modern physics. The great attention-drawing topics of 20th century physics were quantum mechanics and relativity. They still are. The great discovery of 20th century physics has been learning how much of it is geometry. How the shape of space affects what physics can be. (Relativity is the accessible reflection of this.)
And so K-Theory comes to our help in string theory. String theory exists in that grand unification where mathematics and physics and philosophy merge into one. I don’t toss philosophy into this as an insult to philosophers or to string theoreticians. Right now it is very hard to think of ways to test whether a particular string theory model is true. We instead ponder what kinds of string theory could be true, and how we might someday tell whether they are. When we ask what things could possibly be true, and how to tell, we are working for the philosophy department.
My reading tells me that K-Theory has been useful in condensed matter physics. That is, when you have a lot of particles and they interact strongly. When they act like liquids or solids. I can’t speak from experience, either on the mathematics or the physics side.
I can talk about an interesting mathematical application. It’s described in detail in section 2.3 of Allen Hatcher’s text Vector Bundles and K-Theory, here. It comes about from consideration of the Hopf invariant, named for Heinz Hopf for what I trust are good reasons. It also comes from consideration of homomorphisms. A homomorphism is a matching between two sets of things that preserves their structure. This has a precise definition, but I can make it casual. If you have noticed that, every (American, hourlong) late-night chat show is basically the same? The host at his desk, the jovial band leader, the monologue, the show rundown? Two guests and a band? (At least in normal times.) Then you have noticed the homomorphism between these shows. A mathematical homomorphism is more about preserving the products of multiplication. Or it preserves the existence of a thing called the kernel. That is, you can match up elements and how the elements interact.
What’s important is Adams’ Theorem of the Hopf Invariant. I’ll write this out (quoting Hatcher) to give some taste of K-Theory:
The following statements are true only for n = 1, 2, 4, and 8:
a. is a division algebra.
b. is parallelizable, ie, there exist n – 1 tangent vector fields to which are linearly independent at each point, or in other words, the tangent bundle to is trivial.
This is, I promise, low on jargon. “Division algebra” is familiar to anyone who did well in abstract algebra. It means a ring where every element, except for zero, has a multiplicative inverse. That is, division exists. “Linearly independent” is also a familiar term, to the mathematician. Almost every subject in mathematics has a concept of “linearly independent”. The exact definition varies but it amounts to the set of things having neither redundant nor missing elements.
The proof from there sprawls out over a bunch of ideas. Many of them I don’t know. Some of them are simple. The conditions on the Hopf invariant all that stuff eventually turns into finding values of n for for which divides . There are only three values of ‘n’ that do that. For example.
What all that tells us is that if you want to do something like division on ordered sets of real numbers you have only a few choices. You can have a single real number, . Or you can have an ordered pair, . Or an ordered quadruple, . Or you can have an ordered octuple, . And that’s it. Not that other ordered sets can’t be interesting. They will all diverge far enough from the way real numbers work that you can’t do something that looks like division.
And now we come back to the running theme of this year’s A-to-Z. Real numbers are real numbers, fine. Complex numbers? We have some ways to understand them. One of them is to match each complex number with an ordered pair of real numbers. We have to define a more complicated multiplication rule than “first times first, second times second”. This rule is the rule implied if we come to through this avenue of K-Theory. We get this matching between real numbers and the first great expansion on real numbers.
The next great expansion of complex numbers is the quaternions. We can understand them as ordered quartets of real numbers. That is, as . We need to make our multiplication rule a bit fussier yet to do this coherently. Guess what fuss we’d expect coming through K-Theory?
seems the odd one out; who does anything with that? There is a set of numbers that neatly matches this ordered set of octuples. It’s called the octonions, sometimes called the Cayley Numbers. We don’t work with them much. We barely work with quaternions, as they’re a lot of fuss. Multiplication on them doesn’t even commute. (They’re very good for understanding rotations in three-dimensional space. You can also also use them as vectors. You’ll do that if your programming language supports quaternions already.) Octonions are more challenging. Not only does their multiplication not commute, it’s not even associative. That is, if you have three octonions — call them p, q, and r — you can expect that p times the product of q-and-r would be different from the product of p-and-q times r. Real numbers don’t work like that. Complex numbers or quaternions don’t either.
Octonions let us have a meaningful division, so we could write out and know what it meant. We won’t see that for any bigger ordered set of . And K-Theory is one of the tools which tells us we may stop looking.
This is hardly the last word in the field. It’s barely the first. It is at least an understandable one. The abstractness of the field works against me here. It does offer some compensations. Broad applicability, for example; a theorem tied to few specific properties will work in many places. And pure aesthetics too. Much work, in statements of theorems and their proofs, involve lovely diagrams. You’ll see great lattices of sets relating to one another. They’re linked by chains of homomorphisms. And, in further aesthetics, beautiful words strung into lovely sentences. You may not know what it means to say “Pontryagin classes also detect the nontorsion in outside the stable range”. I know I don’t. I do know when I hear a beautiful string of syllables and that is a joy of mathematics never appreciated enough.
These days I’ve been preparing these comics posts by making a note of every comic that seems like it might have a mathematical topic. Then at the end of the week I go back and re-read them all and think what I could write something about. This past week’s had two that seemed like nice juicy topics. And then I was busy all day Saturday so didn’t have time to put the thought into them that they needed. So instead I offer some comic strips with at least mentions of mathematical subjects. If they’re not tightly on point, well, I need to post something, don’t I?
Jeffrey Caulfield and Brian Ponshock’s Yaffle for the 24th is the anthropomorphic numerals joke for the week. It did get me thinking about the numbers which (in English) are homophones to other words. There don’t seem to be many, though: one, two, four, six, and eight seem to be about all I could really justify. There’s probably dialects where “ten” and “tin” blend together. There’s probably a good Internet Argument to be had about whether “couple” should be considered the name of a number. That there aren’t more is probably that there, in a sense, only a couple of names for numbers, with a scheme to compound names for a particular number of interest.
Scott Hilburn’s The Argyle Sweater for the 25th mentions algebra, but is mostly aimed at the Reading the Comics for some historian blogger. I kind of admire Hilburn’s willingness to go for the 70-year-old scandal for a day’s strip. But a daily strip demands a lot of content, especially when it doesn’t have recurring characters. The quiz answers as given are correct, and that’s easy to check. But it is typically easy to check whether a putative answer is correct. Finding an answer is the hard part.
Daniel Shelton’s Ben for the 25th has a four-year-old offering his fingers as a way to help his older brother with mathematics work. Counting on fingers can be a fine way to get the hang of arithmetic and at least I won’t fault someone for starting there. Eventually, do enough arithmetic, and you stop matching numbers with fingers because that adds an extra layer of work that doesn’t do anything but slow you down.
Catching my interest though is that Nicholas (the eight-year-old, and I had to look that up on the Ben comic strip web site; GoComics doesn’t have a cast list) had worked out 8 + 6, but was struggling with 7 + 8. He might at some point get experienced enough to realize that 7 + 8 has to be the same thing as 8 + 7, which has to be the same thing as 8 + 6 + 1. And if he’s already got 8 + 6 nailed down, then 7 + 8 is easy. But that takes using a couple of mathematical principles — that addition commutes, that you can substitute one quantity with something equal to it, that you addition associates — and he might not see where those principles get him any advantage over some other process.
Ed Allison’s Unstrange Phenomena for the 25th builds its Dadaist nonsense for the week around repeating numbers. I learn from trying to pin down just what Allison means by “repeating numbers” that there are people who ascribe mystical significance to, say, “444”. Well, if that helps you take care of the things you need to do, all right. Repeating decimals are a common enough thing. They appear in the decimal expressions for rational numbers. These expressions either terminate — they have finitely many digits and then go to an infinitely long sequence of 0’s — or they repeat. (We rule out “repeating nothing but zeroes” because … I don’t know. I would guess it makes the proofs in some corner of number theory less bothersome.)
You could also find interesting properties about numbers made up of repeating strings of numerals. For example, write down any number of 9’s you like, followed by a 6. The number that creates is divisible by 6. I grant this might not be the most important theorem you’ll ever encounter, but it’s a neat one. Like, a strong of 4’s followed by a 9 is not necessarily divisible by 4 or 9. There are bunches of cute little theorem like this, mostly good for making one admit that huh, there’s some neat coincidences(?) about numbers.
Although … Allison’s strip does seem to get at seeing particular numbers over and over. This does happen; it’s probably a cultural thing. One of the uses we put numbers to is indexing things. So, for example, a TV channel gets a number and while the station may have a name, it makes for an easier control to set the TV to channel numbered 5 or whatnot. We also use numbers to measure things. When we do, we get to pick the size of our units. We typically pick them so our measurements don’t have to be numbers too big or too tiny. There’s no reason we couldn’t measure the distance between cities in millimeters, or the length of toes in light-years. But to try is to look like you’re telling a joke. So we get see some ranges — 1 to 5, 1 to 10 — used a lot when we don’t need fine precision. We see, like, 1 to 100 for cases where we need more precision than that but don’t have to pin a thing down to, like, a quarter of a percent. Numbers will spill past these bounds, naturally. But we are more likely to encounter a 20 than a 15,642. We set up how we think about numbers so we are. So maybe it would look like some numbers just follow you.
Nobody had a suggested topic starting with ‘W’ for me! So I’ll take that as a free choice, and get lightly autobiogrpahical.
Witch of Agnesi.
I know I encountered the Witch of Agnesi while in middle school. Eighth grade, if I’m not mistaken. It was a footnote in a textbook. I don’t remember much of the textbook. What I mostly remember of the course was how much I did not fit with the teacher. The only relief from boredom that year was the month we had a substitute and the occasional interesting footnote.
It was in a chapter about graphing equations. That is, finding curves whose points have coordinates that satisfy some equation. In a bit of relief from lines and parabolas the footnote offered this:
In a weird tantalizing moment the footnote didn’t offer a picture. Or say what an ‘a’ was doing in there. In retrospect I recognize ‘a’ as a parameter, and that different values of it give different but related shapes. No hint what the ‘8’ or the ‘4’ were doing there. Nor why ‘a’ gets raised to the third power in the numerator or the second in the denominator. I did my best with the tools I had at the time. Picked a nice easy boring ‘a’. Picked out values of ‘x’ and found the corresponding ‘y’ which made the equation true, and tried connecting the dots. The result didn’t look anything like a witch. Nor a witch’s hat.
It was one of a handful of biographical notes in the book. These were a little attempt to add some historical context to mathematics. It wasn’t much. But it was an attempt to show that mathematics came from people. Including, here, from Maria Gaëtana Agnesi. She was, I’m certain, the only woman mentioned in the textbook I’ve otherwise completely forgotten.
We have few names of ancient mathematicians. Those we have are often compilers like Euclid whose fame obliterated the people whose work they explained. Or they’re like Pythagoras, credited with discoveries by people who obliterated their own identities. In later times we have the mathematics done by, mostly, people whose social positions gave them time to write mathematics results. So we see centuries where every mathematician is doing it as their side hustle to being a priest or lawyer or physician or combination of these. Women don’t get the chance to stand out here.
Today of course we can name many women who did, and do, mathematics. We can name Emmy Noether, Ada Lovelace, and Marie-Sophie Germain. Challenged to do a bit more, we can offer Florence Nightingale and Sofia Kovalevskaya. Well, and also Grace Hopper and Margaret Hamilton if we decide computer scientists count. Katherine Johnson looks likely to make that cut. But in any case none of these people are known for work understandable in a pre-algebra textbook. This must be why Agnesi earned a place in this book. She’s among the earliest women we can specifically credit with doing noteworthy mathematics. (Also physics, but that’s off point for me.) Her curve might be a little advanced for that textbook’s intended audience. But it’s not far off, and pondering questions like “why ? Why not ?” is more pleasant, to a certain personality, than pondering what a directrix might be and why we might use one.
The equation might be a lousy way to visualize the curve described. The curve is one of that group of interesting shapes you get by constructions. That is, following some novel process. Constructions are fun. They’re almost a craft project.
For this we start with a circle. And two parallel tangent lines. Without loss of generality, suppose they’re horizontal, so, there’s lines at the top and the bottom of the curve.
Take one of the two tangent points. Again without loss of generality, let’s say the bottom one. Draw a line from that point over to the other line. Anywhere on the other line. There’s a point where the line you drew intersects the circle. There’s another point where it intersects the other parallel line. We’ll find a new point by combining pieces of these two points. The point is on the same horizontal as wherever your line intersects the circle. It’s on the same vertical as wherever your line intersects the other parallel line. This point is on the Witch of Agnesi curve.
Now draw another line. Again, starting from the lower tangent point and going up to the other parallel line. Again it intersects the circle somewhere. This gives another point on the Witch of Agnesi curve. Draw another line. Another intersection with the circle, another intersection with the opposite parallel line. Another point on the Witch of Agnesi curve. And so on. Keep doing this. When you’ve drawn all the lines that reach from the tangent point to the other line, you’ll have generated the full Witch of Agnesi curve. This takes more work than writing out , yes. But it’s more fun. It makes for neat animations. And I think it prepares us to expect the shape of the curve.
It’s a neat curve. Between it and the lower parallel line is an area four times that of the circle that generated it. The shape is one we would get from looking at the derivative of the arctangent. So there’s some reasons someone working in calculus might find it interesting. And people did. Pierre de Fermat studied it, and found this area. Isaac Newton and Luigi Guido Grandi studied the shape, using this circle-and-parallel-lines construction. Maria Agnesi’s name attached to it after she published a calculus textbook which examined this curve. She showed, according to people who present themselves as having read her book, the curve and how to find it. And she showed its equation and found the vertex and asymptote line and the inflection points. The inflection points, here, are where the curve chances from being cupped upward to cupping downward, or vice-versa.
It’s a neat function. It’s got some uses. It’s a natural smooth-hill shape, for example. So this makes a good generic landscape feature if you’re modeling the flow over a surface. I read that solitary waves can have this curve’s shape, too.
And the curve turns up as a probability distribution. Take a fixed point. Pick lines at random that pass through this point. See where those lines reach a separate, straight line. Some regions are more likely to be intersected than are others. Chart how often any particular line is the new intersection point. That chart will (given some assumptions I ask you to pretend you agree with) be a Witch of Agnesi curve. This might not surprise you. It seems inevitable from the circle-and-intersecting-line construction process. And that’s nice enough. As a distribution it looks like the usual Gaussian bell curve.
It’s different, though. And it’s different in strange ways. Like, for a probability distribution we can find an expected value. That’s … well, what it sounds like. But this is the strange probability distribution for which the law of large numbers does not work. Imagine an experiment that produces real numbers, with the frequency of each number given by this distribution. Run the experiment zillions of times. What’s the mean value of all the zillions of generated numbers? And it … doesn’t … have one. I mean, we know it ought to, it should be the center of that hill. But the calculations for that don’t work right. Taking a bigger sample makes the sample mean jump around more, not less, the way every other distribution should work. It’s a weird idea.
Imagine carving a block of wood in the shape of this curve, with a horizontal lower bound and the Witch of Agnesi curve as the upper bound. Where would it balance? … The normal mathematical tools don’t say, even though the shape has an obvious line of symmetry. And a finite area. You don’t get this kind of weirdness with parabolas.
(Yes, you’ll get a balancing point if you actually carve a real one. This is because you work with finitely-long blocks of wood. Imagine you had a block of wood infinite in length. Then you would see some strange behavior.)
It teaches us more strange things, though. Consider interpolations, that is, taking a couple data points and fitting a curve to them. We usually start out looking for polynomials when we interpolate data points. This is because everything is polynomials. Toss in more data points. We need a higher-order polynomial, but we can usually fit all the given points. But sometimes polynomials won’t work. A problem called Runge’s Phenomenon can happen, where the more data points you have the worse your polynomial interpolation is. The Witch of Agnesi curve is one of those. Carl Runge used points on this curve, and trying to fit polynomials to those points, to discover the problem. More data and higher-order polynomials make for worse interpolations. You get curves that look less and less like the original Witch. Runge is himself famous to mathematicians, known for “Runge-Kutta”. That’s a family of techniques to solve differential equations numerically. I don’t know whether Runge came to the weirdness of the Witch of Agnesi curve from considering how errors build in numerical integration. I can imagine it, though. The topics feel related to me.
I understand how none of this could fit that textbook’s slender footnote. I’m not sure any of the really good parts of the Witch of Agnesi could even fit thematically in that textbook. At least beyond the fact of its interesting name, which any good blog about the curve will explain. That there was no picture, and that the equation was beyond what the textbook had been describing, made it a challenge. Maybe not seeing what the shape was teased the mathematician out of this bored student.
And next is ‘X’. Will I take Mr Wu’s suggestion and use that to describe something “extreme”? Or will I take another topic or suggestion? We’ll see on Friday, barring unpleasant surprises. Thanks for reading.
There were nearly a dozen mathematically-themed comic strips among what I’d read, and they almost but not quite split mid-week. Better, they include one of my favorite ever mathematics strips from Charles Schulz’s Peanuts.
Jimmy Halto’s Little Iodine for the 4th of December, 1956 was rerun the 2nd of February. Little Iodine seeks out help with what seems to be story problems. The rate problem — “if it takes one man two hours to plow seven acros, how long will it take five men and a horse to … ” — is a kind I remember being particularly baffling. I think it’s the presence of three numbers at once. It seems easy to go from, say, “if you go two miles in ten minutes, how long will it take to go six miles?” to an answer. To go from “if one person working two hours plows seven acres then how long will five men take to clear fourteen acres” to an answer seems like a different kind of problem altogether. It’s a kind of problem for which it’s even wiser than usual to carefully list everything you need.
Kieran Meehan’s Pros and Cons for the 5th uses a bit of arithmetic. It looks as if it’s meant to be a reminder about following the conclusions of one’s deductive logic. It’s more common to use 1 + 1 equalling 2, or 2 + 2 equalling 4. Maybe 2 times 2 being 4. But then it takes a little turn into numerology, trying to read more meaning into numbers than is wise. (I understand why people should use numerological reasoning, especially given how much mathematicians like to talk up mathematics as descriptions of reality and how older numeral systems used letters to represent words. And that before you consider how many numbers have connotations.)
Mort Walker and Dik Browne’s Hi and Lois for the 10th of August, 1960 was rerun the 6th of February. It’s a counting joke. Babies do have some number sense. At least babies as old as Trixie do, I believe, in that they’re able to detect that something weird is going on when they’re shown, eg, two balls put into a box and four balls coming out. (Also it turns out that stage magicians get called in to help psychologists study just how infants and toddlers understand the world, which is neat.)
The end of the (US) semester snuck up on me but, in my defense, I’m not teaching this semester. If you know someone who needs me to teach, please leave me a note. But as a service for people who are just trying to figure out exactly how much studying they need to do for their finals, knock it off. You’re not playing a video game. It’s not like you can figure out how much effort it takes to get an 83.5 on the final and then put the rest of your energy into your major’s classes.
For those not interested in grade-grubbing, here’s some old-time radio. Vic and Sade was a longrunning 15-minute morning radio program written with exquisite care by Paul Rhymer. It’s not going to be to everyone’s taste. But if it is yours, it’s going to be really yours: a tiny cast of people talking not quite past one another while respecting the classic Greek unities. Part of the Overnightscape Underground is the Vic and Sadecast, which curates episodes of the show, particularly trying to explain the context of things gone by since 1940. This episode, from October 1941, is aptly titled “It’s Algebra, Uncle Fletcher”. Neither Vic nor Sade are in the episode, but their son Rush and Uncle Fletcher are. And they try to work through high school algebra problems. I’m tickled to hear Uncle Fletcher explaining mathematics homework. I hope you are too.
Was there an uptick in mathematics-themed comic strips in the syndicated comics this past week? It depends how tight a definition of “theme” you use. I have enough to write about that I’m splitting the week’s load. And I’ve got a follow-up to that Wronski post the other day, so I’m feeling nice and full of content right now. So here goes.
Zach Weinersmith’s Saturday Morning Breakfast Cereal posted the 5th gets my week off to an annoying start. Science and mathematics and engineering people have a tendency to be smug about their subjects. And to see aptitude or interest in their subjects as virtue, or at least intelligence. (If they see a distinction between virtue and intelligence.) To presume that an interest in the field I like is a demonstration of intelligence is a pretty nasty and arrogant move.
And yes, I also dislike the attitude that school should be about training people. Teaching should be about letting people be literate with the great thoughts people have had. Mathematics has a privileged spot here. The field, as we’ve developed it, seems to build on human aptitudes for number and space. It’s easy to find useful sides to it. Doesn’t mean it’s vocational training.
Lincoln Peirce’s Big Nate on the 6th discovered mathematics puzzles. And this gave him the desire to create a new mathematical puzzle that he would use to get rich. Good luck with that. Coming up with interesting enough recreational mathematics puzzles is hard. Presenting it in a way that people will buy is another, possibly greater, challenge. It takes luck and timing and presentation, just as getting a hit song does. Sudoku, for example, spent five years in the Dell Magazine puzzle books before getting a foothold in Japanese newspapers. And then twenty years there before being noticed in the English-speaking puzzle world. Big Nate’s teacher tries to encourage him, although that doesn’t go as Mr Staples might have hoped. (The storyline continues to the 11th. Spoiler: Nate does not invent the next great recreational mathematics puzzle.)
Jef Mallett’s Frazz for the 7th start out in a mathematics class, at least. I suppose the mathematical content doesn’t matter, though. Mallett’s making a point about questions that, I confess, I’m not sure I get. I’ll leave it for wiser heads to understand.
Mike Thompson’s Grand Avenue for the 8th is a subverted word-problem joke. And I suppose a reminder about the need for word problems to parse as things people would do, or might be interested in. I can’t go along with characterizing buying twelve candy bars “gluttonous” though. Not if they’re in a pack of twelve or something like that. I may be unfair to Grand Avenue. Mind, until a few years ago I was large enough my main method of getting around was “being rolled by Oompa-Loompas”, so I could be a poor judge.
We come now almost to the end of the Summer 2017 A To Z. Possibly also the end of all these A To Z sequences. Gaurish of, For the love of Mathematics, proposed that I talk about the obvious logical choice. The last promising thing I hadn’t talked about. I have no idea what to do for future A To Z’s, if they’re even possible anymore. But that’s a problem for some later time.
Some good advice that I don’t always take. When starting a new problem, make a list of all the things that seem likely to be relevant. Problems that are worth doing are usually about things. They’ll be quantities like the radius or volume of some interesting surface. The amount of a quantity under consideration. The speed at which something is moving. The rate at which that speed is changing. The length something has to travel. The number of nodes something must go across. Whatever. This all sounds like stuff from story problems. But most interesting mathematics is from a story problem; we want to know what this property is like. Even if we stick to a purely mathematical problem, there’s usually a couple of things that we’re interested in and that we describe. If we’re attacking the four-color map theorem, we have the number of territories to color. We have, for each territory, the number of territories that touch it.
Next, select a name for each of these quantities. Write it down, in the table, next to the term. The volume of the tank is ‘V’. The radius of the tank is ‘r’. The height of the tank is ‘h’. The fluid is flowing in at a rate ‘r’. The fluid is flowing out at a rate, oh, let’s say ‘s’. And so on. You might take a moment to go through and think out which of these variables are connected to which other ones, and how. Volume, for example, is surely something to do with the radius times something to do with the height. It’s nice to have that stuff written down. You may not know the thing you set out to solve, but you at least know you’ve got this under control.
I recommend this. It’s a good way to organize your thoughts. It establishes what things you expect you could know, or could want to know, about the problem. It gives you some hint how these things relate to each other. It sets you up to think about what kinds of relationships you figure to study when you solve the problem. It gives you a lifeline, when you’re lost in a sea of calculation. It’s reassurance that these symbols do mean something. Better, it shows what those things are.
I don’t always do it. I have my excuses. If I’m doing a problem that’s very like one I’ve already recently done, the things affecting it are probably the same. The names to give these variables are probably going to be about the same. Maybe I’ll make a quick sketch to show how the parts of the problem relate. If it seems like less work to recreate my thoughts than to write them down, I skip writing them down. Not always good practice. I tell myself I can always go back and do things the fully right way if I do get lost. So far that’s been true.
So, the names. Suppose I am interested in, say, the length of the longest rod that will fit around this hallway corridor. Then I am in a freshman calculus book, yes. Fine. Suppose I am interested in whether this pinball machine can be angled up the flight of stairs that has a turn in it Then I will measure things like the width of the pinball machine. And the width of the stairs, and of the landing. I will measure this carefully. Pinball machines are heavy and there are many hilarious sad stories of people wedging them into hallways and stairwells four and a half stories up from the street. But: once I have identified, say, ‘width of pinball machine’ as a quantity of interest, why would I ever refer to it as anything but?
This is no dumb question. It is always dangerous to lose the link between the thing we calculate and the thing we are interested in. Without that link we are less able to notice mistakes in either our calculations or the thing we mean to calculate. Without that link we can’t do a sanity check, that reassurance that it’s not plausible we just might fit something 96 feet long around the corner. Or that we estimated that we could fit something of six square feet around the corner. It is common advice in programming computers to always give variables meaningful names. Don’t write ‘T’ when ‘Total’ or, better, ‘Total_Value_Of_Purchase’ is available. Why do we disregard this in mathematics, and switch to ‘T’ instead?
First reason is, well, try writing this stuff out. Your hand (h) will fall off (foff) in about fifteen minutes, twenty seconds. (15′ 20”). If you’re writing a program, the programming environment you have will auto-complete the variable after one or two letters in. Or you can copy and paste the whole name. It’s still good practice to leave a comment about what the variable should represent, if the name leaves any reasonable ambiguity.
Another reason is that sure, we do specific problems for specific cases. But a mathematician is naturally drawn to thinking of general problems, in abstract cases. We see something in common between the problem “a length and a quarter of the length is fifteen feet; what is the length?” and the problem “a volume plus a quarter of the volume is fifteen gallons; what is the volume?”. That one is about lengths and the other about volumes doesn’t concern us. We see a saving in effort by separating the quantity of a thing from the kind of the thing. This restores danger. We must think, after we are done calculating, about whether the answer could make sense. But we can minimize that, we hope. At the least we can check once we’re done to see if our answer makes sense. Maybe even whether it’s right.
For centuries, as the things we now recognize as algebra developed, we would use words. We would talk about the “thing” or the “quantity” or “it”. Some impersonal name, or convenient pronoun. This would often get shortened because anything you write often you write shorter. “Re”, perhaps. In the late 16th century we start to see the “New Algebra”. Here mathematics starts looking like … you know … mathematics. We start to see stuff like “addition” represented with the + symbol instead of an abbreviation for “addition” or a p with a squiggle over it or some other shorthand. We get equals signs. You start to see decimals and exponents. And we start to see letters used in place of numbers whose value we don’t know.
There are a couple kinds of “numbers whose value we don’t know”. One is the number whose value we don’t know, but hope to learn. This is the classic variable we want to solve for. Another kind is the number whose value we don’t know because we don’t care. I mean, it has some value, and presumably it doesn’t change over the course of our problem. But it’s not like our work will be so different if, say, the tank is two feet high rather than four.
Is there a problem? If we pick our letters to fit a specific problem, no. Presumably all the things we want to describe have some clear name, and some letter that best represents the name. It’s annoying when we have to consider, say, the pinball machine width and the corridor width. But we can work something out.
But what about general problems?
Is an easy problem to solve?
If we want to figure what ‘m’ is, yes. Similarly ‘y’. If we want to know what ‘b’ is, it’s tedious, but we can do that. If we want to know what ‘e’ is? Run and hide, that stuff is crazy. If you have to, do it numerically and accept an estimate. Don’t try figuring what that is.
And so we’ve developed conventions. There are some letters that, except in weird circumstances, are coefficients. They’re numbers whose value we don’t know, but either don’t care about or could look up. And there are some that, by default, are variables. They’re the ones whose value we want to know.
These conventions started forming, as mentioned, in the late 16th century. François Viète here made a name that lasts to mathematics historians at least. His texts described how to do algebra problems in the sort of procedural methods that we would recognize as algebra today. And he had a great idea for these letters. Use the whole alphabet, if needed. Use the consonants to represent the coefficients, the numbers we know but don’t care what they are. Use the vowels to represent the variables, whose values we want to learn. So he would look at that equation and see right away: it’s a terrible mess. (I exaggerate. He doesn’t seem to have known the = sign, and I don’t know offhand when ‘log’ and ‘cos’ became common. But suppose the rest of the equation were translated into his terminology.)
It’s not a bad approach. Besides the mnemonic value of consonant-coefficient, vowel-variable, it’s true that we usually have fewer variables than anything else. The more variables in a problem the harder it is. If someone expects you to solve an equation with ten variables in it, you’re excused for refusing. So five or maybe six or possibly seven choices for variables is plenty.
But it’s not what we settled on. René Descartes had a better idea. He had a lot of them, but here’s one. Use the letters at the end of the alphabet for the unknowns. Use the letters at the start of the alphabet for coefficients. And that is, roughly, what we’ve settled on. In my example nightmare equation, we’d suppose ‘y’ to probably be the variable we want to solve for.
And so, and finally, x. It is almost the variable. It says “mathematics” in only two strokes. Even π takes more writing. Descartes used it. We follow him. It’s way off at the end of the alphabet. It starts few words, very few things, almost nothing we would want to measure. (Xylem … mass? Flow? What thing is the xylem anyway?) Even mathematical dictionaries don’t have much to say about it. The letter transports almost no connotations, no messy specific problems to it. If it suggests anything, it suggests the horizontal coordinate in a Cartesian system. It almost is mathematics. It signifies nothing in itself, but long use has given it an identity as the thing we hope to learn by study.
And pirate treasure maps. I don’t know when ‘X’ became the symbol of where to look for buried treasure. My casual reading suggests “never”. Treasure maps don’t really exist. Maps in general don’t work that way. Or at least didn’t before cartoons. X marking the spot seems to be the work of Robert Louis Stevenson, renowned for creating a fanciful map and then putting together a book to justify publishing it. (I jest. But according to Simon Garfield’s On The Map: A Mind-Expanding Exploration of the Way The World Looks, his map did get lost on the way to the publisher, and he had to re-create it from studying the text of Treasure Island. This delights me to no end.) It makes me wonder if Stevenson was thinking of x’s service in mathematics. But the advantages of x as a symbol are hard to ignore. It highlights a point clearly. It’s fast to write. Its use might be coincidence.
But it is a letter that does a needed job really well.
Something about ‘5’ that you only notice when you’re a kid first learning about numbers. You know that it’s a prime number because it’s equal to 1 times 5 and nothing else. You also know that once you introduce fractions, it’s equal to all kinds of things. It’s 10 times one-half and it’s 15 times one-third and it’s 2.5 times 2 and many other things. Why, you might ask the teacher, is it a prime number if it’s got a million billion trillion different factors? And when every other whole number has as many factors? If you get to the real numbers it’s even worse yet, although when you’re a kid you probably don’t realize that. If you ask, the teacher probably answers that it’s only the whole numbers that count for saying whether something is prime or not. And, like, 2.5 can’t be considered anything, prime or composite. This satisfies the immediate question. It doesn’t quite get at the underlying one, though. Why do integers have prime numbers while real numbers don’t?
To maybe have a prime number we need a ring. This is a creature of group theory, or what we call “algebra” once we get to college. A ring consists of a set of elements, and a rule for adding them together, and a rule for multiplying them together. And I want this ring to have a multiplicative identity. That’s some number which works like ‘1’: take something, multiply it by that, and you get that something back again. Also, I want this multiplication rule to commute. That is, the order of multiplication doesn’t affect what the result is. (If the order matters then everything gets too complicated to deal with.) Let me say the things in the set are numbers. It turns out (spoiler!) they don’t have to be. But that’s how we start out.
Whether the numbers in a ring are prime or not depends on the multiplication rule. Let’s take a candidate number that I’ll call ‘a’ to make my writing easier. If the only numbers whose product is ‘a’ are the pair of ‘a’ and the multiplicative identity, then ‘a’ is prime. If there’s some other pair of numbers that give you ‘a’, then ‘a’ is not prime.
The integers — the positive and negative whole numbers, including zero — are a ring. And they have prime numbers just like you’d expect, if we figure out some rule about how to deal with the number ‘-1’. There are many other rings. There’s a whole family of rings, in fact, so commonly used that they have shorthand. Mathematicians write them as “Zn”, where ‘n’ is some whole number. They’re the integers, modulo ‘n’. That is, they’re the whole numbers from ‘0’ up to the number ‘n-1’, whatever that is. Addition and multiplication work as they do with normal arithmetic, except that if the result is less than ‘0’ we add ‘n’ to it. If the result is more than ‘n-1’ we subtract ‘n’ from it. We repeat that until the result is something from ‘0’ to ‘n-1’, inclusive.
(We use the letter ‘Z’ because it’s from the German word for numbers, and a lot of foundational work was done by German-speaking mathematicians. Alternatively, we might write this set as “In”, where “I” stands for integers. If that doesn’t satisfy, we might write this set as “Jn”, where “J” stands for integers. This is because it’s only very recently that we’ve come to see “I” and “J” as different letters rather than different ways to write the same letter.)
These modulo arithmetics are legitimate ones, good reliable rings. They make us realize how strange prime numbers are, though. Consider the set Z4, where the only numbers are 0, 1, 2, and 3. 0 times anything is 0. 1 times anything is whatever you started with. 2 times 1 is 2. Obvious. 2 times 2 is … 0. All right. 2 times 3 is 2 again. 3 times 1 is 3. 3 times 2 is 2. 3 times 3 is 1. … So that’s a little weird. The only product that gives us 3 is 3 times 1. So 3’s a prime number here. 2 isn’t a prime number: 2 times 3 is 2. For that matter even 1 is a composite number, an unsettling consequence.
Or then Z5, where the only numbers are 0, 1, 2, 3, and 4. Here, there are no prime numbers. Each number is the product of at least one pair of other numbers. In Z6 we start to have prime numbers again. But Z7? Z8? I recommend these questions to a night when your mind is too busy to let you fall asleep.
Prime numbers depend on context. In the crowded universe of all the rational numbers, or all the real numbers, nothing is prime. In the more austere world of the Gaussian Integers, familiar friends like ‘3’ are prime again, although ‘5’ no longer is. We recognize that as the product of and , themselves now prime numbers.
So given that these things do depend on context. Should we care? Or let me put it another way. Suppose we contact a wholly separate culture, one that we can’t have influenced and one not influenced by us. It’s plausible that they should have a mathematics. Would they notice prime numbers as something worth study? Or would they notice them the way we notice, say, pentagonal numbers, a thing that allows for some pretty patterns and that’s about it?
Well, anything could happen, of course. I’m inclined to think that prime numbers would be noticed, though. They seem to follow naturally from pondering arithmetic. And if one has thought of rings, then prime numbers seem to stand out. The way that Zn behaves changes in important ways if ‘n’ is a prime number. Most notably, if ‘n’ is prime (among the whole numbers), then we can define something that works like division on Zn. If ‘n’ isn’t prime (again), we can’t. This stands out. There are a host of other intriguing results that all seem to depend on whether ‘n’ is a prime number among the whole numbers. It seems hard to believe someone could think of the whole numbers and not notice the prime numbers among them.
And they do stand out, as these reliably peculiar things. Many things about them (in the whole numbers) are easy to prove. That there are infinitely many, for example, you can prove to a child. And there are many things we have no idea how to prove. That there are infinitely many primes which are exactly two more than another prime, for example. Any child can understand the question. The one who can prove it will win what fame mathematicians enjoy. If it can be proved.
They turn up in strange, surprising places. Just in the whole numbers we find some patches where there are many prime numbers in a row (Forty percent of the numbers 1 through 10!). We can find deserts; we know of a stretch of 1,113,106 numbers in a row without a single prime among them. We know it’s possible to find prime deserts as vast as we want. Say you want a gap between primes of at least size N. Then look at the numbers (N+1)! + 2, (N+1)! + 3, (N+1)! + 4, and so on, up to (N+1)! + N+1. None of those can be prime numbers. You must have a gap at least the size N. It may be larger; how we know that (N+1)! + 1 is a prime number?
No telling. Well, we can check. See if any prime number divides into (N+1)! + 1. This takes a long time to do if N is all that big. There’s no formulas we know that will make this easy or quick.
We don’t call it a “prime number” if it’s in a ring that isn’t enough like the numbers. Fair enough. We shift the name to “prime element”. “Element” is a good generic name for a thing whose identity we don’t mean to pin down too closely. I’ve talked about the Gaussian Primes already, in an earlier essay and earlier in this essay. We can make a ring out of the polynomials whose coefficients are all integers. In that, is a prime. So is . If this hasn’t given you some ideas what other polynomials might be primes, then you have something else to ponder while trying to sleep. Thinking of all the prime polynomials is likely harder than you can do, though.
Prime numbers seem to stand out, obvious and important. Humans have known about prime numbers for as long as we’ve known about multiplication. And yet there is something obscure about them. If there are cultures completely independent of our own, do they have insights which make prime numbers not such occult figures? How different would the world be if we knew all the things we now wonder about primes?
Once more do I have Gaurish to thank for the day’s topic. (There’ll be two more chances this week, providing I keep my writing just enough ahead of deadline.) This one doesn’t touch category theory or topology.
I keep touching on group theory here. It’s a field that’s about what kinds of things can work like arithmetic does. A group is a set of things that you can add together. At least, you can do something that works like adding regular numbers together does. A ring is a set of things that you can add and multiply together.
There are many interesting rings. Here’s one. It’s called the Gaussian Integers. They’re made of numbers we can write as , where ‘a’ and ‘b’ are some integers. is what you figure, that number that multiplied by itself is -1. These aren’t the complex-valued numbers, you notice, because ‘a’ and ‘b’ are always integers. But you add them together the way you add complex-valued numbers together. That is, plus is the number . And you multiply them the way you multiply complex-valued numbers together. That is, times is the number .
We created something that has addition and multiplication. It picks up subtraction for free. It doesn’t have division. We can create rings that do, but this one won’t, any more than regular old integers have division. But we can ask what other normal-arithmetic-like stuff these Gaussian integers do have. For instance, can we factor numbers?
This isn’t an obvious one. No, we can’t expect to be able to divide one Gaussian integer by another. But we can’t expect to divide a regular old integer by another, not and get an integer out of it. That doesn’t mean we can’t factor them. It means we divide the regular old integers into a couple classes. There’s prime numbers. There’s composites. There’s the unit, the number 1. There’s zero. We know prime numbers; they’re 2, 3, 5, 7, and so on. Composite numbers are the ones you get by multiplying prime numbers together: 4, 6, 8, 9, 10, and so on. 1 and 0 are off on their own. Leave them there. We can’t divide any old integer by any old integer. But we can say an integer is equal to this string of prime numbers multiplied together. This gives us a handle by which we can prove a lot of interesting results.
We can do the same with Gaussian integers. We can divide them up into Gaussian primes, Gaussian composites, units, and zero. The words mean what they mean for regular old integers. A Gaussian composite can be factored into the multiples of Gaussian primes. Gaussian primes can’t be factored any further.
If we know what the prime numbers are for regular old integers we can tell whether something’s a Gaussian prime. Admittedly, knowing all the prime numbers is a challenge. But a Gaussian integer will be prime whenever a couple simple-to-test conditions are true. First is if ‘a’ and ‘b’ are both not zero, but is a prime number. So, for example, is a Gaussian prime.
You might ask, hey, would also be a Gaussian prime? That’s also got components that are integers, and the squares of them add up to a prime number (41). Well-spotted. Gaussian primes appear in quartets. If is a Gaussian prime, so is . And so are and .
There’s another group of Gaussian primes. These are the numbers where either ‘a’ or ‘b’ is zero. Then the other one is, if positive, three more than a whole multiple of four. If it’s negative, then it’s three less than a whole multiple of four. So ‘3’ is a Gaussian prime, as is -3, and as is and so is .
This has strange effects. Like, ‘3’ is a prime number in the regular old scheme of things. It’s also a Gaussian prime. But familiar other prime numbers like ‘2’ and ‘5’? Not anymore. Two is equal to ; both of those terms are Gaussian primes. Five is equal to . There are similar shocking results for 13. But, roughly, the world of composites and prime numbers translates into Gaussian composites and Gaussian primes. In this slightly exotic structure we have everything familiar about factoring numbers.
You might have some nagging thoughts. Like, sure, two is equal to . But isn’t it also equal to ? One of the important things about prime numbers is that every composite number is the product of a unique string of prime numbers. Do we have to give that up for Gaussian integers?
Good nag. But no; the doubt is coming about because you’ve forgotten the difference between “the positive integers” and “all the integers”. If we stick to positive whole numbers then, yeah, (say) ten is equal to two times five and no other combination of prime numbers. But suppose we have all the integers, positive and negative. Then ten is equal to either two times five or it’s equal to negative two times negative five. Or, better, it’s equal to negative one times two times negative one times five. Or suffix times any even number of negative ones.
Remember that bit about separating ‘one’ out from the world of primes and composites? That’s because the number one screws up these unique factorizations. You can always toss in extra factors of one, to taste, without changing the product of something. If we have positive and negative integers to use, then negative one does almost the same trick. We can toss in any even number of extra negative ones without changing the product. This is why we separate “units” out of the numbers. They’re not part of the prime factorization of any numbers.
For the Gaussian integers there are four units. 1 and -1, and . They are neither primes nor composites, and we don’t worry about how they would otherwise multiply the number of factorizations we get.
But let me close with a neat, easy-to-understand puzzle. It’s called the moat-crossing problem. In the regular old integers it’s this: imagine that the prime numbers are islands in a dangerous sea. You start on the number ‘2’. Imagine you have a board that can be set down and safely crossed, then picked up to be put down again. Could you get from the start and go off to safety, which is infinitely far away? If your board is some, fixed, finite length?
No, you can’t. The problem amounts to how big the gap between one prime number and the next largest prime number can be. It turns out there’s no limit to that. That is, you give me a number, as small or as large as you like. I can find some prime number that’s more than your number less than its successor. There are infinitely large gaps between prime numbers.
Gaussian primes, though? Since a Gaussian prime might have nearest neighbors in any direction? Nobody knows. We know there are arbitrarily large gaps. Pick a moat size; we can (eventually) find a Gaussian prime that’s at least that far away from its nearest neighbors. But this does not say whether it’s impossible to get from the smallest Gaussian primes — and its companions and on — infinitely far away. We know there’s a moat of width 6 separating the origin of things from infinity. We don’t know that there’s bigger ones.
You’re not going to solve this problem. Unless I have more brilliant readers than I know about; if I have ones who can solve this problem then I might be too intimidated to write anything more. But there is surely a pleasant pastime, maybe a charming game, to be made from this. Try finding the biggest possible moats around some set of Gaussian prime island.
Gaurish, of the For The Love Of Mathematics gives me another subject today. It’s one that isn’t about ellipses. Sad to say it’s also not about elliptic integrals. This is sad to me because I have a cute little anecdote about a time I accidentally gave my class an impossible problem. I did apologize. No, nobody solved it anyway.
Elliptic Curves start, of course, with polynomials. Particularly, they’re polynomials with two variables. We call the ‘x’ and ‘y’ because we have no reason to be difficult. They’re of at most third degree. That is, we can have terms like ‘x’ and ‘y2‘ and ‘x2y’ and ‘y3‘. Something with higher powers, like, ‘x4‘ or ‘x2y2‘ — a fourth power, all together — is right out. Doesn’t matter. Start from this and we can do some slick changes of variables so that we can rewrite it to look like this:
Here, ‘A’ and ‘B’ are some numbers that don’t change for this particular curve. Also, we need it to be true that doesn’t equal zero. It avoids problems. What we’ll be looking at are coordinates, values of ‘x’ and ‘y’ together which make this equation true. That is, it’s points on the curve. If you pick some real numbers ‘A’ and ‘B’ and draw all the values of ‘x’ and ‘y’ that make the equation true you get … well, there’s different shapes. They all look like those microscope photos of a water drop emerging and falling from a tap, only rotated clockwise ninety degrees.
So. Pick any of these curves that you like. Pick a point. I’m going to name your point ‘P’. Now pick a point once more. I’m going to name that point ‘Q’. Now draw a line from P through Q. Keep drawing it. It’ll cross the original elliptic curve again. And that point is … not actually special. What is special is the reflection of that point. That is, the same x-coordinate, but flip the plus or minus sign for the y-coordinate. (WARNING! Do not call it “the reflection” at your thesis defense! Call it the “conjugate” point. It means “reflection”.) Your elliptic curve will be symmetric around the x-axis. If, say, the point with x-coordinate 4 and y-coordinate 3 is on the curve, so is the point with x-coordinate 4 and y-coordinate -3. So that reflected point is … something special.
This lets us do something wonderful. We can think of this reflected point as the sum of your ‘P’ and ‘Q’. You can ‘add’ any two points on the curve and get a third point. This means we can do something that looks like addition for points on the elliptic curve. And this means the points on this curve are a group, and we can bring all our group-theory knowledge to studying them. It’s a commutative group, too; ‘P’ added to ‘Q’ leads to the same point as ‘Q’ added to ‘P’.
Let me head off some clever thoughts that make fair objections. What if ‘P’ and ‘Q’ are already reflections, so the line between them is vertical? That never touches the original elliptic curve again, right? Yeah, fair complaint. We patch this by saying that there’s one more point, ‘O’, that’s off “at infinity”. Where is infinity? It’s wherever your vertical lines end. Shut up, this can too be made rigorous. In any case it’s a common hack for this sort of problem. When we add that, everything’s nice. The ‘O’ serves the role in this group that zero serves in arithmetic: the sum of point ‘O’ and any point ‘P’ is going to be ‘P’ again.
Second clever thought to head off: what if ‘P’ and ‘Q’ are the same point? There’s infinitely many lines that go through a single point so how do we pick one to find an intersection with the elliptic curve? Huh? If you did that, then we pick the tangent line to the elliptic curve that touches ‘P’, and carry on as before.
There’s more. What kind of number is ‘x’? Or ‘y’? I’ll bet that you figured they were real numbers. You know, ordinary stuff. I didn’t say what they were, so left it to our instinct, and that usually runs toward real numbers. Those are what I meant, yes. But we didn’t have to. ‘x’ and ‘y’ could be in other sets of numbers too. They could be complex-valued numbers. They could be just the rational numbers. They could even be part of a finite collection of possible numbers. As the equation is something meaningful (and some technical points are met) we can carry on. The elliptical curves, and the points we “add” on them, might not look like the curves we started with anymore. They might not look like anything recognizable anymore. But the logic continues to hold. We still create these groups out of the points on these lines intersecting a curve.
By now you probably admit this is neat stuff. You may also think: so what? We can take this thing you never thought about, draw points and lines on it, and make it look very loosely kind of like just adding numbers together. Why is this interesting? No appreciation just for the beauty of the structure involved? Well, we live in a fallen world.
It comes back to number theory. The modern study of Diophantine equations grows out of studying elliptic curves on the rational numbers. It turns out the group of points you get for that looks like a finite collection of points with some collection of integers hanging on. How long that collection of numbers is is called the ‘rank’, and there are deep mysteries at work. We know there are elliptic equations that have a rank as big as 28. Nobody knows if the rank can be arbitrary high, though. And I believe we don’t even know if there are any curves with rank of, like, 27, or 25.
Yeah, I’m still sensing skepticism out there. Fine. We’ll go back to the only part of number theory everybody agrees is useful. Encryption. We have roughly the same goals for every encryption scheme. We want it to be easy to encode a message. We want it to be easy to decode the message if you have the key. We want it to be hard to decode the message if you don’t have the key.
Take something inside one of these elliptic curve groups. Especially one that’s got a finite field. Let me call your thing ‘g’. It’s really easy for you, knowing what ‘g’ is and what your field is, to raise it to a power. You can pretty well impress me by sharing the value of ‘g’ raised to some whole number ‘m’. Call that ‘h’.
Why am I impressed? Because if all I know is ‘h’, I have a heck of a time figuring out what ‘g’ is. Especially on these finite field groups there’s no obvious connection between how big ‘h’ is and how big ‘g’ is and how big ‘m’ is. Start with a big enough finite field and you can encode messages in ways that are crazy hard to crack.
We trust. At least, if there are any ways to break the code quickly, nobody’s shared them. And there’s one of those enormous-money-prize awards waiting for someone who does know how to break such a code quickly. (I don’t know which. I’m going by what I expect from people.)
And then there’s fame. These were used to prove Fermat’s Last Theorem. Suppose there are some non-boring numbers ‘a’, ‘b’, and ‘c’, so that for some prime number ‘p’ that’s five or larger, it’s true that . (We can separately prove Fermat’s Last Theorem for a power that isn’t a prime number, or a power that’s 3 or 4.) Then this implies properties about the elliptic curve:
This is a convenient way of writing things since it showcases the ap and bp. It’s equal to:
(I was so tempted to leave an arithmetic error in there so I could make sure someone commented.)
If there’s a solution to Fermat’s Last Theorem, then this elliptic equation can’t be modular. I don’t have enough words to explain what ‘modular’ means here. Andrew Wiles and Richard Taylor showed that the equation was modular. So there is no solution to Fermat’s Last Theorem except the boring ones. (Like, where ‘b’ is zero and ‘a’ and ‘c’ equal each other.) And it all comes from looking close at these neat curves, none of which looks like an ellipse.
They’re named elliptic curves because we first noticed them when Carl Jacobi — yes, that Carl Jacobi — while studying the length of arcs of an ellipse. That’s interesting enough on its own. But it is hard. Maybe I could have fit in that anecdote about giving my class an impossible problem after all.
Today’s A To Z topic is another request from Gaurish, of the For The Love Of Mathematics blog. Also part of what looks like a quest to make me become a topology blogger, at least for a little while. It’s going to be exciting and I hope not to faceplant as I try this.
Also, a note about Thomas K Dye, who’s drawn the banner art for this and for the Why Stuff Can Orbit series: the publisher for collections of his comic strip is having a sale this weekend.
The word looks intimidating, and faintly of technobabble. It’s less cryptic than it appears. We see parts of it in non-mathematical contexts. In biology class we would see “homology”, the sharing of structure in body parts that look superficially very different. We also see it in art class. The instructor points out that a dog’s leg looks like that because they stand on their toes. What looks like a backward-facing knee is just the ankle, and if we stand on our toes we see that in ourselves. We might see it in chemistry, as many interesting organic compounds differ only in how long or how numerous the boring parts are. The stuff that does work is the same, or close to the same. And this is a hint to what a mathematician means by cohomology. It’s something in shapes. It’s particularly something in how different things might have similar shapes. Yes, I am using a homology in language here.
I often talk casually about the “shape” of mathematical things. Or their “structures”. This sounds weird and abstract to start and never really gets better. We can get some footing if we think about drawing the thing we’re talking about. Could we represent the thing we’re working on as a figure? Often we can. Maybe we can draw a polygon, with the vertices of the shape matching the pieces of our mathematical thing. We get the structure of our thing from thinking about what we can do to that polygon without changing the way it looks. Or without changing the way we can do whatever our original mathematical thing does.
This leads us to homologies. We get them by looking for stuff that’s true even if we moosh up the original thing. The classic homology comes from polyhedrons, three-dimensional shapes. There’s a relationship between the number of vertices, the number of edges, and the number of faces of a polyhedron. It doesn’t change even if you stretch the shape out longer, or squish it down, for that matter slice off a corner. It only changes if you punch a new hole through the middle of it. Or if you plug one up. That would be unsporting. A homology describes something about the structure of a mathematical thing. It might even be literal. Topology, the study of what we know about shapes without bringing distance into it, has the number of holes that go through a thing as a homology. This gets feeling like a comfortable, familiar idea now.
But that isn’t a cohomology. That ‘co’ prefix looks dangerous. At least it looks significant. When the ‘co’ prefix has turned up before it’s meant something is shaped by how it refers to something else. Coordinates aren’t just number lines; they’re collections of number lines that we can use to say where things are. If ‘a’ is a factor of the number ‘x’, its cofactor is the number you multiply ‘a’ by in order to get ‘x’. (For real numbers that’s just x divided by a. For other stuff it might be weirder.) A codomain is a set that a function maps a domain into (and must contain the range, at least). Cosets aren’t just sets; they’re ways we can divide (for example) the counting numbers into odds and evens.
So what’s the ‘co’ part for a homology? I’m sad to say we start losing that comfortable feeling now. We have to look at something we’re used to thinking of as a process as though it were a thing. These things are morphisms: what are the ways we can match one mathematical structure to another? Sometimes the morphisms are easy. We can match the even numbers up with all the integers: match 0 with 0, match 2 with 1, match -6 with -3, and so on. Addition on the even numbers matches with addition on the integers: 4 plus 6 is 10; 2 plus 3 is 5. For that matter, we can match the integers with the multiples of three: match 1 with 3, match -1 with -3, match 5 with 15. 1 plus -2 is -1; 3 plus -6 is -9.
What happens if we look at the sets of matchings that we can do as if that were a set of things? That is, not some human concept like ‘2’ but rather ‘match a number with one-half its value’? And ‘match a number with three times its value’? These can be the population of a new set of things.
And these things can interact. Suppose we “match a number with one-half its value” and then immediately “match a number with three times its value”. Can we do that? … Sure, easily. 4 matches to 2 which goes on to 6. 8 matches to 4 which goes on to 12. Can we write that as a single matching? Again, sure. 4 matches to 6. 8 matches to 12. -2 matches to -3. We can write this as “match a number with three-halves its value”. We’ve taken “match a number with one-half its value” and combined it with “match a number with three times its value”. And it’s given us the new “match a number with three-halves its value”. These things we can do to the integers are themselves things that can interact.
This is a good moment to pause and let the dizziness pass.
It isn’t just you. There is something weird thinking of “doing stuff to a set” as a thing. And we have to get a touch more abstract than even this. We should be all right, but please do not try not to use this to defend your thesis in category theory. Just use it to not look forlorn when talking to your friend who’s defending her thesis in category theory.
Now, we can take this collection of all the ways we can relate one set of things to another. And we can combine this with an operation that works kind of like addition. Some way to “add” one way-to-match-things to another and get a way-to-match-things. There’s also something that works kind of like multiplication. It’s a different way to combine these ways-to-match-things. This forms a ring, which is a kind of structure that mathematicians learn about in Introduction to Not That Kind Of Algebra. There are many constructs that are rings. The integers, for example, are also a ring, with addition and multiplication the same old processes we’ve always used.
And just as we can sort the integers into odds and evens — or into other groupings, like “multiples of three” and “one plus a multiple of three” and “two plus a multiple of three” — so we can sort the ways-to-match-things into new collections. And this is our cohomology. It’s the ways we can sort and classify the different ways to manipulate whatever we started on.
I apologize that this sounds so abstract as to barely exist. I admit we’re far from a nice solid example such as “six”. But the abstractness is what gives cohomologies explanatory power. We depend very little on the specifics of what we might talk about. And therefore what we can prove is true for very many things. It takes a while to get there, is all.
And now as summer (United States edition) reaches its closing months I plunge into the fourth of my A To Z mathematics-glossary sequences. I hope I know what I’m doing! Today’s request is one of several from Gaurish, who’s got to be my top requester for mathematical terms and whom I thank for it. It’s a lot easier writing these things when I don’t have to think up topics. Gaurish hosts a fine blog, For the love of Mathematics, which you might consider reading.
Arithmetic is what people who aren’t mathematicians figure mathematicians do all day. I remember in my childhood a Berenstain Bears book about people’s jobs. Its mathematician was an adorable little bear adding up sums on the chalkboard, in an observatory, on the Moon. I liked every part of this. I wouldn’t say it’s the whole reason I became a mathematician but it did made the prospect look good early on.
People who aren’t mathematicians are right. At least, the bulk of what mathematics people do is arithmetic. If we work by volume. Arithmetic is about the calculations we do to evaluate or solve polynomials. And polynomials are everything that humans find interesting. Arithmetic is adding and subtracting, of multiplication and division, of taking powers and taking roots. Arithmetic is changing the units of a thing, and of breaking something into several smaller units, or of merging several smaller units into one big one. Arithmetic’s role in commerce and in finance must overwhelm the higher mathematics. Higher mathematics offers cohomologies and Ricci tensors. Arithmetic offers a budget.
This is old mathematics. There’s evidence of humans twenty thousands of years ago recording their arithmetic computations. My understanding is the evidence is ambiguous and interpretations vary. This seems fair. I assume that humans did such arithmetic then, granting that I do not know how to interpret archeological evidence. The thing is that arithmetic is older than humans. Animals are able to count, to do addition and subtraction, perhaps to do harder computations. (I crib this from The Number Sense:
How the Mind Creates Mathematics, by Stanislas Daehaene.) We learn it first, refining our rough instinctively developed sense to something rigorous. At least we learn it at the same time we learn geometry, the other branch of mathematics that must predate human existence.
The primality of arithmetic governs how it becomes an adjective. We will have, for example, the “arithmetic progression” of terms in a sequence. This is a sequence of numbers such as 1, 3, 5, 7, 9, and so on. Or 4, 9, 14, 19, 24, 29, and so on. The difference between one term and its successor is the same as the difference between the predecessor and this term. Or we speak of the “arithmetic mean”. This is the one found by adding together all the numbers of a sample and dividing by the number of terms in the sample. These are important concepts, useful concepts. They are among the first concepts we have when we think of a thing. Their familiarity makes them easy tools to overlook.
Consider the Fundamental Theorem of Arithmetic. There are many Fundamental Theorems; that of Algebra guarantees us the number of roots of a polynomial equation. That of Calculus guarantees us that derivatives and integrals are joined concepts. The Fundamental Theorem of Arithmetic tells us that every whole number greater than one is equal to one and only one product of prime numbers. If a number is equal to (say) two times two times thirteen times nineteen, it cannot also be equal to (say) five times eleven times seventeen. This may seem uncontroversial. The budding mathematician will convince herself it’s so by trying to work out all the ways to write 60 as the product of prime numbers. It’s hard to imagine mathematics for which it isn’t true.
But it needn’t be true. As we study why arithmetic works we discover many strange things. This mathematics that we know even without learning is sophisticated. To build a logical justification for it requires a theory of sets and hundreds of pages of tight reasoning. Or a theory of categories and I don’t even know how much reasoning. The thing that is obvious from putting a couple objects on a table and then a couple more is hard to prove.
As we continue studying arithmetic we start to ponder things like Goldbach’s Conjecture, about even numbers (other than two) being the sum of exactly two prime numbers. This brings us into number theory, a land of fascinating problems. Many of them are so accessible you could pose them to a person while waiting in a fast-food line. This befits a field that grows out of such simple stuff. Many of those are so hard to answer that no person knows whether they are true, or are false, or are even answerable.
And it splits off other ideas. Arithmetic starts, at least, with the counting numbers. It moves into the whole numbers and soon all the integers. With division we soon get rational numbers. With roots we soon get certain irrational numbers. A close study of this implies there must be irrational numbers that must exist, at least as much as “four” exists. Yet they can’t be reached by studying polynomials. Not polynomials that don’t already use these exotic irrational numbers. These are transcendental numbers. If we were to say the transcendental numbers were the only real numbers we would be making only a very slight mistake. We learn they exist by thinking long enough and deep enough about arithmetic to realize there must be more there than we realized.
Thought compounds thought. The integers and the rational numbers and the real numbers have a structure. They interact in certain ways. We can look for things that are not numbers, but which follow rules like that for addition and for multiplication. Sometimes even for powers and for roots. Some of these can be strange: polynomials themselves, for example, follow rules like those of arithmetic. Matrices, which we can represent as grids of numbers, can have powers and even something like roots. Arithmetic is inspiration to finding mathematical structures that look little like our arithmetic. We can find things that follow mathematical operations but which don’t have a Fundamental Theorem of Arithmetic.
And there are more related ideas. These are often very useful. There’s modular arithmetic, in which we adjust the rules of addition and multiplication so that we can work with a finite set of numbers. There’s floating point arithmetic, in which we set machines to do our calculations. These calculations are no longer precise. But they are fast, and reliable, and that is often what we need.
So arithmetic is what people who aren’t mathematicians figure mathematicians do all day. And they are mistaken, but not by much. Arithmetic gives us an idea of what mathematics we can hope to understand. So it structures the way we think about mathematics.
You know we’re getting near the end of the (United States) school year when Comic Strip Master Command orders everyone to clear out their mathematics jokes. I’m assuming that’s what happened here. Or else a lot of cartoonists had word problems on their minds eight weeks ago. Also eight weeks ago plus whenever they originally drew the comics, for those that are deep in reruns. It was busy enough to split this week’s load into two pieces and might have been worth splitting into three, if I thought I had publishing dates free for all that.
Larry Wright’s Motley Classics for the 28th of May, a rerun from 1989, is a joke about using algebra. Occasionally mathematicians try to use the the ability of people to catch things in midair as evidence of the sorts of differential equations solution that we all can do, if imperfectly, in our heads. But I’m not aware of evidence that anyone does anything that sophisticated. I would be stunned if we didn’t really work by a process of making a guess of where the thing should be and refining it as time allows, with experience helping us make better guesses. There’s good stuff to learn in modeling how to catch stuff, though.
Michael Jantze’s The Norm Classics rerun for the 28th opines about why in algebra you had to not just have an answer but explain why that was the answer. I suppose mathematicians get trained to stop thinking about individual problems and instead look to classes of problems. Is it possible to work out a scheme that works for many cases instead of one? If it isn’t, can we at least say something interesting about why it’s not? And perhaps that’s part of what makes algebra classes hard. To think about a collection of things is usually harder than to think about one, and maybe instructors aren’t always clear about how to turn the specific into the general.
Also I want to say some very good words about Jantze’s graphical design. The mock textbook cover for the title panel on the left is so spot-on for a particular era in mathematics textbooks it’s uncanny. The all-caps Helvetica, the use of two slightly different tans, the minimalist cover art … I know shelves stuffed full in the university mathematics library where every book looks like that. Plus, “[Mathematics Thing] And Their Applications” is one of the roughly four standard approved mathematics book titles. He paid good attention to his references.
Gary Wise and Lance Aldrich’s Real Life Adventures for the 28th deploys a big old whiteboard full of equations for the “secret” of the universe. This makes a neat change from finding the “meaning” of the universe, or of life. The equations themselves look mostly like gibberish to me, but Wise and Aldrich make good uses of their symbols. The symbol , a vector-valued quantity named B, turns up a lot. This symbol we often use to represent magnetic flux. The B without a little arrow above it would represent the intensity of the magnetic field. Similarly an turns up. This we often use for magnetic field strength. While I didn’t spot a — electric field — which would be the natural partner to all this, there are plenty of bare E symbols. Those would represent electric potential. And many of the other symbols are what would naturally turn up if you were trying to model how something is tossed around by a magnetic field. Q, for example, is often the electric charge. ω is a common symbol for how fast an electromagnetic wave oscillates. (It’s not the frequency, but it’s related to the frequency.) The uses of symbols is consistent enough, in fact, I wonder if Wise and Aldrich did use a legitimate sprawl of equations and I’m missing the referenced problem.
Bill Amend’s FoxTrot Classicfor the 31st, a rerun from the 7th of June, 2006, shows the conflation of “genius” and “good at mathematics” in everyday use. Amend has picked a quixotic but in-character thing for Jason Fox to try doing. Euclid’s Fifth Postulate is one of the classic obsessions of mathematicians throughout history. Euclid admitted the thing — a confusing-reading mess of propositions — as a postulate because … well, there’s interesting geometry you can’t do without it, and there doesn’t seem any way to prove it from the rest of his geometric postulates. So it must be assumed to be true.
There isn’t a way to prove it from the rest of the geometric postulates, but it took mathematicians over two thousand years of work at that to be convinced of the fact. But I know I went through a time of wanting to try finding a proof myself. It was a mercifully short-lived time that ended in my humbly understanding that as smart as I figured I was, I wasn’t that smart. We can suppose Euclid’s Fifth Postulate to be false and get interesting geometries out of that, particularly the geometries of the surface of the sphere, and the geometry of general relativity. Jason will surely sometime learn.
On reflection, that Saturday Morning Breakfast Cereal I was thinking about was not mathematically-inclined enough to be worth including here. Helping make my mind up on that was that I had enough other comic strips to discuss here that I didn’t need to pad my essay. Yes, on a slow week I let even more marginal stuff in. Here’s the comic I don’t figure to talk about. Enjoy!
Jack Pullan’s Boomerangs rerun for the 16th is another strip built around the “algebra is useless in real life” notion. I’m too busy noticing Mom in the first panel saying “what are you doing play [sic] video games?” to respond.
Ruben Bolling’s Super-Fun-Pak Comix excerpt for the 16th is marginal, yeah, but fun. Numeric coincidence and numerology can sneak into compulsions with terrible ease. I can believe easily the need to make the number of steps divisible by some favored number.
Rich Powell’s Wide Open for the 16th is a caveman science joke, and it does rely on a chalkboard full of algebra for flavor. The symbols come tantalizingly close to meaningful. The amount of kinetic energy, K or KE, of a particle of mass m moving at speed v is indeed . Both 16 and 32 turn up often in the physics of falling bodies, at least if we’re using feet to measure. turns up in physics too. It comes from the acceleration of a mass on a spring. But an equation of the same shape turns up whenever you describe things that go through tiny wobbles around the normal value. So the blackboard is gibberish, but it’s a higher grade of gibberish than usual.
Rick Detorie’s One Big Happy rerun for the 17th is a resisting-the-word-problem joke, made fresher by setting it in little Ruthie’s playing at school.
Emphasis on can. There’s no good way to solve the “general” three-body problem, the one where the star and planets can have any sizes and any starting positions and any starting speeds. We can do well for special cases, though. If you have a sun, a planet, and a satellite — each body negligible compared to the other — we can predict orbits perfectly well. If the bodies have to stay in one plane of motion, instead of moving in three-dimensional space, we can do pretty well. If we know two of the bodies orbit each other tightly and the third is way off in the middle of nowhere we can do pretty well.
But there’s still so many interesting cases for which we just can’t be sure chaos will not break out. Three interacting bodies just offer so much more chance for things to happen. (To mention something surely coincidental, it does seem to be a lot easier to write good comedy, or drama, with three important characters rather than two. Any pair of characters can gang up on the third, after all. I notice how much more energetic Over The Hedge became when Hammy the Squirrel joined RJ and Verne as the core cast.)
If there was one major theme for this week it was my confidence that there must be another source of Jumble strips out there. I haven’t found it, but I admit not making it a priority either. The official Jumble site says I can play if I activate Flash, but I don’t have enough days in the year to keep up with Flash updates. And that doesn’t help me posting mathematics-relevant puzzles here anyway.
Mark Anderson’s Andertoons for January 29th satisfies my Andertoons need for this week. And it name-drops the one bit of geometry everyone remembers. To be dour and humorless about it, though, I don’t think one could likely apply the Pythagorean Theorem. Typically the horizontal axis and the vertical axis in a graph like this measure different things. Squaring the different kinds of quantities and adding them together wouldn’t mean anything intelligible. What would even be the square root of (say) a squared-dollars-plus-squared-weeks? This is something one learns from dimensional analysis, a corner of mathematics I’ve thought about writing about some. I admit this particular insight isn’t deep, but everything starts somewhere.
Norm Feuti’s Gil rerun for the 30th is a geometry name-drop, listing it as the sort of category Jeopardy! features. Gil shouldn’t quit so soon. The responses for the category are “What is the Pythagorean Theorem?”, “What is acute?”, “What is parallel?”, “What is 180 degrees?” (or, possibly, 360 or 90 degrees), and “What is a pentagon?”.
Terri Libenson’s Pajama Diaries for the 1st of February shows off the other major theme of this past week, which was busy enough that I have to again split the comics post into two pieces. That theme is people getting basic mathematics wrong. Mostly counting. (You’ll see.) I know there’s no controlling what people feel embarrassed about. But I think it’s unfair to conclude you “can no longer” do mathematics in your head because you’re not able to make change right away. It’s normal to be slow or unreliable about something you don’t do often. Inexperience and inability are not the same thing, and it’s unfair to people to conflate them.
Gordon Bess’s Redeye for the 21st of September, 1970, got rerun the 1st of February. And it’s another in the theme of people getting basic mathematics wrong. And even more basic mathematics this time. There’s more problems-with-counting comics coming when I finish the comics from the past week.
Dave Whamond’s Reality Check for the 1st hopes that you won’t notice the label on the door is painted backwards. Just saying. It’s an easy joke to make about algebra, also, that it should put letters in to perfectly good mathematics. Letters are used for good reasons, though. We’ve always wanted to work out the value of numbers we only know descriptions of. But it’s way too wordy to use the whole description of the number every time we might speak of it. Before we started using letters we could use placeholder names like “re”, meaning “thing” (as in “thing we want to calculate”). That works fine, although it crashes horribly when we want to track two or three things at once. It’s hard to find words that are decently noncommittal about their values but that we aren’t going to confuse with each other.
So the alphabet works great for this. An individual letter doesn’t suggest any particular number, as long as we pretend ‘O’ and ‘I’ and ‘l’ don’t look like they do. But we also haven’t got any problem telling ‘x’ from ‘y’ unless our handwriting is bad. They’re quick to write and to say aloud, and they don’t require learning to write any new symbols.
Later, yes, letters do start picking up connotations. And sometimes we need more letters than the Roman alphabet allows. So we import from the Greek alphabet the letters that look different from their Roman analogues. That’s a bit exotic. But at least in a Western-European-based culture they aren’t completely novel. Mathematicians aren’t really trying to make this hard because, after all, they’re the ones who have to deal with the hard parts.
Bu Fisher’s Mutt and Jeff rerun for the 2nd is another of the basic-mathematics-wrong jokes. But it does get there by throwing out a baffling set of story-problem-starter points. Particularly interesting to me is Jeff’s protest in the first panel that they couldn’t have been doing 60 miles an hour as they hadn’t been out an hour. It’s the sort of protest easy to use as introduction to the ideas of average speed and instantaneous speed and, from that, derivatives.
Comic Strip Master Command sent me a slow week in mathematical comics. I suppose they knew I was on somehow a busier schedule than usual and couldn’t spend all the time I wanted just writing. I appreciate that but don’t want to see another of those weeks when nothing qualifies. Just a warning there.
John Rose’s Barney Google and Snuffy Smith for the 12th is a bit of mathematical wordplay. It does use geometry as the “hard mathematics we don’t know how to do”. That’s a change from the usual algebra. And that’s odd considering the joke depends on an idiom that is actually used by real people.
Patrick Roberts’s Todd the Dinosaur for the 12th uses mathematics as the classic impossibly hard subject a seven-year-old can’t be expected to understand. The worry about fractions seems age-appropriate. I don’t know whether it’s fashionable to give elementary school students experience thinking of ‘x’ and ‘y’ as numbers. I remember that as a time when we’d get a square or circle and try to figure what number fits in the gap. It wasn’t a 0 or a square often enough.
Jef Mallett’s Frazz for the 12th uses one of those great questions I think every child has. And it uses it to question how we can learn things from statistical study. This is circling around the “Bayesian” interpretation of probability, of what odds mean. It’s a big idea and I’m not sure I’m competent to explain it. It amounts to asking what explanations would be plausibly consistent with observations. As we get more data we may be able to rule some cases in or out. It can be unsettling. It demands we accept right up front that we may be wrong. But it lets us find reasonably clean conclusions out of the confusing and muddy world of actual data.
Sam Hepburn’s Questionable Quotebook for the 14th illustrates an old observation about the hypnotic power of decimal points. I think Hepburn’s gone overboard in this, though: six digits past the decimal in this percentage is too many. It draws attention to the fakeness of the number. One, two, maybe three digits past the decimal would have a more authentic ring to them. I had thought the John Allen Paulos tweet above was about this comic, but it’s mere coincidence. Funny how that happens.
And now I can finish off last week’s mathematically-themed comic strips. There’s a strong theme to them, for a refreshing change. It would almost be what we’d call a Comics Synchronicity, on Usenet group rec.arts.comics.strips, had they all appeared the same day. Some folks claiming to be open-minded would allow a Synchronicity for strips appearing on subsequent days or close enough in publication, but I won’t have any of that unless it suits my needs at the time.
Ernie Bushmiller’s for the 6th would fit thematically better as a Cameo Edition comic. It mentions arithmetic but only because it’s the sort of thing a student might need a cheat sheet on. I can’t fault Sluggo needing help on adding eight or multiplying by six; they’re hard. Not remembering 4 x 2 is unusual. But everybody has their own hangups. The strip originally ran the 6th of December, 1949.
Bill holbrook’s On The Fastrack for the 7th seems like it should be the anthropomorphic numerals joke for this essay. It doesn’t seem to quite fit the definition, but, what the heck.
Brian Boychuk and Ron Boychuk’s The Chuckle Brothers on the 7th starts off the run of E = mc2 jokes for this essay. This one reminds me of Gary Larson’s Far Side classic with the cleaning woman giving Einstein just that little last bit of inspiration about squaring things away. It shouldn’t surprise anyone that E equalling m times c squared isn’t a matter of what makes an attractive-looking formula. There’s good reasons when one thinks what energy and mass are to realize they’re connected like that. Einstein’s famous, deservedly, for recognizing that link and making it clear.
Mark Pett’s Lucky Cow rerun for the 7th has Claire try to use Einstein’s famous quote to look like a genius. The mathematical content is accidental. It could be anything profound yet easy to express, and it’s hard to beat the economy of “E = mc2” for both. I’d agree that it suggests Claire doesn’t know statistics well to suppose she could get a MacArthur “Genius” Grant by being overheard by a grant nominator. On the other hand, does anybody have a better idea how to get their attention?
Harley Schwadron’s 9 to 5 for the 8th completes the “E = mc2” triptych. Calling a tie with the equation on it a power tie elevates the gag for me. I don’t think of “E = mc2” as something that uses powers, even though it literally does. I suppose what gets me is that “c” is a constant number. It’s the speed of light in a vacuum. So “c2” is also a constant number. In form the equation isn’t different from “E = m times seven”, and nobody thinks of seven as a power.
Morrie Turner’s Wee Pals rerun for the 8th is a bit of mathematics wordplay. It’s also got that weird Morrie Turner thing going on where it feels unquestionably earnest and well-intentioned but prejudiced in that way smart 60s comedies would be.
Mort Walker’s Beetle Bailey for the 18th of May, 1960 was reprinted on the 9th. It mentions mathematics — algebra specifically — as the sort of thing intelligent people do. I’m going to take a leap and suppose it’s the sort of algebra done in high school about finding values of ‘x’ rather than the mathematics-major sort of algebra, done with groups and rings and fields. I wonder when holding a mop became the signifier of not just low intelligence but low ambition. It’s subverted in Jef Mallet’s Frazz, the title character of which works as a janitor to support his exercise and music habits. But it is a standard prop to signal something.
It’s hard to learn from an example. Examples are great, and I wouldn’t try teaching anything subtle without one. Might not even try teaching the obvious without one. But a single example is dangerous. The learner has trouble telling what parts of the example are the general lesson to learn and what parts are just things that happen to be true for that case. Having several examples, of different kinds of things, saves the student. The thing in common to many different examples is the thing to retain.
The mathematics major learns group theory in Introduction To Not That Kind Of Algebra, MAT 351. A group extracts the barest essence of arithmetic: a bunch of things and the ability to add them together. So what’s an example? … Well, the integers do nicely. What’s another example? … Well, the integers modulo two, where the only things are 0 and 1 and we know 1 + 1 equals 0. What’s another example? … The integers modulo three, where the only things are 0 and 1 and 2 and we know 1 + 2 equals 0. How about another? … The integers modulo four? Modulo five?
All true. All, also, basically the same thing. The whole set of integers, or of real numbers, are different. But as finite groups, the integers modulo anything are nice easy to understand groups. They’re known as Cyclic Groups for reasons I’ll explain if asked. But all the Cyclic Groups are kind of the same.
So how about another example? And here we get some good ones. There’s the Permutation Groups. These are fun. You start off with a set of things. You can label them anything you like, but you’re daft if you don’t label them the counting numbers. So, say, the set of things 1, 2, 3, 4, 5. Start with them in that order. A permutation is the swapping of any pair of those things. So swapping, say, the second and fifth things to get the list 1, 5, 3, 4, 2. The collection of all the swaps you can make is the Permutation Group on this set of things. The things in the group are not 1, 2, 3, 4, 5. The things in the permutation group are “swap the second and fifth thing” or “swap the third and first thing” or “swap the fourth and the third thing”. You maybe feel uneasy about this. That’s all right. I suggest playing with this until you feel comfortable because it is a lot of fun to play with. Playing in this case mean writing out all the ways you can swap stuff, which you can always do as a string of swaps of exactly two things.
(Some people may remember an episode of Futurama that involved a brain-swapping machine. Or a body-swapping machine, if you prefer. The gimmick of the episode is that two people could only swap bodies/brains exactly one time. The problem was how to get everybody back in their correct bodies. It turns out to be possible to do, and one of the show’s writers did write a proof of it. It’s shown on-screen for a moment. Many fans were awestruck by an episode of the show inspiring a Mathematical Theorem. They’re overestimating how rare theorems are. But it is fun when real mathematics gets done as a side effect of telling a good joke. Anyway, the theorem fits well in group theory and the study of these permutation groups.)
So the student wanting examples of groups can get the Permutation Group on three elements. Or the Permutation Group on four elements. The Permutation Group on five elements. … You kind of see, this is certainly different from those Cyclic Groups. But they’re all kind of like each other.
An “Alternating Group” is one where all the elements in it are an even number of permutations. So, “swap the second and fifth things” would not be in an alternating group. But “swap the second and fifth things, and swap the fourth and second things” would be. And so the student needing examples can look at the Alternating Group on two elements. Or the Alternating Group on three elements. The Alternating Group on four elements. And so on. It’s slightly different from the Permutation Group. It’s certainly different from the Cyclic Group. But still, if you’ve mastered the Alternating Group on five elements you aren’t going to see the Alternating Group on six elements as all that different.
Cyclic Groups and Alternating Groups have some stuff in common. Permutation Groups not so much and I’m going to leave them in the above paragraph, waving, since they got me to the Alternating Groups I wanted.
One is that they’re finite. At least they can be. I like finite groups. I imagine students like them too. It’s nice having a mathematical thing you can write out in full and know you aren’t missing anything.
The second thing is that they are, or they can be, “simple groups”. That’s … a challenge to explain. This has to do with the structure of the group and the kinds of subgroup you can extract from it. It’s very very loosely and figuratively and do not try to pass this off at your thesis defense kind of like being a prime number. In fact, Cyclic Groups for a prime number of elements are simple groups. So are Alternating Groups on five or more elements.
So we get to wondering: what are the finite simple groups? Turns out they come in four main families. One family is the Cyclic Groups for a prime number of things. One family is the Alternating Groups on five or more things. One family is this collection called the Chevalley Groups. Those are mostly things about projections: the ways to map one set of coordinates into another. We don’t talk about them much in Introduction To Not That Kind Of Algebra. They’re too deep into Geometry for people learning Algebra. The last family is this collection called the Twisted Chevalley Groups, or the Steinberg Groups. And they .. uhm. Well, I never got far enough into Geometry I’m Guessing to understand what they’re for. I’m certain they’re quite useful to people working in the field of order-three automorphisms of the whatever exactly D4 is.
And that’s it. That’s all the families there are. If it’s a finite simple group then it’s one of these. … Unless it isn’t.
Because there are a couple of stragglers. There are a few finite simple groups that don’t fit in any of the four big families. And it really is only a few. I would have expected an infinite number of weird little cases that don’t belong to a family that looks similar. Instead, there are 26. (27 if you decide a particular one of the Steinberg Groups doesn’t really belong in that family. I’m not familiar enough with the case to have an opinion.) Funny number to have turn up. It took ten thousand pages to prove there were just the 26 special cases. I haven’t read them all. (I haven’t read any of the pages. But my Algebra professors at Rutgers were proud to mention their department’s work in tracking down all these cases.)
Some of these cases have some resemblance to one another. But not enough to see them as a family the way the Cyclic Groups are. We bundle all these together in a wastebasket taxon called “the sporadic groups”. The first five of them were worked out in the 1860s. The last of them was worked out in 1980, seven years after its existence was first suspected.
The sporadic groups all have weird sizes. The smallest one, known as M11 (for “Mathieu”, who found it and four of its siblings in the 1860s) has 7,920 things in it. They get enormous soon after that.
The biggest of the sporadic groups, and the last one described, is the Monster Group. It’s known as M. It has a lot of things in it. In particular it’s got 808,017,424,794,512,875,886,459,904,961,710,757,005,754,368,000,000,000 things in it. So, you know, it’s not like we’ve written out everything that’s in it. We’ve just got descriptions of how you would write out everything in it, if you wanted to try. And you can get a good argument going about what it means for a mathematical object to “exist”, or to be “created”. There are something like 1054 things in it. That’s something like a trillion times a trillion times the number of stars in the observable universe. Not just the stars in our galaxy, but all the stars in all the galaxies we could in principle ever see.
It’s one of the rare things for which “Brobdingnagian” is an understatement. Everything about it is mind-boggling, the sort of thing that staggers the imagination more than infinitely large things do. We don’t really think of infinitely large things; we just picture “something big”. A number like that one above is definite, and awesomely big. Just read off the digits of that number; it sounds like what we imagine infinity ought to be.
We can make a chart, called the “character table”, which describes how subsets of the group interact with one another. The character table for the Monster Group is 194 rows tall and 194 columns wide. The Monster Group can be represented as this, I am solemnly assured, logical and beautiful algebraic structure. It’s something like a polyhedron in rather more than three dimensions of space. In particular it needs 196,884 dimensions to show off its particular beauty. I am taking experts’ word for it. I can’t quite imagine more than 196,883 dimensions for a thing.
And it’s a thing full of mystery. This creature of group theory makes us think of the number 196,884. The same 196,884 turns up in number theory, the study of how integers are put together. It’s the first non-boring coefficient in a thing called the j-function. It’s not coincidence. This bit of number theory and this bit of group theory are bound together, but it took some years for anyone to quite understand why.
There are more mysteries. The character table has 194 rows and columns. Each column implies a function. Some of those functions are duplicated; there are 171 distinct ones. But some of the distinct ones it turns out you can find by adding together multiples of others. There are 163 distinct ones. 163 appears again in number theory, in the study of algebraic integers. These are, of course, not integers at all. They’re things that look like complex-valued numbers: some real number plus some (possibly other) real number times the square root of some specified negative number. They’ve got neat properties. Or weird ones.
You know how with integers there’s just one way to factor them? Like, fifteen is equal to three times five and no other set of prime numbers? Algebraic integers don’t work like that. There’s usually multiple ways to do that. There are exceptions, algebraic integers that still have unique factorings. They happen only for a few square roots of negative numbers. The biggest of those negative numbers? Minus 163.
I don’t know if this 163 appearance means something. As I understand the matter, neither does anybody else.
There is some link to the mathematics of string theory. That’s an interesting but controversial and hard-to-experiment-upon model for how the physics of the universe may work. But I don’t know string theory well enough to say what it is or how surprising this should be.
The Monster Group creates a monster essay. I suppose it couldn’t do otherwise. I suppose I can’t adequately describe all its sublime mystery. Dr Mark Ronan has written a fine web page describing much of the Monster Group and the history of our understanding of it. He also has written a book, Symmetry and the Monster, to explain all this in greater depths. I’ve not read the book. But I do mean to, now.
I want to talk about functions again. I’ve been keeping like a proper mathematician to a nice general idea of what a function is. The sort where a function’s this rule matching stuff in a set called the domain with stuff in a set called the range. And I’ve tried not to commit myself to saying anything about what that domain and range are. They could be numbers. They could be other functions. They could be the set of DVDs you own but haven’t watched in more than two years. They could be collections socks. Haven’t said.
But we know what functions anyone cares about. They’re stuff that have domains and ranges that are numbers. Preferably real numbers. Complex-valued numbers if we must. If we look at more exotic sets they’re ones that stick close to being numbers: vectors made up of an ordered set of numbers. Matrices of numbers. Functions that are themselves about numbers. Maybe we’ll get to something exotic like a rotation, but then what is a rotation but spinning something a certain number of degrees? There are a bunch of unavoidably common domains and ranges.
Fine, then. I’ll stick to functions with ranges that look enough like regular old numbers. By “enough” I mean they have a zero. That is, something that works like zero does. You know, add it to something else and that something else isn’t changed. That’s all I need.
A natural thing to wonder about a function — hold on. “Natural” is the wrong word. Something we learn to wonder about in functions, in pre-algebra class where they’re all polynomials, is where the zeroes are. They’re generally not at zero. Why would we say “zeroes” to mean “zero”? That could let non-mathematicians think they knew what we were on about. By the “zeroes” we mean the things in the domain that get matched to the zero in the range. It might be zero; no reason it couldn’t, until we know what the function’s rule is. Just we can’t count on that.
A polynomial we know has … well, it might have zero zeroes. Might have no zeroes. It might have one, or two, or so on. If it’s an n-th degree polynomial it can have up to n zeroes. And if it’s not a polynomial? Well, then it could have any conceivable number of zeroes and nobody is going to give you a nice little formula to say where they all are. It’s not that we’re being mean. It’s just that there isn’t a nice little formula that works for all possibilities. There aren’t even nice little formulas that work for all polynomials. You have to find zeroes by thinking about the problem. Sorry.
But! Suppose you have a collection of all the zeroes for your function. That’s all the points in the domain that match with zero in the range. Then we have a new name for the thing you have. And that’s the kernel of your function. It’s the biggest subset in the domain with an image that’s just the zero in the range.
So we have a name for the zeroes that isn’t just “the zeroes”. What does this get us?
If we don’t know anything about the kind of function we have, not much. If the function belongs to some common kinds of functions, though, it tells us stuff.
For example. Suppose the function has domain and range that are vectors. And that the function is linear, which is to say, easy to deal with. Let me call the function ‘f’. And let me pick out two things in the domain. I’ll call them ‘x’ and ‘y’ because I’m writing this after Thanksgiving dinner and can’t work up a cleverer name for anything. If f is linear then f(x + y) is the same thing as f(x) + f(y). And now something magic happens. If x and y are both in the kernel, then x + y has to be in the kernel too. Think about it. Meanwhile, if x is in the kernel but y isn’t, then f(x + y) is f(y). Again think about it.
What we can see is that the domain fractures into two directions. One of them, the direction of the kernel, is invisible to the function. You can move however much you like in that direction and f can’t see it. The other direction, perpendicular (“orthogonal”, we say in the trade) to the kernel, is visible. Everything that might change changes in that direction.
This idea threads through vector spaces, and we study a lot of things that turn out to look like vector spaces. It keeps surprising us by letting us solve problems, or find the best-possible approximate solutions. This kernel gives us room to match some fiddly conditions without breaking the real solution. The size of the null space alone can tell us whether some problems are solvable, or whether they’ll have infinitely large sets of solutions.
In this vector-space construct the kernel often takes on another name, the “null space”. This means the same thing. But it reminds us that superhero comics writers miss out on many excellent pieces of terminology by not taking advanced courses in mathematics.
Kernels also appear in group theory, whenever we get into rings. We’re always working with rings. They’re nearly as unavoidable as vector spaces.
You know how you can divide the whole numbers into odd and even? And you can do some neat tricks with that for some problems? You can do that with every ring, using the kernel as a dividing point. This gives us information about how the ring is shaped, and what other structures might look like the ring. This often lets us turn proofs that might be hard into a collection of proofs on individual cases that are, at least, doable. Tricks about odd and even numbers become, in trained hands, subtle proofs of surprising results.
We see vector spaces and rings all over the place in mathematics. Some of that’s selection bias. Vector spaces capture a lot of what’s important about geometry. Rings capture a lot of what’s important about arithmetic. We have understandings of geometry and arithmetic that transcend even our species. Raccoons understand space. Crows understand number. When we look to do mathematics we look for patterns we understand, and these are major patterns we understand. And there are kernels that matter to each of them.
Some mathematical ideas inspire metaphors to me. Kernels are one. Kernels feel to me like the process of holding a polarized lens up to a crystal. This lets one see how the crystal is put together. I realize writing this down that my metaphor is unclear: is the kernel the lens or the structure seen in the crystal? I suppose the function has to be the lens, with the kernel the crystallization planes made clear under it. It’s curious I had enjoyed this feeling about kernels and functions for so long without making it precise. Feelings about mathematical structures can be like that.
So let me start the End 2016 Mathematics A To Z with a word everybody figures they know. As will happen, everybody’s right and everybody’s wrong about that.
Everybody knows what algebra is. It’s the point where suddenly mathematics involves spelling. Instead of long division we’re on a never-ending search for ‘x’. Years later we pass along gifs of either someone saying “stop asking us to find your ex” or someone who’s circled the letter ‘x’ and written “there it is”. And make jokes about how we got through life without using algebra. And we know it’s the thing mathematicians are always doing.
Mathematicians aren’t always doing that. I expect the average mathematician would say she almost never does that. That’s a bit of a fib. We have a lot of work where we do stuff that would be recognizable as high school algebra. It’s just we don’t really care about that. We’re doing that because it’s how we get the problem we are interested in done. the most recent few pieces in my “Why Stuff can Orbit” series include a bunch of high school algebra-style work. But that was just because it was the easiest way to answer some calculus-inspired questions.
Still, “algebra” is a much-used word. It comes back around the second or third year of a mathematics major’s career. It comes in two forms in undergraduate life. One form is “linear algebra”, which is a great subject. That field’s about how stuff moves. You get to imagine space as this stretchy material. You can stretch it out. You can squash it down. You can stretch it in some directions and squash it in others. You can rotate it. These are simple things to build on. You can spend a whole career building on that. It becomes practical in surprising ways. For example, it’s the field of study behind finding equations that best match some complicated, messy real data.
The second form is “abstract algebra”, which comes in about the same time. This one is alien and baffling for a long while. It doesn’t help that the books all call it Introduction to Algebra or just Algebra and all your friends think you’re slumming. The mathematics major stumbles through confusing definitions and theorems that ought to sound comforting. (“Fermat’s Little Theorem”? That’s a good thing, right?) But the confusion passes, in time. There’s a beautiful subject here, one of my favorites. I’ve talked about it a lot.
We start with something that looks like the loosest cartoon of arithmetic. We get a bunch of things we can add together, and an ‘addition’ operation. This lets us do a lot of stuff that looks like addition modulo numbers. Then we go on to stuff that looks like picking up floor tiles and rotating them. Add in something that we call ‘multiplication’ and we get rings. This is a bit more like normal arithmetic. Add in some other stuff and we get ‘fields’ and other structures. We can keep falling back on arithmetic and on rotating tiles to build our intuition about what we’re doing. This trains mathematicians to look for particular patterns in new, abstract constructs.
Linear algebra is not an abstract-algebra sort of algebra. Sorry about that.
And there’s another kind of algebra that mathematicians talk about. At least once they get into grad school they do. There’s a huge family of these kinds of algebras. The family trait for them is that they share a particular rule about how you can multiply their elements together. I won’t get into that here. There are many kinds of these algebras. One that I keep trying to study on my own and crash hard against is Lie Algebra. That’s named for the Norwegian mathematician Sophus Lie. Pronounce it “lee”, as in “leaning”. You can understand quantum mechanics much better if you’re comfortable with Lie Algebras and so now you know one of my weaknesses. Another kind is the Clifford Algebra. This lets us create something called a “hypercomplex number”. It isn’t much like a complex number. Sorry. Clifford Algebra does lend to a construct called spinors. These help physicists understand the behavior of bosons and fermions. Every bit of matter seems to be either a boson or a fermion. So you see why this is something people might like to understand.
Boolean Algebra is the algebra of this type that a normal person is likely to have heard of. It’s about what we can build using two values and a few operations. Those values by tradition we call True and False, or 1 and 0. The operations we call things like ‘and’ and ‘or’ and ‘not’. It doesn’t sound like much. It gives us computational logic. Isn’t that amazing stuff?
So if someone says “algebra” she might mean any of these. A normal person in a non-academic context probably means high school algebra. A mathematician speaking without further context probably means abstract algebra. If you hear something about “matrices” it’s more likely that she’s speaking of linear algebra. But abstract algebra can’t be ruled out yet. If you hear a word like “eigenvector” or “eigenvalue” or anything else starting “eigen” (or “characteristic”) she’s more probably speaking of abstract algebra. And if there’s someone’s name before the word “algebra” then she’s probably speaking of the last of these. This is not a perfect guide. But it is the sort of context mathematicians expect other mathematicians notice.
Here in the United States schools are just lurching back into the mode where they have students come in and do stuff all day. Perhaps this is why it was a routine week. Comic Strip Master Command wants to save up a bunch of story problems for us. But here’s what the last seven days sent into my attention.
Jeff Harris’s Shortcuts educational feature for the 21st is about algebra. It’s got a fair enough blend of historical trivia and definitions and examples and jokes. I don’t remember running across the “number cruncher” joke before.
Mark Anderson’s Andertoons for the 23rd is your typical student-in-lecture joke. But I do sympathize with students not understanding when a symbol gets used for different meanings. It throws everyone. But sometimes the things important to note clearly in one section are different from the needs in another section. No amount of warning will clear things up for everybody, but we try anyway.
Tom Thaves’s Frank and Ernest for the 23rd tells a joke about collapsing wave functions, which is why you never see this comic in a newspaper but always see it on a physics teacher’s door. This is properly physics, specifically quantum mechanics. But it has mathematical import. The most practical model of quantum mechanics describes what state a system is in by something called a wave function. And we can turn this wave function into a probability distribution, which describes how likely the system is to be in each of its possible states. “Collapsing” the wave function is a somewhat mysterious and controversial practice. It comes about because if we know nothing about a system then it may have one of many possible values. If we observe, say, the position of something though, then we have one possible value. The wave functions before and after the observation are different. We call it collapsing, reflecting how a universe of possibilities collapsed into a mere fact. But it’s hard to find an explanation for what that is that’s philosophically and physically satisfying. This problem leads us to Schrödinger’s Cat, and to other challenges to our sense of how the world could make sense. So, if you want to make your mark here’s a good problem for you. It’s not going to be easy.
We’ve got into that stretch of the year when (United States) schools are out of session. Comic Strip Master Command seems to have thus ordered everyone to clean out their mathematics gags, even if they didn’t have any particularly strong ones. There were enough the past week I’m breaking this collection into two segments, though. And the first segment, I admit, is mostly the same joke repeated.
Russell Myers’s Broom Hilda for the 27th is the type case for my “Math Is Just This Hard Stuff, Right?” name here. In fairness to Broom Hilda, mathematics is a lot harder now than it was 1,500 years ago. It’s fair not being able to keep up. There was a time that finding roots of third-degree polynomials was the stuff of experts. Today it’s within the powers of any Boring Algebra student, although she’ll have to look up the formula for it.
John McPherson’s Close To Home for the 27th is a bunch of trigonometry-cheat tattoos. I’m sure some folks have gotten mathematics tattoos that include … probably not these formulas. They’re not beautiful enough. Maybe some diagrams of triangles and the like, though. The proof of the Pythagoran Theorem in Euclid’s Elements, for example, includes this intricate figure I would expect captures imaginations and could be appreciated as a beautiful drawing.
Missy Meyer’s Holiday Doodles observed that the 28th was “Tau Day”, which takes everything I find dubious about “Pi Day” and matches it to the silly idea that we would somehow make life better by replacing π with a symbol for 2π.
Hilary Price’s Rhymes With Orange for the 29th uses mathematics as the way to sort out nerds. I can’t say that’s necessarily wrong. It’s interesting to me that geometry and algebra communicate “nerdy” in a shorthand way that, say, an obsession with poetry or history or other interests wouldn’t. It wouldn’t fit the needs of this particular strip, but I imagine that a well-diagrammed sentence would be as good as a page full of equations for expressing nerdiness. The title card’s promise of doing quadratic equations would have worked on me as a kid, but I thought they sounded neat and exotic and something to discover because they sounded hard. When I took Boring High School Algebra that charm wore off.
Aaron McGruder’s The Boondocks rerun for the 29th starts a sequence of Riley doubting the use of parts of mathematics. The parts about making numbers smaller. It’s a better-than-normal treatment of the problem of getting a student motivated. The strip originally ran the 18th of April, 2001, and the story continued the several days after that.
For this week I have something I want to follow up on. We’ll see if I make it that far.
The Mean Value Theorem.
My subject line disagrees with the header just above here. I want to talk about the Mean Value Theorem. It’s one of those things that turns up in freshman calculus and then again in Analysis. It’s introduced as “the” Mean Value Theorem. But like many things in calculus it comes in several forms. So I figure to talk about one of them here, and another form in a while, when I’ve had time to make up drawings.
Calculus can split effortlessly into two kinds of things. One is differential calculus. This is the study of continuity and smoothness. It studies how a quantity changes if someting affecting it changes. It tells us how to optimize things. It tells us how to approximate complicated functions with simpler ones. Usually polynomials. It leads us to differential equations, problems in which the rate at which something changes depends on what value the thing has.
The other kind is integral calculus. This is the study of shapes and areas. It studies how infinitely many things, all infinitely small, add together. It tells us what the net change in things are. It tells us how to go from information about every point in a volume to information about the whole volume.
They aren’t really separate. Each kind informs the other, and gives us tools to use in studying the other. And they are almost mirrors of one another. Differentials and integrals are not quite inverses, but they come quite close. And as a result most of the important stuff you learn in differential calculus has an echo in integral calculus. The Mean Value Theorem is among them.
The Mean Value Theorem is a rule about functions. In this case it’s functions with a domain that’s an interval of the real numbers. I’ll use ‘a’ as the name for the smallest number in the domain and ‘b’ as the largest number. People talking about the Mean Value Theorem often do. The range is also the real numbers, although it doesn’t matter which ones.
I’ll call the function ‘f’ in accord with a longrunning tradition of not working too hard to name functions. What does matter is that ‘f’ is continuous on the interval [a, b]. I’ve described what ‘continuous’ means before. It means that here too.
And we need one more thing. The function f has to be differentiable on the interval (a, b). You maybe noticed that before I wrote [a, b], and here I just wrote (a, b). There’s a difference here. We need the function to be continuous on the “closed” interval [a, b]. That is, it’s got to be continuous for ‘a’, for ‘b’, and for every point in-between.
But we only need the function to be differentiable on the “open” interval (a, b). That is, it’s got to be continuous for all the points in-between ‘a’ and ‘b’. If it happens to be differentiable for ‘a’, or for ‘b’, or for both, that’s great. But we won’t turn away a function f for not being differentiable at those points. Only the interior. That sort of distinction between stuff true on the interior and stuff true on the boundaries is common. This is why mathematicians have words for “including the boundaries” (“closed”) and “never minding the boundaries” (“open”).
As to what “differentiable” is … A function is differentiable at a point if you can take its derivative at that point. I’m sure that clears everything up. There are many ways to describe what differentiability is. One that’s not too bad is to imagine zooming way in on the curve representing a function. If you start with a big old wobbly function it waves all around. But pick a point. Zoom in on that. Does the function stay all wobbly, or does it get more steady, more straight? Keep zooming in. Does it get even straighter still? If you zoomed in over and over again on the curve at some point, would it look almost exactly like a straight line?
If it does, then the function is differentiable at that point. It has a derivative there. The derivative’s value is whatever the slope of that line is. The slope is that thing you remember from taking Boring Algebra in high school. That rise-over-run thing. But this derivative is a great thing to know. You could approximate the original function with a straight line, with slope equal to that derivative. Close to that point, you’ll make a small enough error nobody has to worry about it.
That there will be this straight line approximation isn’t true for every function. Here’s an example. Picture a line that goes up and then takes a 90-degree turn to go back down again. Look at the corner. However close you zoom in on the corner, there’s going to be a corner. It’s never going to look like a straight line; there’s a 90-degree angle there. It can be a smaller angle if you like, but any sort of corner breaks this differentiability. This is a point where the function isn’t differentiable.
There are functions that are nothing but corners. They can be differentiable nowhere, or only at a tiny set of points that can be ignored. (A set of measure zero, as the dialect would put it.) Mathematicians discovered this over the course of the 19th century. They got into some good arguments about how that can even make sense. It can get worse. Also found in the 19th century were functions that are continuous only at a single point. This smashes just about everyone’s intuition. But we can’t find a definition of continuity that’s as useful as the one we use now and avoids that problem. So we accept that it implies some pathological conclusions and carry on as best we can.
Now I get to the Mean Value Theorem in its differential calculus pelage. It starts with the endpoints, ‘a’ and ‘b’, and the values of the function at those points, ‘f(a)’ and ‘f(b)’. And from here it’s easiest to figure what’s going on if you imagine the plot of a generic function f. I recommend drawing one. Just make sure you draw it without lifting the pen from paper, and without including any corners anywhere. Something wiggly.
Draw the line that connects the ends of the wiggly graph. Formally, we’re adding the line segment that connects the points with coordinates (a, f(a)) and (b, f(b)). That’s coordinate pairs, not intervals. That’s clear in the minds of the mathematicians who don’t see why not to use parentheses over and over like this. (We are short on good grouping symbols like parentheses and brackets and braces.)
Per the Mean Value Theorem, there is at least one point whose derivative is the same as the slope of that line segment. If you were to slide the line up or down, without changing its orientation, you’d find something wonderful. Most of the time this line intersects the curve, crossing from above to below or vice-versa. But there’ll be at least one point where the shifted line is “tangent”, where it just touches the original curve. Close to that touching point, the “tangent point”, the shifted line and the curve blend together and can’t be easily told apart. As long as the function is differentiable on the open interval (a, b), and continuous on the closed interval [a, b], this will be true. You might convince yourself of it by drawing a couple of curves and taking a straightedge to the results.
This is an existence theorem. Like the Intermediate Value Theorem, it doesn’t tell us which point, or points, make the thing we’re interested in true. It just promises us that there is some point that does it. So it gets used in other proofs. It lets us mix information about intervals and information about points.
It’s tempting to try using it numerically. It looks as if it justifies a common differential-calculus trick. Suppose we want to know the value of the derivative at a point. We could pick a little interval around that point and find the endpoints. And then find the slope of the line segment connecting the endpoints. And won’t that be close enough to the derivative at the point we care about?
Well. Um. No, we really can’t be sure about that. We don’t have any idea what interval might make the derivative of the point we care about equal to this line-segment slope. The Mean Value Theorem won’t tell us. It won’t even tell us if there exists an interval that would let that trick work. We can’t invoke the Mean Value Theorem to let us get away with that.
Often, though, we can get away with it. Differentiable functions do have to follow some rules. Among them is that if you do pick a small enough interval then approximations that look like this will work all right. If the function flutters around a lot, we need a smaller interval. But a lot of the functions we’re interested in don’t flutter around that much. So we can get away with it. And there’s some grounds to trust in getting away with it. The Mean Value Theorem isn’t any part of the grounds. It just looks so much like it ought to be.
I hope on a later Thursday to look at an integral-calculus form of the Mean Value Theorem.
I first learned of Cramer’s Rule in the way I expect most people do. It was an algebra course. I mean high school algebra. By high school algebra I mean you spend roughly eight hundred years learning ways to solve for x or to plot y versus x. Then take a pause for polar coordinates and matrices. Then you go back to finding both x and y.
Cramer’s Rule came up in the context of solving simultaneous equations. You have more than one variable. So x and y. Maybe z. Maybe even a w, before whoever set up the problem gives up and renames everything x1 and x2 and x62 and all that. You also have more than one equation. In fact, you have exactly as many equations as you have variables. Are there any sets of values those variables can have which make all those variable true simultaneously? Thus the imaginative name “simultaneous equations” or the search for “simultaneous solutions”.
If all the equations are linear then we can always say whether there’s simultaneous solutions. By “linear” we mean what we always mean in mathematics, which is, “something we can handle”. But more exactly it means the equations have x and y and whatever other variables only to the first power. No x-squared or square roots of y or tangents of z or anything. (The equations are also allowed to omit a variable. That is, if you have one equation with x, y, and z, and another with just x and z, and another with just y and z, that’s fine. We pretend the missing variable is there and just multiplied by zero, and proceed as before.) One way to find these solutions is with Cramer’s Rule.
Cramer’s Rule sets up some matrices based on the system of equations. If the system has two equations, it sets up three matrices. If the system has three equations, it sets up four matrices. If the system has twelve equations, it sets up thirteen matrices. You see the pattern here. And then you can take the determinant of each of these matrices. Dividing the determinant of one of these matrices by another one tells you what value of x makes all the equations true. Dividing the determinant of another matrix by the determinant of one of these matrices tells you which values of y makes all the equations true. And so on. The Rule tells you which determinants to use. It also says what it means if the determinant you want to divide by equals zero. It means there’s either no set of simultaneous solutions or there’s infinitely many solutions.
This gets dropped on us students in the vain effort to convince us knowing how to calculate determinants is worth it. It’s not that determinants aren’t worth knowing. It’s just that they don’t seem to tell us anything we care about. Not until we get into mappings and calculus and differential equations and other mathematics-major stuff. We never see it in high school.
And the hard part of determinants is that for all the cool stuff they tell us, they take forever to calculate. The determinant for a matrix with two rows and two columns isn’t bad. Three rows and three columns is getting bad. Four rows and four columns is awful. The determinant for a matrix with five rows and five columns you only ever calculate if you’ve made your teacher extremely cross with you.
So there’s the genius and the first problem with Cramer’s Rule. It takes a lot of calculating. Many any errors along the way with the calculation and your work is wrong. And worse, it won’t be wrong in an obvious way. You can find the error only by going over every single step and hoping to catch the spot where you, somehow, got 36 times -7 minus 21 times -8 wrong.
The second problem is nobody in high school algebra mentions why systems of linear equations should be interesting to solve. Oh, maybe they’ll explain how this is the work you do to figure out where two straight lines intersect. But that just shifts the “and we care because … ?” problem back one step. Later on we might come to understand the lines represent cases where something we’re interested in is true, or where it changes from true to false.
This sort of simultaneous-solution problem turns up naturally in optimization problems. These are problems where you try to find a maximum subject to some constraints. Or find a minimum. Maximums and minimums are the same thing when you think about them long enough. If all the constraints can be satisfied at once and you get a maximum (or minimum, whatever), great! If they can’t … Well, you can study how close it’s possible to get, and what happens if you loosen one or more constraint. That’s worth knowing about.
The third problem with Cramer’s Rule is that, as a method, it kind of sucks. We can be convinced that simultaneous linear equations are worth solving, or at least that we have to solve them to get out of High School Algebra. And we have computers. They can grind away and work out thirteen determinants of twelve-row-by-twelve-column matrices. They might even get an answer back before the end of the term. (The amount of work needed for a determinant grows scary fast as the matrix gets bigger.) But all that work might be meaningless.
The trouble is that Cramer’s Rule is numerically unstable. Before I even explain what that is you already sense it’s a bad thing. Think of all the good things in your life you’ve heard described as unstable. Fair enough. But here’s what we mean by numerically unstable.
Is 1/3 equal to 0.3333333? No, and we know that. But is it close enough? Sure, most of the time. Suppose we need a third of sixty million. 0.3333333 times 60,000,000 equals 19,999,998. That’s a little off of the correct 20,000,000. But I bet you wouldn’t even notice the difference if nobody pointed it out to you. Even if you did notice it you might write off the difference. “If we must, make up the difference out of petty cash”, you might declare, as if that were quite sensible in the context.
And that’s so because this multiplication is numerically stable. Make a small error in either term and you get a proportional error in the result. A small mistake will — well, maybe it won’t stay small, necessarily. But it’ll not grow too fast too quickly.
So now you know intuitively what an unstable calculation is. This is one in which a small error doesn’t necessarily stay proportionally small. It might grow huge, arbitrarily huge, and in few calculations. So your answer might be computed just fine, but actually be meaningless.
This isn’t because of a flaw in the computer per se. That is, it’s working as designed. It’s just that we might need, effectively, infinitely many digits of precision for the result to be correct. You see where there may be problems achieving that.
Cramer’s Rule isn’t guaranteed to be nonsense, and that’s a relief. But it is vulnerable to this. You can set up problems that look harmless but which the computer can’t do. And that’s surely the worst of all worlds, since we wouldn’t bother calculating them numerically if it weren’t too hard to do by hand.
I don’t want to get too down on Cramer’s Rule. It’s not like the numerical instability hurts every problem you might use it on. And you can, at the cost of some more work, detect whether a particular set of equations will have instabilities. That requires a lot of calculation but if we have the computer to do the work fine. Let it. And a computer can limit its numerical instabilities if it can do symbolic manipulations. That is, if it can use the idea of “one-third” rather than 0.3333333. The software package Mathematica, for example, does symbolic manipulations very well. You can shed many numerical-instability problems, although you gain the problem of paying for a copy of Mathematica.
If you just care about, or just need, one of the variables then what the heck. Cramer’s Rule lets you solve for just one or just some of the variables. That seems like a niche application to me, but it is there.
And the Rule re-emerges in pure analysis, where numerical instability doesn’t matter. When we look to differential equations, for example, we often find solutions are combinations of several independent component functions. Bases, in fact. Testing whether we have found independent bases can be done through a thing called the Wronskian. That’s a way that Cramer’s Rule appears in differential equations.
Wikipedia also asserts the use of Cramer’s Rule in differential geometry. I believe that’s a true statement, and that it will be reflected in many mechanics problems. In these we can use our knowledge that, say, energy and angular momentum of a system are constant values to tell us something of how positions and velocities depend on each other. But I admit I’m not well-read in differential geometry. That’s something which has indeed caused me pain in my scholarly life. I don’t know whether differential geometers thank Cramer’s Rule for this insight or whether they’re just glad to have got all that out of the way. (See the above Wikipedia Editors quarrel.)
I admit for all this talk about Cramer’s Rule I haven’t said what it is. Not in enough detail to pass your high school algebra class. That’s all right. It’s easy to find. MathWorld has the rule in pretty simple form. Mathworld does forget to define what it means by the vectord. (It’s the vector with components d1, d2, et cetera.) But that’s enough technical detail. If you need to calculate something using it, you can probably look closer at the problem and see if you can do it another way instead. Or you’re in high school algebra and just have to slog through it. It’s all right. Eventually you can put x and y aside and do geometry.
Last week’s Reading The Comics was a bunch of Gocomics.com strips. And I don’t feel the need to post the images for those, since they’re reasonably stable links. Today’s is also a bunch of Gocomics.com strips. I know how every how-to-bring-in-readers post ever says you should include images. Maybe I will commission someone to do some icons. It couldn’t hurt.
Someone looking close at the title, with responsible eye protection, might notice it’s dated the 17th, a day this is not. There haven’t been many mathematically-themed comic strips since the 17th is all. And I’m thinking to try out, at least for a while, making the day on which a Reading the Comics post is issued regular. Maybe Monday. This might mean there are some long and some short posts, but being a bit more scheduled might help my writing.
Mark Anderson’s Andertoons for the 14th is the charting joke for this essay. Also the Mark Anderson joke for this essay. I was all ready to start explaining ways that the entropy of something can decrease. The easiest way is by expending energy, which we can see as just increasing entropy somewhere else in the universe. The one requiring the most patience is simply waiting: entropy almost always increases, or at least doesn’t decrease. But “almost always” isn’t the same as “always”. But I have to pass. I suspect Anderson drew the chart going down because of the sense of entropy being a winding-down of useful stuff. Or because of down having connotations of failure, and the increase of entropy suggesting the failing of the universe. And we can also read this as a further joke: things are falling apart so badly that even entropy isn’t working like it ought. Anderson might not have meant for a joke that sophisticated, but if he wants to say he did I won’t argue it.
Scott Adams’s Dilbert Classics for the 14th reprinted the comic of the 20th of March, 1993. I admit I do this sort of compulsive “change-simplifying” paying myself. It’s easy to do if you have committed to memory pairs of numbers separated by five: 0 and 5, 1 and 6, 2 and 7, and so on. So if I get a bill for (say) $4.18, I would look for whether I have three cents in change. If I have, have I got 23 cents? That would give me back a nickel. 43 cents would give me back a quarter in change. And a quarter is great because I can use that for pinball.
Sometimes the person at the cash register doesn’t want a ridiculous bunch of change. I don’t blame them. It’s easy to suppose that someone who’s given you $5.03 for a $4.18 charge misunderstood what the bill was. Some folks will take this as a chance to complain mightily about how kids don’t learn even the basics of mathematics anymore and the world is doomed because the young will follow their job training and let machines that are vastly better at arithmetic than they are do arithmetic. This is probably what Adams was thinking, since, well, look at the clerk’s thought balloon in the final panel.
But consider this: why would Dilbert have handed over $7.14? Or, specifically, how could he give $7.14 to the clerk but not have been able to give $2.14, which would make things easier on everybody? There’s no combination of bills — in United States or, so far as I’m aware, any major world currency — in which you can give seven dollars but not two dollars. He had to be handing over five dollars he was getting right back. The clerk would be right to suspect this. It looks like the start of a change scam, begun by giving a confusing amount of money.
Had Adams written it so that the charge was $6.89, and Dilbert “helpfully” gave $12.14, then Dilbert wouldn’t be needlessly confusing things.
Dave Whamond’s Reality Check for the 15th is that pirate-based find-x joke that feels like it should be going around Facebook, even though I don’t think it has been. I can’t say the combination of jokes quite makes logical sense, but I’m amused. It might be from the Reality Check squirrel in the corner.
Nate Fakes’s Break of Day for the 16th is the anthropomorphized shapes joke for this essay. It’s not the only shapes joke, though.
Rick Detorie’s One Big Happy rerun for the 17th is another shapes joke. Ruthie has strong ideas about what distinguishes a pyramid from a triangle. In this context I can’t say she’s wrong to assert what a pyramid is.
After the heavy pace of March and April I figure to take it easy and settle to about a three-a-week schedule around here. That doesn’t mean that Comic Strip Master Command wants things to be too slow for me. And this time they gave me more comics than usual that have expiring URLs. I don’t think I’ve had this many pictures to include in a long while.
Bill Whitehead’s Free Range for the 28th presents an equation-solving nightmare. From my experience, this would be … a great pain, yes. But it wouldn’t be a career-wrecking mess. Typically a problem that’s hard to solve is hard because you have no idea what to do. Given an expression, you’re allowed to do anything that doesn’t change its truth value. And many approaches might look promising without quite resolving to something useful. The real breakthrough is working out what approach should be used. For an astrophysics problem, there are some classes of key decisions to make. One class is what to include and what to omit in the model. Another class is what to approximate — and how — versus what to treat exactly. Another class is what sorts of substitutions and transformations turn the original expression into one that reveals what you want. Those are the hard parts, and those are unlikely to have been forgotten. Applying those may be tedious, and I don’t doubt it would be anguishing to have the finished work wiped out. But it wouldn’t set one back years either. It would just hurt.
Bill Holbrook’s On The Fastrack for the 29th continues the storyline about Fi giving her STEM talk. She is right, as I see it, in attributing drama and narrative to numbers. This is most easily seen in the sorts of finance and accounting mathematics which the character does. And the inevitable answer to “numbers are boring” (or “mathematics is boring”) is surely to show how they are about people. Even abstract mathematics is about things (some) people find interesting, and that must be about the people too.
Rick Detorie’s One Big Happy for the 16th is a confused-mathematics joke. Grandpa tosses off a New Math joke that’s reasonably age-appropriate too, which is always nice to see in a comic strip. I don’t know how seriously to take Ruthie’s assertion that a 100% means she only got at least half of the questions correct. It could be a cartoonist grumbling about how kids these days never learn anything, the same way ever past generation of cartoonists had complained. But Ruthie is also the sort of perpetually-confused, perpetually-confusing character who would get the implications of a 100% on a test wrong. Or would state them weirdly, since yes, a 100% does imply getting at least half the test’s questions right.
Niklas Eriksson’s Carpe Diem for the 3rd uses the traditional board full of mathematical symbols as signifier of intelligence. There’s some interesting mixes of symbols here. The c2, for example, isn’t wrong for mathematics. But it does evoke Einstein and physics. There’s the curious mix of the symbol π and the approximation 3.14. But then I’m not sure how we would get from any of this to a proposition like “whether we can survive without people”.
Bud Blake’s Tiger for the 3rd is a cute little kids-learning-to-count thing. I suppose it doesn’t really need to be here. But Punkinhead looks so cute wearing his tie dangling down onto the floor, the way kids wear their ties these days.
Tony Murphy’s It’s All About You for the 3rd name-drops algebra. I think what the author really wanted here was arithmetic, if the goal is to figure out the right time based on four clocks. They seem to be trying to do a simple arithmetic mean of the time on the four clocks, which is fair if we make some assumptions about how clocks drift away from the correct time. Mostly those assumptions are that the clocks all started right and are equally likely to drift backwards or forwards, and do that drifting at the same rate. If some clocks are more reliable than others, then, their claimed time should get more weight than the others. And something like that must be at work here. The mean of 7:56, 8:02, 8:07, and 8:13, uncorrected, is 8:04 and thirty seconds. That’s not close enough to 8:03 “and five-eighths” unless someone’s been calculating wrong, or supposing that 8:02 is more probably right than 8:13 is.
I concede this isn’t a set of mathematically-themed comics that inspires deep discussions. That’s all right. It’s got three that I can give pictures for, which is important. Also it means I can wrap up April with another essay. This gives me two months in a row of posting something every day, and I’d have bet that couldn’t happen.
Ted Shearer’s Quincy for the 1st of March, 1977, rerun the 25th of April, is not actually a “mathematics is useless in the real world” comic strip. It’s more about the uselessness of any school stuff in the face of problems like the neighborhood bully. Arithmetic just fits on the blackboard efficiently. There’s some sadness in the setting. There’s also some lovely artwork, though, and it’s worth noticing it. The lines are nice and expressive, and the greyscale wash well-placed. It’s good to look at.
dro-mo for the 26th I admit I’m not sure what exactly is going on. I suppose it’s a contest to describe the most interesting geometric shape. I believe the fourth panel is meant to be a representation of the tesseract, the four-dimensional analog of the cube. This causes me to realize I don’t remember any illustrations of a five-dimensional hypercube. Wikipedia has a couple, but they’re a bit disappointing. They look like the four-dimensional cube with some more lines. Maybe it has some more flattering angles somewhere.
Bill Amend’s FoxTrot for the 26th (a rerun from the 3rd of May, 2005) poses a legitimate geometry problem. Amend likes to do this. It was one of the things that first attracted me to the comic strip, actually, that his mathematics or physics or computer science jokes were correct. “Determine the sum of the interior angles for an N-sided polygon” makes sense. The commenters at Gocomics.com are quick to say what the sum is. If there are N sides, the interior angles sum up to (N – 2) times 180 degrees. I believe the commenters misread the question. “Determine”, to me, implies explaining why the sum is given by that formula. That’s a more interesting question and I think still reasonable for a freshman in high school. I would do it by way of triangles.
David L Hoyt and Jeff Knurek’s Jumble for the 27th of April gives us another arithmetic puzzle. As often happens, you can solve the surprise-answer by looking hard at the cartoon and picking up the clues from there. And it gives us an anthropomorphic-numerals gag for this collection.
Bill Holbrook’s On The Fastrack for the 28th of April has the misanthropic Fi explain some of the glories of numbers. As she says, they can be reliable, consistent partners. If you have learned something about ‘6’, then it not only is true, it must be true, at least if we are using ‘6’ to mean the same thing. This is the sort of thing that transcends ordinary knowledge and that’s so wonderful about mathematics.
Fi describes ‘x’ and ‘y’ as “shifty little goobers”, which is a bit unfair. ‘x’ and ‘y’ are names we give to numbers when we don’t yet know what values they have, or when we don’t care what they have. We’ve settled on those names mostly in imitation of Réné Descartes. Trying to do without names is a mess. You can do it, but it’s rather like novels in which none of the characters has a name. The most skilled writers can carry that off. The rest of us make a horrid mess. So we give placeholder names. Before ‘x’ and ‘y’ mathematicians would use names like ‘the thing’ (well, ‘re’) or ‘the heap’. Anything that the quantity we talk about might measure. It’s done better that way.
Comic Strip Master Command slowed down the pace at which the newspaper comics were to talk mathematical subjects. All right, that’s their prerogative. But it leaves me here, at Thursday, with slightly too few comics for my tastes. On the other hand, if I don’t run with what I have, I might not have anything to post for the 31st of March, and it would be a shame to go this whole month with something posted every day only to spoil it on the 31st. This is a pretty juvenile reason to do a thing, so here we are. Enjoy, please.
Tom Thaves’s Frank and Ernest for the 25th of March is a students-grumbling joke. I’m not sure what to make of the argument “arithmetic might be education, but that algebra stuff is indoctrination”. I imagine it reflects the feeling that the rules of arithmetic are all these nice straightforward things, and then algebra’s rules seem a bewildering set of near-gibberish. I can understand people looking at the quadratic formula, being told it has something to do with parabolas and an axis, throwing up their hands, and declaring it all this crazy game they’ll never play.
What people are forgetting in this is that everything sounds like this crazy gibberish game at first. The confusion you felt when first trying to factor a quadratic polynomial? It’s the same confusion you felt when first doing long division. And when you first multiplied a three-digit by a two-digit number. And when you had to subtract with borrowing. It’s also the same confusion you have when you first hear the first European settlement of Manhattan was driven by the Netherlands’ war for independence from Spain. Learning is changing the baffling confusion of life into an understandable pattern.
Which is not to deny that we could do a better job motivating stuff. You have no idea how many drafts of the Dedekind Domain essay I threw out because there were just too many words describing conditions and not why any of them mattered. I’m lazy; I don’t like scrapping that much text. And I’m still not quite happy with Normal Groups.
Jeff Mallet’s Frazz for the 27th is an easier joke to explain. It’s also one whose appeal I really understand. There is a compelling beauty to the notation and the symbols of higher mathematics. I remember when a kid I peered at one of my parents’ calculus textbooks. The reference page of common integrals was enchanting. It wasn’t the only thing that drove me towards mathematics. But the aesthetic beauty is there.
And it’s not just mathematicians and mathematics-based fields that see it. The arts editor for my undergraduate school’s unread leftist weekly newspaper asked me to work out a problem, any problem, to include as graphic arts. I was happy to. (I was the managing editor for the paper at the time.) I even had a great problem, from the final exam in my freshman Classical Mechanics course. The problem was to derive the equivalent of Kepler’s Laws of Motion with a different force law. Instead of the inverse-square attraction of gravity we used the exponential-decay-style interactions of the weak force. It was a brilliant exam question, frankly, and made for a page of symbols that maybe nobody understood but that I’ll bet everyone thought pretty.
John Forgetta and L A Rose’s The Meaning of Lila for the 27th is probably a rerun. The strip mostly is, although a few new or updated comics are fit into the rotation. It’s an example of a census joke, in which you classify away the whole population of the world. I remember first seeing it, as a kid, in a church bulletin. That one worked out how the entire working population of the United States was actually only two people and that’s why you’re always so tired. You could probably use the logic of this sort of joke to teach Venn diagrams. The logic that produces a funny low count relies on counting people several times, once for each of many categories they might fit in.
Mark Anderson’s Andertoons for the 30th made me giggle. I suppose there’s an essay to be written about whether we need mathematics, and what we need it for. But wouldn’t that just take away from the fun of it?
The “subgroup” part of this is easy. Remember that a “group” means a collection of things and some operation that lets us combine them. We usually call that either addition or multiplication. We usually write it out like it’s multiplication. If a and b are things from the collection, we write “ab” to mean adding or multiplying them together. (If we had a ring, we’d have something like addition and something like multiplication, and we’d be able to do “a + b” or “ab” as needed.)
So with that in mind, the first thing you’d imagine a subgroup to be? That’s what it is. It’s a collection of things, all of which are in the original group, and that uses the same operation as the original group. For example, if the original group has a set that’s the whole numbers and the operation of addition, a subgroup would be the even numbers and the same old addition.
Now things will get much clearer if I have names. Let me use G to mean some group. This is a common generic name for a group. Let me use H as the name for a subgroup of G. This is a common generic name for a subgroup of G. You see how deeply we reach to find names for things. And we’ll still want names for elements inside groups. Those are almost always lowercase letters: a and b, for example. If we want to make clear it’s something from G’s set, we might use g. If we want to be make clear it’s something from H’s set, we might use h.
I need to tax your imagination again. Suppose “g” is some element in G’s set. What would you imagine the symbol “gH” means? No, imagine something simpler.
Mathematicians call this “left-multiplying H by g”. What we mean is, take every single element h that’s in the set H, and find out what gh is. Then take all these products together. That’s the set “gH”. This might be a subgroup. It might not. No telling. Not without knowing what G is, what H is, what g is, and what the operation is. And we call it left-multiplying even if the operation is called addition or something else. It’s just easier to have a standard name even if the name doesn’t make perfect sense.
That we named something left-multiplying probably inspires a question. Is there right-multiplying? Yes, there is. We’d write that as “Hg”. And that means take every single element h that’s in the set H, and find out what hg is. Then take all these products together.
You see the subtle difference between left-multiplying and right-multiplying. In the one, you multiply everything in H on the left. In the other, you multiply everything in H on the right.
So. Take anything in G. Let me call that g. If it’s always, necessarily, true that the left-product, gH, is the same set as the right-product, Hg, then H is a normal subgroup of G.
The mistake mathematics majors make in doing this: we need the set gH to be the same as the set Hg. That is, the whole collection of products has to be the same for left-multiplying as right-multiplying. Nobody cares whether for any particular thing, h, inside H whether gh is the same as hg. It doesn’t matter. It’s whether the whole collection of things is the same that counts. I assume every mathematics major makes this mistake. I did, anyway.
The natural thing to wonder here: how can the set gH ever not be the same as Hg? For that matter, how can a single product gh ever not be the same as hg? Do mathematicians just forget how multiplication works?
Technically speaking no, we don’t. We just want to be able to talk about operations where maybe the order does too matter. With ordinary regular-old-number addition and multiplication the order doesn’t matter. gh always equals hg. We say this “commutes”. And if the operation for a group commutes, then every subgroup is a normal subgroup.
But sometimes we’re interested in things that don’t commute. Or that we can’t assume commute. The example every algebra book uses for this is three-dimensional rotations. Set your algebra book down on a table. If you don’t have an algebra book you may use another one instead. I recommend Christopher Miller’s American Cornball: A Laffopedic Guide To The Formerly Funny. It’s a fine guide to all sorts of jokes that used to amuse and what was supposed to be amusing about them. If you don’t have a table then I don’t know what to suggest.
Spin the book clockwise on the table and then stand it up on the edge nearer you. Then try again. Put the book back where it started. Stand it up on the edge nearer you and then spin it clockwise on the table. The book faces a different way this time around. (If it doesn’t, you spun too much. Try again until you get the answer I said.)
Three-dimensional rotations like this form a group. The different ways you can turn something are the elements of its set. The operation between two rotations is just to do one and then the other, in order. But they don’t commute, not most of the time. So they can have a subgroup that isn’t normal.
You may believe me now that such things exist. Now you can move on to wondering why we should care.
Let me start by saying every group has at least two normal subgroups. Whatever your group G is, there’s a subgroup that’s made up just of the identity element and the group’s operation. The identity element is the thing that acts like 1 does for multiplication. You can multiply stuff by it and you get the same thing you started. The identity and the operator make a subgroup. And you’ll convince yourself that it’s a normal subgroup as soon as you write down g1 = 1g.
(Wait, you might ask! What if multiplying on the left has a different identity than multiplying on the right does? Great question. Very good insight. You’ve got a knack for asking good questions. If we have that then we’re working with a more exotic group-like mathematical object, so don’t worry.)
So the identity, ‘1’, makes a normal subgroup. Here’s another normal subgroup. The whole of G qualifies. (It’s OK if you feel uneasy. Think it over.)
So ‘1’ is a normal subgroup of G. G is a normal subgroup of G. They’re boring answers. We know them before we even know anything about G. But they qualify.
Does this sound familiar any? We have a thing. ‘1’ and the original thing subdivide it. It might be possible to subdivide it more, but maybe not.
Is this all … factoring?
Please here pretend I make a bunch of awkward faces while trying not to say either yes or no. But if H is a normal subgroup of G, then we can write something G/H, just like we might write 4/2, and that means something.
That G/H we call a quotient group. It’s a subgroup, sure. As to what it is … well, let me go back to examples.
Let’s say that G is the set of whole numbers and the operation of ordinary old addition. And H is the set of whole numbers that are multiples of 4, again with addition. So the things in H are 0, 4, 8, 12, and so on. Also -4, -8, -12, and so on.
Suppose we pick things in G. And we use the group operation on the set of things in H. How many different sets can we get out of it? So for example we might pick the number 1 out of G. The set 1 + H is … well, list all the things that are in H, and add 1 to them. So that’s 1 + 0, 1 + 4, 1 + 8, 1 + 12, and 1 + -4, 1 + -8, 1 + -12, and so on. All told, it’s a bunch of numbers one more than a whole multiple of 4.
Or we might pick the number 7 out of G. The set 7 + H is 7 + 0, 7 + 4, 7 + 8, 7 + 12, and so on. It’s also got 7 + -4, 7 + -8, 7 + -12, and all that. These are all the numbers that are three more than a whole multiple of 4.
We might pick the number 8 out of G. This happens to be in H, but so what? The set 8 + H is going to be 8 + 0, 8 + 4, 8 + 8 … you know, these are all going to be multiples of 4 again. So 8 + H is just H. Some of these are simple.
How about the number 3? 3 + H is 3 + 0, 3 + 4, 3 + 8, and so on. The thing is, the collection of numbers you get by 3 + H is the same as the collection of numbers you get by 7 + H. Both 3 and 7 do the same thing when we add them to H.
Fiddle around with this and you realize there’s only four possible different sets you get out of this. You can get 0 + H, 1 + H, 2 + H, or 3 + H. Any other numbers in G give you a set that looks exactly like one of those. So we can speak of 0, 1, 2, and 3 as being a new group, the “quotient group” that you get by G/H. (This looks more like remainders to me, too, but that’s the terminology we have.)
But we can do something like this with any group and any normal subgroup of that group. The normal subgroup gives us a way of picking out a representative set of the original group. That set shows off all the different ways we can manipulate the normal subgroup. It tells us things about the way the original group is put together.
Normal subgroups are not just “factors, but for groups”. They do give us a way to see groups as things built up of other groups. We can see structures in sets of things.
Lewis Carroll didn’t like the matrix. Well, Charles Dodgson, anyway. And it isn’t that he disliked matrices particularly. He believed it was a bad use of a word. “Surely,” he wrote, “[ matrix ] means rather the mould, or form, into which algebraical quantities may be introduced, than an actual assemblage of such quantities”. He might have had etymology on his side. The word meant the place where something was developed, the source of something else. History has outvoted him, and his preferred “block”. The first mathematicians to use the word “matrix” were interested in things derived from the matrix. So for them, the matrix was the source of something else.
What we mean by a matrix is a collection of some number of rows and columns. Inside each individual row and column is some mathematical entity. We call this an element. Elements are almost always real numbers. When they’re not real numbers they’re complex-valued numbers. (I’m sure somebody, somewhere has created matrices with something else as elements. You’ll never see these freaks.)
Matrices work a lot like vectors do. We can add them together. We can multiply them by real- or complex-valued numbers, called scalars. But we can do other things with them. We can define multiplication, at least sometimes. The definition looks like a lot of work, but it represents something useful that way. And for square matrices, ones with equal numbers of rows and columns, we can find other useful stuff. We give that stuff wonderful names like traces and determinants and eigenvalues and eigenvectors and such.
One of the big uses of matrices is to represent a mapping. A matrix can describe how points in a domain map to points in a range. Properly, a matrix made up of real numbers can only describe what are called linear mappings. These are ones that turn the domain into the range by stretching or squeezing down or rotating the whole domain the same amount. A mapping might follow different rules in different regions, but that’s all right. We can write a matrix that approximates the original mapping, at least in some areas. We do this in the same way, and for pretty much the same reason, we can approximate a real and complicated curve with a bunch of straight lines. Or the way we can approximate a complicated surface with a bunch of triangular plates.
We can compound mappings. That is, we can start with a domain and a mapping, and find the image of that domain. We can then use a mapping again and find the image of the image of that domain. The matrix that describes this mapping-of-a-mapping is the one you get by multiplying the matrix of the first mapping and the matrix of the second mapping together. This is why we define matrix multiplication the odd way we do. Mapping are that useful, and matrices are that tied to them.
I wrote about some of the uses of matrices in a Set Tour essay. That was based on a use of matrices in physics. We can describe the changing of a physical system with a mapping. And we can understand equilibriums, states where a system doesn’t change, by looking at the matrix that approximates what the mapping does near but not exactly on the equilibrium.
But there are other uses of matrices. Many of them have nothing to do with mappings or physical systems or anything. For example, we have graph theory. A graph, here, means a bunch of points, “vertices”, connected by curves, “edges”. Many interesting properties of graphs depend on how many other vertices each vertex is connected to. And this is well-represented by a matrix. Index your vertices. Then create a matrix. If vertex number 1 connects to vertex number 2, put a ‘1’ in the first row, second column. If vertex number 1 connects to vertex number 3, put a ‘1’ in the first row, third column. If vertex number 2 isn’t connected to vertex number 3, put a ‘0’ in the second row, third column. And so on.
We don’t have to use ones and zeroes. A “network” is a kind of graph where there’s some cost associated with each edge. We can put that cost, that number, into the matrix. Studying the matrix of a graph or network can tell us things that aren’t obvious from looking at the drawing.
I don’t believe I got any requests for a mathematics term starting ‘J’. I’m as surprised as you. Well, maybe less surprised. I’ve looked at the alphabetical index for Wolfram MathWorld and noticed its relative poverty for ‘J’. It’s not as bad as ‘X’ or ‘Y’, though. But it gives me room to pick a word of my own.
The Jacobian is named for Carl Gustav Jacob Jacobi, who lived in the first half of the 19th century. He’s renowned for work in mechanics, the study of mathematically modeling physics. He’s also renowned for matrices, rectangular grids of numbers which represent problems. There’s more, of course, but those are the points that bring me to the Jacobian I mean to talk about. There are other things named for Jacobi, including other things named “Jacobian”. But I mean to limit the focus to two, related, things.
I discussed mappings some while describing homomorphisms and isomorphisms. A mapping’s a relationship matching things in one set, a domain, to things in a set, the range. The domain and the range can be anything at all. They can even be the same thing, if you like.
A very common domain is … space. Like, the thing you move around in. It’s a region full of points that are all some distance and some direction from one another. There’s almost always assumed to be multiple directions possible. We often call this “Euclidean space”. It’s the space that works like we expect for normal geometry. We might start with a two- or three-dimensional space. But it’s often convenient, especially for physics problems, to work with more dimensions. Four-dimensions. Six-dimensions. Incredibly huge numbers of dimensions. Honest, this often helps. It’s just harder to sketch out.
So we might for a problem need, say, 12-dimensional space. We can describe a point in that with an ordered set of twelve coordinates. Each describes how far you are from some standard reference point known as The Origin. If it doesn’t matter how many dimensions we’re working with, we call it an N-dimensional space. Or we use another letter if N is committed to something or other.
This is our stage. We are going to be interested in some N-dimensional Euclidean space. Let’s pretend N is 2; then our stage looks like the screen you’re reading now. We don’t need to pretend N is larger yet.
Our player is a mapping. It matches things in our N-dimensional space back to the same N-dimensional space. For example, maybe we have a mapping that takes the point with coordinates (3, 1) to the point (-3, -1). And it takes the point with coordinates (5.5, -2) to the point (-5.5, 2). And it takes the point with coordinates (-6, -π) to the point (6, π). You get the pattern. If we start from the point with coordinates (x, y) for some real numbers x and y, then the mapping gives us the point with coordinates (-x, -y).
One more step and then the play begins. Let’s not just think about a single point. Think about a whole region. If we look at the mapping of every point in that whole region, we get out … probably, some new region. We call this the “image” of the original region. With the mapping from the paragraph above, it’s easy to say what the image of a region is. It’ll look like the reflection in a corner mirror of the original region.
What if the mapping’s more complicated? What if we had a mapping that described how something was reflected in a cylindrical mirror? Or a mapping that describes how the points would move if they represent points of water flowing around a drain? — And that last explains why Jacobians appear in mathematical physics.
Many physics problems can be understood as describing how points that describe the system move in time. The dynamics of a system can be understood by how moving in time changes a region of starting conditions. A system might keep a region pretty much unchanged. Maybe it makes the region move, but it doesn’t change size or shape much. Or a system might change the region impressively. It might keep the area about the same, but stretch it out and fold it back, the way one might knead cookie dough.
The Jacobian, the one I’m interested in here, is a way of measuring these changes. The Jacobian matrix describes, for each point in the original domain, how a tiny change in one coordinate causes a change in the mapping’s coordinates. So if we have a mapping from an N-dimensional space to an N-dimensional space, there are going to be N times N values at work. Each one represents a different piece. How much does a tiny change in the first coordinate of the original point change the first coordinate of the mapping of the point? How much does a tiny change in the first coordinate of the original point change the second coordinate of the mapping of the the point? How much does a tiny change in the first coordinate of the original point change the third coordinate of the mapping of the point? … how much does a tiny change in the second coordinate of the original point change the first coordinate of the mapping of the point? And on and on and now you know why mathematics majors are trained on Jacobians with two-by-two and three-by-three matrices. We do maybe a couple four-by-four matrices to remind us that we are born to suffer. We never actually work out bigger matrices. Life is just too short.
(I’ve been talking, by the way, about the mapping of an N-dimensional space to an N-dimensional space. This is because we’re about to get to something that requires it. But we can write a matrix like this for a mapping of an N-dimensional space to an M-dimensional space, a different-sized space. It has uses. Let’s not worry about that.)
If you have a square matrix, one that has as many rows as columns, then you can calculate something named the determinant. This involves a lot of work. It takes even more work the bigger the matrix is. This is why mathematics majors learn to calculate determinants on two-by-two and three-by-three matrices. We do a couple four-by-four matrices and maybe one five-by-five to again remind us about suffering.
Anyway, by calculating the determinant of a Jacobian matrix, we get the Jacobian determinant. Finally we have something simple. The Jacobian determinant says how the area of a region changes in the mapping. Suppose the Jacobian determinant at a point is 2. Then a small region containing that point has an image with twice the original area. Suppose the Jacobian determinant is 0.8. Then a small region containing that point has an image with area 0.8 times the original area. Suppose the Jacobian determinant is -1. Then —
Well, what would you imagine?
If the Jacobian determinant is -1, then a small region around that point gets mapped to something with the same area. What changes is called the handedness. The mapping doesn’t just stretch or squash the region, but it also flips it along at least one dimension. The Jacobian determinant can tell us that.
So the Jacobian matrix, and the Jacobian determinant, are ways to describe how mappings change areas. Mathematicians will often call either of them just “the Jacobian”. We trust context to make clear what we mean. Either one is a way of describing how mappings change space: how they expand or contract, how they rotate, how they reflect spaces. Some fields of mathematics, including a surprising amount of the study of physics, are about studying how space changes.