My 2019 Mathematics A To Z: Category Theory

Today’s A To Z term is category theory. It was suggested by aajohannas, on Twitter as @aajohannas. It’s a topic I have long wanted to know better, and that every year or so I make a new attempt to try learning without ever feeling like I’ve made progress.

The language of it is beautiful, though. Much of its work is attractive just to see, too, as the field’s developed notation that could be presented as visual art. Much of mathematics could be visual art, yes, but these are art you can almost create in ASCII. It’s amazing.

Cartoony banner illustration of a coati, a raccoon-like animal, flying a kite in the clear autumn sky. A skywriting plane has written 'MATHEMATIC A TO Z'; the kite, with the letter 'S' on it to make the word 'MATHEMATICS'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Category Theory.

What is the most important part of mathematics? Well, the part you wish you understood, yes. But what’s the fundamental part? The piece of mathematics that we could feel most sure an alien intelligence would agree is mathematics?

There’s idle curiosity behind this, yes. It’s a question implicit in some ideals of the Enlightenment. The notion that we should be able to find truths that all beings capable of reason would agree upon, and find themselves. Mathematics seems particularly good for this. If we have something proven by deductive logic from clearly stated axioms and definitions, then we know something true.

There’s practicality too. In the late 19th and early 20th century (western) mathematics tried to find logically rigorous foundations. You might have thought we always had that and, uh, not so much. It turns out a complete rigorous logical proof of even simple stuff takes forever. Mathematicians compose enough of an argument to convince other mathematicians that we could fill in the details. But we still trusted there was a rigorous foundation. The question is, what is it?

A great candidate for this was set theory. This was a great breakthrough. The basic modern idea of set theory builds on bunches of things, called elements. And collections of those things, called sets. And we build rigorous ideas of what it means for elements to be members of sets. This doesn’t sound like much. Powerful ideas never do.

I don’t know that everyone’s intuition is like this. But my gut wants to think a “powerful” result is, like, a great rocket. Some enormous and prominent and mighty thing that blasts through a problem like gravity or an atmosphere. This is almost the opposite of what mathematics means by “powerful”. A rocket is a fiddly, delicate thing. It has millions of components made to tight specifications. It can only launch when lots of conditions are exactly right. A theorem that gives a great result, but has a long list of prerequisites and lemmas that feed into it resembles this. A powerful mathematical result is more like the gravity that the rocket overcomes. It tends to suppose little about the situation, and so it provides results that are applicable over the whole field. Or over a wide field, or a surprising breadth of topics. And, really, mighty as a rocket might be, the gravity it fights is moreso.

So set theory is powerful. It can explain many things. Most amazing is that we can represent arithmetic with it. At least we can get to integers, and all that we do with integers, and that in not too much work. It makes sense that mathematicians latched onto this as critical. It fueled much of the thinking behind the New Math, the infamous attempted United States educational reform of the 1960s and 70s. I grew up in the tail end of this, learning unions and intersections and complements along with times tables and delighted in it.

But even before New Math became a coherent idea there was a better idea. Emmy Noether, mentioned yesterday, is not a part of it. But a part of her insight into physics, and into group theory, was an understanding of structure. That important mathematics results from considering what we can do with sets of things. And what things we can do that produce invariants, things that don’t change. Saunders Mac Lane, one of Noether’s students, and Samuel Eilenberg in the 1940s used what looks to me like this principle. They organized category theory.

Category Theory looks at first like set theory only made terrifying. I’m not very comfortable with it myself. It’s an abstract field, and I’m more at home with stuff I can write a quick Octave program to double-check. Many results in category theory are described, or even proved, with beautiful directed-graph lattices. They show how things relate to one another. This is definitely the field to study if you like drawing arrows.

Part of the statement of a theorem, which includes three squares showing relations between objects labelled A and A', D(I) and D'(I), lim D and lim D'.
I admit I don’t know what point is being proved here. But the square structure illustrated here, and the way successive squares will, say, replace lim D with A, or pI with fI, is almost a sestina. From page 145 of Dr Tom Leinster’s Basic Category Theory.

Just as set theory does, category theory starts with things, called objects. And these objects get piled together into collections. And then there’s another collection of relationships between these collections. These relationships you call maps or morphisms or arrows, based on whatever the first book you kind of understood called them. I’m partial to “maps”. And then we have rules by which these maps compose, that is, where two maps reduce to a single map. This bundle of things — the objects, the collections, and the maps — is a category.

These objects can start out looking like elements, and the collections like sets, and the maps like functions. This gives me, at least, a patch of ground where I feel like I know what I’m doing. But what we need of things to be objects and collections and maps is very little. The result is great power. We can describe set theory in the language of categories. So we can describe arithmetic in category theory. There’s a bit of a hike from the start of category theory to, like, knowing what 18 plus 7 is.

But we’re not bound to anything that concrete. We can describe, for example, groups as categories. This gives us results like when we can factor polynomials. Or whether compass and straightedge can trisect an arbitrary angle. (There’s some work behind this too.) We can describe vector spaces as categories. Heady results like the idea that one function might be orthogonal to another lurk within this field. Manifolds, spaces that work like normal space, are part of the field. So are topological spaces, which tell us about shapes.

If you aren’t yet dizzy then consider this. A category is itself an object. So we can define maps between categories. These we call functors. Which themselves have use in computer science, as a way some kinds of software can be programmed well. More, maps themselves are objects. We can define mappings between maps. These we call natural transformations. Which are the things that Eilenberg and Mac Lane were particularly interested in, to start with. Category theory grew in part out of needing a better understanding of natural transformations.

I do not know what to recommend for people who want to really learn category theory. I haven’t found the textbook or the blog that makes me feel like I am mastering the subject. Writing this essay has introduced me to Dr Tom Leinster’s Basic Category Theory, which I’ve enjoyed skimming. Exercise 3.3.1, for example, seems like exactly the sort of problem I would pose if I knew category theory well enough to write a book on it.

Is this, finally, the mathematics we could be sure an alien would recognize? I’m skeptical, but I always am. It seems to me we build mathematics on arithmetic and geometry. Category theory, seeming to offer explanations of both, is a natural foundation for that. But we are evolved to see the world in terms of number and shape. Of course we see arithmetic and geometry as mathematics. Can we count on every being capable of reason seeing the same things as important? … I admit I can’t imagine a being we might communicate with not recognizing both. But this may say more about the limits of my imagination than about the limits of what could be mathematics.

Thanks for reading. All the Fall 2019 A To Z posts should be at this link. I hope to have the second essay of the week posted Thursday. This year’s and all past A To Z sequences should be at this link. And if you have thoughts about other topics I might cover, please offer suggestions for the letters F through H.


One Way To Get Your Own Theorem

While doing some research to better grouse about Ken Keeler’s Futurama theorem I ran across an amusing site I hadn’t known about. It is Theory Mine, a site that allows you to hire — and name — a genuine, mathematically sound theorem. The spirit of the thing is akin to that scam in which you “name” a star. But this is more legitimate in that, you know, it’s got any legitimacy. For this, you’re buying naming rights from someone who has any rights to sell. By convention the discoverer of a theorem can name it whatever she wishes, and there’s one chance in ten that anyone else will use the name.

I haven’t used it. I’ve made my own theorems, thanks, and could put them on a coffee mug or t-shirt if I wished to make a particularly boring t-shirt. But I’m delighted by the scheme. They don’t have a team of freelance mathematicians whipping up stuff and hoping it isn’t already known. Not for the kinds of prices they charge. This should inspire the question: well, where do the theorems come from?

The scheme uses an automated reasoning system. I don’t know the details of how it works, but I can think of a system by which this might work. It goes back to the Crisis of Foundations, the time in the late 19th/early 20th century when logicians got very worried that we were still letting physical intuitions and unstated assumptions stay in our mathematics. One solution: turn everything into symbols, icons with no connotations. The axioms of mathematics become a couple basic symbols. The laws of logical deduction become things we can do with the symbols, converting one line of symbols into a related other line. Every line we get is a theorem. And we know it’s correct. To write out the theorem in this scheme is to write out its proof, and to feel like you’re touching some deep magic. And there’s no human frailties in the system, besides the thrill of reeling off True Names like that.

You may not be sure what this works like. It may help to compare it to a slightly-fun number coding scheme. I mean the one where you start with a number, like, ‘1’. Then you write down how many times and which digit appears. There’s a single ‘1’ in that string, so you would write down ’11’. And repeat: In ’11’ there’s a sequence of two ‘1’s, so you would write down ’21’. And repeat: there’s a single ‘2’ and a single ‘1’, so you then write down ‘1211’. And again: there’s a single ‘1’, a single ‘2’, and then a double ‘1’, so you next write ‘111221’. And so on until you get bored or die.

When we do this for mathematics we start with a couple different basic units. And we also start with several things we may do at most symbols. So there’s rarely a single line that follows from the previous. There’s an ever-expanding tree of known truths. This may stave off boredom but I make no promises about death.

The result of this is pages and pages that look like Ancient High Martian. I don’t feel the thrill of doing this. Some people do, though. And as recreational mathematics goes I suppose it’s at least as good as sudoku. Anyway, this kind of project, rewarding indefatigability and thoroughness, is perfect for automation anyway. Let the computer work out all the things we can prove are true.

If I’m reading Theory Mine’s description correctly they seem to be doing something roughly like this. If they’re not, well, you go ahead and make your own rival service using my paragraphs as your system. All I ask is one penny for every use of L’Hôpital’s Rule, a theorem named for Guillaume de l’Hôpital and discovered by Johann Bernoulli. (I have heard that Bernoulli was paid for his work, but I do not know that’s true. I have now explained why, if we suppose that to be true, my prior sentence is a very funny joke and you should at minimum chuckle.)

This should inspire the question: what do we need mathematicians for, then? It’s for the same reason we need writers, when it would be possible to automate the composing of sentences that satisfy the rules of English grammar. I mean if there were rules to English grammar. That we can identify a theorem that’s true does not mean it has even the slightest interest to anyone, ever. There’s much more that could be known than that we could ever care about.

You can see this in Theory Mine’s example of Quentin’s Theorem. Quentin’s Theorem is about an operation you can do on a set whose elements consist of the non-negative whole numbers with a separate value, which they call color, attached. You can add these colored-numbers together according to some particular rules about how the values and the colors add. The order of this addition normally matters: blue two plus green three isn’t the same as green three plus blue two. Quentin’s Theorem finds cases where, if you add enough colored-numbers together, the order doesn’t matter. I know. I am also staggered by how useful this fact promises to be.

Yeah, maybe there is some use. I don’t know what it is. If anyone’s going to find the use it’ll be a mathematician. Or a physicist who’s found some bizarre quark properties she wants to codify. Anyway, if what you’re interested in is “what can you do to make a vertical column stable?” then the automatic proof generator isn’t helping you at all. Not without a lot of work put in to guiding it. So we can skip the hard work of finding and proving theorems, if we can do the hard work of figuring out where to look for these theorems instead. Always the way.

You also may wonder how we know the computer is doing its work right. It’s possible to write software that is logically proven to be correct. That is, the software can’t produce anything but the designed behavior. We don’t usually write software this way. It’s harder to write, because you have to actually design your software’s behavior. And we can get away without doing it. Usually there’s some human overseeing the results who can say what to do if the software seems to be going wrong. Advocates of logically-proven software point out that we’re getting more software, often passing results on to other programs. This can turn a bug in one program into a bug in the whole world faster than a responsible human can say, “I dunno. Did you try turning it off and on again?” I’d like to think we could get more logically-proven software. But I also fear I couldn’t write software that sound and, you know, mathematics blogging isn’t earning me enough to eat on.

Also, yes, even proven software will malfunction if the hardware the computer’s on malfunctions. That’s rare, but does happen. Fortunately, it’s possible to automate the checking of a proof, and that’s easier to do than creating a proof in the first place. We just have to prove we have the proof-checker working. Certainty would be a nice thing if we ever got it, I suppose.

The End 2016 Mathematics A To Z: Zermelo-Fraenkel Axioms

gaurish gave me a choice for the Z-term to finish off the End 2016 A To Z. I appreciate it. I’m picking the more abstract thing because I’m not sure that I can explain zero briefly. The foundations of mathematics are a lot easier.

Zermelo-Fraenkel Axioms

I remember the look on my father’s face when I asked if he’d tell me what he knew about sets. He misheard what I was asking about. When we had that straightened out my father admitted that he didn’t know anything particular. I thanked him and went off disappointed. In hindsight, I kind of understand why everyone treated me like that in middle school.

My father’s always quick to dismiss how much mathematics he knows, or could understand. It’s a common habit. But in this case he was probably right. I knew a bit about set theory as a kid because I came to mathematics late in the “New Math” wave. Sets were seen as fundamental to why mathematics worked without being so exotic that kids couldn’t understand them. Perhaps so; both my love and I delighted in what we got of set theory as kids. But if you grew up before that stuff was popular you probably had a vague, intuitive, and imprecise idea of what sets were. Mathematicians had only a vague, intuitive, and imprecise idea of what sets were through to the late 19th century.

And then came what mathematics majors hear of as the Crisis of Foundations. (Or a similar name, like Foundational Crisis. I suspect there are dialect differences here.) It reflected mathematics taking seriously one of its ideals: that everything in it could be deduced from clearly stated axioms and definitions using logically rigorous arguments. As often happens, taking one’s ideals seriously produces great turmoil and strife.

Before about 1900 we could get away with saying that a set was a bunch of things which all satisfied some description. That’s how I would describe it to a new acquaintance if I didn’t want to be treated like I was in middle school. The definition is fine if we don’t look at it too hard. “The set of all roots of this polynomial”. “The set of all rectangles with area 2”. “The set of all animals with four-fingered front paws”. “The set of all houses in Central New Jersey that are yellow”. That’s all fine.

And then if we try to be logically rigorous we get problems. We always did, though. They’re embodied by ancient jokes like the person from Crete who declared that all Cretans always lie; is the statement true? Or the slightly less ancient joke about the barber who shaves only the men who do not shave themselves; does he shave himself? If not jokes these should at least be puzzles faced in fairy-tale quests. Logicians dressed this up some. Bertrand Russell gave us the quite respectable “The set consisting of all sets which are not members of themselves”, and asked us to stare hard into that set. To this we have only one logical response, which is to shout, “Look at that big, distracting thing!” and run away. This satisfies the problem only for a while.

The while ended in — well, that took a while too. But between 1908 and the early 1920s Ernst Zermelo, Abraham Fraenkel, and Thoralf Skolem paused from arguing whose name would also be the best indie rock band name long enough to put set theory right. Their structure is known as Zermelo-Fraenkel Set Theory, or ZF. It gives us a reliable base for set theory that avoids any contradictions or catastrophic pitfalls. Or does so far as we have found in a century of work.

It’s built on a set of axioms, of course. Most of them are uncontroversial, things like declaring two sets are equivalent if they have the same elements. Declaring that the union of sets is itself a set. Obvious, sure, but it’s the obvious things that we have to make axioms. Maybe you could start an argument about whether we should just assume there exists some infinitely large set. But if we’re aware sets probably have something to teach us about numbers, and that numbers can get infinitely large, then it seems fair to suppose that there must be some infinitely large set. The axioms that aren’t simple obvious things like that are too useful to do without. They assume stuff like that no set is an element of itself. Or that every set has a “power set”, a new set comprised of all the subsets of the original set. Good stuff to know.

There is one axiom that’s controversial. Not controversial the way Euclid’s Parallel Postulate was. That’s ugly one about lines crossing another line meeting on the same side they make angles smaller than something something or other. That axiom was controversial because it read so weird, so needlessly complicated. (It isn’t; it’s exactly as complicated as it must be. Or better, it’s as simple as it could possibly be and still be useful.) The controversial axiom of Zermelo-Fraenkel Set Theory is known as the Axiom of Choice. It says if we have a collection of mutually disjoint sets, each with at least one thing in them, then it’s possible to pick exactly one item from each of the sets.

It’s impossible to dispute this is what we have axioms for. It’s about something that feels like it should be obvious: we can always pick something from a set. How could this not be true?

If it is true, though, we get some unsavory conclusions. For example, it becomes possible to take a ball the size of an orange and slice it up. We slice using mathematical blades. They’re not halted by something as petty as the desire not to slice atoms down the middle. We can reassemble the pieces. Into two balls. And worse, it doesn’t require we do something like cut the orange into infinitely many pieces. We expect crazy things to happen when we let infinities get involved. No, though, we can do this cut-and-duplicate thing by cutting the orange into five pieces. When you hear that it’s hard to know whether to point to the big, distracting thing and run away. If we dump the Axiom of Choice we don’t have that problem. But can we do anything useful without the ability to make a choice like that?

And we’ve learned that we can. If we want to use the Zermelo-Fraenkel Set Theory with the Axiom of Choice we say we were working in “ZFC”, Zermelo-Fraenkel-with-Choice. We don’t have to. If we don’t want to make any assumption about choices we say we’re working in “ZF”. Which to use depends on what one wants to use.

Either way Zermelo and Fraenkel and Skolem established set theory on the foundation we use to this day. We’re not required to use them, no; there’s a construction called von Neumann-Bernays-Gödel Set Theory that’s supposed to be more elegant. They didn’t mention it in my logic classes that I remember, though.

And still there’s important stuff we would like to know which even ZFC can’t answer. The most famous of these is the continuum hypothesis. Everyone knows — excuse me. That’s wrong. Everyone who would be reading a pop mathematics blog knows there are different-sized infinitely-large sets. And knows that the set of integers is smaller than the set of real numbers. The question is: is there a set bigger than the integers yet smaller than the real numbers? The Continuum Hypothesis says there is not.

Zermelo-Fraenkel Set Theory, even though it’s all about the properties of sets, can’t tell us if the Continuum Hypothesis is true. But that’s all right; it can’t tell us if it’s false, either. Whether the Continuum Hypothesis is true or false stands independent of the rest of the theory. We can assume whichever state is more useful for our work.

Back to the ideals of mathematics. One question that produced the Crisis of Foundations was consistency. How do we know our axioms don’t contain a contradiction? It’s hard to say. Typically a set of axioms we can prove consistent are also a set too boring to do anything useful in. Zermelo-Fraenkel Set Theory, with or without the Axiom of Choice, has a lot of interesting results. Do we know the axioms are consistent?

No, not yet. We know some of the axioms are mutually consistent, at least. And we have some results which, if true, would prove the axioms to be consistent. We don’t know if they’re true. Mathematicians are generally confident that these axioms are consistent. Mostly on the grounds that if there were a problem something would have turned up by now. It’s withstood all the obvious faults. But the universe is vaster than we imagine. We could be wrong.

It’s hard to live up to our ideals. After a generation of valiant struggling we settle into hoping we’re doing good enough. And waiting for some brilliant mind that can get us a bit closer to what we ought to be.