From my First A-to-Z: Z-transform


Back in the day I taught in a Computational Science department, which threw me out to exciting and new-to-me subjects more than once. One quite fun semester I was learning, and teaching, signal processing. This set me up for the triumphant conclusion of my first A-to-Z.

One of the things you can see in my style is mentioning the connotations implied by whether one uses x or z as a variable. Any letter will do, for the use it’s put to. But to use the name ‘z’ suggests an openness to something that ‘x’ doesn’t.

There’s a mention here about stability in algorithms, and the note that we can process data in ways that are stable or are unstable. I don’t mention why one would want or not want stability. Wanting stability hardly seems to need explaining; isn’t that the good option? And, often, yes, we want stable systems because they correct and wipe away error. But there are reasons we might want instability, or at least less stability. Too stable a system will obscure weak trends, or the starts of trends. Your weight flutters day by day in ways that don’t mean much, which is why it’s better to consider a seven-day average. If you took instead a 700-day running average, these meaningless fluctuations would be invisible. But you also would take a year or more to notice whether you were losing or gaining weight. That’s one of the things stability costs.


z-transform.

The z-transform comes to us from signal processing. The signal we take to be a sequence of numbers, all representing something sampled at uniformly spaced times. The temperature at noon. The power being used, second-by-second. The number of customers in the store, once a month. Anything. The sequence of numbers we take to stretch back into the infinitely great past, and to stretch forward into the infinitely distant future. If it doesn’t, then we pad the sequence with zeroes, or some other safe number that we know means “nothing”. (That’s another classic mathematician’s trick.)

It’s convenient to have a name for this sequence. “a” is a good one. The different sampled values are denoted by an index. a0 represents whatever value we have at the “start” of the sample. That might represent the present. That might represent where sampling began. That might represent just some convenient reference point. It’s the equivalent of mileage maker zero; we have to have something be the start.

a1, a2, a3, and so on are the first, second, third, and so on samples after the reference start. a-1, a-2, a-3, and so on are the first, second, third, and so on samples from before the reference start. That might be the last couple of values before the present.

So for example, suppose the temperatures the last several days were 77, 81, 84, 82, 78. Then we would probably represent this as a-4 = 77, a-3 = 81, a-2 = 84, a-1 = 82, a0 = 78. We’ll hope this is Fahrenheit or that we are remotely sensing a temperature.

The z-transform of a sequence of numbers is something that looks a lot like a polynomial, based on these numbers. For this five-day temperature sequence the z-transform would be the polynomial 77 z^4 + 81 z^3 + 84 z^2 + 81 z^1 + 78 z^0 . (z1 is the same as z. z0 is the same as the number “1”. I wrote it this way to make the pattern more clear.)

I would not be surprised if you protested that this doesn’t merely look like a polynomial but actually is one. You’re right, of course, for this set, where all our samples are from negative (and zero) indices. If we had positive indices then we’d lose the right to call the transform a polynomial. Suppose we trust our weather forecaster completely, and add in a1 = 83 and a2 = 76. Then the z-transform for this set of data would be 77 z^4 + 81 z^3 + 84 z^2 + 81 z^1 + 78 z^0 + 83 \left(\frac{1}{z}\right)^1 + 76 \left(\frac{1}{z}\right)^2 . You’d probably agree that’s not a polynomial, although it looks a lot like one.

The use of z for these polynomials is basically arbitrary. The main reason to use z instead of x is that we can learn interesting things if we imagine letting z be a complex-valued number. And z carries connotations of “a possibly complex-valued number”, especially if it’s used in ways that suggest we aren’t looking at coordinates in space. It’s not that there’s anything in the symbol x that refuses the possibility of it being complex-valued. It’s just that z appears so often in the study of complex-valued numbers that it reminds a mathematician to think of them.

A sound question you might have is: why do this? And there’s not much advantage in going from a list of temperatures “77, 81, 84, 81, 78, 83, 76” over to a polynomial-like expression 77 z^4 + 81 z^3 + 84 z^2 + 81 z^1 + 78 z^0 + 83 \left(\frac{1}{z}\right)^1 + 76 \left(\frac{1}{z}\right)^2 .

Where this starts to get useful is when we have an infinitely long sequence of numbers to work with. Yes, it does too. It will often turn out that an interesting sequence transforms into a polynomial that itself is equivalent to some easy-to-work-with function. My little temperature example there won’t do it, no. But consider the sequence that’s zero for all negative indices, and 1 for the zero index and all positive indices. This gives us the polynomial-like structure \cdots + 0z^2 + 0z^1 + 1 + 1\left(\frac{1}{z}\right)^1 + 1\left(\frac{1}{z}\right)^2 + 1\left(\frac{1}{z}\right)^3 + 1\left(\frac{1}{z}\right)^4 + \cdots . And that turns out to be the same as 1 \div \left(1 - \left(\frac{1}{z}\right)\right) . That’s much shorter to write down, at least.

Probably you’ll grant that, but still wonder what the point of doing that is. Remember that we started by thinking of signal processing. A processed signal is a matter of transforming your initial signal. By this we mean multiplying your original signal by something, or adding something to it. For example, suppose we want a five-day running average temperature. This we can find by taking one-fifth today’s temperature, a0, and adding to that one-fifth of yesterday’s temperature, a-1, and one-fifth of the day before’s temperature a-2, and one-fifth a-3, and one-fifth a-4.

The effect of processing a signal is equivalent to manipulating its z-transform. By studying properties of the z-transform, such as where its values are zero or where they are imaginary or where they are undefined, we learn things about what the processing is like. We can tell whether the processing is stable — does it keep a small error in the original signal small, or does it magnify it? Does it serve to amplify parts of the signal and not others? Does it dampen unwanted parts of the signal while keeping the main intact?

We can understand how data will be changed by understanding the z-transform of the way we manipulate it. That z-transform turns a signal-processing idea into a complex-valued function. And we have a lot of tools for studying complex-valued functions. So we become able to say a lot about the processing. And that is what the z-transform gets us.

A Moment Which Turns Out to Be Universal


I was reading a bit farther in Charles Coulson Gillispie’s Pierre-Simon Laplace, 1749 – 1827, A Life In Exact Science and reached this paragraph, too good not to share:

Wishing to study [ Méchanique céleste ] in advance, [ Jean-Baptiste ] Biot offered to read proof. When he returned the sheets, he would often ask Laplace to explain some of the many steps that had been skipped over with the famous phrase, “it is easy to see”. Sometimes, Biot said, Laplace himself would not remember how he had worked something out and would have difficulty reconstructing it.

So, it’s not just you and your instructors.

(Gillispie wrote the book along with Robert Fox and Ivor Grattan-Guinness.)

How All Of 2021 Treated My Mathematics Blog


Oh, you know, how did 2021 treat anybody? I always do one of these surveys for the end of each month. It’s only fair to do one for the end of the year also.

2021 was my tenth full year blogging around here. I might have made more of that if the actual anniversary in late September hadn’t coincided with a lot of personal hardships. 2021 was a quiet year around these parts with only 94 things posted. That’s the fewest of any full year. (I posted only 41 things in 2011, but I only started posting at all in late September of that year.) That seems not to have done my readership any harm. There were 28,832 pages viewed in 2021, up from 24,474 in 2020 and a fair bit above the 24,662 given in my previously best-viewed year of 2019. Eleven data points (the partial year 2011, and the full years 2012 through 2021) aren’t many, so there’s no real drawing patterns here. But it does seem like I have a year of sharp increases and then a year of slight declines in page views. I suppose we’ll check in in 2023 and see if that pattern holds.

Bar chart of annual views and unique visitors from 2012 to the present. After nearly level view counts in 2019 and 2020 there was a good-size rise for 2021.
The number of unique visitors for 2012 is so tiny because they started recording that (so far as they let us know) in, like, late December so that figure is meaningless. The rest seem all right, though.

One thing not declining? The number of unique visitors. WordPress recorded 20,339 unique visitors in 2021, a comfortable bit above 2020’s 16,870 and 2019s 16,718. So far I haven’t seen a year-over-year decline in unique visitors. That’s gratifying.

Less gratifying: the number of likes continues its decline. It hasn’t increased, around here, since 2015 when a seemingly impossible 3,273 likes were given by readers. In 2021 there were only 481 likes, the fewest since 2013. The dropping-off of likes has looked so resembled a Poisson distribution that I’m tempted to see whether it actually fits that.

Bar chart of the annual likes from 2013 to the present. It rose sharply from 2013 to 2015 and has declined in a not-quite-exponential pattern since then.
I know, my first thought was that it looked like an overdamped system receiving a shock, but I don’t think the decline is consistent enough to support that.

The number of comments dropped a slight bit. There were 188 given around here in 2021, but that’s only ten fewer than were given in 2020. It’s seven more than were given in 2019, so if there’s any pattern there I don’t know it.

WordPress lists 483 posts around here as having gotten four or more page views in the year. It won’t tell me everything that got even a single view, though. I’m not willing to do the work of stitching together the monthly page view data to learn everything that was of interest however passing. I’ll settle with knowing what was most popular. And what were my most popular posts of the year mercifully ended? These posts from 2021 got more views than all the others:

Mercator-style map of the world, with the United States in dark red and most of the New World, western Europe, South and Pacific Rim Asia, Australia, and New Zealand in a more uniform pink. The Philippines and India are in an intermediately dark red.
Hey look, it’s a naturally occurring International Telecommunication Union zonal map! And at this point may I point out that besides being a lower-tier pop-mathematics writer I am also a lower-tier humor blogger?

There were 143 countries, or country-like entities, sending me any page views in 2021. I don’t know how that compares to earlier years. But here’s the roster of where page views came from:

Country Readers
United States 13,723
Philippines 3,994
India 2,507
Canada 1,393
United Kingdom 865
Australia 659
Germany 442
Brazil 347
South Africa 296
European Union 273
Sweden 230
Singapore 210
Italy 204
Austria 178
France 143
Finland 141
Malaysia 135
South Korea 135
Hong Kong SAR China 132
Ireland 131
Netherlands 117
Turkey 117
Spain 107
Pakistan 105
Thailand 102
Mexico 101
United Arab Emirates 100
Indonesia 97
Switzerland 95
Norway 87
New Zealand 86
Belgium 76
Nigeria 76
Russia 74
Japan 64
Taiwan 62
Bangladesh 58
Poland 55
Greece 54
Denmark 52
Colombia 51
Israel 49
Ghana 46
Portugal 44
Czech Republic 40
Vietnam 38
Saudi Arabia 33
Argentina 30
Lebanon 30
Ecuador 28
Nepal 28
Egypt 25
Kuwait 23
Serbia 22
Chile 21
Croatia 21
Jamaica 20
Peru 20
Tanzania 20
Costa Rica 19
Romania 17
Trinidad & Tobago 17
Sri Lanka 16
Ukraine 15
Hungary 13
Jordan 13
Bulgaria 12
China 12
Albania 11
Bahrain 11
Morocco 11
Estonia 10
Qatar 10
Slovakia 10
Cyprus 9
Kenya 9
Zimbabwe 9
Algeria 8
Oman 8
Belarus 7
Georgia 7
Honduras 7
Lithuania 7
Puerto Rico 7
Venezuela 7
Bosnia & Herzegovina 6
Ethiopia 6
Iraq 6
Belize 5
Bhutan 5
Moldova 5
Uruguay 5
Dominican Republic 4
Guam 4
Kazakhstan 4
Macedonia 4
Mauritius 4
Zambia 4
Åland Islands 3
Antigua & Barbuda 3
Bahamas 3
Cambodia 3
El Salvador 3
Gambia 3
Guatemala 3
Slovenia 3
Suriname 3
American Samoa 2
Azerbaijan 2
Bolivia 2
Cameroon 2
Guernsey 2
Malta 2
Papua New Guinea 2
Réunion 2
Rwanda 2
Sudan 2
Uganda 2
Afghanistan 1
Andorra 1
Armenia 1
Fiji 1
Grenada 1
Iceland 1
Isle of Man 1
Latvia 1
Liberia 1
Liechtenstein 1
Luxembourg 1
Maldives 1
Marshall Islands 1
Mongolia 1
Myanmar (Burma) 1
Namibia 1
Palestinian Territories 1
Panama 1
Paraguay 1
Senegal 1
St. Lucia 1
St. Vincent & Grenadines 1
Togo 1
Tunisia 1
Vatican City 1

I don’t know that I’ve gotten a reader from Vatican City before. I hope it’s not about the essay figuring what dates are most and least likely for Easter. I’d expect them to know that already.

My plan is to spend a bit more time republishing posts from old A-to-Z’s. And then I hope to finish off the Little 2021 Mathematics A-to-Z, late and battered but still carrying on. I intend to post something at least once a week after that, although I don’t have a clear idea what that will be. Perhaps I’ll finally work out the algorithm for Compute!’s New Automatic Proofreader. Perhaps I’ll fill in with A-to-Z style essays for topics I had skipped before. Or I might get back to reading the comics for their mathematics topics. I’m open to suggestions.

Some Progress on the Infinitude of Monkeys


I have been reading Pierre-Simon LaPlace, 1749 – 1827, A Life In Exact Science, by Charles Coulson Gillispie with Robert Fox and Ivor Grattan-Guinness. It’s less of a biography than I expected and more a discussion of LaPlace’s considerable body of work. Part of LaPlace’s work was in giving probability a logically coherent, rigorous meaning. Laplace discusses the gambler’s fallacy and the tendency to assign causes to random events. That, for example, if we came across letters from a printer’s font reading out ‘INFINITESIMAL’ we would think that deliberate. We wouldn’t think that for a string of letters in no recognized language. And that brings up this neat quote from Gillispie:

The example may in all probability be adapted from the chapter in the Port-Royal La Logique (1662) on judgement of future events, where Arnauld points out that it would be stupid to bet twenty sous against ten thousand livres that a child playing with printer’s type would arrange the letters to compose the first twenty lines of Virgil’s Aenid.

The reference here is to a book by Antoine Arnauld and Pierre Nicole that I haven’t read or heard of before. But it makes a neat forerunner to the Infinite Monkey Theorem. That’s the study of what probability means when put to infinitely great or long processes. Émile Borel’s use of monkeys at a typewriter echoes this idea of children playing beyond their understanding. I don’t know whether Borel knew of Arnauld and Nicole’s example. But I did not want my readers to miss a neat bit of infinite-monkey trivia. Or to miss today’s Bizarro, offering yet another comic on the subject.

A printer reports to William Shakespeare: 'There's no way I can deliver 37 plays and 150 sonnets. I've got no monkeys, and typewriters haven't been invented yet.'
Piraro and Wayno’s Bizarro for the 18th of January, 2022. I’m not promising a return to regular Reading the Comics posts. But essays that feature Bizarro, past and future, are at this link.

From my Seventh A-to-Z: Big-O and Little-O Notation


I toss off a mention in this essay, about its book publication. By the time it appeared I was thinking whether I could assemble these A-to-Z’s, or a set of them, into a book. I haven’t had the energy to put that together but it still seems viable.


Mr Wu, author of the Singapore Maths Tuition blog, asked me to explain a technical term today. I thought that would be a fun, quick essay. I don’t learn very fast, do I?

A note on style. I make reference here to “Big-O” and “Little-O”, capitalizing and hyphenating them. This is to give them visual presence as a name. In casual discussion they’re just read, or said, as the two words or word-and-a-letter. Often the Big- or Little- gets dropped and we just talk about O. An O, without further context, in my experience means Big-O.

The part of me that wants smooth consistency in prose urges me to write “Little-o”, as the thing described is represented with a lowercase ‘o’. But Little-o sounds like a midway game or an Eyerly Aircraft Company amusement park ride. And I never achieve consistency in my prose anyway. Maybe for the book publication. Until I’m convinced another is better, though, “Little-O” it is.

Color cartoon illustration of a coati in a beret and neckerchief, holding up a director's megaphone and looking over the Hollywood hills. The megaphone has the symbols + x (division obelus) and = on it. The Hollywood sign is, instead, the letters MATHEMATICS. In the background are spotlights, with several of them crossing so as to make the letters A and Z; one leg of the spotlights has 'TO' in it, so the art reads out, subtly, 'Mathematics A to Z'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Big-O and Little-O Notation.

When I first went to college I had a campus post office box. I knew my box number. I also knew the length of the sluggish line for the combination lock code. The lock was a dial, lettered A through J. Being a young STEM-class idiot I thought, boy, would it actually be quicker to pick the lock than wait for the line? A three-letter combination, of ten options? That’s 1,000 possibilities. If I could try five a minute that’s, at worst, three hours 20 minutes. Combination might be anywhere in that set; I might get lucky. I could expect to spend 80 minutes picking my lock.

I decided to wait in line instead, and good that I did. I was unaware lock settings might not be a letter, like ‘A’. It could be the midway point between adjacent letters, like ‘AB’. That meant there were eight times as many combinations as I estimated, and I could expect to spend over ten hours. Even the slow line was faster than that. It transpired that my combination had two of these midway letters.

But that’s a little demonstration of algorithmic complexity. Also in cracking passwords by trial-and-error. Doubling the set of possible combination codes octuples the time it takes to break into the set. Making the combination longer would also work; each extra letter would multiply the cracking time by twenty. So you understand why your password should include “special characters” like punctuation, but most of all should be long.

We’re often interested in how long to expect a task to take. Sometimes we’re interested in the typical time it takes. Often we’re interested in the longest it could ever take. If we have a deterministic algorithm, we can say. We can count how many steps it takes. Sometimes this is easy. If we want to add two two-digit numbers together we know: it will be, at most, three single-digit additions plus, maybe, writing down a carry. (To add 98 and 37 is adding 8 + 7 to get 15, to add 9 + 3 to get 12, and to take the carry from the 15, so, 1 + 12 to get 13, so we have 135.) We can get a good quarrel going about what “a single step” is. We can argue whether that carry into the hundreds column is really one more addition. But we can agree that there is some smallest bit of arithmetic work, and proceed from that.

For any algorithm we have something that describes how big a thing we’re working on. It’s often ‘n’. If we need more than one variable to describe how big it is, ‘m’ gets called up next. If we’re estimating how long it takes to work on a number, ‘n’ is the number of digits in the number. If we’re thinking about a square matrix, ‘n’ is the number of rows and columns. If it’s a not-square matrix, then ‘n’ is the number of rows and ‘m’ the number of columns. Or vice-versa; it’s your matrix. If we’re looking for an item in a list, ‘n’ is the number of items in the list. If we’re looking to evaluate a polynomial, ‘n’ is the order of the polynomial.

In normal circumstances we don’t work out how many steps some operation does take. It’s more useful to know that multiplying these two long numbers would take about 900 steps than that it would need only 816. And so this gives us an asymptotic estimate. We get an estimate of how much longer cracking the combination lock will take if there’s more letters to pick from. This allowing that some poor soul will get the combination A-B-C.

There are a couple ways to describe how long this will take. The more common is the Big-O. This is just the letter, like you find between N and P. Since that’s easy, many have taken to using a fancy, vaguely cursive O, one that looks like \mathcal{O} . I agree it looks nice. Particularly, though, we write \mathcal{O}(f(n)) , where f is some function. In practice, we’ll see functions like \mathcal{O}(n) or \mathcal{O}(n^2 \log(n)) or \mathcal{O}(n^3) . Usually something simple like that. It can be tricky. There’s a scheme for multiplying large numbers together that’s \mathcal{O}(n \cdot 2^{\sqrt{2 log (n)}} \cdot log(n)) . What you will not see is something like \mathcal{O}(\sin (n)) , or \mathcal{O}(n^3 - n^4) or such. This comes to what we mean by the Big-O.

It’ll be convenient for me to have a name for the actual number of steps the algorithm takes. Let me call the function describing that g(n). Then g(n) is \mathcal{O}(f(n)) if once n gets big enough, g(n) is always less than C times f(n). Here c is some constant number. Could be 1. Could be 1,000,000. Could be 0.00001. Doesn’t matter; it’s some positive number.

There’s some neat tricks to play here. For example, the function ‘n ‘ is \mathcal{O}(n) . It’s also \mathcal{O}(n^2) and \mathcal{O}(n^9) and \mathcal{O}(e^{n}) . The function ‘n^2 ‘ is also \mathcal{O}(n^2) and those later terms, but it is not \mathcal{O}(n) . And you can see why \mathcal{O}(\sin(n)) is right out.

There is also a Little-O notation. It, too, is an upper bound on the function. But it is a stricter bound, setting tighter restrictions on what g(n) is like. You ask how it is the stricter bound gets the minuscule letter. That is a fine question. I think it’s a quirk of history. Both symbols come to us through number theory. Big-O was developed first, published in 1894 by Paul Bachmann. Little-O was published in 1909 by Edmund Landau. Yes, the one with the short Hilbert-like list of number theory problems. In 1914 G H Hardy and John Edensor Littlewood would work on another measure and they used Ω to express it. (If you see the letter used for Big-O and Little-O as the Greek omicron, then you see why a related concept got called omega.)

What makes the Little-O measure different is its sternness. g(n) is o(f(n)) if, for every positive number C, whenever n is large enough g(n) is less than or equal to C times f(n). I know that sounds almost the same. Here’s why it’s not.

If g(n) is \mathcal{O}(f(n)) , then you can go ahead and pick a C and find that, eventually, g(n) \le C f(n) . If g(n) is o(f(n)) , then I, trying to sabotage you, can go ahead and pick a C, trying my best to spoil your bounds. But I will fail. Even if I pick, like a C of one millionth of a billionth of a trillionth, eventually f(n) will be so big that g(n) \le C f(n) . I can’t find a C small enough that f(n) doesn’t eventually outgrow it, and outgrow g(n).

This implies some odd-looking stuff. Like, that the function n is not o(n) . But the function n is at least o(n^2) , and o(n^9) and those other fun variations. Being Little-O compels you to be Big-O. Big-O is not compelled to be Little-O, although it can happen.

These definitions, for Big-O and Little-O, I’ve laid out from algorithmic complexity. It’s implicitly about functions defined on the counting numbers. But there’s no reason I have to limit the ideas to that. I could define similar ideas for a function g(x), with domain the real numbers, and come up with an idea of being on the order of f(x).

We make some adjustments to this. The important one is that, with algorithmic complexity, we assumed g(n) had to be a positive number. What would it even mean for something to take minus four steps to complete? But a regular old function might be zero or negative or change between negative and positive. So we look at the absolute value of g(x). Is there some value of C so that, when x is big enough, the absolute value of g(x) stays less than C times f(x)? If it does, then g(x) is \mathcal{O}(f(x)) . Is it the case that for every positive number C it’s true that g(x) is less than C times f(x), once x is big enough? Then g(x) is o(f(x)) .

Fine, but why bother defining this?

A compelling answer is that it gives us a way to describe how different a function is from an approximation to that function. We are always looking for approximations to functions because most functions are hard. We have a small set of functions we like to work with. Polynomials are great numerically. Exponentials and trig functions are great analytically. That’s about all the functions that are easy to work with. Big-O notation particularly lets us estimate how bad an error we make using the approximation.

For example, the Runge-Kutta method numerically approximates solutions to ordinary differential equations. It does this by taking the information we have about the function at some point x to approximate its value at a point x + h. ‘h’ is some number. The difference between the actual answer and the Runge-Kutta approximation is \mathcal{O}(h^4) . We use this knowledge to make sure our error is tolerable. Also, we don’t usually care what the function is at x + h. It’s just what we can calculate. What we want is the function at some point a fair bit away from x, call it x + L. So we use our approximate knowledge of conditions at x + h to approximate the function at x + 2h. And use x + 2h to tell us about x + 3h, and from that x + 4h and so on, until we get to x + L. We’d like to have as few of these uninteresting intermediate points as we can, so look for as big an h as is safe.

That context may be the more common one. We see it, particularly, in Taylor Series and other polynomial approximations. For example, the sine of a number is approximately:

\sin(x) = x - \frac{x^3}{3!} + \frac{x^5}{5!} - \frac{x^7}{7!} + \frac{x^9}{9!} + \mathcal{O}(x^{11})

This has consequences. It tells us, for example, that if x is about 0.1, this approximation is probably pretty good. So it is: the sine of 0.1 (radians) is about 0.0998334166468282 and that’s exactly what five terms here gives us. But it also warns that if x is about 10, this approximation may be gibberish. And so it is: the sine of 10.0 is about -0.5440 and the polynomial is about 1448.27.

The connotation in using Big-O notation here is that we look for small h’s, and for \mathcal{O}(x) to be a tiny number. It seems odd to use the same notation with a large independent variable and with a small one. The concept carries over, though, and helps us talk efficiently about this different problem.


Today’s and all the other 2020 A-to-Z essays should appear at this link. Both the 2020 and all past A-to-Z essays ought to be at this link.

Thank you for reading.

From my Sixth A-to-Z: Operator


One of the many small benefits of these essays is getting myself clearly grounded on terms that I had accepted without thinking much about. Operator, like functional (mentioned in here), is one of them. I’m sure that when these were first introduced my instructors gave them clear definitions. Buut when they’re first introduced it’s not clear why these are important, or that we are going to spend the rest of grad school talking about them. So this piece from 2019’s A-to-Z sequence secured my footing on a term I had a fair understanding of. You get some idea of what has to be intended from the context in which the term is used. Also from knowing how terms like this tend to be defined. But having it down to where I could certainly pass a true-false test about “is this an operator”? That was new.


Today’s A To Z term is one I’ve mentioned previously, including in this A to Z sequence. But it was specifically nominated by Goldenoj, whom I know I follow on Twitter. I’m sorry not to be able to give you an account; I haven’t been able to use my @nebusj account for several months now. Well, if I do get a Twitter, Mathstodon, or blog account I’ll refer you there.

Cartoony banner illustration of a coati, a raccoon-like animal, flying a kite in the clear autumn sky. A skywriting plane has written 'MATHEMATIC A TO Z'; the kite, with the letter 'S' on it to make the word 'MATHEMATICS'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Operator.

An operator is a function. An operator has a domain that’s a space. Its range is also a space. It can be the same sapce but doesn’t have to be. It is very common for these spaces to be “function spaces”. So common that if you want to talk about an operator that isn’t dealing with function spaces it’s good form to warn your audience. Everything in a particular function space is a real-valued and continuous function. Also everything shares the same domain as everything else in that particular function space.

So here’s what I first wonder: why call this an operator instead of a function? I have hypotheses and an unwillingness to read the literature. One is that maybe mathematicians started saying “operator” a long time ago. Taking the derivative, for example, is an operator. So is taking an indefinite integral. Mathematicians have been doing those for a very long time. Longer than we’ve had the modern idea of a function, which is this rule connecting a domain and a range. So the term might be a fossil.

My other hypothesis is the one I’d bet on, though. This hypothesis is that there is a limit to how many different things we can call “the function” in one sentence before the reader rebels. I felt bad enough with that first paragraph. Imagine parsing something like “the function which the Laplacian function took the function to”. We are less likely to make dumb mistakes if we have different names for things which serve different roles. This is probably why there is another word for a function with domain of a function space and range of real or complex-valued numbers. That is a “functional”. It covers things like the norm for measuring a function’s size. It also covers things like finding the total energy in a physics problem.

I’ve mentioned two operators that anyone who’d read a pop mathematics blog has heard of, the differential and the integral. There are more. There are so many more.

Many of them we can build from the differential and the integral. Many operators that we care to deal with are linear, which is how mathematicians say “good”. But both the differential and the integral operators are linear, which lurks behind many of our favorite rules. Like, allow me to call from the vasty deep functions ‘f’ and ‘g’, and scalars ‘a’ and ‘b’. You know how the derivative of the function af + bg is a times the derivative of f plus b times the derivative of g? That’s the differential operator being all linear on us. Similarly, how the integral of af + bg is a times the integral of f plus b times the integral of g? Something mathematical with the adjective “linear” is giving us at least some solid footing.

I’ve mentioned before that a wonder of functions is that most things you can do with numbers, you can also do with functions. One of those things is the premise that if numbers can be the domain and range of functions, then functions can be the domain and range of functions. We can do more, though.

One of the conceptual leaps in high school algebra is that we start analyzing the things we do with numbers. Like, we don’t just take the number three, square it, multiply that by two and add to that the number three times four and add to that the number 1. We think about what if we take any number, call it x, and think of 2x^2 + 4x + 1 . And what if we make equations based on doing this 2x^2 + 4x + 1 ; what values of x make those equations true? Or tell us something interesting?

Operators represent a similar leap. We can think of functions as things we manipulate, and think of those manipulations as a particular thing to do. For example, let me come up with a differential expression. For some function u(x) work out the value of this:

2\frac{d^2 u(x)}{dx^2} + 4 \frac{d u(x)}{dx} + u(x)

Let me join in the convention of using ‘D’ for the differential operator. Then we can rewrite this expression like so:

2D^2 u + 4D u + u

Suddenly the differential equation looks a lot like a polynomial. Of course it does. Remember that everything in mathematics is polynomials. We get new tools to solve differential equations by rewriting them as operators. That’s nice. It also scratches that itch that I think everyone in Intro to Calculus gets, of wanting to somehow see \frac{d^2}{dx^2} as if it were a square of \frac{d}{dx} . It’s not, and D^2 is not the square of D . It’s composing D with itself. But it looks close enough to squaring to feel comfortable.

Nobody needs to do 2D^2 u + 4D u + u except to learn some stuff about operators. But you might imagine a world where we did this process all the time. If we did, then we’d develop shorthand for it. Maybe a new operator, call it T, and define it that T = 2D^2 + 4D + 1 . You see the grammar of treating functions as if they were real numbers becoming familiar. You maybe even noticed the ‘1’ sitting there, serving as the “identity operator”. You know how you’d write out Tv(x) = 3 if you needed to write it in full.

But there are operators that we use all the time. These do get special names, and often shorthand. For example, there’s the gradient operator. This applies to any function with several independent variables. The gradient has a great physical interpretation if the variables represent coordinates of space. If they do, the gradient of a function at a point gives us a vector that describes the direction in which the function increases fastest. And the size of that gradient — a functional on this operator — describes how fast that increase is.

The gradient itself defines more operators. These have names you get very familiar with in Vector Calculus, with names like divergence and curl. These have compelling physical interpretations if we think of the function we operate on as describing a moving fluid. A positive divergence means fluid is coming into the system; a negative divergence, that it is leaving. The curl, in fluids, describe how nearby streams of fluid move at different rate.

Physical interpretations are common in operators. This probably reflects how much influence physics has on mathematics and vice-versa. Anyone studying quantum mechanics gets familiar with a host of operators. These have comfortable names like “position operator” or “momentum operator” or “spin operator”. These are operators that apply to the wave function for a problem. They transform the wave function into a probability distribution. That distribution describes what positions or momentums or spins are likely, how likely they are. Or how unlikely they are.

They’re not all physical, though. Or not purely physical. Many operators are useful because they are powerful mathematical tools. There is a variation of the Fourier series called the Fourier transform. We can interpret this as an operator. Suppose the original function started out with time or space as its independent variable. This often happens. The Fourier transform operator gives us a new function, one with frequencies as independent variable. This can make the function easier to work with. The Fourier transform is an integral operator, by the way, so don’t go thinking everything is a complicated set of derivatives.

Another integral-based operator that’s important is the Laplace transform. This is a great operator because it turns differential equations into algebraic equations. Often, into polynomials. You saw that one coming.

This is all a lot of good press for operators. Well, they’re powerful tools. They help us to see that we can manipulate functions in the ways that functions let us manipulate numbers. It should sound good to realize there is much new that you can do, and you already know most of what’s needed to do it.


This and all the other Fall 2019 A To Z posts should be gathered here. And once I have the time to fiddle with tags I’ll have all past A to Z essays gathered at this link.

From my Fifth A-to-Z: Oriented Graph


My grad-student career took me into Monte Carlo methods and viscosity-free fluid flow. It’s a respectable path. But I could have ended up in graph theory; I got a couple courses in it in grad school and loved it. I just could not find a problem I could work on that was both solvable and interesting. But hints of that alternative path for me turn up now and then, such as in this piece from 2018.


I am surprised to have had no suggestions for an ‘O’ letter. I’m glad to take a free choice, certainly. It let me get at one of those fields I didn’t specialize in, but could easily have. And let me mention that while I’m still taking suggestions for the letters P through T, each other letter has gotten at least one nomination. I can be swayed by a neat term, though, so if you’ve thought of something hard to resist, try me. And later this month I’ll open up the letters U through Z. Might want to start thinking right away about what X, Y, and Z could be.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Oriented Graph.

This is another term from graph theory, one of the great mathematical subjects for doodlers. A graph, here, is made of two sets of things. One is a bunch of fixed points, called ‘vertices’. The other is a bunch of curves, called ‘edges’. Every edge starts at one vertex and ends at one vertex. We don’t require that every vertex have an edge grow from it.

Already you can see why this is a fun subject. It models some stuff really well. Like, anything where you have a bunch of sources of stuff, that come together and spread out again? Chances are there’s a graph that describes this. There’s a compelling all-purpose interpretation. Have vertices represent the spots where something accumulates, or rests, or changes, or whatever. Have edges represent the paths along which something can move. This covers so much.

The next step is a “directed graph”. This comes from making the edges different. If we don’t say otherwise we suppose that stuff can move along an edge in either direction. But suppose otherwise. Suppose there are some edges that can be used in only one direction. This makes a “directed edge”. It’s easy to see in graph theory networks of stuff like city streets. Once you ponder that, one-way streets follow close behind. If every edge in a graph is directed, then you have a directed graph. Moving from a regular old undirected graph to a directed graph changes everything you’d learned about graph theory. Mostly it makes things harder. But you get some good things in trade. We become able to model sources, for example. This is where whatever might move comes from. Also sinks, which is where whatever might move disappears from our consideration.

You might fear that by switching to a directed graph there’s no way to have a two-way connection between a pair of vertices. Or that if there is you have to go through some third vertex. I understand your fear, and wish to reassure you. We can get a two-way connection even in a directed graph: just have the same two vertices be connected by two edges. One goes one way, one goes the other. I hope you feel some comfort.

What if we don’t have that, though? What if the directed graph doesn’t have any vertices with a pair of opposite-directed edges? And that, then, is an oriented graph. We get the orientation from looking at pairs of vertices. Each pair either has no edge connecting them, or has a single directed edge between them.

There’s a lot of potential oriented graphs. If you have three vertices, for example, there’s seven oriented graphs to make of that. You’re allowed to have a vertex not connected to any others. You’re also allowed to have the vertices grouped into a couple of subsets, and connect only to other vertices in their own subset. This is part of why four vertices can give you 42 different oriented graphs. Five vertices can give you 582 different oriented graphs. You can insist on a connected oriented graph.

A connected graph is what you guess. It’s a graph where there’s no vertices off on their own, unconnected to anything. There’s no subsets of vertices connected only to each other. This doesn’t mean you can always get from any one vertex to any other vertex. The directions might not allow you to that. But if you’re willing to break the laws, and ignore the directions of these edges, you could then get from any vertex to any other vertex. Limiting yourself to connected graphs reduces the number of oriented graphs you can get. But not by as much as you might guess, at least not to start. There’s only one connected oriented graph for two vertices, instead of two. Three vertices have five connected oriented graphs, rather than seven. Four vertices have 34, rather than 42. Five vertices, 535 rather than 582. The total number of lost graphs grows, of course. The percentage of lost graphs dwindles, though.

There’s something more. What if there are no unconnected vertices? That is, every pair of vertices has an edge? If every pair of vertices in a graph has a direct connection we call that a “complete” graph. This is true whether the graph is directed or not. If you do have a complete oriented graph — every pair of vertices has a direct connection, and only the one direction — then that’s a “tournament”. If that seems like a whimsical name, consider one interpretation of it. Imagine a sports tournament in which every team played every other team once. And that there’s no ties. Each vertex represents one team. Each edge is the match played by the two teams. The direction is, let’s say, from the losing team to the winning team. (It’s as good if the direction is from the winning team to the losing team.) Then you have a complete, oriented, directed graph. And it represents your tournament.

And that delights me. A mathematician like me might talk a good game about building models. How one can represent things with mathematical constructs. Here, it’s done. You can make little dots, for vertices, and curved lines with arrows, for edges. And draw a picture that shows how a round-robin tournament works. It can be that direct.


From my Fourth A-to-Z: Open Set


It’s quite funny to notice the first paragraph’s shame at missing my self-imposed schedule. I still have not found confirmation of my hunch that “open” and “closed”, as set properties, were named independently. I haven’t found evidence I’m wrong, though, either.


Today’s glossary entry is another request from Elke Stangl, author of the Elkemental Force blog. I’m hoping this also turns out to be a well-received entry. Half of that is up to you, the kind reader. At least I hope you’re a reader. It’s already gone wrong, as it was supposed to be Friday’s entry. I discovered I hadn’t actually scheduled it while I was too far from my laptop to do anything about that mistake. This spoils the nice Monday-Wednesday-Friday routine of these glossary entries that dates back to the first one I ever posted and just means I have to quit forever and not show my face ever again. Sorry, Ulam Spiral. Someone else will have to think of you.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Open Set.

Mathematics likes to present itself as being universal truths. And it is. At least if we allow that the rules of logic by which mathematics works are universal. Suppose them to be true and the rest follows. But we start out with intuition, with things we observe in the real world. We’re happy when we can remove the stuff that’s clearly based on idiosyncratic experience. We find something that’s got to be universal.

Sets are pretty abstract things, as mathematicians use the term. They get to be hard to talk about; we run out of simpler words that we can use. A set is … a bunch of things. The things are … stuff that could be in a set, or else that we’d rule out of a set. We can end up better understanding things by drawing a picture. We draw the universe, which is a rectangular block, sometimes with dashed lines as the edges. The set is some blotch drawn on the inside of it. Some shade it in to emphasize which stuff we want in the set. If we need to pick out a couple things in the universe we drop in dots or numerals. If we’re rigorous about the drawing we could create a Venn Diagram.

When we do this, we’re giving up on the pure mathematical abstraction of the set. We’re replacing it with a territory on a map. Several territories, if we have several sets. The territories can overlap or be completely separate. We’re subtly letting our sense of geography, our sense of the spaces in which we move, infiltrate our understanding of sets. That’s all right. It can give us useful ideas. Later on, we’ll try to separate out the ideas that are too bound to geography.

A set is open if whenever you’re in it, you can’t be on its boundary. We never quite have this in the real world, with territories. The border between, say, New Jersey and New York becomes this infinitesimally slender thing, as wide in space as midnight is in time. But we can, with some effort, imagine the state. Imagine being as tiny in every direction as the border between two states. Then we can imagine the difference between being on the border and being away from it.

And not being on the border matters. If we are not on the border we can imagine the problem of getting to the border. Pick any direction; we can move some distance while staying inside the set. It might be a lot of distance, it might be a tiny bit. But we stay inside however we might move. If we are on the border, then there’s some direction in which any movement, however small, drops us out of the set. That’s a difference in kind between a set that’s open and a set that isn’t.

I say “a set that’s open and a set that isn’t”. There are such things as closed sets. A set doesn’t have to be either open or closed. It can be neither, a set that includes some of its borders but not other parts of it. It can even be both open and closed simultaneously. The whole universe, for example, is both an open and a closed set. The empty set, with nothing in it, is both open and closed. (This looks like a semantic trick. OK, if you’re in the empty set you’re not on its boundary. But you can’t be in the empty set. So what’s going on? … The usual. It makes other work easier if we call the empty set ‘open’. And the extra work we’d have to do to rule out the empty set doesn’t seem to get us anything interesting. So we accept what might be a trick.) The definitions of ‘open’ and ‘closed’ don’t exclude one another.

I’m not sure how this confusing state of affairs developed. My hunch is that the words ‘open’ and ‘closed’ evolved independent of each other. Why do I think this? An open set has its openness from, well, not containing its boundaries; from the inside there’s always a little more to it. A closed set has its closedness from sequences. That is, you can consider a string of points inside a set. Are these points leading somewhere? Is that point inside your set? If a string of points always leads to somewhere, and that somewhere is inside the set, then you have closure. You have a closed set. I’m not sure that the terms were derived with that much thought. But it does explain, at least in terms a mathematician might respect, why a set that isn’t open isn’t necessarily closed.

Back to open sets. What does it mean to not be on the boundary of the set? How do we know if we’re on it? We can define sets by all sorts of complicated rules: complex-valued numbers of size less than five, say. Rational numbers whose denominator (in lowest form) is no more than ten. Points in space from which a satellite dropped would crash into the moon rather than into the Earth or Sun. If we have an idea of distance we could measure how far it is from a point to the nearest part of the boundary. Do we need distance, though?

No, it turns out. We can get the idea of open sets without using distance. Introduce a neighborhood of a point. A neighborhood of a point is an open set that contains that point. It doesn’t have to be small, but that’s the connotation. And we get to thinking of little N-balls, circle or sphere-like constructs centered on the target point. It doesn’t have to be N-balls. But we think of them so much that we might as well say it’s necessary. If every point in a set has a neighborhood around it that’s also inside the set, then the set’s open.

You’re going to accuse me of begging the question. Fair enough. I was using open sets to define open sets. This use is all right for an intuitive idea of what makes a set open, but it’s not rigorous. We can give in and say we have to have distance. Then we have N-balls and we can build open sets out of balls that don’t contain the edges. Or we can try to drive distance out of our idea of open sets.

We can do it this way. Start off by saying the whole universe is an open set. Also that the union of any number of open sets is also an open set. And that the intersection of any finite number of open sets is also an open set. Does this sound weak? So it sounds weak. It’s enough. We get the open sets we were thinking of all along from this.

This works for the sets that look like territories on a map. It also works for sets for which we have some idea of distance, however strange it is to our everyday distances. It even works if we don’t have any idea of distance. This lets us talk about topological spaces, and study what geometry looks like if we can’t tell how far apart two points are. We can, for example, at least tell that two points are different. Can we find a neighborhood of one that doesn’t contain the other? Then we know they’re some distance apart, even without knowing what distance is.

That we reached so abstract an idea of what an open set is without losing the idea’s usefulness suggests we’re doing well. So we are. It also shows why Nicholas Bourbaki, the famous nonexistent mathematician, thought set theory and its related ideas were the core of mathematics. Today category theory is a more popular candidate for the core of mathematics. But set theory is still close to the core, and much of analysis is about what we can know from the fact of sets being open. Open sets let us explain a lot.

How December 2021, The Month I Crashed, Treated My Mathematics Blog


On my humor blog I joked I was holding off on my monthly statistics recaps waiting for December 2021 to get better. What held me back here is more attention- and energy-draining nonsense going on last week. It’s passed without lasting harm, that I know about, though. So I can get back to looking at how things looked here in December.

December was, technically, my most prolific month in the sorry year of 2021. I had twelve articles posted, in a year that mostly saw around five to seven posts a year. But more than half of them were repeats, copying the text of old A-to-Z’s, with a small introduction added. I’ve observed how much my readership seems to depend on the number of posts made, more than anything else. How did this sudden surge affect my statistics? … Here’s how.

Bar chart showing two and a half year's worth of monthly readership totals. The last several months have shown a slow but steady decline.
I can’t wait for the number of followers to roll over to 1,000, so that it’s easy to consider how many people hit ‘follow’ and then never read a word of my writing ever again.

This was another declining month, with the fewest number of page views — 1,946 — and unique visitors — 1,351 — since July 2021. As you’d expect, this was also below the twelve-month running means, of 2,437.7 views from 1,727.8 unique visitors. It’s also below the twelve-month running medians, of 2,436.5 views from 1,742 unique visitors.

I notice, looking at the years going back to 2018, that I’ve seen a readership drop in December each of the last several years. In 2019 my December readership was barely three-fifths the November readership, for example. In 2018 and 2020 readership fell by one-tenth to one-fifth. But those are also years where my A-to-Z was going regularly, and filling whole weeks with publication, in November, with only a few pieces in December. Having December be busier than November is novel.

So I’m curious whether other blogs see a similar November-to-December dropoff. I’m also curious if they have a publishing schedule that makes it easier to find actual patterns through the chaos.

There were 46 things liked in December, which is above the running mean of 40.5 and median of 38.5. There were nine comments given, below that mean of 15.3 and median of 11.5. On the other hand, what much was there to say? (And I appreciate each comment, particularly those of moral support.)

The per-posting numbers, of views and visitors and such, collapsed. I had expected that, since the laconic publishing schedule I settled on drove the per-posting averages way up. The twelve-month running mean of views per posting was 323.4, and median 307.4, for example. December saw 162.2 views per posting. There were a running mean of 228.4 visitors per posting, and median of 219.2 per posting, for the twelve months ending with November 2021. December 2021 saw 112.6 visitors per posting. So those numbers are way down. But they aren’t far off the figures I had in, say, the end of 2020, when I was doing 18 or 19 posts per month.


Might as well list all twelve posts of December, in their descending order of popularity. I’m not surprised the original A-to-Z stuff was most popular. Besides being least familiar, it also came first in the month, so had time to attract page views. Here’s the roster of how the month’s postings ranked.


WordPress credits me with publishing 16,789 words in December, an average of 1,399.1 words per post. That’s not only my most talkative month for 2021; that’s two of my most talkative months. There’s a whole third of the year I didn’t publish that much. This is all inflated by my reposting old articles in their entirety, of course. In past years I would include a pointer to an old A-to-Z essay, but not the whole thing.

This all brings my blog to a total 67,218 words posted for the year. It’s not the second-least-talkative year after all, although I’ll keep its comparisons to other years for a separate post.

At the closing of the year, WordPress figures I’ve posted 1,675 things here. They drew a total 150,883 page views from 90,187 visitors. This isn’t much compared to the first-tier pop-mathematics blogs. But it’s still more people than I could expect to meet in my life. So that’s nice to know about.

And now let’s look ahead to what 2022 is going to bring on all of this. I still intend to finish the Little 2021 Mathematics A-to-Z. Those essays should be at this link when I post them. I may get back to my Reading the Comics posts, as well. We’ll see.

From my Third A-to-Z: Osculating Circle


With the third A-to-Z choice for the letter O, I finally set ortho-ness down. I had thought the letter might become a reference for everything described as ortho-. It has to be acknowledged that two or three examples gets you the general idea of what’s got at when something is named ortho-, though.

Must admit, I haven’t that I remember ever solved a differential equation using osculating circles instead of, you know, polynomials or sine functions (Fourier series). But references I trust say that would be a way to go.


I’m happy to say it’s another request today. This one’s from HowardAt58, author of the Saving School Math blog. He’s given me some great inspiration in the past.

Osculating Circle.

It’s right there in the name. Osculating. You know what that is from that one Daffy Duck cartoon where he cries out “Greetings, Gate, let’s osculate” while wearing a moustache. Daffy’s imitating somebody there, but goodness knows who. Someday the mystery drives the young you to a dictionary web site. Osculate means kiss. This doesn’t seem to explain the scene. Daffy was imitating Jerry Colonna. That meant something in 1943. You can find him on old-time radio recordings. I think he’s funny, in that 40s style.

Make the substitution. A kissing circle. Suppose it’s not some playground antic one level up from the Kissing Bandit that plagues recess yet one or two levels down what we imagine we’d do in high school. It suggests a circle that comes really close to something, that touches it a moment, and then goes off its own way.

But then touching. We know another word for that. It’s the root behind “tangent”. Tangent is a trigonometry term. But it appears in calculus too. The tangent line is a line that touches a curve at one specific point and is going in the same direction as the original curve is at that point. We like this because … well, we do. The tangent line is a good approximation of the original curve, at least at the tangent point and for some region local to that. The tangent touches the original curve, and maybe it does something else later on. What could kissing be?

The osculating circle is about approximating an interesting thing with a well-behaved thing. So are similar things with names like “osculating curve” or “osculating sphere”. We need that a lot. Interesting things are complicated. Well-behaved things are understood. We move from what we understand to what we would like to know, often, by an approximation. This is why we have tangent lines. This is why we build polynomials that approximate an interesting function. They share the original function’s value, and its derivative’s value. A polynomial approximation can share many derivatives. If the function is nice enough, and the polynomial big enough, it can be impossible to tell the difference between the polynomial and the original function.

The osculating circle, or sphere, isn’t so concerned with matching derivatives. I know, I’m as shocked as you are. Well, it matches the first and the second derivatives of the original curve. Anything past that, though, it matches only by luck. The osculating circle is instead about matching the curvature of the original curve. The curvature is what you think it would be: it’s how much a function curves. If you imagine looking closely at the original curve and an osculating circle they appear to be two arcs that come together. They must touch at one point. They might touch at others, but that’s incidental.

Osculating circles, and osculating spheres, sneak out of mathematics and into practical work. This is because we often want to work with things that are almost circles. The surface of the Earth, for example, is not a sphere. But it’s only a tiny bit off. It’s off in ways that you only notice if you are doing high-precision mapping. Or taking close measurements of things in the sky. Sometimes we do this. So we map the Earth locally as if it were a perfect sphere, with curvature exactly what its curvature is at our observation post.

Or we might be observing something moving in orbit. If the universe had only two things in it, and they were the correct two things, all orbits would be simple: they would be ellipses. They would have to be “point masses”, things that have mass without any volume. They never are. They’re always shapes. Spheres would be fine, but they’re never perfect spheres even. The slight difference between a perfect sphere and whatever the things really are affects the orbit. Or the other things in the universe tug on the orbiting things. Or the thing orbiting makes a course correction. All these things make little changes in the orbiting thing’s orbit. The actual orbit of the thing is a complicated curve. The orbit we could calculate is an osculating — well, an osculating ellipse, rather than an osculating circle. Similar idea, though. Call it an osculating orbit if you’d rather.

That osculating circles have practical uses doesn’t mean they aren’t respectable mathematics. I’ll concede they’re not used as much as polynomials or sine curves are. I suppose that’s because polynomials and sine curves have nicer derivatives than circles do. But osculating circles do turn up as ways to try solving nonlinear differential equations. We need the help. Linear differential equations anyone can solve. Nonlinear differential equations are pretty much impossible. They also turn up in signal processing, as ways to find the frequencies of a signal from a sampling of data. This, too, we would like to know.

We get the name “osculating circle” from Gottfried Wilhelm Leibniz. This might not surprise. Finding easy-to-understand shapes that approximate interesting shapes is why we have calculus. Isaac Newton described a way of making them in the Principia Mathematica. This also might not surprise. Of course they would on this subject come so close together without kissing.

From my Second A-to-Z: Orthonormal


For early 2016 — dubbed “Leap Day 2016” as that’s when it started — I got a request to explain orthogonal. I went in a different direction, although not completely different. This essay does get a bit more into specifics of how mathematicians use the idea, like, showing some calculations and such. I put in a casual description of vectors here. For book publication I’d want to rewrite that to be clearer that, like, ordered sets of numbers are just one (very common) way to represent vectors.


Jacob Kanev had requested “orthogonal” for this glossary. I’d be happy to oblige. But I used the word in last summer’s Mathematics A To Z. And I admit I’m tempted to just reprint that essay, since it would save some needed time. But I can do something more.

Orthonormal.

“Orthogonal” is another word for “perpendicular”. Mathematicians use it for reasons I’m not precisely sure of. My belief is that it’s because “perpendicular” sounds like we’re talking about directions. And we want to extend the idea to things that aren’t necessarily directions. As majors, mathematicians learn orthogonality for vectors, things pointing in different directions. Then we extend it to other ideas. To functions, particularly, but we can also define it for spaces and for other stuff.

I was vague, last summer, about how we do that. We do it by creating a function called the “inner product”. That takes in two of whatever things we’re measuring and gives us a real number. If the inner product of two things is zero, then the two things are orthogonal.

The first example mathematics majors learn of this, before they even hear the words “inner product”, are dot products. These are for vectors, ordered sets of numbers. The dot product we find by matching up numbers in the corresponding slots for the two vectors, multiplying them together, and then adding up the products. For example. Give me the vector with values (1, 2, 3), and the other vector with values (-6, 5, -4). The inner product will be 1 times -6 (which is -6) plus 2 times 5 (which is 10) plus 3 times -4 (which is -12). So that’s -6 + 10 – 12 or -8.

So those vectors aren’t orthogonal. But how about the vectors (1, -1, 0) and (0, 0, 1)? Their dot product is 1 times 0 (which is 0) plus -1 times 0 (which is 0) plus 0 times 1 (which is 0). The vectors are perpendicular. And if you tried drawing this you’d see, yeah, they are. The first vector we’d draw as being inside a flat plane, and the second vector as pointing up, through that plane, like a thumbtack.

So that’s orthogonal. What about this orthonormal stuff?

Well … the inner product can tell us something besides orthogonality. What happens if we take the inner product of a vector with itself? Say, (1, 2, 3) with itself? That’s going to be 1 times 1 (which is 1) plus 2 times 2 (4, according to rumor) plus 3 times 3 (which is 9). That’s 14, a tidy sum, although, so what?

The inner product of (-6, 5, -4) with itself? Oh, that’s some ugly numbers. Let’s skip it. How about the inner product of (1, -1, 0) with itself? That’ll be 1 times 1 (which is 1) plus -1 times -1 (which is positive 1) plus 0 times 0 (which is 0). That adds up to 2. And now, wait a minute. This might be something.

Start from somewhere. Move 1 unit to the east. (Don’t care what the unit is. Inches, kilometers, astronomical units, anything.) Then move -1 units to the north, or like normal people would say, 1 unit o the south. How far are you from the starting point? … Well, you’re the square root of 2 units away.

Now imagine starting from somewhere and moving 1 unit east, and then 2 units north, and then 3 units straight up, because you found a convenient elevator. How far are you from the starting point? This may take a moment of fiddling around with the Pythagorean theorem. But you’re the square root of 14 units away.

And what the heck, (0, 0, 1). The inner product of that with itself is 0 times 0 (which is zero) plus 0 times 0 (still zero) plus 1 times 1 (which is 1). That adds up to 1. And, yeah, if we go one unit straight up, we’re one unit away from where we started.

The inner product of a vector with itself gives us the square of the vector’s length. At least if we aren’t using some freak definition of inner products and lengths and vectors. And this is great! It means we can talk about the length — maybe better to say the size — of things that maybe don’t have obvious sizes.

Some stuff will have convenient sizes. For example, they’ll have size 1. The vector (0, 0, 1) was one such. So is (1, 0, 0). And you can think of another example easily. Yes, it’s \left(\frac{1}{\sqrt{2}}, -\frac{1}{2}, \frac{1}{2}\right) . (Go ahead, check!)

So by “orthonormal” we mean a collection of things that are orthogonal to each other, and that themselves are all of size 1. It’s a description of both what things are by themselves and how they relate to one another. A thing can’t be orthonormal by itself, for the same reason a line can’t be perpendicular to nothing in particular. But a pair of things might be orthogonal, and they might be the right length to be orthonormal too.

Why do this? Well, the same reasons we always do this. We can impose something like direction onto a problem. We might be able to break up a problem into simpler problems, one in each direction. We might at least be able to simplify the ways different directions are entangled. We might be able to write a problem’s solution as the sum of solutions to a standard set of representative simple problems. This one turns up all the time. And an orthogonal set of something is often a really good choice of a standard set of representative problems.

This sort of thing turns up a lot when solving differential equations. And those often turn up when we want to describe things that happen in the real world. So a good number of mathematicians develop a habit of looking for orthonormal sets.

From my First A-to-Z: Orthogonal


I haven’t had the space yet to finish my Little 2021 A-to-Z, so let me resume playing the hits of past ones. For my first, Summer 2015, one, I picked all the topics myself. This one, Orthogonal, I remember as one of the challenging ones. The challenge was the question put in the first paragraph: why do we have this term, which is so nearly a synonym for “perpendicular”? I didn’t find an answer, then, or since. But I was able to think about how we use “orthogonal” and what it might do that “perpendicular ” doesn’t..


Orthogonal.

Orthogonal is another word for perpendicular. So why do we need another word for that?

It helps to think about why “perpendicular” is a useful way to organize things. For example, we can describe the directions to a place in terms of how far it is north-south and how far it is east-west, and talk about how fast it’s travelling in terms of its speed heading north or south and its speed heading east or west. We can separate the north-south motion from the east-west motion. If we’re lucky these motions separate entirely, and we turn a complicated two- or three-dimensional problem into two or three simpler problems. If they can’t be fully separated, they can often be largely separated. We turn a complicated problem into a set of simpler problems with a nice and easy part plus an annoying yet small hard part.

And this is why we like perpendicular directions. We can often turn a problem into several simpler ones describing each direction separately, or nearly so.

And now the amazing thing. We can separate these motions because the north-south and the east-west directions are at right angles to one another. But we can describe something that works like an angle between things that aren’t necessarily directions. For example, we can describe an angle between things like functions that have the same domain. And once we can describe the angle between two functions, we can describe functions that make right angles between each other.

This means we can describe functions as being perpendicular to one another. An example. On the domain of real numbers from -1 to 1, the function f(x) = x is perpendicular to the function g(x) = x^2 . And when we want to study a more complicated function we can separate the part that’s in the “direction” of f(x) from the part that’s in the “direction” of g(x). We can treat functions, even functions we don’t know, as if they were locations in space. And we can study and even solve for the different parts of the function as if we were pinning down the north-south and the east-west movements of a thing.

So if we want to study, say, how heat flows through a body, we can work out a series of “direction” for functions, and work out the flow in each of those “directions”. These don’t have anything to do with left-right or up-down directions, but the concepts and the convenience is similar.

I’ve spoken about this in terms of functions. But we can define the “angle” between things for many kinds of mathematical structures. Once we can do that, we can have “perpendicular” pairs of things. I’ve spoken only about functions, but that’s because functions are more familiar than many of the mathematical structures that have orthogonality.

Ah, but why call it “orthogonal” rather than “perpendicular”? And I don’t know. The best I can work out is that it feels weird to speak of, say, the cosine function being “perpendicular” to the sine function when you can’t really say either is in any particular direction. “Orthogonal” seems to appeal less directly to physical intuition while still meaning something. But that’s my guess, rather than the verdict of a skilled etymologist.

78 Pages and More of Arithmetic Trivia About 2022


2022 is a new and, we all hope, less brutal year. It is also a number, though, an integer. And every integer has some interesting things about it. Iva Sallay, of the Find The Factors recreational mathematics blog, assembled an awesome list of trivia about the number. This includes a bunch of tweets about the number’s interesting mathematical properties. At least some of them are sure to surprise you.

If that is not enough, then, please consider something which Christian Lawson-Perfect noted on Mathstodon. It is 78 pages titled Mathematical Beauty of 2022, by Dr Inder J Taneja. If the name sounds faintly familiar it might be that I’ve mentioned Taneja’s work before, in recreational arithmetic projects.

None of this trivia may matter. But there is some value in finding cute and silly things. Verifying, or discovering, cute trivia about a number helps you learn how to spot patterns and learn to look for new ones. And it’s good to play some.

From my Seventh A-to-Z: Tiling (the accidental remake)


For the 2020 A-to-Z I took the suggestion to write about tiling. It’s a fun field with many interesting wrinkles. And I realized after publishing that I had already written about Tiling, just two years before. There was no scrambling together a replacement essay, so I had to let it stand as is.

The accidental remake allows for some interesting studies, though. The two essays have very similar structures, which probably reflects that I came to both essays with similar rough ideas what to write, and went to similar sources to fill in details. The second essay turned out longer. Also, I think, better. I did a bit more tracking down specifics, such as trying to find Hao Wang’s paper and see just what it says. And rewriting is often key to good writing. This offers lessons in preparing these essays for book publication.


Mr Wu, author of the Singapore Maths Tuition blog, had an interesting suggestion for the letter T: Talent. As in mathematical talent. It’s a fine topic but, in the end, too far beyond my skills. I could share some of the legends about mathematical talent I’ve received. But what that says about the culture of mathematicians is a deeper and more important question.

So I picked my own topic for the week. I do have topics for next week — U — and the week after — V — chosen. But the letters W and X? I’m still open to suggestions. I’m open to creative or wild-card interpretations of the letters. Especially for X and (soon) Z. Thanks for sharing any thoughts you care to.

Color cartoon illustration of a coati in a beret and neckerchief, holding up a director's megaphone and looking over the Hollywood hills. The megaphone has the symbols + x (division obelus) and = on it. The Hollywood sign is, instead, the letters MATHEMATICS. In the background are spotlights, with several of them crossing so as to make the letters A and Z; one leg of the spotlights has 'TO' in it, so the art reads out, subtly, 'Mathematics A to Z'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Tiling.

Think of a floor. Imagine you are bored. What do you notice?

What I hope you notice is that it is covered. Perhaps by carpet, or concrete, or something homogeneous like that. Let’s ignore that. My floor is covered in small pieces, repeated. My dining room floor is slats of wood, about three and a half feet long and two inches wide. The slats are offset from the neighbors so there’s a pleasant strong line in one direction and stippled lines in the other. The kitchen is squares, one foot on each side. This is a grid we could plot high school algebra functions on. The bathroom is more elaborate. It has white rectangles about two inches long, tan rectangles about two inches long, and black squares. Each rectangle is perpendicular to ones of the other color, and arranged to bisect those. The black squares fill the gaps where no rectangle would fit.

Move from my house to pure mathematics. It’s easy to turn the floor of a room into abstract mathematics. We start with something to tile. Usually this is the infinite, two-dimensional plane. The thing you get if you have a house and forget the walls. Sometimes we look to tile the hyperbolic plane, a different geometry that we of course represent with a finite circle. (Setting particular rules about how to measure distance makes this equivalent to a funny-shaped plane.) Or the surface of a sphere, or of a torus, or something like that. But if we don’t say otherwise, it’s the plane.

What to cover it with? … Smaller shapes. We have a mathematical tiling if we have a collection of not-overlapping open sets. And if those open sets, plus their boundaries, cover the whole plane. “Cover” here means what “cover” means in English, only using more technical words. These sets — these tiles — can be any shape. We can have as many or as few of them as we like. We can even add markings to the tiles, give them colors or patterns or such, to add variety to the puzzles.

(And if we want, we can do this in other dimensions. There are good “tiling” questions to ask about how to fill a three-dimensional space, or a four-dimensional one, or more.)

Having an unlimited collection of tiles is nice. But mathematicians learn to look for how little we need to do something. Here, we look for the smallest number of distinct shapes. As with tiling an actual floor, we can get all the tiles we need. We can rotate them, too, to any angle. We can flip them over and put the “top” side “down”, something kitchen tiles won’t let us do. Can we reflect them? Use the shape we’d get looking at the mirror image of one? That’s up to whoever’s writing this paper.

What shapes will work? Well, squares, for one. We can prove that by looking at a sheet of graph paper. Rectangles would work too. We can see that by drawing boxes around the squares on our graph paper. Two-by-one blocks, three-by-two blocks, 40-by-1 blocks, these all still cover the paper and we can imagine covering the plane. If we like, we can draw two-by-two squares. Squares made up of smaller squares. Or repeat this: draw two-by-one rectangles, and then group two of these rectangles together to make two-by-two squares.

We can take it on faith that, oh, rectangles π long by e wide would cover the plane too. These can all line up in rows and columns, the way our squares would. Or we can stagger them, like bricks or my dining room’s wood slats are.

How about parallelograms? Those, it turns out, tile exactly as well as rectangles or squares do. Grids or staggered, too. Ah, but how about trapezoids? Surely they won’t tile anything. Not generally, anyway. The slanted sides will, most of the time, only fit in weird winding circle-like paths.

Unless … take two of these trapezoid tiles. We’ll set them down so the parallel sides run horizontally in front of you. Rotate one of them, though, 180 degrees. And try setting them — let’s say so the longer slanted line of both trapezoids meet, edge to edge. These two trapezoids come together. They make a parallelogram, although one with a slash through it. And we can tile parallelograms, whether or not they have a slash.

OK, but if you draw some weird quadrilateral shape, and it’s not anything that has a more specific name than “quadrilateral”? That won’t tile the plane, will it?

It will! In one of those turns that surprises and impresses me every time I run across it again, any quadrilateral can tile the plane. It opens up so many home decorating options, if you get in good with a tile maker.

That’s some good news for quadrilateral tiles. How about other shapes? Triangles, for example? Well, that’s good news too. Take two of any identical triangle you like. Turn one of them around and match sides of the same length. The two triangles, bundled together like that, are a quadrilateral. And we can use any quadrilateral to tile the plane, so we’re done.

How about pentagons? … With pentagons, the easy times stop. It turns out not every pentagon will tile the plane. The pentagon has to be of the right kind to make it fit. If the pentagon is in one of these kinds, it can tile the plane. If not, not. There are fifteen families of tiling known. The most recent family was discovered in 2015. It’s thought that there are no other convex pentagon tilings. I don’t know whether the proof of that is generally accepted in tiling circles. And we can do more tilings if the pentagon doesn’t need to be convex. For example, we can cut any parallelogram into two identical pentagons. So we can make as many pentagons as we want to cover the plane. But we can’t assume any pentagon we like will do it.

Hexagons look promising. First, a regular hexagon tiles the plane, as strategy games know. There are also at least three families of irregular hexagons that we know can tile the plane.

And there the good times end. There are no convex heptagons or octagons or any other shape with more sides that tile the plane.

Not by themselves, anyway. If we have more than one tile shape we can start doing fine things again. Octagons assisted by squares, for example, will tile the plane. I’ve lived places with that tiling. Or something that looks like it. It’s easier to install if you have square tiles with an octagon pattern making up the center, and triangle corners a different color. These squares come together to look like octagons and squares.

And this leads to a fun avenue of tiling. Hao Wang, in the early 60s, proposed a sort of domino-like tiling. You may have seen these in mathematics puzzles, or in toys. Each of these Wang Tiles, or Wang Dominoes, is a square. But the square is cut along the diagonals, into four quadrants. Each quadrant is a right triangle. Each quadrant, each triangle, is one of a finite set of colors. Adjacent triangles can have the same color. You can place down tiles, subject only to the rule that the tile edge has to have the same color on both sides. So a tile with a blue right-quadrant has to have on its right a tile with a blue left-quadrant. A tile with a white upper-quadrant on its top has, above it, a tile with a white lower-quadrant.

In 1961 Wang conjectured that if a finite set of these tiles will tile the plane, then there must be a periodic tiling. That is, if you picked up the plane and slid it a set horizontal and vertical distance, it would all look the same again. This sort of translation is common. All my floors do that. If we ignore things like the bounds of their rooms, or the flaws in their manufacture or installation or where a tile broke in some mishap.

This is not to say you couldn’t arrange them aperiodically. You don’t even need Wang Tiles for that. Get two colors of square tile, a white and a black, and lay them down based on whether the next decimal digit of π is odd or even. No; Wang’s conjecture was that if you had tiles that you could lay down aperiodically, then you could also arrange them to set down periodically. With the black and white squares, lay down alternate colors. That’s easy.

In 1964, Robert Berger proved Wang’s conjecture was false. He found a collection of Wang Tiles that could only tile the plane aperiodically. In 1966 he published this in the Memoirs of the American Mathematical Society. The 1964 proof was for his thesis. 1966 was its general publication. I mention this because while doing research I got irritated at how different sources dated this to 1964, 1966, or sometimes 1961. I want to have this straightened out. It appears Berger had the proof in 1964 and the publication in 1966.

I would like to share details of Berger’s proof, but haven’t got access to the paper. What fascinates me about this is that Berger’s proof used a set of 20,426 different tiles. I assume he did not work this all out with shards of construction paper, but then, how to get 20,426 of anything? With computer time as expensive as it was in 1964? The mystery of how he got all these tiles is worth an essay of its own and regret I can’t write it.

Berger conjectured that a smaller set might do. Quite so. He himself reduced the set to 104 tiles. Donald Knuth in 1968 modified the set down to 92 tiles. In 2015 Emmanuel Jeandel and Michael Rao published a set of 11 tiles, using four colors. And showed by computer search that a smaller set of tiles, or fewer colors, would not force some aperiodic tiling to exist. I do not know whether there might be other sets of 11, four-colored, tiles that work. You can see the set at the top of Wikipedia’s page on Wang Tiles.

These Wang Tiles, all squares, inspired variant questions. Could there be other shapes that only aperiodically tile the plane? What if they don’t have to be squares? Raphael Robinson, in 1971, came up with a tiling using six shapes. The shapes have patterns on them too, usually represented as colored lines. Tiles can be put down only in ways that fit and that make the lines match up.

Among my readers are people who have been waiting, for 1800 words now, for Roger Penrose. It’s now that time. In 1974 Penrose published an aperiodic tiling, one based on pentagons and using a set of six tiles. You’ve never heard of that either, because soon after he found a different set, based on a quadrilateral cut into two shapes. The shapes, as with Wang Tiles or Robinson’s tiling, have rules about what edges may be put against each other. Penrose — and independently Robert Ammann — also developed another set, this based on a pair of rhombuses. These have rules about what edges may tough one another, and have patterns on them which must line up.

The Penrose tiling became, and stayed famous. (Ammann, an amateur, never had much to do with the mathematics community. He died in 1994.) Martin Gardner publicized it, and it leapt out of mathematicians’ hands into the popular culture. At least a bit. That it could give you nice-looking floors must have helped.

To show that the rhombus-based Penrose tiling is aperiodic takes some arguing. But it uses tools already used in this essay. Remember drawing rectangles around several squares? And then drawing squares around several of these rectangles? We can do that with these Penrose-Ammann rhombuses. From the rhombus tiling we can draw bigger rhombuses. Ones which, it turns out, follow the same edge rules that the originals do. So that we can go again, grouping these bigger rhombuses into even-bigger rhombuses. And into even-even-bigger rhombuses. And so on.

What this gets us is this: suppose the rhombus tiling is periodic. Then there’s some finite-distance horizontal-and-vertical move that leaves the pattern unchanged. So, the same finite-distance move has to leave the bigger-rhombus pattern unchanged. And this same finite-distance move has to leave the even-bigger-rhombus pattern unchanged. Also the even-even-bigger pattern unchanged.

Keep bundling rhombuses together. You get eventually-big-enough-rhombuses. Now, think of how far you have to move the tiles to get a repeat pattern. Especially, think how many eventually-big-enough-rhombuses it is. This distance, the move you have to make, is less than one eventually-big-enough rhombus. (If it’s not you aren’t eventually-big-enough yet. Bundle them together again.) And that doesn’t work. Moving one tile over without changing the pattern makes sense. Moving one-half a tile over? That doesn’t. So the eventually-big-enough pattern can’t be periodic, and so, the original pattern can’t be either. This is explained in graphic detail a nice Powerpoint slide set from Professor Alexander F Ritter, A Tour Of Tilings In Thirty Minutes.

It’s possible to do better. In 2010 Joshua E S Socolar and Joan M Taylor published a single tile that can force an aperiodic tiling. As with the Wang Tiles, and Robinson shapes, and the Penrose-Ammann rhombuses, markings are part of it. They have to line up so that the markings — in two colors, in the renditions I’ve seen — make sense. With the Penrose tilings, you can get away from the pattern rules for the edges by replacing them with little notches. The Socolar-Taylor shape can make a similar trade. Here the rules are complex enough that it would need to be a three-dimensional shape, one that looks like the dilithium housing of the warp core. You can see the tile — in colored, marked form, and also in three-dimensional tile shape — at the PDF here. It’s likely not coming to the flooring store soon.

It’s all wonderful, but is it useful? I could go on a few hundred words about, particularly, crystals and quasicrystals. These are important for materials science. Especially these days as we have harnessed slightly-imperfect crystals to be our computers. I don’t care. These are lovely to look at. If you see nothing appealing in a great heap of colors and polygons spread over the floor there are things we cannot communicate about. Tiling is a delight; what more do you need?


Thanks for your attention. This and all of my 2020 A-to-Z essays should be at this link. All the essays from every A-to-Z series should be at this link. See you next week, I hope.

From my Sixth A-to-Z: Taylor Series


By the time of 2019 and my sixth A-to-Z series , I had some standard narrative tricks I could deploy. My insistence that everything is polynomials, for example. Anecdotes from my slight academic career. A prose style that emphasizes what we do with the idea of something rather than instructions. That last comes from the idea that if you wanted to know how to compute a Taylor series you’d just look it up on Mathworld or Wikipedia or whatnot. The thing a pop mathematics blog can do is give some reason that you’d want to know how to compute a Taylor series. I regret talking about functions that break Taylor series, though. I have to treat these essays as introducing the idea of a Taylor series to someone who doesn’t know anything about them. And it’s bad form to teach how stuff doesn’t work too close to teaching how it does work. Readers tend to blur what works and what doesn’t together. Still, f(x) = \exp(-\frac{1}{x^2}) is a really neat weird function and it’d be a shame to let it go completely unmentioned.


Today’s A To Z term was nominated by APMA, author of the Everybody Makes DATA blog. It was a topic that delighted me to realize I could explain. Then it started to torment me as I realized there is a lot to explain here, and I had to pick something. So here’s where things ended up.

Cartoony banner illustration of a coati, a raccoon-like animal, flying a kite in the clear autumn sky. A skywriting plane has written 'MATHEMATIC A TO Z'; the kite, with the letter 'S' on it to make the word 'MATHEMATICS'.
Art by Thomas K Dye, creator of the web comics Projection Edge, Newshounds, Infinity Refugees, and Something Happens. He’s on Twitter as @projectionedge. You can get to read Projection Edge six months early by subscribing to his Patreon.

Taylor Series.

In the mid-2000s I was teaching at a department being closed down. In its last semester I had to teach Computational Quantum Mechanics. The person who’d normally taught it had transferred to another department. But a few last majors wanted the old department’s version of the course, and this pressed me into the role. Teaching a course you don’t really know is a rush. It’s a semester of learning, and trying to think deeply enough that you can convey something to students. This while all the regular demands of the semester eat your time and working energy. And this in the leap of faith that the syllabus you made up, before you truly knew the subject, will be nearly enough right. And that you have not committed to teaching something you do not understand.

So around mid-course I realized I needed to explain finding the wave function for a hydrogen atom with two electrons. The wave function is this probability distribution. You use it to find things like the probability a particle is in a certain area, or has a certain momentum. Things like that. A proton with one electron is as much as I’d ever done, as a physics major. We treat the proton as the center of the universe, immobile, and the electron hovers around that somewhere. Two electrons, though? A thing repelling your electron, and repelled by your electron, and neither of those having fixed positions? What the mathematics of that must look like terrified me. When I couldn’t procrastinate it farther I accepted my doom and read exactly what it was I should do.

It turned out I had known what I needed for nearly twenty years already. Got it in high school.

Of course I’m discussing Taylor Series. The equations were loaded down with symbols, yes. But at its core, the important stuff, was this old and trusted friend.

The premise behind a Taylor Series is even older than that. It’s universal. If you want to do something complicated, try doing the simplest thing that looks at all like it. And then make that a little bit more like you want. And then a bit more. Keep making these little improvements until you’ve got it as right as you truly need. Put that vaguely, the idea describes Taylor series just as well as it describes making a video game or painting a state portrait. We can make it more specific, though.

A series, in this context, means the sum of a sequence of things. This can be finitely many things. It can be infinitely many things. If the sum makes sense, we say the series converges. If the sum doesn’t, we say the series diverges. When we first learn about series, the sequences are all numbers. 1 + \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \cdots , for example, which diverges. (It adds to a number bigger than any finite number.) Or 1 + \frac{1}{2^2} + \frac{1}{3^2} + \frac{1}{4^2} + \cdots , which converges. (It adds to \frac{1}{6}\pi^2 .)

In a Taylor Series, the terms are all polynomials. They’re simple polynomials. Let me call the independent variable ‘x’. Sometimes it’s ‘z’, for the reasons you would expect. (‘x’ usually implies we’re looking at real-valued functions. ‘z’ usually implies we’re looking at complex-valued functions. ‘t’ implies it’s a real-valued function with an independent variable that represents time.) Each of these terms is simple. Each term is the distance between x and a reference point, raised to a whole power, and multiplied by some coefficient. The reference point is the same for every term. What makes this potent is that we use, potentially, many terms. Infinitely many terms, if need be.

Call the reference point ‘a’. Or if you prefer, x0. z0 if you want to work with z’s. You see the pattern. This ‘a’ is the “point of expansion”. The coefficients of each term depend on the original function at the point of expansion. The coefficient for the term that has (x - a) is the first derivative of f, evaluated at a. The coefficient for the term that has (x - a)^2 is the second derivative of f, evaluated at a (times a number that’s the same for the squared-term for every Taylor Series). The coefficient for the term that has (x - a)^3 is the third derivative of f, evaluated at a (times a different number that’s the same for the cubed-term for every Taylor Series).

You’ll never guess what the coefficient for the term with (x - a)^{122,743} is. Nor will you ever care. The only reason you would wish to is to answer an exam question. The instructor will, in that case, have a function that’s either the sine or the cosine of x. The point of expansion will be 0, \frac{\pi}{2} , \pi , or \frac{3\pi}{2} .

Otherwise you will trust that this is one of the terms of (x - a)^n , ‘n’ representing some counting number too great to be interesting. All the interesting work will be done with the Taylor series either truncated to a couple terms, or continued on to infinitely many.

What a Taylor series offers is the chance to approximate a function we’re genuinely interested in with a polynomial. This is worth doing, usually, because polynomials are easier to work with. They have nice analytic properties. We can automate taking their derivatives and integrals. We can set a computer to calculate their value at some point, if we need that. We might have no idea how to start calculating the logarithm of 1.3. We certainly have an idea how to start calculating 0.3 - \frac{1}{2}(0.3^2) + \frac{1}{3}(0.3^3) . (Yes, it’s 0.3. I’m using a Taylor series with a = 1 as the point of expansion.)

The first couple terms tell us interesting things. Especially if we’re looking at a function that represents something physical. The first two terms tell us where an equilibrium might be. The next term tells us whether an equilibrium is stable or not. If it is stable, it tells us how perturbations, points near the equilibrium, behave.

The first couple terms will describe a line, or a quadratic, or a cubic, some simple function like that. Usually adding more terms will make this Taylor series approximation a better fit to the original. There might be a larger region where the polynomial and the original function are close enough. Or the difference between the polynomial and the original function will be closer together on the same old region.

We would really like that region to eventually grow to the whole domain of the original function. We can’t count on that, though. Roughly, the interval of convergence will stretch from ‘a’ to wherever the first weird thing happens. Weird things are, like, discontinuities. Vertical asymptotes. Anything you don’t like dealing with in the original function, the Taylor series will refuse to deal with. Outside that interval, the Taylor series diverges and we just can’t use it for anything meaningful. Which is almost supernaturally weird of them. The Taylor series uses information about the original function, but it’s all derivatives at a single point. Somehow the derivatives of, say, the logarithm of x around x = 1 give a hint that the logarithm of 0 is undefinable. And so they won’t help us calculate the logarithm of 3.

Things can be weirder. There are functions that just break Taylor series altogether. Some are obvious. A function needs lots of derivatives at a point to have a good Taylor series approximation. So, many fractal curves won’t have a Taylor series approximation. These curves are all corners, points where they aren’t continuous or where derivatives don’t exist. Some are obviously designed to break Taylor series approximations. We can make a function that follows different rules if x is rational than if x is irrational. There’s no approximating that, and you’d blame the person who made such a function, not the Taylor series. It can be subtle. The function defined by the rule f(x) = \exp{-\frac{1}{x^2}} , with the note that if x is zero then f(x) is 0, seems to satisfy everything we’d look for. It’s a function that’s mostly near 1, that drops down to being near zero around where x = 0. But its Taylor series expansion around a = 0 is a horizontal line always at 0. The interval of convergence can be a single point, challenging our idea of what an interval is.

That’s all right. If we can trust that we’re avoiding weird parts, Taylor series give us an outstanding new tool. Grant that the Taylor series describes a function with the same rule as our original function. The Taylor series is often easier to work with, especially if we’re working on differential equations. We can automate, or at least find formulas for, taking the derivative of a polynomial. Or adding together derivatives of polynomials. Often we can attack a differential equation too hard to solve otherwise by supposing the answer is a polynomial. This is essentially what that quantum mechanics problem used, and why the tool was so familiar when I was in a strange land.

Roughly. What I was actually doing was treating the function I wanted as a power series. This is, like the Taylor series, the sum of a sequence of terms, all of which are (x - a)^n times some coefficient. What makes it not a Taylor series is that the coefficients weren’t the derivatives of any function I knew to start. But the experience of Taylor series trained me to look at functions as things which could be approximated by polynomials.

This gives us the hint to look at other series that approximate interesting functions. We get a host of these, with names like Laurent series and Fourier series and Chebyshev series and such. Laurent series look like Taylor series but we allow powers to be negative integers as well as positive ones. Fourier series do away with polynomials. They instead use trigonometric functions, sines and cosines. Chebyshev series build on polynomials, but not on pure powers. They’ll use orthogonal polynomials. These behave like perpendicular directions do. That orthogonality makes many numerical techniques behave better.

The Taylor series is a great introduction to these tools. Its first several terms have good physical interpretations. Its calculation requires tools we learn early on in calculus. The habits of thought it teaches guides us even in unfamiliar territory.


And I feel very relieved to be done with this. I often have a few false starts to an essay, but those are mostly before I commit words to text editor. This one had about four branches that now sit in my scrap file. I’m glad to have a deadline forcing me to just publish already.

Thank you, though. This and the essays for the Fall 2019 A to Z should be at this link. Next week: the letters U and V. And all past A to Z essays ought to be at this link.

From my Fifth A-to-Z: Tiling (the first time)


I keep saying in picking A-to-Z topics that just because I don’t take a suggestion now doesn’t mean I won’t in the future. 2018’s A-to-Z I notice includes Mr Wu’s suggestion of “torus”. I didn’t take it then, but did get to it in this year’s little project. I’m glad to have the proof my word is good. I have thought sometime I might fill a gap in my inspiration by taking topics I hadn’t used in A-to-Z’s (I’ve kept lists) and doing them. I’d just need a catchy name for the set of essays.


For today’s a to Z topic I again picked one nominated by aajohannas. This after I realized I was falling into a never-ending research spiral on Mr Wu, of Mathtuition’s suggested “torus”. I do have an older essay describing the torus, as a set. But that does leave out a lot of why a torus is interesting. Well, we’ll carry on.

Cartoon of a thinking coati (it's a raccoon-like animal from Latin America); beside him are spelled out on Scrabble titles, 'MATHEMATICS A TO Z', on a starry background. Various arithmetic symbols are constellations in the background.
Art by Thomas K Dye, creator of the web comics Newshounds, Something Happens, and Infinity Refugees. His current project is Projection Edge. And you can get Projection Edge six months ahead of public publication by subscribing to his Patreon. And he’s on Twitter as @Newshoundscomic.

Tiling.

Here is a surprising thought for the next time you consider remodeling the kitchen. It’s common to tile the floor. Perhaps some of the walls behind the counter. What patterns could you use? And there are infinitely many possibilities. You might leap ahead of me and say, yes, but they’re all boring. A tile that’s eight inches square is different from one that’s twelve inches square and different from one that’s 12.01 inches square. Fine. Let’s allow that all square tiles are “really” the same pattern. The only difference between a square two feet on a side and a square half an inch on a side is how much grout you have to deal with. There are still infinitely many possibilities.

You might still suspect me of being boring. Sure, there’s a rectangular tile that’s, say, six inches by eight inches. And one that’s six inches by nine inches. Six inches by ten inches. Six inches by one millimeter. Yes, I’m technically right. But I’m not interested in that. Let’s allow that all rectangular tiles are “really” the same pattern. So we have “squares” and “rectangles”. There are still infinitely many tile possibilities.

Let me shorten the discussion here. Draw a quadrilateral. One that doesn’t intersect itself. That is, there’s four corners, four lines, and there’s no X crossings. If you have that, then you have a tiling. Get enough of these tiles and arrange them correctly and you can cover the plane. Or the kitchen floor, if you have a level floor. It might not be obvious how to do it. You might have to rotate alternating tiles, or set them in what seem like weird offsets. But you can do it. You’ll need someone to make the tiles for you, if you pick some weird pattern. I hope I live long enough to see it become part of the dubious kitchen package on junk home-renovation shows.

Let me broaden the discussion here. What do I mean by a tiling if I’m allowing any four-sided figure to be a tile? We start with a surface. Usually the plane, a flat surface stretching out infinitely far in two dimensions. The kitchen floor, or any other mere mortal surface, approximates this. But the floor stops at some point. That’s all right. The ideas we develop for the plane work all right for the kitchen. There’s some weird effects for the tiles that get too near the edges of the room. We don’t need to worry about them here. The tiles are some collection of open sets. No two tiles overlap. The tiles, plus their boundaries, cover the whole plane. That is, every point on the plane is either inside exactly one of the open sets, or it’s on the boundary between one (or more) sets.

There isn’t a requirement that all these sets have the same shape. We usually do, and will limit our tiles to one or two shapes endlessly repeated. It seems to appeal to our aesthetics and our installation budget. Using a single pattern allows us to cover the plane with triangles. Any triangle will do. Similarly any quadrilateral will do. For convex pentagonal tiles — here things get weird. There are fourteen known families of pentagons that tile the plane. Each member of the family looks about the same, but there’s some room for variation in the sides. Plus there’s one more special case that can tile the plane, but only that one shape, with no variation allowed. We don’t know if there’s a sixteenth pattern. But then until 2015 we didn’t know there was a 15th, and that was the first pattern found in thirty years. Might be an opening for someone with a good eye for doodling.

There are also exciting opportunities in convex hexagons. Anyone who plays strategy games knows a regular hexagon will tile the plane. (Regular hexagonal tilings fit a certain kind of strategy game well. Particularly they imply an equal distance between the centers of any adjacent tiles. Square and triangular tiles don’t guarantee that. This can imply better balance for territory-based games.) Irregular hexagons will, too. There are three known families of irregular hexagons that tile the plane. You can treat the regular hexagon as a special case of any of these three families. No one knows if there’s a fourth family. Ready your notepad at the next overlong, agenda-less meeting.

There aren’t tilings for identical convex heptagons, figures with seven sides. Nor eight, nor nine, nor any higher figure. You can cover them if you have non-convex figures. See any Tetris game where you keep getting the ‘s’ or ‘t’ shapes. And you can cover them if you use several shapes.

There’s some guidance if you want to create your own periodic tilings. I see it called the Conway Criterion. I don’t know the field well enough to say whether that is a common term. It could be something one mathematics popularizer thought of and that other popularizers imitated. (I don’t find “Conway Criterion” on the Mathworld glossary, but that isn’t definitive.) Suppose your polygon satisfies a couple of rules about the shapes of the edges. The rules are given in that link earlier this paragraph. If your shape does, then it’ll be able to tile the plane. If you don’t satisfy the rules, don’t despair! It might yet. The Conway Criterion tells you when some shape will tile the plane. It won’t tell you that something won’t.

(The name “Conway” may nag at you as familiar from somewhere. This criterion is named for John H Conway, who’s famous for a bunch of work in knot theory, group theory, and coding theory. And in popular mathematics for the “Game of Life”. This is a set of rules on a grid of numbers. The rules say how to calculate a new grid, based on this first one. Iterating them, creating grid after grid, can make patterns that seem far too complicated to be implicit in the simple rules. Conway also developed an algorithm to calculate the day of the week, in the Gregorian calendar. It is difficult to explain to the non-calendar fan how great this sort of thing is.)

This has all gotten to periodic tilings. That is, these patterns might be complicated. But if need be, we could get them printed on a nice square tile and cover the floor with that. Almost as beautiful and much easier to install. Are there tilings that aren’t periodic? Aperiodic tilings?

Well, sure. Easily. Take a bunch of tiles with a right angle, and two 45-degree angles. Put any two together and you have a square. So you’re “really” tiling squares that happen to be made up of a pair of triangles. Each pair, toss a coin to decide whether you put the diagonal as a forward or backward slash. Done. That’s not a periodic tiling. Not unless you had a weird run of luck on your coin tosses.

All right, but is that just a technicality? We could have easily installed this periodically and we just added some chaos to make it “not work”. Can we use a finite number of different kinds of tiles, and have it be aperiodic however much we try to make it periodic? And through about 1966 mathematicians would have mostly guessed that no, you couldn’t. If you had a set of tiles that would cover the plane aperiodically, there was also some way to do it periodically.

And then in 1966 came a surprising result. No, not Penrose tiles. I know you want me there. I’ll get there. Not there yet though. In 1966 Robert Berger — who also attended Rensselaer Polytechnic Institute, thank you — discovered such a tiling. It’s aperiodic, and it can’t be made periodic. Why do we know Penrose Tiles rather than Berger Tiles? Couple reasons, including that Berger has to use 20,426 distinct tile shapes. In 1971 Raphael M Robinson simplified matters a bit and got that down to six shapes. Roger Penrose in 1974 squeezed the set down to two, although by adding some rules about what edges may and may not touch one another. (You can turn this into a pure edges thing by putting notches into the shapes.) That really caught the public imagination. It’s got simplicity and accessibility to combine with beauty. Aperiodic tiles seem to relate to “quasicrystals”, which are what the name suggests and do happen in some materials. And they’ve got beauty. Aperiodic tiling embraces our need to have not too much order in our order.

I’ve discussed, in all this, tiling the plane. It’s an easy surface to think about and a popular one. But we can form tiling questions about other shapes. Cylinders, spheres, and toruses seem like they should have good tiling questions available. And we can imagine “tiling” stuff in more dimensions too. If we can fill a volume with cubes, or rectangles, it’s natural to wonder what other shapes we can fill it with. My impression is that fewer definite answers are known about the tiling of three- and four- and higher-dimensional space. Possibly because it’s harder to sketch out ideas and test them. Possibly because the spaces are that much stranger. I would be glad to hear more.


I’m hoping now to have a nice relaxing weekend. I won’t. I need to think of what to say for the letter ‘U’. On Tuesday I hope that it will join the rest of my A to Z essays at this link.

From my Fourth A-to-Z: Topology


In 2017 I reverted to just one A-to-Z per year. And I got banner art for the first time. It’s a small bit of polish that raised my apparent professionalism a whole order of magnitude. And for the letter T, I did something no pop mathematics blog had ever done before. I wrote about topology without starting from stretchy rubber doughnuts and coffee cups. Let me prove that to you now.


Today’s glossary entry comes from Elke Stangl, author of the Elkemental Force blog. I’ll do my best, although it would have made my essay a bit easier if I’d had the chance to do another topic first. We’ll get there.

Summer 2017 Mathematics A to Z, featuring a coati (it's kind of the Latin American raccoon) looking over alphabet blocks, with a lot of equations in the background.
Art courtesy of Thomas K Dye, creator of the web comic Newshounds. He has a Patreon for those able to support his work. He’s also open for commissions, starting from US$10.

Topology.

Start with a universe. Nice thing to have around. Call it ‘M’. I’ll get to why that name.

I’ve talked a fair bit about weird mathematical objects that need some bundle of traits to be interesting. So this will change the pace some. Here, I request only that the universe have a concept of “sets”. OK, that carries a little baggage along with it. We have to have intersections and unions. Those come about from having pairs of sets. The intersection of two sets is all the things that are in both sets simultaneously. The union of two sets is all the things that are in one set, or the other, or both simultaneously. But it’s hard to think of something that could have sets that couldn’t have intersections and unions.

So from your universe ‘M’ create a new collection of things. Call it ‘T’. I’ll get to why that name. But if you’ve formed a guess about why, then you know. So I suppose I don’t need to say why, now. ‘T’ is a collection of subsets of ‘M’. Now let’s suppose these four things are true.

First. ‘M’ is one of the sets in ‘T’.

Second. The empty set ∅ (which has nothing at all in it) is one of the sets in ‘T’.

Third. Whenever two sets are in ‘T’, their intersection is also in ‘T’.

Fourth. Whenever two (or more) sets are in ‘T’, their union is also in ‘T’.

Got all that? I imagine a lot of shrugging and head-nodding out there. So let’s take that. Your universe ‘M’ and your collection of sets ‘T’ are a topology. And that’s that.

Yeah, that’s never that. Let me put in some more text. Suppose we have a universe that consists of two symbols, say, ‘a’ and ‘b’. There’s four distinct topologies you can make of that. Take the universe plus the collection of sets {∅}, {a}, {b}, and {a, b}. That’s a topology. Try it out. That’s the first collection you would probably think of.

Here’s another collection. Take this two-thing universe and the collection of sets {∅}, {a}, and {a, b}. That’s another topology and you might want to double-check that. Or there’s this one: the universe and the collection of sets {∅}, {b}, and {a, b}. Last one: the universe and the collection of sets {∅} and {a, b} and nothing else. That one barely looks legitimate, but it is. Not a topology: the universe and the collection of sets {∅}, {a}, and {b}.

The number of toplogies grows surprisingly with the number of things in the universe. Like, if we had three symbols, ‘a’, ‘b’, and ‘c’, there would be 29 possible topologies. The universe of the three symbols and the collection of sets {∅}, {a}, {b, c}, and {a, b, c}, for example, would be a topology. But the universe and the collection of sets {∅}, {a}, {b}, {c}, and {a, b, c} would not. It’s a good thing to ponder if you need something to occupy your mind while awake in bed.

With four symbols, there’s 355 possibilities. Good luck working those all out before you fall asleep. Five symbols have 6,942 possibilities. You realize this doesn’t look like any expected sequence. After ‘4’ the count of topologies isn’t anything obvious like “two to the number of symbols” or “the number of symbols factorial” or something.

Are you getting ready to call me on being inconsistent? In the past I’ve talked about topology as studying what we can know about geometry without involving the idea of distance. How’s that got anything to do with this fiddling about with sets and intersections and stuff?

So now we come to that name ‘M’, and what it’s finally mnemonic for. I have to touch on something Elke Stangl hoped I’d write about, but a letter someone else bid on first. That would be a manifold. I come from an applied-mathematics background so I’m not sure I ever got a proper introduction to manifolds. They appeared one day in the background of some talk about physics problems. I think they were introduced as “it’s a space that works like normal space”, and that was it. We were supposed to pretend we had always known about them. (I’m translating. What we were actually told would be that it “works like R3”. That’s how mathematicians say “like normal space”.) That was all we needed.

Properly, a manifold is … eh. It’s something that works kind of like normal space. That is, it’s a set, something that can be a universe. And it has to be something we can define “open sets” on. The open sets for the manifold follow the rules I gave for a topology above. You can make a collection of these open sets. And the empty set has to be in that collection. So does the whole universe. The intersection of two open sets in that collection is itself in that collection. The union of open sets in that collection is in that collection. If all that’s true, then we have a manifold.

And now the piece that makes every pop mathematics article about topology talk about doughnuts and coffee cups. It’s possible that two topologies might be homeomorphic to each other. “Homeomorphic” is a term of art. But you understand it if you remember that “morph” means shape, and suspect that “homeo” is probably close to “homogenous”. Two things being homeomorphic means you can match their parts up. In the matching there’s nothing left over in the first thing or the second. And the relations between the parts of the first thing are the same as the relations between the parts of the second thing.

So. Imagine the snippet of the number line for the numbers larger than -π and smaller than π. Think of all the open sets you can use to cover that. It will have a set like “the numbers bigger than 0 and less than 1”. A set like “the numbers bigger than -π and smaller than 2.1”. A set like “the numbers bigger than 0.01 and smaller than 0.011”. And so on.

Now imagine the points that exist on a circle, if you’ve omitted one point. Let’s say it’s the unit circle, centered on the origin, and that what we’re leaving out is the point that’s exactly to the left of the origin. The open sets for this are the arcs that cover some part of this punctured circle. There’s the arc that corresponds to the angles from 0 to 1 radian measure. There’s the arc that corresponds to the angles from -π to 2.1 radians. There’s the arc that corresponds to the angles from 0.01 to 0.011 radians. You see where this is going. You see why I say we can match those sets on the number line to the arcs of this punctured circle. There’s some details to fill in here. But you probably believe me this could be done if I had to.

There’s two (or three) great branches of topology. One is called “algebraic topology”. It’s the one that makes for fun pop mathematics articles about imaginary rubber sheets. It’s called “algebraic” because this field makes it natural to study the holes in a sheet. And those holes tend to form groups and rings, basic pieces of Not That Algebra. The field (I’m told) can be interpreted as looking at functors on groups and rings. This makes for some neat tying-together of subjects this A To Z round.

The other branch is called “differential topology”, which is a great field to study because it sounds like what Mister Spock is thinking about. It inspires awestruck looks where saying you study, like, Bayesian probability gets blank stares. Differential topology is about differentiable functions on manifolds. This gets deep into mathematical physics.

As you study mathematical physics, you stop worrying about ever solving specific physics problems. Specific problems are petty stuff. What you like is solving whole classes of problems. A steady trick for this is to try to find some properties that are true about the problem regardless of what exactly it’s doing at the time. This amounts to finding a manifold that relates to the problem. Consider a central-force problem, for example, with planets orbiting a sun. A planet can’t move just anywhere. It can only be in places and moving in directions that give the system the same total energy that it had to start. And the same linear momentum. And the same angular momentum. We can match these constraints to manifolds. Whatever the planet does, it does it without ever leaving these manifolds. To know the shapes of these manifolds — how they are connected — and what kinds of functions are defined on them tells us something of how the planets move.

The maybe-third branch is “low-dimensional topology”. This is what differential topology is for two- or three- or four-dimensional spaces. You know, shapes we can imagine with ease in the real world. Maybe imagine with some effort, for four dimensions. This kind of branches out of differential topology because having so few dimensions to work in makes a lot of problems harder. We need specialized theoretical tools that only work for these cases. Is that enough to count as a separate branch? It depends what topologists you want to pick a fight with. (I don’t want a fight with any of them. I’m over here in numerical mathematics when I’m not merely blogging. I’m happy to provide space for anyone wishing to defend her branch of topology.)

But each grows out of this quite general, quite abstract idea, also known as “point-set topology”, that’s all about sets and collections of sets. There is much that we can learn from thinking about how to collect the things that are possible.

From my Third A-to-Z: Tree


It’s difficult to remember but there was a time I didn’t just post three A-to-Z essays in a week, but I did two such sequences in a year. It’s hard to imagine having that much energy now. The End 2016 A-to-Z got that name, rather than “End Of 2016”, because — hard as this may be to believe now — 2016 seemed like a particularly brutal year that we could not wait to finish. Unfortunately it turned out to be one of those years that will get pop-histories with subtitles like “Twelve Months That Changed The World” or “The Crisis Of Our Times”. Still, this piece shows off some of what I think characteristic of my writing: an interest in the legends that accrue around mathematical fields, and my reasons to be skeptical of the legends.


Graph theory begins with a beautiful legend. I have no reason to suppose it’s false, except my natural suspicion of beautiful legends as origin stories. Its organization as a field is traced to 18th century Köningsburg, where seven bridges connected the banks of a river and a small island in the center. Whether it was possible to cross each bridge exactly once and get back where one started was, they say, a pleasant idle thought to ponder and path to try walking. Then Leonhard Euler solved the problem. It’s impossible.

Tree.

Graph theory arises whenever we have a bunch of things that can be connected. We call the things “vertices”, because that’s a good corner-type word. The connections we call “edges”, because that’s a good connection-type word. It’s easy to create graphs that look like the edges of a crystal, especially if you draw edges as straight as much as possible. You don’t have to. You can draw them curved. Then they look like the scary tangles of wire around your wireless router complex.

Graph theory really got organized in the 19th century, and went crazy in the 20th. It turns out there’s lots of things that connect to other things. Networks, whether computers or social or thematically linked concepts. Anything that has to be delivered from one place to another. All the interesting chemicals. Anything that could be put in a pipe or taken on a road has some graph theory thing applicable to it.

A lot of graph theory ponders loops. The original problem was about how to use every bridge, every edge, exactly one time. Look at a tangled mass of a graph and it’s hard not to start looking for loops. They’re often interesting. It’s not easy to tell if there’s a loop that lets you get to every vertex exactly once.

What if there aren’t loops? What if there aren’t any vertices you can step away from and get back to by another route? Well, then you have a tree.

A tree’s a graph where all the vertices are connected so that there aren’t any closed loops. We normally draw them with straight lines, the better to look like actual trees. We then stop trying to make them look like actual trees by doing stuff like drawing them as a long horizontal spine with a couple branches sticking off above and below, or as * type stars, or H shapes. They still correspond to real-world things. If you’re not sure how consider the layout of one of those long, single-corridor hallways as in a hotel or dormitory. The rooms connect to one another as a tree once again, as long as no room opens to anything but its own closet or bathroom or the central hallway.

We can talk about the radius of a graph. That’s how many edges away any point can be from the center of the tree. And every tree has a center. Or two centers. If it has two centers they share an edge between the two. And that’s one of the quietly amazing things about trees to me. However complicated and messy the tree might be, we can find its center. How many things allow us that?

A tree might have some special vertex. That’s called the ‘root’. It’s what the vertices and the connections represent that make a root; it’s not something inherent in the way trees look. We pick one for some special reason and then we highlight it. Maybe put it at the bottom of the drawing, making ‘root’ for once a sensible name for a mathematics thing. Often we put it at the top of the drawing, because I guess we’re just being difficult. Well, we do that because we were modelling stuff where a thing’s properties depend on what it comes from. And that puts us into thoughts of inheritance and of family trees. And weird as it is to put the root of a tree at the top, it’s also weird to put the eldest ancestors at the bottom of a family tree. People do it, but in those illuminated drawings that make a literal tree out of things. You don’t see it in family trees used for actual work, like filling up a couple pages at the start of a king or a queen’s biography.

Trees give us neat new questions to ponder, like, how many are there? I mean, if you have a certain number of vertices then how many ways are there to arrange them? One or two or three vertices all have just the one way to arrange them. Four vertices can be hooked up a whole two ways. Five vertices offer a whole three different ways to connect them. Six vertices offer six ways to connect and now we’re finally getting something interesting. There’s eleven ways to connect seven vertices, and 23 ways to connect eight vertices. The number keeps on rising, but it doesn’t follow the obvious patterns for growth of this sort of thing.

And if that’s not enough to idly ponder then think of destroying trees. Draw a tree, any shape you like. Pick one of the vertices. Imagine you obliterate that. How many separate pieces has the tree been broken into? It might be as few as two. It might be as many as the number of remaining vertices. If graph theory took away the pastime of wandering around Köningsburg’s bridges, it has given us this pastime we can create anytime we have pen, paper, and a long meeting.

From my Second A-To-Z: Transcendental Number


The second time I did one of these A-to-Z’s, I hit on the idea of asking people for suggestions. It was a good move as it opened up subjects I had not come close to considering. I didn’t think to include the instructions for making your own transcendental number, though. You never get craft projects in mathematics, not after you get past the stage of making construction-paper rhombuses or something. I am glad to see my schtick of including a warning about using this stuff at your thesis defense was established by then.


I’m down to the last seven letters in the Leap Day 2016 A To Z. It’s also the next-to-the-last of Gaurish’s requests. This was a fun one.

Transcendental Number.

Take a huge bag and stuff all the real numbers into it. Give the bag a good solid shaking. Stir up all the numbers until they’re thoroughly mixed. Reach in and grab just the one. There you go: you’ve got a transcendental number. Enjoy!

OK, I detect some grumbling out there. The first is that you tried doing this in your head because you somehow don’t have a bag large enough to hold all the real numbers. And you imagined pulling out some number like “2” or “37” or maybe “one-half”. And you may not be exactly sure what a transcendental number is. But you’re confident the strangest number you extracted, “minus 8”, isn’t it. And you’re right. None of those are transcendental numbers.

I regret saying this, but that’s your own fault. You’re lousy at picking random numbers from your head. So am I. We all are. Don’t believe me? Think of a positive whole number. I predict you probably picked something between 1 and 10. Almost surely something between 1 and 100. Surely something less than 10,000. You didn’t even consider picking something between 10,012,002,214,473,325,937,775 and 10,012,002,214,473,325,937,785. Challenged to pick a number, people will select nice and familiar ones. The nice familiar numbers happen not to be transcendental.

I detect some secondary grumbling there. Somebody picked π. And someone else picked e. Very good. Those are transcendental numbers. They’re also nice familiar numbers, at least to people who like mathematics a lot. So they attract attention.

Still haven’t said what they are. What they are traces back, of course, to polynomials. Take a polynomial that’s got one variable, which we call ‘x’ because we don’t want to be difficult. Suppose that all the coefficients of the polynomial, the constant numbers we presumably know or could find out, are integers. What are the roots of the polynomial? That is, for what values of x is the polynomial a complicated way of writing ‘zero’?

For example, try the polynomial x2 – 6x + 5. If x = 1, then that polynomial is equal to zero. If x = 5, the polynomial’s equal to zero. Or how about the polynomial x2 + 4x + 4? That’s equal to zero if x is equal to -2. So a polynomial with integer coefficients can certainly have positive and negative integers as roots.

How about the polynomial 2x – 3? Yes, that is so a polynomial. This is almost easy. That’s equal to zero if x = 3/2. How about the polynomial (2x – 3)(4x + 5)(6x – 7)? It’s my polynomial and I want to write it so it’s easy to find the roots. That polynomial will be zero if x = 3/2, or if x = -5/4, or if x = 7/6. So a polynomial with integer coefficients can have positive and negative rational numbers as roots.

How about the polynomial x2 – 2? That’s equal to zero if x is the square root of 2, about 1.414. It’s also equal to zero if x is minus the square root of 2, about -1.414. And the square root of 2 is irrational. So we can certainly have irrational numbers as roots.

So if we can have whole numbers, and rational numbers, and irrational numbers as roots, how can there be anything else? Yes, complex numbers, I see you raising your hand there. We’re not talking about complex numbers just now. Only real numbers.

It isn’t hard to work out why we can get any whole number, positive or negative, from a polynomial with integer coefficients. Or why we can get any rational number. The irrationals, though … it turns out we can only get some of them this way. We can get square roots and cube roots and fourth roots and all that. We can get combinations of those. But we can’t get everything. There are irrational numbers that are there but that even polynomials can’t reach.

It’s all right to be surprised. It’s a surprising result. Maybe even unsettling. Transcendental numbers have something peculiar about them. The 19th Century French mathematician Joseph Liouville first proved the things must exist, in 1844. (He used continued fractions to show there must be such things.) It would be seven years later that he gave an example of one in nice, easy-to-understand decimals. This is the number 0.110 001 000 000 000 000 000 001 000 000 (et cetera). This number is zero almost everywhere. But there’s a 1 in the n-th digit past the decimal if n is the factorial of some number. That is, 1! is 1, so the 1st digit past the decimal is a 1. 2! is 2, so the 2nd digit past the decimal is a 1. 3! is 6, so the 6th digit past the decimal is a 1. 4! is 24, so the 24th digit past the decimal is a 1. The next 1 will appear in spot number 5!, which is 120. After that, 6! is 720 so we wait for the 720th digit to be 1 again.

And what is this Liouville number 0.110 001 000 000 000 000 000 001 000 000 (et cetera) used for, besides showing that a transcendental number exists? Not a thing. It’s of no other interest. And this plagued the transcendental numbers until 1873. The only examples anyone had of transcendental numbers were ones built to show that they existed. In 1873 Charles Hermite showed finally that e, the base of the natural logarithm, was transcendental. e is a much more interesting number; we have reasons to care about it. Every exponential growth or decay or oscillating process has e lurking in it somewhere. In 1882 Ferdinand von Lindemann showed that π was transcendental, and that’s an even more interesting number.

That bit about π has interesting implications. One goes back to the ancient Greeks. Is it possible, using straightedge and compass, to create a square that’s exactly the same size as a given circle? This is equivalent to saying, if I give you a line segment, can you create another line segment that’s exactly the square root of π times as long? This geometric problem is equivalent to an algebraic one. That problem: can you create a polynomial, with integer coefficients, that has the square root of π as a root? (WARNING: I’m skipping some important points for the sake of clarity. DO NOT attempt to use this to pass your thesis defense without putting those points back in.) We want the square root of π because … well, what’s the area of a square whose sides are the square root of π long? That’s right. So we start with a line segment that’s equal to the radius of the circle and we can do that, surely. Once we have the radius, can’t we make a line that’s the square root of π times the radius, and from that make a square with area exactly π times the radius squared? Since π is transcendental, then, no. We can’t. Sorry. One of the great problems of ancient mathematics, and one that still has the power to attract the casual mathematician, got its final answer in 1882.

Georg Cantor is a name even non-mathematicians might recognize. He showed there have to be some infinite sets bigger than others, and that there must be more real numbers than there are rational numbers. Four years after showing that, he proved there are as many transcendental numbers as there are real numbers.

They’re everywhere. They permeate the real numbers so much that we can understand the real numbers as the transcendental numbers plus some dust. They’re almost the dark matter of mathematics. We don’t actually know all that many of them. Wolfram MathWorld has a table listing numbers proven to be transcendental, and the fact we can list that on a single web page is remarkable. Some of them are large sets of numbers, yes, like e^{\pi \sqrt{d}} for every positive whole number d. And we can infer many more from them; if π is transcendental then so is 2π, and so is 5π, and so is -20.38π, and so on. But the table of numbers proven to be irrational is still just 25 rows long.

There are even mysteries about obvious numbers. π is transcendental. So is e. We know that at least one of π times e and π plus e is transcendental. Perhaps both are. We don’t know which one is, or if both are. We don’t know whether ππ is transcendental. We don’t know whether ee is, either. Don’t even ask if πe is.

How, by the way, does this fit with my claim that everything in mathematics is polynomials? — Well, we found these numbers in the first place by looking at polynomials. The set is defined, even to this day, by how a particular kind of polynomial can’t reach them. Thinking about a particular kind of polynomial makes visible this interesting set.

From my First A-to-Z: Tensor


Of course I can’t just take a break for the sake of having a break. I feel like I have to do something of interest. So why not make better use of my several thousand past entries and repost one? I’d just reblog it except WordPress’s system for that is kind of rubbish. So here’s what I wrote, when I was first doing A-to-Z’s, back in summer of 2015. Somehow I was able to post three of these a week. I don’t know how.

I had remembered this essay as mostly describing the boring part of tensors, that we usually represent them as grids of numbers and then symbols with subscripts and superscripts. I’m glad to rediscover that I got at why we do such things to numbers and subscripts and superscripts.


Tensor.

The true but unenlightening answer first: a tensor is a regular, rectangular grid of numbers. The most common kind is a two-dimensional grid, so that it looks like a matrix, or like the times tables. It might be square, with as many rows as columns, or it might be rectangular.

It can also be one-dimensional, looking like a row or a column of numbers. Or it could be three-dimensional, rows and columns and whole levels of numbers. We don’t try to visualize that. It can be what we call zero-dimensional, in which case it just looks like a solitary number. It might be four- or more-dimensional, although I confess I’ve never heard of anyone who actually writes out such a thing. It’s just so hard to visualize.

You can add and subtract tensors if they’re of compatible sizes. You can also do something like multiplication. And this does mean that tensors of compatible sizes will form a ring. Of course, that doesn’t say why they’re interesting.

Tensors are useful because they can describe spatial relationships efficiently. The word comes from the same Latin root as “tension”, a hint about how we can imagine it. A common use of tensors is in describing the stress in an object. Applying stress in different directions to an object often produces different effects. The classic example there is a newspaper. Rip it in one direction and you get a smooth, clean tear. Rip it perpendicularly and you get a raggedy mess. The stress tensor represents this: it gives some idea of how a force put on the paper will create a tear.

Tensors show up a lot in physics, and so in mathematical physics. Technically they show up everywhere, since vectors and even plain old numbers (scalars, in the lingo) are kinds of tensors, but that’s not what I mean. Tensors can describe efficiently things whose magnitude and direction changes based on where something is and where it’s looking. So they are a great tool to use if one wants to represent stress, or how well magnetic fields pass through objects, or how electrical fields are distorted by the objects they move in. And they describe space, as well: general relativity is built on tensors. The mathematics of a tensor allow one to describe how space is shaped, based on how to measure the distance between two points in space.

My own mathematical education happened to be pretty tensor-light. I never happened to have courses that forced me to get good with them, and I confess to feeling intimidated when a mathematical argument gets deep into tensor mathematics. Joseph C Kolecki, with NASA’s Glenn (Lewis) Research Center, published in 2002 a nice little booklet “An Introduction to Tensors for Students of Physics and Engineering”. This I think nicely bridges some of the gap between mathematical structures like vectors and matrices, that mathematics and physics majors know well, and the kinds of tensors that get called tensors and that can be intimidating.

My Little 2021 Mathematics A-to-Z is taking a short break


I regret coming to this point. I’d started my Little 2021 Mathematics A-to-Z with more lead time than usual, in the hopes that I’d have a less stressful time for the whole project. And then all that lead time slipped away. And there’s an extra bit of awkwardness, caused by my once-a-week schedule and the date I happened to start publishing this year’s project. I couldn’t finish it before 2021 ended, not unless I published two things in a week. And I don’t have the energy or time to do that.

The point of a schedule is to help make it easier to accomplish things you value. If it can’t help that, then the schedule has to go. I’ve given people this advice, and now, I’ll take it. I mean still to get to the letters T, O, and Z, to finish off the sequence. But that’ll run in 2022, I hope early in the year.

My Little 2021 Mathematics A-to-Z: Atlas


I owe Elkement thanks again for a topic. They’re author of the Theory and Practice of Trying to Combine Just Anything blog. And the subject lets me circle back around topology.

Atlas.

Mathematics is like every field in having jargon. Some jargon is unique to the field; there is no lay meaning of a “homeomorphism”. Some jargon is words plucked from the common language, such as “smooth”. The common meaning may guide you to what mathematicians want in it. A smooth function has a graph with no gaps, no discontinuities, no sharp corners; you can see smoothness in it. Sometimes the common meaning is an ambiguous help. A “series” is the sum of a sequence of numbers, that is, it is one number. Mathematicians study the series, but by looking at properties of the sequence.

So what sort of jargon is “atlas”? In common English, an atlas is a book of maps. Each map represents something different. Perhaps a different region of space. Perhaps a different scale, or a different projection altogether. The maps may show different features, or show them at different times. The maps must be about the same sort of thing. No slipping a map of Narnia in with the map of an amusement park, unless you warn of that in the title. The maps must not contradict one another. (So far as human-made things can be consistent, anyway.) And that’s the important stuff.

Atlas is the first kind of common-word jargon. Mathematicians use it to mean a collection of things. Those collected things aren’t mathematical maps. “Map” is the second type of jargon. The collected things are coordinate charts. “Coordinate chart” is a pairing of words not likely to appear in common English. But if you did encounter them? The meaning you might guess from their common use is not far off their mathematical use.

A coordinate chart is a matching of the points in an open set to normal coordinates. Euclidean coordinates, to be precise. But, you know, latitude and longitude, if it’s two dimensional. Add in the altitude if it’s three dimensions. Your x-y-z coordinates. It still counts if this is one dimension, or four dimensions, or sixteen dimensions. You’re less likely to draw a sketch of those. (In practice, you draw a sketch of a three-dimensional blob, and put N = 16 off in the corner, maybe in a box.)

These coordinate charts are on a manifold. That’s the second type of common-language jargon. Manifold, to pick the least bad of its manifold common definitions, is a “complicated object or subject”. The mathematical manifold is a surface. The things on that surface are connected by relationships that could be complicated. But the shape can be as simple as a plane or a sphere or a torus.

Every point on a coordinate chart needs some unique set of coordinates. And if a point appears on two coordinate charts, they have to be consistent. Consistent here is the matching between charts being a homeomorphism. A homeomorphism is a map, in the jargon sense. So it’s a function matching open sets on one chart to ope sets in the other chart. There’s more to it (there always is). But the important thing is that, away from the edges of the chart, we don’t create any new gaps or punctures or missing sections.

Some manifolds are easy to spot. The surface of the Earth, for example. Many are easy to come up with charts for. Think of any map of the Earth. Each point on the surface of the Earth matches some point on the sheet of paper. The coordinate chart is … let’s say how far your point is from the upper left corner of the page. (Pretend that you can measure those points precisely enough to match them to, like, the town you’re in.) Could be how far you are from the center, or the lower right corner, or whatever. These are all as good, and even count as other coordinate charts.

It’s easy to imagine that as latitude and longitude. We see maps of the world arranged by latitude and longitude so often. And that’s fine; latitude and longitude makes a good chart. But we have a problem in giving coordinates to the north and south pole. The latitude is easy but the longitude? So we have two points that can’t be covered on the map. We can save our atlas by having a couple charts. For the Earth this can be a map of most of the world arranged by latitude and longitude, and then two insets showing a disc around the north and the south poles. Thus we have an atlas of three charts.

We can make this a little tighter, reducing this to two charts. Have one that’s your normal sort of wall map, centered on the equator. Have the other be a transverse Mercator map. Make its center the great circle going through the prime meridian and the 180-degree antimeridian. Then every point on the planet, including the poles, has a neat unambiguous coordinate in at least one chart. A good chunk of the world will be on both charts. We can throw in more charts if we like, but two is enough.

The requirements to be an atlas aren’t hard to meet. So a lot of geometric structures end up being atlases. Theodore Frankel’s wonderful The Geometry of Physics introduces them on page 15. But that’s also the last appearance of “atlas”, at least in the index. The idea gets upstaged. The manifolds that the atlas charts end up being more interesting. Many problems about things in motion are easy to describe as paths traced out on manifolds. A large chunk of mathematical physics is then looking at this problem and figuring out what the space of possible behaviors looks like. What its topology is.

In a sense, the mathematical physicist might survey a problem, like a scout exploring new territory, more than solve it. This exploration brings us to directional derivatives. To tangent bundles. To other terms, jargon only partially informed by the common meanings.


And we draw to the final weeks of 2021, and of the Little 2021 Mathematics A-to-Z. All this year’s essays should be at this link. And all my glossary essays from every year should be at this link. Thank you for reading!

My Little 2021 Mathematics A-to-Z: Subtraction


Iva Sallay was once again a kind friend to my writing efforts here. Sallay, who runs the Find the Factors recreational mathematics puzzle site, saw a topic gives a compelling theme to this year’s A-to-Z.

Subtraction.

Subtraction is the inverse of addition.

So thanks for reading along as the Little 2021 Mathematics A-to-Z enters its final stage. Next week I hope to be back with something for my third letter ‘A’ of the sequence.

All right, I can be a little more clear. By the inverse I mean subtraction is the name the name we give to adding the additive inverse of something. It’s what lets addition be a group action. That is, we write a - b to mean we find whatever number, added to b, gives us 0. Then we add that to a. We do this pretty often, so it’s convenient to have a name for it. The word “subtraction” appears in English from about 1400. It grew from the Latin for “taking away”. By about 1425 the word has its mathematical meaning. I imagine this wasn’t too radical a linguistic evolution

All right, so some other thoughts. What’s so interesting about subtraction that it’s worth a name? We don’t have a particular word for reversing, say, a permutation. But don’t go very far in school not thinking about inverting an addition. Must come down to subtraction’s practical use in finding differences between things. Often in figuring out change. Debts at least. Nobody needs the inverse of a permutation unless they’re putting a deck of cards back in order.

Subtraction has other roles, though. Not so much in mathematics, but in teaching us how to learn about mathematics. For example, subtraction gives us a good reason to notice zero. Zero, the additive identity, is implicit to addition. But if you’re learning addition, and you think of it as “put these two piles of things together into one larger pile”? What good does an empty pile do you there? It’s easy to not notice there’s a concept there. But subtraction, taking stuff away from a pile? You can imagine taking everything away, and wanting a word for that. This isn’t the only way to notice zero is worth some attention. It’s a good way, though.

There’s more, though. Learning subtraction teaches us limits of what we can do, mathematically. We can add 3 to 7 or, if it’s more convenient, 7 to 3. But we learn from the start that while we can subtract 3 from 7, there’s no subtracting 7 from 3. This is true when we’re learning arithmetic and numbers are all positive. Some time later we ask, what happens if we go ahead and do this anyway? And figure out a number that makes sense as the answer to “what do you get subtracting 7 from 3”? This introduces us to the negative numbers. It’s a richer idea of what it is to have numbers. We can start to see addition and subtraction as expressions of the same operation.

Linus: 'Lucy, how much is six from four?' Lucy: 'Six from four?! You can't subtract six from four ... you can't subtract a bigger number from a smaller number.' Linus: 'YOU CAN IF YOU'RE STUPID!'
Charles Schulz’s Peanuts for the 27th of August, 1957. The amazing thing is you can if you’re smart, too. We can ask whether it’s good teaching to start instructions with something that’s not true, and then revealing what’s not true about it. My hunch is there is, because this provides the lesson that, even for something as “objective” as mathematics, the way we construct things is a convention. That we can change our tools as we want to do new things.

But we also notice they’re not quite the same. As mentioned, addition can be done in any order. If I need to do 7 + 4 + 3 + 6 I can decide I’d rather do 4 + 6 + 7 + 3 and make that 10 + 10 before getting to 20. This all simplifies my calculating. If I need to do 7 – 4 – 3 – 6 I get into a lot of trouble if I simplify my work by writing 4 – 6 – 7 – 3 instead. Even if I decide I’d rather take the 3 – 6 and turn that into a negative 3 first, I’ve made a mess of things.

The first property this teaches us to notice we call “commutativity”. Most mathematical operations don’t have that. But a lot of the ones we find useful do. The second property this points out is “associativity”, which more of the operations we find useful have. It’s not essential that someone learning how to calculate know this is a way to categorize mathematics operations. (I’ve read that before the New Math educational reforms of the 1960s, American elementary school mathematics textbooks never mentioned commutativity or associativity.) But I suspect it is essential that someone learning mathematics learn the things you can do come in families.

So let me mention division, the inverse of multiplication. (And that my chosen theme won’t let me get to in sequence.) Like subtraction, division refuses to be commutative or associative. Subtraction prompts us to treat the negative numbers as something useful. In parallel, division prompts us to accept fractions as numbers. (We accepted fractions as numbers long before we accepted negative numbers, mind. Anyone with a pie and three friends has an interest in “one-quarter” that they may not have with “negative four”.) When we start learning about numbers raised to powers, or exponentials, we have questions ready to ask. How do the operations behave? Do they encourage us to find other kinds of number?

And we also think of how to patch up subtraction’s problems. If we want subtraction to be a kind of addition, we have to get precise about what that little subtraction sign means. What we’ve settled on is that a - b is shorthand for a + (-b) , where -b is the additive inverse of b .

Once we do that all subtraction’s problems with commutativity and associativity go away. 7 – 4 – 3 – 6 becomes 7 + (-4) + (-3) + (-6), and that we can shuffle around however convenient. Say, to 7 + (-3) + (-4) + (-6), then to 7 + (-3) + (-10), then to 4 + (-10), and so -6. Thus do we domesticate a useful, wild operation like subtraction.

Any individual subtraction has one right answer. There are many ways to get there, though. I had learned, for example, to do a problem such as 738 minus 451 by subtracting one column of numbers at a time. Right to left, so, subtracting 8 minus 1, and then 3 minus 5, and after the borrowing then 6 minus 4. I remember several elementary school textbooks explaining borrowing as unwrapping rolls of dimes. It was a model well-suited to me.

We don’t need to, though. We can go from the left to the right, doing 7 minus 4 first and 8 minus 1 last. We can go through and figure out all the possible carries before doing any work. There’s a slick method called partial differences which skips all the carrying. But it demands writing out several more intermediate terms. This uses more paper, but if there isn’t a paper shortage, so what?

There are more ways to calculate. If we turn things over to a computer, we’re likely to do subtraction using a complements technique. When I say computer you likely think electronic computer, or did right up to the adjective there. But mechanical computers were a thing too. Blaise Pascal’s computing device of the 1650s used nines’ complements to subtract on the gears that did addition. Explaining the trick would take me farther afield than I want to go now. But, you know how, like, 6 plus 3 is 9? So you can turn a subtraction of 6 into an addition of 3. Or a subtraction of 3 into an addition of 6. Plus some bookkeeping.

A digital computer is likely to use ones’ complements, representing every number as a string of 0’s and 1’s. This has great speed advantages. The complement of 0 is 1 and vice-versa, and it’s very quick for a computer to swap between 0 and 1. Subtraction by complements is different and, to my eye, takes more steps. But they might be steps you do better.

One more thought subtraction gives us, though. In a previous paragraph I wrote out 7 – 4, and also wrote 7 + (-4). We use the symbol – for two things. Do those two uses of – mean the same thing? You may think I’m being fussy here. After all, the value of -4 is the same as the value of 0 – 4. And even a fussy mathematician says whichever of “minus four” and “negative four” better fits the meter of the sentence. But our friends in the philosophy department would agree this is a fair question. Are we collapsing two related ideas together by using the same symbol for them?

My inclination is to say that the – of -4 is different from the – in 0 – 4, though. The – in -4 is a unary operation: it means “give me the inverse of the number on the right”. The – in 0 – 4 is a binary operation: it means “subtract the number on the right from the number on the left”. So I would say these are different things sharing a symbol. Unfortunately our friends in the philosophy department can’t answer the question for us. The university laid them off four years ago, part of society’s realignment away from questions like “how can we recognize when a thing is true?” and towards “how can we teach proto-laborers to use Excel macros?”. We have to use subtraction to expand our thinking on our own.

How November 2021 Treated My Mathematics Blog


As I come near the end of the Little 2021 Mathematics A-to-Z, I also come to the start of December. So that’s a good time to look at the past month and see how readers responded to my work. Over November I published seven pieces, and here’s how they sorted out, most popular to the least, as WordPress counts their page views:

There’s an obvious advantage stuff published earlier in the month has. Still, this is usually around the time in an A-to-Z sequence where I get hit by a content aggregator and one post gets 25,000 views in a three-hour period and then falls back to normal. Would be a mood lift.

After a suspiciously average October, I saw another underperforming November. I mean underperforming compared to the twelve-month running average leading up to November. The mean, leading up to November, monthly page view was 2,501.8, and the median was 2,527. In actual November, I got 2,103 page views. The mean number of unique visitors was 1,775.7, and the running median 1,752. In fact, there were 1,493 unique visitors.

Rated per posting, though, it doesn’t look so bad. There were on average 300.4 page views for each of the seven postings this past month. The twelve-month running mean was 314.3 views per posting, and the median 307.4. There were 213.3 unique visitors per posting in November. This is insignificantly below the running mean 222.1 unique visitors per posting, and running median of 217.2 visitors per posting. (And, again, this is views to anything at all on my blog, per new posting. Sometime, I’ll have to dare a month with no posts to learn how much my back catalogue gets on its own weight.)

Bar chart showing two and a half years' worth of readership figures. After several months' decline halted in October, the readership dropped again for November.
I feel like views per visitor are always the same number. At least something in the 1.4 to 1.5 range. They’re not; back in April the views per visitor were 1.31, and in February 1.38. Still, it doesn’t seem like it varies a lot.

I am at least growing less likable, confirming a fear. There were 25 likes given in November, the second month in a row it’s been less than one like a day. The mean was 43.4 likes per day, and the median 42. It doesn’t even look good rated per posting: this came out to 3.6 likes per posting, compared to a running mean of 5.3 and running median of 5.6. Comments offer a little hope, at least, with 13 comments given over the course of November. The mean was 15.1 and median 10.1. Per posting, this gets right on average: November averaged 1.9 comments per posting, and the twelve-month running mean was 1.9. The twelve-month running median was 1.4 comments per posting, so I finally found a figure where I beat an average.

WordPress figures I published 6,106 words this past month. It’s my second-most loquacious month this year, with an average 872.3 words per November posting. It brings my total for the year to 50,429 words, averaging 623 words per posting. Unless December makes some big changes this is going to be my second-least-talkative year of the blog.

As of the start of November I’ve had 1,663 postings here. They’ve drawn a total 148,937 views, from 88,561 unique visitors.

If you’d like to follow this blog regularly, I’d be glad if you did. You can use the “Follow Nebusresearch” button at the upper right corner of this page. Or you can get essays by e-mail as soon as they’re published, using the box just below that button. I don’t use the e-mail for anything but sending these essays. I don’t know how WordPress Master Command uses them.

To get essays on your RSS reader, use the feed at https://nebusresearch.wordpress.com/feed/. You can get RSS readers from several places, including This Old Reader or at Newsblur. You also can sign up for a free account at Dreamwidth or Livejournal. Use https://www.dreamwidth.org/feeds/ or https://www.livejournal.com/syn to add my essays to your Reading or Friends page. You can use that for any blog with an RSS feed.

While my Twitter account has gone feral I am on Mathstodon, the mathematics-themed instance of the Mastodon network. So you can catch me as @nebusj@mathstodon.xyz there. Thank you as ever for reading and for, I hope, the successful conclusion of this year’s little A-to-Z.

My Little 2021 Mathematics A-to-Z: Convex


Jacob Siehler, a friend from Mathstodon, and Assistant Professor at Gustavus Adolphus College, offered several good topics for the letter ‘C’. I picked the one that seemed to connect to the greatest number of other topics I’ve covered recently.

Convex

It’s easy to say what convex is, if we’re talking about shapes in ordinary space. A convex shape is one where the line connecting any two points inside the shape always stays inside the shape. Circles are convex. Triangles and rectangles too. Star shapes are not. Is a torus? That depends. If it’s a doughnut shape sitting in some bigger space, then it’s not convex. If the doughnut shape is all the space there is to consider, then it is. There’s a parallel here to prime numbers. Whether 5 is a prime depends on whether you think 5 is an integer, a real number, or a complex number.

Still, this seems easy to the point of boring. So how does Wolfram Mathworld match 337 items for ‘convex’? For a sense of scale, it has only 112 matches for ‘quadrilateral’. This is a word used almost as much as ‘quadrilateral’, with 370 items. Why?

Why is that it’s one of those terms that sneaks in everywhere. Some of it is obvious. There’s a concept called “star-convex”, where two points only need a connection by some path. It doesn’t have to be a straight line. That’s a familiar mathematical trick, coming up with a less-demanding version of a property. There’s the “convex hull”, which is the smallest convex set that contains a given set of points. We even come up with “convex functions”, functions of real numbers. A function’s convex if, the space above the graph of a function is convex. This seems like stretching the idea of convexity rather a bit.

Still, we wouldn’t coin such a term if we couldn’t use it. Well, if someone couldn’t use it. The saving thing here is the idea of “space”. We get it from our idea of what space is from looking around rooms and walking around hills and stuff. But what makes something a space? When we look at what’s essential? What we need is traits like, there are things. We can measure how far apart things are. We have some idea of paths between things. That’s not asking a lot.

So many things become spaces. And so convexity sneaks in everywhere. A convex function has nice properties if you’re looking for minimums. Or maximums; that’s as easy to do. And we look for minimums a lot. A large, practical set of mathematics is the search for optimum values, the set of values that maximize, or minimize, something. You may protest that not everything we’re intersted in is a convex function. This is true. But a lot of what we are interested in is, or is approximately.

This gets into some surprising corners. Economics, for example. The mathematics of economics is often interested in how much of a thing you can make. But you have to put things in to make it. You expect, at least once the system is set up, that if you halve the components you put in you get half the thing out. Or double the components in and get double the thing out. But you can run out of the components. Or related stuff, like, floor space to store partly-complete product. Or transport available to send this stuff to the customer. Or time to get things finished. For our needs these are all “things you can run out of”.

And so we have a problem of linear programming. We have something or other we want to optimize. Call it y . It depends on a whole range of variables, which we describe as a vector \vec{x} . And we have constraints. Each of these is an inequality; we can represent that as demanding some functions of these variables be at most some numbers. We can bundle those functions together as a matrix called A . We can bundle those maximum numbers together as a vector called \vec{b} . So the problem is finding A\vec{x} \le \vec{b} . Also, we demand that none of these values be smaller than some minimum we might as well call 0. The range of all the possible values of these variables is a space. These constraints chop up that space, into a shape. Into a convex shape, of course, or this paragraph wouldn’t belong in this essay. If you need to be convinced of this, imagine taking a wedge of cheese and hacking away slices all the way through it. How do you cut a cave or a tunnel in it?

So take this convex shape, called a polytope. That’s what we call a polygon or polyhedron if we don’t want to commit to any particular number of dimensions of space. (If we’re being careful. My suspicion is ‘polyhedron’ is more often said.) This makes a shape. Some point in that shape has the best possible value of y . (Also the worst, if that’s your thing.) Where is it? There is an answer, and it gives a pretext to share a fun story. The answer is that it’s on the outside, on one of the faces of the polytope. And you can find it following along the edges of those polytopes. This we know as the simplex method, or Dantzig’s Simplex Method if we must be more particular, for George Dantzig. Its success relies on looking at convex functions in convex spaces and how much this simplifies finding things.

Usually. The simplex method is one of polynomial-order complexity for normal, typical problems. That’s a measure of how much longer it takes to find an answer as you get more variables, more constraints, more work. Polynomial is okay, growing about the way it takes longer to multiply when you have more digits in the numbers. But there’s a worst case, in which the complexity grows exponentially. We shy away from exponential-complexity because … you know, exponentials grow fast, given a chance. What saves us is that that’s a worst case, not a typical case. The convexity lets us set up our problem and, rather often, solve it well enough.

Now the story, a mutation of which it’s likely you encountered. George Dantzig, as a student in Jerzy Neyman’s statistics class, arrived late one day to find a couple problems on the board. He took these to be homework, and struggled with the harder-than-usual set. But turned them in, apologizing for them being late. Neyman accepted the work, and eventually got around to looking at it. This wasn’t the homework. This was some unsolved problems in statistics. Six weeks later Neyman had prepared them for publication. A year later, Neyman explained to Dantzig that all he needed to earn his PhD was put these two papers together in a nice binder.

This cute story somehow escaped into the wild. It became an inspirational tale for more than mathematics grad students. That part’s easy to see; it has most everything inspiration needs. It mutated further, into the movie Good Will Hunting. I do not know that the unsolved problems, work done in the late 1930s, related to Dantzig’s simplex method, proved after World War II. It may be that they are simply connected in their originator. But perhaps it is more than I realize now.


I hope to finish off the word ‘Mathematics’ with the letter S next week. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all the A-to-Z essays from past years, should be at this link. Thank you for reading.

I’m looking for the last topics for the Little 2021 Mathematics A-to-Z


I’m approaching the end of this year’s little Mathematics A-to-Z. The project’s been smaller, as I’d hoped, although I’m not sure I managed to make it any less hard on myself. Still, I’m glad to be doing it and glad to have the suggestions of you kind readers for topics. This quartet should wrap up the year, and the project.

So please let me know of any topics you’d like to see me try taking on. The topic should be anything mathematics-related, although I tend to take a broad view of mathematics-related. (I’m also open to biographical sketches.) To suggest something, please, say so in a comment. If you do, please also let me know about any projects you have — blogs, YouTube channels, real-world projects — that I should mention at the top of that essay.

I am happy to revisit a subject I think I have more to write about, so don’t be shy about suggesting those. Past essays for these letters include:

A.


T.


O.


Z.


And, as ever, all my A-to-Z essays should be at this link. Thanks for reading and thanks for sharing your thoughts.

My Little 2021 Mathematics A-to-Z: Inverse


I owe Iva Sallay thanks for the suggestion of today’s topic. Sallay is a longtime friend of my blog here. And runs the Find the Factors recreational mathematics puzzle site. If you haven’t been following, or haven’t visited before, this is a fun week to step in again. The puzzles this week include (American) Thanksgiving-themed pictures.

Inverse.

When we visit the museum made of a visual artist’s studio we often admire the tools. The surviving pencils and crayons, pens, brushes and such. We don’t often notice the eraser, the correction tape, the unused white-out, or the pages cut into scraps to cover up errors. To do something is to want to undo it. This is as true for the mathematics of a circle as it is for the drawing of one.

If not to undo something, we do often want to know where something comes from. A classic paper asks can one hear the shape of a drum? You hear a sound. Can you say what made that sound? Fine, dismiss the drum shape as idle curiosity. The same question applies to any sensory data. If our hand feels cooler here, where is the insulation of the building damaged? If we have this electrocardiogram reading, what can we say about the action of the heart producing that? If we see the banks of a river, what can we know about how the river floods?

And this is the point, and purpose, of inverses. We can understand them as finding the causes of what we observe.

The first inverse we meet is usually the inverse function. It’s introduced as a way to undo what a function does. That’s an odd introduction, if you’re comfortable with what a function is. A function is a mathematical construct. It’s two sets — a domain and a range — and a rule that links elements in the domain to the range. To “undo” a function is like “undoing” a rectangle. But a function has a compelling “physical” interpretation. It’s routine to introduce functions as machines that take some numbers in and give numbers out. We think of them as ways to transform the domain into the range. In functional analysis get to thinking of domains as the most perfect putty. We expect functions to stretch and rotate and compress and slide along as though they were drawing a Betty Boop cartoon.

So we’re trained to speak of a function as a verb, acting on pieces of the domain. An element or point, or a region, or the whole domain. We think the function “maps”, or “takes”, or “transforms” this into its image in the range. And if we can turn one thing into another, surely we can turn it back.

Some things it’s obvious we can turn back. Suppose our function adds 2 to whatever we give it. We can get the original back by subtracting 2. If the function subtracts 32 and divides by 1.8, we can reverse it by multiplying by 1.8 and adding 32. If the function takes the reciprocal, we can take the reciprocal again. We have a bit of a problem if we started out taking the reciprocal of 0, but who would want to do such a thing anyway? If the function squares a number, we can undo that by taking the square root. Unless we started from a negative number. Then we have trouble.

The trouble is not every function has an inverse. Which we could have realized by thinking how to undo “multiply by zero”. To be a well-defined function, the rule part has to match elements in the domain to exactly one element in the range. This makes the function, in the impenetrable jargon of the mathematician, a “one-to-one function”. Or you can describe it with the more intuitive label of “bijective”.

But there’s no reason more than one thing in the domain can’t match to the same thing in the range. If I know the cosine of my angle is \frac{1}{2}, my angle might be 30 degrees. Or -30 degrees. Or 390 degrees. Or 330 degrees. You may protest there’s no difference between a 30 degree and a 390 degree angle. I agree those angles point in the same direction. But a gear rotated 390 degrees has done something that a gear rotated 30 degrees hasn’t. If all I know is where the dot I’ve put on the gear is, how can I know how much it’s rotated?

So what we do is shift from the actual cosine into one branch of the cosine. By restricting the domain we can create a function that has the same rule as the one we want, but that’s also one-to-one and so has an inverse. What restriction to use? That depends on what you want. But mathematicians have some that come up so often they might as well be defaults. So the square root is the inverse of the square of nonnegative numbers. The inverse Cosine is the inverse of the cosine of angles from 0 to 180 degrees. The inverse Sine is the inverse of the sine of angles from -90 to 90 degrees. The capital letters are convention to say we’re doing this. If we want a different range, we write out that we’re looking for an inverse cosine from -180 to 0 degrees or whatever. (Yes, the mathematician will default to using radians, rather than degrees, for angles. That’s a different essay.) It’s an imperfect solution, but it often works well enough.

The trouble we had with cosines, and functions, continues through all inverses. There are almost always alternate causes. Many shapes of drums sound alike. Take two metal bars. Heat both with a blowtorch, one on the end and one in the center. Not to the point of melting, only to the point of being too hot to touch. Let them cool in insulated boxes for a couple weeks. There’ll be no measurement you can do on the remaining heat that tells you which one was heated on the end and which the center. That’s not because your thermometers are no good or the flow of heat is not deterministic or anything. It’s that both starting cases settle to the same end. So here there is no usable inverse.

This is not to call inverses futile. We can look for what we expect to find useful. We are inclined to find inverses of the cosine between 0 and 180 degrees, even though 4140 through 4320 degrees is as legitimate. We may not know what is wrong with a heart, but have some idea what a heart could do and still beat. And there’s a famous example in 19th-century astronomy. After the discovery of Uranus came the discovery it did not move right. For a while it moved across the sky too fast for its distance from the sun. Then it started moving too slow. The obvious supposition was that there was another, not-yet-seen, planet, affecting its orbit.

The trouble is finding it. Calculating the orbit from what data they had required solving equations with 13 unknown quantities. John Couch Adams and Urbain Le Verrier attempted this anyway, making suppositions about what they could not measure. They made great suppositions. Le Verrier made the better calculations, and persuaded an astronomer (Johann Gottfried Galle, assisted by Heinrich Louis d’Arrest) to go look. Took about an hour of looking. They also made lucky suppositions. Both, for example, supposed the trans-Uranian planet would obey “Bode’s Law”, a seeming pattern in the size of planetary radiuses. The actual Neptune does not. It was near enough in the sky to where the calculated planet would be, though. The world is vaster than our imaginations.

That there are many ways to draw Betty Boop does not mean there’s nothing to learn about how this drawing was done. And so we keep having inverses as a vibrant field of mathematics.


Next week I hope to cover the letter ‘C’ and don’t think I’m not worried about what that ‘C’ will be. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all the A-to-Z essays from past years, should be at this link. Thank you for reading.

My Little 2021 Mathematics A-to-Z: Triangle


And I have another topic suggested by John Golden, author of Math Hombre. It’s one of the basic bits of mathematics, and so is hard to think about.

Triangle.

Edward Brisse assembled a list of 2,001 things to call a “center” of a triangle. I’d have run out around three. We don’t need most of them. I mention them because the list speaks of how interesting we find triangles. Nobody’s got two thousand thoughts about enneadecagons (19-sided figures).

As always with mathematics it’s hard to say whether triangles are all that interesting or whether we humans are obsessed. They’ve got great publicity. The Pythagorean Theorem may be the only bit of interesting mathematics an average person can be assumed to recognize. The kinds of triangles — acute, obtuse, right, equilateral, isosceles, scalene — are fit questions for trivia games. An ordinary mathematics education can end in trigonometry. This ends up being about circles, but we learn it through triangles. The art and science of determining where a thing is we call “triangulation”.

But triangles do seem to stand out. They’re the simplest polygon, only three vertices and three edges. So we can slice any other polygon into triangles. Any triangle can tile the plane. Even quadrilaterals may need reflections of themselves. One of the first geometry facts we learn is the interior angles of a triangle add up to two right angles. And one of the first geometry facts we learn, discovering there are non-Euclidean geometries, is that they don’t have to.

Triangles have to be convex, that is, they don’t have any divots. This property sounds boring. But it’s a good boring; it makes other work easier. It tells us that the length of any two sides of a triangle add together to something longer than the third side. And that’s a powerful idea.

There are many ways to define “distance”. Mathematicians have tried to find the most abstract version of the concept. This inequality is one of the few pieces that every definition of “distance” must respect. This idea of distance leaps out of shapes drawn on paper. Last week I mentioned a triangle inequality, in discussing functions f and g . We can define operators that describe a distance between functions. And the distances between trios of functions behave like the distances between points on the triangle. Thus does geometry sneak in to abstract concepts like “piecewise continuous functions”.

And they serve in curious blends of the abstract and the concrete. For example, numerical solutions to partial differential equations. A partial differential equation is one where we want to know a function of two or more variables, and only have information about how the function changes as those variables change. These turn up all the time in any study of things in bulk. Heat flowing through space. Waves passing through fluids. Fluids running through channels. So any classical physics problem that isn’t, like, balls bouncing against each other or planets orbiting stars. We can solve these if they’re linear. Linear here is a term of art meaning “easy”. I kid; “linear” means more like “manageable”. All the good problems are nonlinear and we can exactly solve about two of them.

So, numerical solutions. We make approximations by putting down a mesh on the differential equation’s domain. And then, using several graduate-level courses’ worth of tricks, approximating the equation we want with one that we can solve here. That mesh, though? … It can be many things. One powerful technique is “finite elements”. An element is a small piece of space. Guess what the default shape for these elements are. There are times, and reasons, to use other shapes as elements. You learn those once you have the hang of triangles. (Dividing the space of your variables up into elements lets you look for an approximate solution using tools easier to manage than you’d have without. This is a bit like looking for one’s keys over where the light is better. But we can find something that’s as close as we need to our keys.)

If we need finite elements for, oh, three dimensions of space, or four, then triangles fail us. We can’t fill a volume with two-dimensional shapes like triangles. But the triangle has its analog. The tetrahedron, in some sense four triangles joined together, has all the virtues of the triangle for three dimensions. We can look for a similar shape in four and five and more dimensions. If we’re looking for the thing most like an equilateral triangle, we’re looking for a “simplex”.

These simplexes, or these elements, sprawl out across the domain we want to solve problems for. They look uncannily like the triangles surveyors draw across the chart of a territory, as they show us where things are.


Next week I hope to cover the letter ‘I’ as I near the end of ‘Mathematics’ and consider what to do about ‘A To Z’. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all the A-to-Z essays from past years, should be at this link. Thank you once more for reading.

My Little 2021 Mathematics A-to-Z: Analysis


I’m fortunate this week to have another topic suggested again by Mr Wu, blogger and Singaporean mathematics tutor. It’s a big field, so forgive me not explaining the entire subject.

Analysis.

Analysis is about proving why the rest of mathematics works. It’s a hard field. My experience, a typical one, included crashing against real analysis as an undergraduate and again as a graduate student. It turns out mathematics works by throwing a lot of \epsilon symbols around.

Let me give an example. If you read pop mathematics blogs you know about the number represented by 0.999999\cdots . You’ve seen proofs, some of them even convincing, that this number equals 1. Not a tiny bit less than 1, but exactly 1. Here’s a real-analysis treatment. And — I may regret this — I recommend you don’t read it. Not closely, at least. Instead, look at its shape. Look at the words and symbols as graphic design elements, and trust that what I say is not nonsense. Resume reading after the horizontal rule.

It’s convenient to have a name for the number 0.999999\cdots . I’ll call that r , for “repeating”. 1 we’ll call 1. I think you’ll grant that whatever r is, it can’t be more than 1. I hope you’ll accept that if the difference between 1 and r is zero, then r equals 1. So what is the difference between 1 and r?

Give me some number \epsilon . It has to be a positive number. The implication in the letter \epsilon is that it’s a small number. This isn’t actually required in general. We expect it. We feel surprise and offense if it’s ever not the case.

I can show that the difference between 1 and r is less than \epsilon . I know there is some smallest counting number N so that \epsilon > \frac{1}{10^{N}} . For example, say \epsilon is 0.125. Then we can let N = 1, and 0.125 > \frac{1}{10^{1}} . Or suppose \epsilon is 0.00625. But then if N = 3, 0.00625 > \frac{1}{10^{3}} . (If \epsilon is bigger than 1, let N = 1.) Now we have to ask why I want this N.

Whatever the value of r is, I know that it is more than 0.9. And that it is more than 0.99. And that it is more than 0.999. In fact, it’s more than the number you get by truncating r after any whole number N of digits. Let me call r_N the number you get by truncating r after N digits. So, r_1 = 0.9 and r_2 = 0.99 and r_5 = 0.99999 and so on.

Since r > r_N , it has to be true that 1 - r < 1 - r_N . And since we know what r_N is, we can say exactly what 1 - r_N is. It's \frac{1}{10^{N}} . And we picked N so that \frac{1}{10^{N}} < \epsilon . So 1 - r < 1 - r_N = \frac{1}{10^{N}} < \epsilon . But all we know of \epsilon is that it's a positive number. It can be any positive number. So 1 - r has to be smaller than each and every positive number. The biggest number that’s smaller than every positive number is zero. So the difference between 1 and r must be zero and so they must be equal.


That is a compelling argument. Granted, it compels much the way your older brother kneeling on your chest and pressing your head into the ground compels. But this argument gives the flavor of what much of analysis is like.

For one, it is fussy, leaning to technical. You see why the subject has the reputation of driving off all but the most intent mathematics majors. If you get comfortable with this sort of argument it’s hard to notice anymore.

For another, the argument shows that the difference between two things is less than every positive number. Therefore the difference is zero and so the things are equal. This is one of mathematics’ most important tricks. And another point, there’s a lot of talk about \epsilon . And about finding differences that are, it usually turns out, smaller than some \epsilon . (As an undergraduate I found something wasteful in how the differences were so often so much less than \epsilon . We can’t exhaust the small numbers, though. It still feels uneconomic.)

Something this misses is another trick, though. That’s adding zero. I couldn’t think of a good way to use that here. What we often get is the need to show that, say, function f and function g are equal. That is, that they are less than \epsilon apart. What we can often do is show that f is close to some related function, which let me call f_n .

I know what you’re suspecting: f_n must be a polynomial. Good thought! Although in my experience, it’s actually more likely to be a piecewise constant function. That is, it’s some number, eg, “2”, for part of the domain, and then “2.5” in some other region, with no transition between them. Some other values, even values not starting with “2”, in other parts of the domain. Usually this is easier to prove stuff about than even polynomials are.

But get back to g_n . It’s got the same deal as f_n , some approximation easier to prove stuff about. Then we want to show that g is close to some g_n . And then show that f_n is close to g_n . So — watch this trick. Or, again, watch the shape of this trick. Read again after the horizontal rule.

The difference | f - g | is equal to | f - f_n + f_n - g | since adding zero, that is, adding the number ( -f_n + f_n ) , can’t change a quantity. And | f - f_n + f_n - g | is equal to | f - f_n + f_n -g_n + g_n - g | . Same reason: ( -g_n + g_n ) is zero. So:

| f - g | = |f - f_n + f_n -g_n + g_n - g |

Now we use the “triangle inequality”. If a, b, and c are the lengths of a triangle’s sides, the sum of any two of those numbers is larger than the third. And that tells us:

|f - f_n + f_n  -g_n + g_n - g | \le |f - f_n| + |f_n - g_n|  + | g_n - g |

And then if you can show that | f - f_n | is less than \frac{1}{3}\epsilon ? And that | f_n - g_n | is also \frac{1}{3}\epsilon ? And you see where this is going for | g_n - g | ? Then you’ve shown that | f - g | \le \epsilon . With luck, each of these little pieces is something you can prove.


Don’t worry about what all this means. It’s meant to give a flavor of what you do in an analysis course. It looks hard, but most of that is because it’s a different sort of work than you’d done before. If you hadn’t seen the adding-zero and triangle-inequality tricks? I don’t know how long you’d need to imagine them.

There are other tricks too. An old reliable one is showing that one thing is bounded by the other. That is, that f \le g . You use this trick all the time because if you can also show that g \le f , then those two have to be equal.

The good thing — and there is good — is that once you get the hang of these tricks analysis starts to come together. And even get easier. The first course you take as a mathematics major is real analysis, all about functions of real numbers. The next course in this track is complex analysis, about functions of complex-valued numbers. And it is easy. Compared to what comes before, yes. But also on its own. Every theorem in complex analysis named after Augustin-Louis Cauchy. They all show that the integral of your function, calculated along a closed loop, is zero. I exaggerate by \epsilon .

In grad school, if you make it, you get to functional analysis, which examines functions on functions and other abstractions like that. This, too, is easy, possibly because all the basic approaches you’ve seen several courses over. Or it feels easy after all that mucking around with the real numbers.

This is not the entirety of explaining how mathematics works. Since all these proofs depend on how numbers work, we need to show how numbers work. How logic works. But those are subjects we can leave for grad school, for someone who’s survived this gauntlet.


I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all this year’s essays, and all A-to-Z essays from past years, should be at this link. Thank you once more for reading.

How October 2021 Treated My Mathematics Blog


I’m aware this is a fair bit into October. But it’s the first publication slot I’ve had free. At least since I want Wednesdays to take the Little 2021 A-to-Z essays, and Mondays the other thing I publish. If that, since October ended up another month when I barely managed one essay a week. Let me jump right to that, in fact. The five essays published here in October ranked like this, in popularity, and it’s not just order of publication:

I don’t know what made “Embedding” so popular. I’d suspect I may have hit a much-searched-for keyword except it doesn’t seem to be popular so far in November.

So I got 2,547 page views around here in October. This is up from the last couple months. It’s quite average for the twelve months from October 2020 through September 2021, though. The twelve-month running mean was 2,543.2 page views per month, and the running median of 2,569 views per month. I told you it was average.

Bar chart showing two and a half years' worth of readership figures. After a fairly steep three-month decline both page views and unique readers rose slightly in August, dropped in September, and rose a fair bit in October.
Hey, they forgot to put in the advertisement offering to sell me some kind of further insight this month. Now how will I know to “post content that your readers respond to”?

There were 1,733 unique visitors, as WordPress makes it out. That’s almost, but a bit below average. The running mean was 1,811.3 visitors per month for the twelve months leading up to October. The running median was 1,801 unique visitors. I can make this into something good; it implies people who visited read more stuff. A mere 30 likes were given in October, below the running mean of 47.5 and median of 45. And there were only five comments, below the mean of 16.2 and median of 12.

Given that I’m barely posting anymore, though, the numbers look all right. This was 509.4 views per posting, which creams the running mean of 286.0 and running median of 295.9 views per posting. There were 346.8 unique visitors per posting, even more above the running mean of 203.2 and running median of 205.6 unique visitors per posting. Rating things per posting even makes the number of likes look good: 6.0 per posting, above the mean of 5.2 and median of 4.9. Can’t help with comments, though. Those hang out at a still-anemic 1.0 comments per posting, below the running mean of 1.9 and median of 1.4.

WordPress figures that I published 5,335 words in October, an average of 1,067.0 words per posting. That is my second-chattiest month all year, and my longest words-per-posting for the month. I don’t know where all those words came from. So far for all of 2021 I’ve published 44,323 words, averaging 599 words per essay.

As of the start of November I’ve published 1,656 essays here. They’ve drawn a total 146,834 views from 87,340 logged unique visitors. And drawn 3,285 comments altogether, so far.

If you’d like to follow this blog regularly, please do. You can use the “Follow Nebusresearch” button at the upper right corner of this page. Or you can get essays by e-mail as soon as they’re published, using the box just below that button. I never use the e-mail for anything but sending these essays. I can’t say what WordPress does with them, though.

To get essays on your RSS reader, use the feed at https://nebusresearch.wordpress.com/feed/. You can get RSS readers from several places, including This Old Reader or at Newsblur. You also can sign up for a free account at Dreamwidth or Livejournal, and use https://www.dreamwidth.org/feeds/ or https://www.livejournal.com/syn to add my essays to your Reading or Friends page. This words for any RSS feed, and very many blogs and web sites offer them yet.

While my Twitter account is unattended — all it does is post announcements of essays; I don’t see anything from it — I am on Mathstodon, the mathematics-themed instance of the Mastodon network. So you can catch me as @nebusj@mathstodon.xyz there, and I’m not sure anyone has yet. Still, thank you for reading, and here’s hoping for a good November.

My Little 2021 Mathematics A-to-Z: Monte Carlo


This week’s topic is one of several suggested again by Mr Wu, blogger and Singaporean mathematics tutor. He’d suggested several topics, overlapping in their subject matter, and I was challenged to pick one.

Monte Carlo.

The reputation of mathematics has two aspects: difficulty and truth. Put “difficulty” to the side. “Truth” seems inarguable. We expect mathematics to produce sound, deductive arguments for everything. And that is an ideal. But we often want to know things we can’t do, or can’t do exactly. We can handle that often. If we can show that a number we want must be within some error range of a number we can calculate, we have a “numerical solution”. If we can show that a number we want must be within every error range of a number we can calculate, we have an “analytic solution”.

There are many things we’d like to calculate and can’t exactly. Many of them are integrals, which seem like they should be easy. We can represent any integral as finding the area, or volume, of a shape. The trick is that there’s only a few shapes with volumes we can find exact formulas for. You may remember the area of a triangle or a parallelogram. You have no idea what the area of a regular nonagon is. The trick we rely on is to approximate the shape we want with shapes we know formulas for. This usually gives us a numerical solution.

If you’re any bit devious you’ve had the impulse to think of a shape that can’t be broken up like that. There are such things, and a good swath of mathematics in the late 19th and early 20th centuries was arguments about how to handle them. I don’t mean to discuss them here. I’m more interested in the practical problems of breaking complicated shapes up into simpler ones and adding them all together.

One catch, an obvious one, is that if the shape is complicated you need a lot of simpler shapes added together to get a decent approximation. Less obvious is that you need way more shapes to do a three-dimensional volume well than you need for a two-dimensional area. That’s important because you need even way-er more to do a four-dimensional hypervolume. And more and more and more for a five-dimensional hypervolume. And so on.

That matters because many of the integrals we’d like to work out represent things like the energy of a large number of gas particles. Each of those particles carries six dimensions with it. Three dimensions describe its position and three dimensions describe its momentum. Worse, each particle has its own set of six dimensions. The position of particle 1 tells you nothing about the position of particle 2. So you end up needing ridiculously, impossibly many shapes to get even a rough approximation.

With no alternative, then, we try wisdom instead. We train ourselves to think of deductive reasoning as the only path to certainty. By the rules of deductive logic it is. But there are other unshakeable truths. One of them is randomness.

We can show — by deductive logic, so we trust the conclusion — that the purely random is predictable. Not in the way that lets us say how a ball will bounce off the floor. In the way that we can describe the shape of a great number of grains of sand dropped slowly on the floor.

The trick is one we might get if we were bad at darts. If we toss darts at a dartboard, badly, some will land on the board and some on the wall behind. How many hit the dartboard, compared to the total number we throw? If we’re as likely to hit every spot of the wall, then the fraction that hit the dartboard, times the area of the wall, should be about the area of the dartboard.

So we can do something equivalent to this dart-throwing to find the volumes of these complicated, hyper-dimensional shapes. It’s a kind of numerical integration. It isn’t particularly sensitive to how complicated the shape is, though. It takes more work to find the volume of a shape with more dimensions, yes. But it takes less more-work than the breaking-up-into-known-shapes method does. There are wide swaths of mathematics and mathematical physics where this is the best way to calculate the integral.

This bit that I’ve described is called “Monte Carlo integration”. The “integration” part of the name because that’s what we started out doing. To call it “Monte Carlo” implies either the method was first developed there or the person naming it was thinking of the famous casinos. The case is the latter. Monte Carlo methods as we know them come from Stanislaw Ulam, mathematical physicist working on atomic weapon design. While ill, he got to playing the game of Canfield solitaire, about which I know nothing except that Stanislaw Ulam was playing it in 1946 while ill. He wondered what the chance was that a given game was winnable. The most practical approach was sampling: set a computer to play a great many games and see what fractions of them were won. (The method comes from Ulam and John von Neumann. The name itself comes from their colleague Nicholas Metropolis.)

There are many Monte Carlo methods, with integration being only one very useful one. They hold in common that they’re build on randomness. We try calculations — often simple ones — many times over with many different possible values. And the regularity, the predictability, of randomness serves us. The results come together to an average that is close to the thing we do want to know.


I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all A-to-Z essays from past years, should be at this link. And if you’d like to shape the next several essays, please let me know of some topics worth writing about! Thank you for reading.

I’m looking for some more topics for the Little 2021 Mathematics A-to-Z


I am happy to be near the midpoint of my Little 2021 Mathematics A-to-Z. It feels like forever since I planned to start this, but it has been a long and a hard year. I am in need of topics for the third quarter of letters, the end of the world ‘Mathematics’, and so I appeal to my kind readers for help.

What are mathematical topics which start with the letters I, C, or S, that you’d like to see me try explaining? Leave a comment, and let me know. I’ll pick the one I think I can be most interesting about. As you nominate things, please also include a mention of your own blog or YouTube channel or book. Whatever other projects you do that people might enjoy. The projects don’t need to be mathematical. The topics don’t need to be either, although I like being able to see mathematics from them.

Here are the topics I’ve covered in past years. I’m willing to consider redoing one of these, if I can find a fresh approach. So don’t be afraid to ask if you think I might do a better job about, oh, cohomology or something.

I.


C.


S.


(Please note: there’s nothing I can do with cohomology. I did my best and that’s how it came out.)

All the Little 2021 A-to-Z essays should be at this link. And if you like, all of my A-to-Z essays, for every year, should be at this link. Thanks for reading, and thanks for suggesting things.

My Little 2021 Mathematics A-to-Z: Embedding


Elkement, who’s one of my longest blog-friends here, put forth this suggestion for an ‘E’ topic. It’s a good one. They’re author of the Theory and Practice of Trying to Combine Just Anything blog. Their blog has recently been exploring complex-valued numbers and how to represent rotations.

Embedding.

Consider a book. It’s a collection. It’s easy to see the ordered setting of words, maybe pictures, possibly numbers or even equations. The important thing is the ideas those all represent.

Set the book in a library. How can this change the book?

Perhaps the comparison to other books shows us something the original book neglected. Perhaps something in the original book we now realize was a brilliantly-presented insight. The way we appreciate the book may change.

What can’t change is the content of the original book. The words stay the same, in the same order. If it’s a physical book, the number of pages stays the same, as does the size of the page. The ideas expressed remain the same.

So now you understand embedding. It’s a broad concept, something that can have meaning for any mathematical structure. A structure here is a bunch of items and some things you can do with them. A group, for example, is a good structure to use with this sort of thing. So, for example, the integers and regular addition. This original structure’s embedded in another when everything in the original structure is in the new, and everything you can do with the original structure you can do in the new and get the same results. So, for example, the group you get by taking the integers and regular addition? That’s embedded in the group you get by taking the rational numbers and regular addition. 4 + 8 is 12 whether or not you consider 6.5 a topic fit for discussion. It’s an embedding that expands the set of elements, and that modifies the things you can do to match.

The group you get from the integers and addition is embedded in other things. For example, it’s embedded in the ring you get from the integers and regular addition and regular multiplication. 4 + 8 remains 12 whether or not you can multiply 4 by 8. This embedding doesn’t add any new elements, just new things you can do with them.

Once you have the name, you see embedding everywhere. When we first learn arithmetic we — I, anyway — learn it as adding whole numbers together. Then we embed that into whole numbers with addition and multiplication. And then the (nonnegative) rational numbers with addition and multiplication. At some point (I forget when) the negative numbers came in. So did the whole set of real numbers. Eventually the real numbers got embedded into the complex numbers. And the complex numbers got embedded into the quaternions, although we found real and complex numbers enough for most of our work. I imagine something similar goes on these days.

There’s never only one embedding possible. Consider, for example, two-dimensional geometry, the shapes of figures on a sheet of paper. It’s easy to put that in three dimensions, by setting the paper on the floor, and expand it by drawing in chalk on the wall. Or you can set the paper on the wall, and extend its figures by drawing in chalk on the floor. Or set the paper at an angle to the floor. What you use depends on what’s most convenient. And that can be driven by laziness. It’s easy to match, say, the point in two dimensions at coordinates (3, 4) with the point in three dimensions at coordinates (3, 4, 0), even though (0, 3, 4) or (4, 0, 3) are as valid.

Why embed something in another thing? For the same reasons we do any transformation in mathematics. One is that we figure to embed the thing we’re working on into something easier to deal with. A famous example of this is the Nash embedding theorem. It describes when certain manifolds can be embedded into something that looks like normal space. And that’s useful because it can turn nonlinear partial differential equations — the most insufferable equations — into something solvable.

Another good reason, though, is the one implicit in that early arithmetic education. We started with whole-numbers-with-addition. And then we added the new operation of multiplication. And then new elements, like fractions and negative numbers. If we follow this trail we get to some abstract, tricky structures like octonions. But by small steps in which we have great experience guiding us into new territories.


I hope to return in a week with a fresh A-to-Z essay. This week’s essay, and all the essays for the Little Mathematics A-to-Z, should be at this link. And all of this year’s essays, and all A-to-Z essays from past years, should be at this link. Thank you once more for reading.

My Little 2021 Mathematics A-to-Z: Hyperbola


John Golden, author of the Math Hombre blog, had several great ideas for the letter H in this little A-to-Z for the year. Here’s one of them.

Hyperbola.

The hyperbola is where advanced mathematics begins. It’s a family of shapes, some of the pieces you get by slicing a cone. You can make an approximate one shining a flashlight on a wall. Other conic sections are familiar, everyday things, though. Circles we see everywhere. Ellipses we see everywhere we look at a circle in perspective. Parabolas we learn, in approximation, watching something tossed, or squirting water into the air. The hyperbola should be as accessible. Hold your flashlight parallel to the wall and look at the outline of light it casts. But the difference between this and a parabola isn’t obvious. And it’s harder to see parabolas in nature. It’s the path a space probe swinging past a planet makes? Great guide for all us who’ve launched space probes past Jupiter.

When we learn of hyperbolas, somewhere in high school algebra or in precalculus, they seem designed to break the rules we had inferred. We’ve learned functions like lines and quadradics (parabolas) and cubics. They’re nice, simple, connected shapes. The hyperbola comes in two pieces. We’ve learned that the graph of a function crosses any given vertical line at most once. Now, we can expect to see it twice. We learn to sketch functions by finding a few interesting points — roots, y-intercepts, things like that. Hyperbolas, we’re taught to draw this little central box and then two asymptotes. Also, we have asymptotes, a simpler curve that the actual curve almost equals.

We’re trained to see functions having the couple odd points where they’re not defined. Nobody expects y = 1 \div x to mean anything when x is zero. But we learn these as weird, isolated points. Now there’s this interval of x-values that don’t fit anything on the graph. Half the time, anyway, because we see two classes of hyperbolas. There’s ones that open like cups, pointing up and down. Those have definitions for every value of x. There’s ones that open like ears, pointing left and right. Those have a box in the center where no y satisfies the x’s. They seem like they’re taught just to be mean.

They’re not, of course. The only mathematical thing we teach just to be mean is integration by trigonometric substitution. The things which seem weird or new in hyperbolas are, largely, things we didn’t notice before. A vertical line put across a circle or ellipse crosses the curve twice, most points. There are two huge intervals, to the left and to the right of the circle, where no value of y makes the equation true. Circles are familiar, though. Ellipses don’t seem intimidating. We know we can’t turn x^2 + y^2 = 4 (a typical circle) into a function without some work. We have to write either f(x) = \sqrt{4 - x^2} or f(x) = -\sqrt{4 - x^2} , breaking the circle into two halves. The same happens for hyperbolas, though, with x^2 - y^2 = 4 (a typical hyperbola) turning into f(x) = \sqrt{x^2 - 4} or f(x) = -\sqrt{x^2 - 4} .

Even the definitions seem weird. The ellipse we can draw by taking a set distance and two focus points. If the distance from the first focus to a point plus the distance from the point to the second focus is that set distance, the point’s on the ellipse. We can use two thumbtacks and a piece of string to draw the ellipse. The hyperbola has a simliar rule, but weirder. You have your two focus points, yes. And a set distance. But the locus of points of the hyperbola is everything where the distance from the point to one focus minus the distance from the point to the other focus is that set distance. Good luck doing that with thumbtacks and string.

Yet hyperbolas are ready for us. Consider playing with a decent calculator, hitting the reciprocal button for different numbers. 1 turns to 1, yes. 2 turns into 0.5. -0.125 turns into -8. It’s the simplest iterative game to do on the calculator. If you sketch this, though, all the points (x, y) where one coordinate is the reciprocal of the other? It’s two curves. They approach without ever touching the x- and y-axes. Get far enough from the origin and there’s no telling this curve from the axes. It’s a hyperbola, one that obeys that vertical-line rule again. It has only the one value of x that can’t be allowed. We write it as y = \frac{1}{x} or even xy = 1 . But it’s the shape we see when we draw x^2 - y^2 = 2 , rotated. Or a rotation of one we see when we draw y^2 - x^2 = 2 . The equations of rotated shapes are annoying. We do enough of them for ellipses and parabolas and hyperbolas to meet the course requirement. But they point out how the hyperbola is a more normal construct than we fear.

And let me look at that construct again. An equation describing a hyperbola that opens horizontally or vertically looks like ax^2 - by^2 = c for some constant numbers a, b, and c. (If a, b, and c are all positive, this is a hyperbola opening horizontally. If a and b are positive and c negative, this is a hyperbola opening vertically.) An equation describing an ellipse, similarly with its axes horizontal or vertical looks like ax^2 + by^2 = c . (These are shapes centered on the origin. They can have other centers, which make the equations harder but not more enlightening.) The equations have very similar shapes. Mathematics trains us to suspect things with similar shapes have similar properties. That change from a plus to a minus seems too important to ignore, and yet …

I bet you assumed x and y are real numbers. This is convention, the safe bet. If someone wants complex-valued numbers they usually say so. If they don’t want to be explicit, they use z and w as variables instead of x and y. But what if y is an imaginary number? Suppose y = \imath t , for some real number t, where \imath^2 = -1 . You haven’t missed a step; I’m summoning this from nowhere. (Let’s not think about how to draw a point with an imaginary coordinate.) Then ax^2 - by^2 = c is ax^2 - b(\imath t)^2 = c which is ax^2 + bt^2 = c . And despite the weird letters, that’s a circle. By the same supposition we could go from ax^2 + by^2 = c , which we’d taken to be a circle, and get ax^2 - bt^2 = c , a hyperbola.

Fine stuff inspiring the question “so?” I made up a case and showed how that made two dissimilar things look alike. All right. But consider trigonometry, built on the cosine and sine functions. One good way to see the cosine and sine of an angle is as the x- and y-coordinates of a point on the unit circle, where x^2 + y^2 = 1 . (The angle \theta is the one from the point (\cos(\theta), \sin(\theta)) to the origin to the point (1, 0).)

There exists, in parallel to the familiar trig functions, the “hyperbolic trigonometric functions”. These have imaginative names like the hyperbolic sine and hyperbolic cosine. (And onward. We can speak of the “inverse hyperbolic cosecant”, if we wish no one to speak to us again.) Usually these get introduced in calculus, to give the instructor a tiny break. Their derivatives, and integrals, look much like those of the normal trigonometric functions, but aren’t the exact same problems over and over. And these functions, too, have a compelling meaning. The hyperbolic cosine of an angle and hyperbolic sine of an angle have something to do with points on a unit hyperbola, x^2 - y^2 = 1 .

Thinking back on the flashlight. We get a circle by holding the light perpendicular to the wall. We get a hyperbola holding the light parallel. We get a circle by drawing x^2 + y^2 = 1 with x and y real numbers. We get a hyperbola by (somehow) drawing x^2 + y^2 = 1 with x real and y imaginary. We remember something about representing complex-valued numbers with a real axis and an orthogonal imaginary axis.

One almost feels the connection. I can’t promise that pondering this will make hyperbolas be as familiar as circles or at least ellipses. But often a problem that brings us to hyperbolas has an alternate phrasing that’s ellipses, a nd vice-versa. But the common traits of these conic slices can guide you into a new understanding of mathematics.


Thank you for reading. I hope to have another piece next week at this time. This and all of this year’s Little Mathematics A to Z essays should be at this link. And the A-to-Z essays for every year should be at this link.

My Little 2021 Mathematics A-to-Z: Torus


Mr Wu, a mathematics tutor in Singapore and author of the blog about that, offered this week’s topic. It’s about one of the iconic mathematics shapes.

Torus

When one designs a board game, one has to decide what the edge of the board means. Some games make getting to the edge the goal, such as Candy Land or backgammon. Some games set their play so the edge is unreachable, such as Clue or Monopoly. Some make the edge an impassible limit, such as Go or Scrabble or Checkers. And sometimes the edge becomes something different.

Consider a strategy game like Risk or Civilization or their video game descendants like Europa Universalis. One has to be able to go east, or west, without limit. But there’s no making a cylindrical board. Or making a board infinite in extent, side to side. Instead, the game demands we connect borders. Moving east one space from just-at-the-Eastern-edge means we put the piece at just-at-the-Western-edge. As a video game this is seamless. As a tabletop game we just learn to remember those units in Alberta are not so far from Kamchatka as they look. We have the awkward point that the board doesn’t let us go over the poles. It doesn’t hurt game play: no one wants to invade Russia from the north. We can represent a boundless space on our table.

Sometimes we need more. Consider the arcade game Asteroid. The player’s spaceship hopes to survive by blasting into dust asteroids cluttered around them. The game ‘board’ is the arcade screen, a manageable slice of space. Asteroids move in any direction, often drifting off-screen. If they were out of the game, this would make victory so easy as to be unsatisfying. So the game takes a tip from the strategy games, and connects the right edge of the screen to the left. If we ask why an asteroid last seen moving to the right now appears on the left, well, there are answers. One is to say we’re in a very average segment of a huge asteroid field. There’s about as many asteroids that happen to be approaching from off-screen as recede from us. Why our local work destroying asteroids eliminates the off-screen asteroids is a mystery for the ages. Perhaps the rest of the fleet is also asteroid-clearing at about our pace. What matters is we still have to do something with the asteroids.

Almost. We’ve still got asteroids leaking away through the top and bottom. But we can use the same trick the right and left edges do. And now we have some wonderful things. One is a balanced game. Another is the space in which ship and asteroids move. It is no rectangle now, but a torus.

This is a neat space to explore. It’s unbounded, for example, just as the surface of the Earth is. Or (it appears) the actual universe is. Set your course right and your spaceship can go quite a long way without getting back to exactly where it started from, again much like the surface of the Earth or the universe. We can impersonate an unbounded space using a manageably small set of coordinates, a decent-size game board.

That’s a nice trick to have. Many mathematics problems are about how great blocks of things behave. And it’s usually easiest to model these things if there aren’t boundaries. We can, sure, but they’re hard, most of the time. So we analyze great, infinitely-extending stretches of things.

Analysis does great things. But we need sometimes to do simulations, too. Computers are, as ever, great tempting setups to this. Look at a spreadsheet with hundreds of rows and columns of cells. Each can represent a point in space, interacting with whatever’s nearby by whatever our rule is. And this can do very well … except these cells have to represent a finite territory. A million rows can’t span more than one million times the greatest distance between rows. We have to handle that.

There are tricks. One is to model the cells as being at ever-expanding distances, trusting that there are regions too dull to need much attention. Another is to give the boundary some values that, we figure, look as generic as possible. That “past here it carries on like that”. The trick that makes rhetorical sense to mention here is creating a torus, matching left edge to right, top edge to bottom. Front edge to back if it’s a three-dimensional model.

Making a torus works if a particular spot is mostly affected by its local neighborhood. This describes a lot of problems we find interesting. Many of them are in statistical mechanics, where we do a lot of problems about particules in grids that can do one of two things, depending on the locale. But many mechanics problems work like this too. If we’re interested in how a satellite orbits the Earth, we can ignore that Saturn exists, except maybe as something it might photograph.

And just making a grid into a torus doesn’t solve every problem. This is obvious if you imagine making a torus that’s two rows and two columns linked together. There won’t be much interesting behavior there. Even a reasonably large grid offers problems. There might be structures larger than the torus is across or wide, for example, worth study, and those will be missed. That we have a grid means that a shape is easier to represent if it’s horizontal or vertical. In a real continuous space there’s no directions to be partial to.

There are topology differences too. A famous result shows that four colors are enough to color any map on the plane. On the torus we need at least seven. Putting colors on things may seem like a trivial worry. But map colorings represent information about how stuff can be connected. And here’s a huge difference in these connections.

This all is about one aspect of a torus. Likely you came in wondering when I would get to talking about doughnut shapes, and the line about topology may have readied you to hear about coffee cups. The torus, like most any mathematical concept familiar enough ordinary people know the word, connects to many ideas. Some of them have more than one hole. Some have surfaces that intersect themselves. Some extend into four or more dimensions. Some are even constructs that appear in phase space, describing ways that complicated physical systems can behave. These are all reflections of this shape idea that we can learn from thinking about game boards.


This and all of this year’s Little Mathematics A to Z essays should be at this link. And the A-to-Z essays for every year should be at this link.

%d bloggers like this: