My friend ChefMongoose pointed out this probability question. As with many probability questions, it comes from a dice game. Here, Yahtzee, based on rolling five dice to make combinations. I’m not sure whether my Twitter problems will get in the way of this embedding working; we’ll see.
Probability help please! You are playing Yahtzee against your insanely competitive spouse. You have two rolls left. You’re trying to get three of a kind. Is it better to commit and roll three dice here? Or split it and roll one die? pic.twitter.com/fi85UYUTUv
Probability help please! You are playing Yahtzee against your insanely competitive spouse. You have two rolls left. You’re trying to get three of a kind. Is it better to commit and roll three dice here? Or split it and roll one die? — Christopher Yost.
Of the five dice, two are showing 1’s; two are showing 2’s; and there’s one last die that’s a 3.
As with many dice questions you can in principle work this out by listing all the possible combinations of every possible outcome. A bit of reasoning takes much less work, but you have to think through the reasons.
I like starting the year with a look at the past year’s readership. Really what I like is sitting around waiting to see if WordPress is going to provide any automatically generated reports on this. The first few years I was here it did, this nice animated video with fireworks corresponding to posts and how they were received. That’s been gone for years and I suppose isn’t ever coming back. WordPress is run by a bunch of cowards.
But I can still do a look back the old-fashioned way, like I do with the monthly recaps. There’s just fewer years to look back on, and less reliable trends to examine.
2020 was my ninth full year of mathematics blogging. (I reach my tenth anniversary in September and no, I haven’t any idea what I’ll do for that. Most likely forget.) It was an unusual one in that I set aside what’s been my largest gimmick, the Reading the Comics essays, in favor of my second-largest gimmick, the A-to-Z. It’s the first year I’ve done an A-to-Z that didn’t have a month or two with a posting every day. Also along the way I slid from having a post every Sunday come what may to having a post every Wednesday, although usually also a Monday and a Friday also. Everyone claims it helps a blog to have a regular schedule, although I don’t know whether the particular day of the week counts for much. But how did all that work out for me?
So, I had a year that nearly duplicated 2019. There were 24,474 page views in 2020, down insignificantly from 2019’s 24,662. There were 16,870 unique visitors in 2020, up but also insignificantly from the 16,718 visiting in 2019. The number of likes continued to drift downward, from 798 in 2019 to 662 in 2020. My likes peaked in 2015 (over 3200!) and have fallen off ever since in what sure looks like a Poisson distribution to my eye. But the number of comments — which also peaked in 2015 (at 822) — actually rose, from 181 in 2019 to 198 in 2020.
There’s two big factors in my own control. One is when I post and, as noted, I moved away from Sunday posts midway through the year. The other is how much I post. And that dropped: in 2019 I had 201 posts published. In 2020 I posed only 178.
I thought of 2020 as a particularly longwinded year for me. WordPress says I published only 118,941 words, though, for an average of 672 words per posting. That’s my fewest number of words since 2014, though, and my shortest words-per-posting for the year going since 2013. Apparently throwing things off is all those posts that just point to earlier posts.
And what was popular among posts this year? Rather than give even more attention to how many kinds of trapezoid I can think of, I’ll focus just on what were the most popular things posted in 2020. Those were:
I am, first, surprised that so many Reading the Comics posts were among the most-read pieces. I like them, sure, but how many of them say anything that’s relevant one you’ve forgotten whether you read today’s Scary Gary? And yes, I am going to be bothered until the end of time that I was inconsistent about including the # symbol in the Playful Math Education Blog Carnival posts.
I fell off checking what countries sent me readers, month by month. I got bored writing an image alt-text of “Mercator-style map of the world, with the United States in dark red and most of the New World, western Europe, South and Pacific Rim Asia, Australia, and New Zealand in a more uniform pink” over and over and over again. But it’s a new year, it’s worth putting some fuss into things. And then, hey, what’s this?
Yeah! I finally got a reader from Greenland! Two page views, it looks like. Here’s the whole list, for the whole world.
United Arab Emirates
Hong Kong SAR China
Macau SAR China
Trinidad & Tobago
U.S. Virgin Islands
Bosnia & Herzegovina
Northern Mariana Islands
This is 141 countries, or country-like constructs, all together. I don’t know how that compares to previous years but I’m sure it’s the first time I’ve had five different countries send me a thousand page views each. That’s all gratifying to see.
So what plans have I got for 2021? And when am I going to get back to Reading the Comics posts? Good questions and I don’t know. I suppose I will pick up that series again, although since I took no notes last week, it isn’t going to be this week. At some time this year I want to do another A-to-Z, but I am still recovering from the workload of the last. Anything else? We’ll see. I am open to suggestions of things people think I should try, though.
This is, at least, a retrocomputing-adjacent piece. I’m looking back at the logic of a common and useful tool from the early-to-mid-80s and why it’s built that way. I hope you enjoy. It has to deal with some of the fussier points about how Commodore 64 computers worked. If you find a paragraph is too much technical fussing for you, I ask you to not give up, just zip on to the next paragraph. It’s interesting to know why something was written that way, but it’s all right to accept that it was and move to the next point.
How Did You Get Computer Programs In The 80s?
When the world and I were young, in the 1980s, we still had computers. There were two ways to get software, though. One was trading cassette tapes or floppy disks with cracked programs on them. (The cracking was taking off the copy-protection.) The other was typing. You could type in your own programs, certainly, just like you can make your own web page just by typing. Or you could type in a program. We had many magazines and books that had programs ready for entry. Some were serious programs, spreadsheets and word processors and such. Some were fun, like games or fractal-generators or such. Some were in-between, programs to draw or compose music or the such. Some added graphics or sound commands that the built-in BASIC programming language lacked. All this was available for the $2.95 cover price, or ten cents a page at the library photocopier. I had a Commodore 64 for most of this era, moving to a Commodore 128 (which also ran Commodore 64 programs) in 1989 or so. So my impressions, and this article, default to the Commodore 64 experience.
These programs all had the same weakness. You had to type them in. You can expect to make errors. If the program was written in BASIC you had a hope of spotting errors. The BASIC programming language uses common English words for its commands. Their grammar is not English, but it’s also very formulaic, and not hard to pick up. One has a chance of spotting mistakes if it’s 250 PIRNT "SUM; " S one typed.
But many programs were distributed as machine language. That is, the actual specific numbers that correspond to microchip instructions. For the Commodore 64, and most of the eight-bit home computers of the era, this was the 6502 microchip. (The 64 used a variation, the 6510. The differences between the 6502 and 6510 don’t matter for this essay.) Machine language had advantages, making the programs run faster, and usually able to do more things than BASIC could. But a string of numbers is only barely human-readable. Oh, you might in time learn to recognize the valid microchip instructions. But it is much harder to spot the mistakes on entering 32 255 120. That last would be a valid command on any eight-bit Commodore computer. It would have the computer print something, if it weren’t for the transposition errors.
What Was MLX and How Did You Use It?
The magazines came up with tools to handle this. In the 398-page(!) December 1983 issue of Compute!, my favorite line of magazines introduced MLX. This was a program, written in BASIC, which let you enter machine language programs. Charles Brannon has the credit for writing the article which introduced it. I assume he also wrote the program, but could be mistaken. I’m open to better information. Other magazines had other programs to do the same work; I knew them less well. MLX formatted machine language programs to look like this:
What did all this mean, though? These were lines you would enter in while running MLX. Before the colon was a location in memory. The numbers after the colon — the entries, I’ll call them — are six machine language instructions, one number to go into each memory cell. So, the number 169 was destined to go into memory location 49152. The number 002 would go into memory location 49153. The number 141 would go into memory location 49154. And so on; 000 would go into memory location 49158, 141 into 49159, 179 into 49160. 002 would go into memory location 49164; 141 would go into memory location 49170. And so on.
MLX would prompt you with the line number, the 49152 or 49158 or 49164 or so on. Machine language programs could go into almost any memory location. You had to tell it where to start. 49152 was a popular location for Commodore 64 programs. It was the start of a nice block of memory not easily accessed except by machine language programs. Then you would type in the entries, the numbers that follow. This was a reasonably efficient way to key this stuff in. MLX automatically advanced the location in memory and would handle things like saving the program to tape or disk when you were done.
The alert reader notices, though, that there are seven entries after the colon in each line. That seventh number is the checksum. It’s the guard that Compute! and Compute!’s Gazette put against typos. This seventh number was a checksum. MLX did a calculation based on the memory location and the first six numbers of the line. If it was not the seventh number on the line, then there was an error somewhere. You had to re-enter the line to get it right.
The thing I’d wondered, and finally got curious enough to explore, was how it calculated this.
What Was The Checksum And How Did It Work?
Happily, Compute! and Compute!’s Gazette published MLX in almost every issue, so it’s easy to find. You can see it, for example, on page 123 of the October 1985 issue of Compute!’s Gazette. And MLX was itself a BASIC program. There are quirks of the language, and its representation in magazine print, that take time to get used to. But one can parse it without needing much expertise. One important thing is that most Commodore BASIC commands didn’t need spaces after them. For an often-used program like this they’d skip the spaces. And the : symbol denoted the end of one command and start of another. So, for example, PRINTCHR$(20):IFN=CKSUMTHEN530 one learns means PRINT CHR$(20); IF N = CKSUM THEN 530.
So how does it work? MLX is, as a program, convoluted. It’s well-described by the old term “spaghetti code”. But the actual calculation of the checksum is done in a single line of the program, albeit one with several instructions. I’ll print it, but with some spaces added in to make it easier to read.
500 CKSUM = AD - INT(AD/256)*256:
FOR I = 1 TO 6:
CKSUM = (CKSUM + A(I))AND 255:
Most of this you have a chance of understanding even if you don’t program. CKSUM is the checksum number. AD is the memory address for the start of the line. A is an array of six numbers, the six numbers of that line of machine language. I is an index, a number that ranges from 1 to 6 here. Each A(I) happens to be a number between 0 and 255 inclusive, because that’s the range of integers you can represent with eight bits.
What Did This Code Mean?
So to decipher all this. Starting off. CKSUM = AD - INT(AD/256)*256. INT means “calculate the largest integer not greater than whatever’s inside”. So, like, INT(50/256) would be 0; INT(300/256) would be 1; INT(600/256) would be 2. What we start with, then, is the checksum is “the remainder after dividing the line’s starting address by 256”. We’re familiar with this, mathematically, as “address modulo 256”.
In any modern programming language, we’d write this as CKSUM = MOD(AD, 256) or CKSUM = AD % 256. But Commodore 64 BASIC didn’t have a modulo command. This structure was the familiar and comfortable enough workaround. But, read on.
The next bit was a for/next loop. This would do the steps inside for every integer value of I, starting at 1 and increasing to 6. CKSUM + A(I) has an obvious enough intention. What is the AND 255 part doing, though?
AND, here, is a logic operator. For the Commodore 64, it works on numbers represented as two-byte integers. These have a memory representation of 11111111 11111111 for ‘true’, and 00000000 00000000 for ‘false’. The very leftmost bit, for integers, is a plus-or-minus-sign. If that leftmost bit is a 1, the number is negative; if that leftmost bit is a 0, the number is positive. Did you notice me palming that card, there? We’ll come back to that.
Ordinary whole numbers can be represented in binary too. Like, the number 26 has a binary representation of 00000000 00011010. The number, say, 14 has a binary representation of 00000000 00001110. 26 AND 14 is the number 00000000 00001010, the binary digit being a 1 only when both the first and second numbers have a 1 in that column. This bitwise and operation is also sometimes referred to as masking, as in masking tape. The zeroes in the binary digits of one number mask out the binary digits of the other. (Which does the masking is a matter of taste; 26 AND 14 is the same number as 14 AND 26.)
The binary 00000000 0001010 is the decimal number 10. So you can see that generally these bitwise and operations give you weird results. Taking the bitwise and for 255 is more predictable, though. The number 255 has a bit representation of 00000000 11111111. So what (CKSUM + A(I)) AND 255 does is … give the remainder after dividing (CKSUM + A(I)) by 256. That is, it’s (CKSUM + A(I)) modulo 256.
The formula’s not complicated. To write it in mathematical terms, the calculation is:
Why Write It Like That?
So we have a question. Why are we calculating a number modulo 256 by two different processes? And in the same line of the program?
We get an answer by looking at the binary representation of 49152, which is 11000000 00000000. Remember that card I just palmed? I had warned that if the leftmost digit there were a 1, the number was understood to be negative. 49152 is many things, none of them negative.
So now we know the reason behind the odd programming choice to do the same thing two different ways. As with many odd programming choices it amounts to technical details of how Commodore hardware worked. The Commodore 64’s logical operators — AND, OR, and NOT — work on variables stored as two-byte integers. Two-byte integers can represent numbers from -32,768 up to +32,767. But memory addresses on the Commodore 64 are indexed from 0 up to 65,535. We can’t use bit masking to do the modulo operation, not on memory locations.
I have a second question, though. Look at the work inside the FOR loop. It takes the current value of the checksum, adds one of the entries to it, and takes the bitwise AND of that with 255. Why? The value would be the same if we waited until the loop was done to take the bitwise AND. At least, it would be unless the checksum grew to larger than 32,767. The checksum will be the sum of at most seven numbers, none of them larger than 255, though, so that can’t be the contraint. It’s usually faster to do as little inside a loop as possible, so, why this extravagance?
My first observation is that this FOR loop does the commands inside it six times. And logical operations like AND are very fast. The speed difference could not possibly be perceived. There is a point where optimizing your code is just making life harder for yourself.
My second observation goes back to the quirks of the Commodore 64. You entered commands, like the lines of a BASIC program, on a “logical line” that allowed up to eighty tokens. For typing in commands this is the same as the number of characters. Can this line be rewritten so there’s no redundant code inside the for loop, and so it’s all under 80 characters long?
Yes. This line would have the same effect and it’s only 78 characters:
I don’t have a clear answer. I suspect it’s for the benefit of people typing in the MLX program. In typing that in I’d have trouble not putting in a space between FOR and I, or between CKSUM and AND. Also before and after the TO and before and after AND. This would make the line run over 80 characters and make it crash. The original line is 68 characters, short enough that anyone could add a space here and there and not mess up anything. In looking through MLX, and other programs, I find there are relatively few lines more than 70 characters long. I have found them as long as 76 characters, though. I can’t rule out there being 78- or 79-character lines. They would have to suppose anyone typing them in understands when the line is too long.
There’s an interesting bit of support for this. Compute! also published machine language programs for the Atari 400 and 800. A version of MLX came out for the Atari at the same time the Commodore 64’s came out. Atari BASIC allowed for 120 characters total. And the equivalent line in Atari MLX was:
500 CKSUM=ADDR-INT(ADDR/256)*256:FOR I=1 TO 6:CKSUM=CKSUM+A(I):CKSUM=CKSUM-256*(CKSUM>255):NEXT I
This has a longer name for the address variable. It uses a different way to ensure that CKSUM stays a number between 0 and 255. But the whole line is only 98 characters.
We could save more spaces on the Commodore 64 version, though. Commodore BASIC “really” used only the first two characters of a variable name. To write CKSUM is for the convenience of the programmer. To the computer it would be the same if we wrote CK. We could even truncate it to CK for this one line of code. The only penalty would be confusing the reader who doesn’t remember that CK and CKSUM are the same variable.
And there’s no reason that this couldn’t have been two lines. One line could add up the checksum and a second could do the bitwise AND. Maybe this is all a matter of the programmer’s tastes.
In a modern language this is all quite zippy to code. To write it in Octave or Matlab is something like:
This is a bit verbose. I want it to be easier to see what work is being done. We could make it this compact:
function [checksOut] = oldmlx(oneline)
checksOut = !(mod(sum(oneline(1:7))-oneline(8), 256));
I don’t like compressing my thinking quite that much, though.
But that’s the checksum. Now the question: did it work?
Was This Checksum Any Good?
Since Compute! and Compute!’s Gazette used it for years, the presumptive answer is that it did. The real question, then, is did it work well? “Well” means does it prevent the kinds of mistakes you’re likely to make without demanding too much extra work. We could, for example, eliminate nearly all errors by demanding every line be entered three times and accept only a number that’s entered the same at least two of three times. That’s an incredible typing load. Here? We have to enter one extra number for every six. Much lower load, but it allows more errors through. But the calculation is — effectively — simply “add together all the numbers we typed in, and see if that adds to the expected total”. If it stops the most likely errors, though, then it’s good. So let’s consider them.
The first and simplest error? Entering the wrong line. MLX advanced the memory location on its own. So if you intend to write the line for memory location 50268, and your eye slips and you start entering that for 50274 instead? Or even, reading left to right, going to line 50814 in the next column? Very easy to do. This checksum will detect that nicely, though. Entering one line too soon, or too late, will give a checksum that’s off by 6. If your eye skips two lines, the checksum will be off by 12. The only way to not have the checksum miss is to enter a line that’s some multiple of 256 memory locations away. And since each line is six memory locations, that means you have to jump 768 memory locations away. That is 128 lines away. You are not going to make that mistake. (Going from one column in the magazine to the next is a jump of 91 lines. The pages were 8½-by-11 pages, so were a bit easier to read than the image makes them look.)
How about other errors? You could mis-key, say, 169. But think of the plausible errors. Typing it in as 159 or 196 or 269 would be detected by the checksum. The only one that wouldn’t would be to enter a number that’s equal to 169, modulo 256. So, 425, say, or 681. There is nobody so careless as to read 169 and accidentally type 425, though. In any case, other code in MLX rejects any data that’s not between 0 and 255, so that’s caught before the checksum comes into play.
So it’s safe against the most obvious mistake. And against mis-keying a single entry. Yes, it’s possible that you typed in the whole line right but mis-keyed the checksum. If you did that you felt dumb but re-entered the line. If you even noticed and didn’t just accept the error report and start re-entering the line.
What about mis-keying double entries? And here we have trouble. Suppose that you’re supposed to enter 169, 062 and instead enter 159, 072. They’ll add to the same quantity, and the same checksum. All that’s protecting you is that it takes a bit of luck to make two errors that exactly balance each other. But, then, slipping and hitting an adjacent number on the keyboard is an easy mistake to make.
Worse is entry transposition. If you enter 062, 169 instead you have made no checksum errors. And you won’t even be typing any number “wrong”. At least with the mis-keying you might notice that 169 is a common number and 159 a rare one in machine language. (169 was the command “Load Accumulator”. That is, copy a number into the Central Processing Unit’s accumulator. This was one of three on-chip memory slots. 159 was no meaningful command. It would only appear as data.) Swapping two numbers is another easy error to make.
And they would happen. I can attest from experience. I’d had at least one program which, after typing, had one of these glitches. After all the time spent entering it, I ended up with a program that didn’t work. And I never had the heart to go back and track down the glitch or, more efficiently, retype the whole thing from scratch.
The irony is that the program with the critical typing errors was a machine language compiler. It’s something that would have let me write this sort of machine language code. Since I never reentered it, I never created anything but the most trivial of machine language programs for the 64.
So this MLX checksum was fair. It devoted one-seventh of the typing to error detection. It could catch line-swap errors, single-entry mis-keyings, and transpositions within one entry. It couldn’t catch transposing two entries. So that could have been better. I hope to address that soon.
And now, finally, the close of the All 2020 Mathematics A-to-Z. You may see this as coming in after the close of 2020. I say, well, I’ve done that before. Things that come close to the end of the year are prone to that.
The first important lesson was that I need to read exactly what topics I’ve written about before going ahead with a new week’s topic. I am not sorry, really, to have written about Tiling a second time. I’d rather it have been more than two years after the previous time. But I can make a little something out of that, too. I enjoy the second essay more. I don’t think that’s only because I like my most recent writing more. In the second version I looked at one of those fussy little specific questions. Particularly, what were the 20,426 tiles which Robert Berger found could create an aperiodic tiling? Tracking that down brought me to some fun new connections. And it let me write in a less foggy way. It’s always tempting to write the most generally true thing possible. But details and example cases are easier to understand. It’s surprising that no one in the history of knowledge has observed this difference before I did.
The second lesson was about work during a crisis. 2020 was the most stressful year of my life, a fact I hope remains true. I am aware that ritual, doing regular routine things, helps with stress. So a regular schedule of composing an essay on a mathematical topic was probably a good thing for me. Committing to the essay meant I had specific, attainable goals on clear, predictable deadlines. The catch is that I never got on top of the A-to-Z the way I hoped. My ideal for these is to have the essay written a week ahead of publication. Enough that I can sleep on it many times and amend it as needed. I never got close to this. I was running up close to deadline every week. If I were better managing all this I’d have gotten all November’s essays written before the election, and I didn’t, and that’s why I had to slip a week. I have always been a Sabbath-is-made-for-man sort, so don’t feel bad for slipping the week. But I would have liked to never had had a week when I was copy-editing a half-hour before publication.
It does all imply that I need to do what I resolve every year. Select topics sooner. Start research and drafts sooner. Let myself slip a deadline when that’s needed. But there is also the observation that apparently I can’t cut down the time I spend writing. The first several years of this, believe it or not, I wrote three essays a week for eight intense weeks. These would be six to eight hundred words each. Then I slacked off, doing two a week; these of course grew to a thousand, maybe 1200 words each. For 2020? One essay a week and more than one topped 2500 words. Yes, the traditional joke is that you write a lot because you don’t have the time to write briefly. But writing a lot takes time too.
They’re challenging. In the pandemic particularly, as I can’t rely on the university library for a quick biography to read. Or to check journals of mathematical history, although I haven’t resorted to such actual information yet. But I’m also aware that I am not a historian or a real biographer. I have to balance drawing conclusions I can feel confident are not wrong with making declarations that are interesting to read. Still, I enjoy a focus on the culture of mathematics, and how mathematics interacts with the broader culture. It’s a piece mathematicians tend not to acknowledge; our field’s reputation for objective truth is a compelling romantic legend.
I do plan to write an A-to-Z for 2021. I suspect I’ll do it as this year, one per week. I don’t know when I’ll start, although it should be earlier than June. I’ll want to give myself more possible slip dates without running off the year. I will not be writing about tiling again. I do realize that, since I have seven A-to-Z sequences of 26 essays each, I could in principle fill half a year with writing by reblogging each, one a day. I’m not sure the point of such an exercise, but it would at least fill the content hole.
There is a side of me that would like to have a blogging gimmick that doesn’t commit me to 26 essays. I’ve tried a couple; they haven’t ever caught like this has. Maybe I could do something small and focused, like, ten terms from complex analysis. I’m open to suggestions.
When will I resume covering mathematical themes in comic strips? I don’t know; it’s the obvious thing to do while I wait for the A-to-Z cycle to start anew. It’s got some of the A-to-Z thrill, of writing about topics someone else chose. But I need some time to relax and play and I don’t know when I’ll be back to regular work.
I’m very slightly sorry to bump other things. But folks who like the history of mathematics, and how it links to other things, and who also like listening to stuff, might want to know. Peter Adamson, host of the History Of Philosophy Without Any Gaps podcast, this week talked for about twenty minutes about Girolamo Cardano.
Cardano is famous in mathematics circles for early work in probability. And, more, for pioneering the use of imaginary numbers. This along the way to a fantastic controversy about credit, and discovery, and secrets, and self-promotion.
Cardano was, as Adamson notes, a polymath; his day job was as a physician and he poked around in the philosophy of mind. That’s what makes him a fit subject for Adamson’s project. So if you’d like a different perspective on a person known, if vaguely, to many mathematics folks, and have a spot of time, you might enjoy.
And a happy new year, at last, to all. I’ll take this chance first to look at my readership figures from December. Later I’ll look at the whole year, and what things I would learn from that if I were capable of learning from this self-examination.
I had 13 posts here in December, which is my lowest count since June. For the twelve months from December 2019 through November 2020, I’d posted a mean of 15.3 and a median of 15 posts. So that’s relatively quiet. My blog overall got 2,366 page views from 1,751 unique visitors. That’s a decline from October and November. But it’s still above the running averages, which had a mean of 1,957.8 and median of 1,974 page views. And a mean of 1,335.7 and median of 1,290.5 unique visitors.
There were 51 likes given to posts in December. That’s barely below the twelve-month running averages, which had a mean of 54.6 and a median of 52 likes. The number of comments collapsed to a mere 4 and while it’s been worse, it’s still dire. There were a mean of 15.3 and median of 15 comments through the twelve months before that.
If it’s disappointing to see numbers drop, and it is, there’s some evidence that it’s all my own fault. Even beyond that this is my blog and I’m the only one writing for it. That is in the per-posting statistics. There were 182.0 views per posting, which is well above the averages (132.0 mean, 132.6 median). It’s also near the averages in November (191.5) and October (169.1). Likes per posting were even better: 3.9, compared to a running average mean of 3.5 and running average median of 3.4. The per-posting likes had been 4.0 and 4.4 the previous months. Comments per posting — 0.3 — is still a dire number, though. The running-average mean was 1.1 per posting and median of 1.0 per posting.
It suggests that the best thing I can do for my statistics is post more. Most of December’s posts were little but links to even earlier posts. This feels like cheating to me, to do too often. On the other hand, I’ve had 1,580 posts over the past decade; why have that if I’m not going to reuse them? And, yes, it’s a bit staggering to imagine that I could repost one entry a day for four and a third years before I ran out. (Granting that lot of those would be references to earlier posts. Or things like monthly statistics recaps that make not a lick of sense to repeat.)
What were popular posts from November or December 2020? It turns out the five most popular posts from that stretch were all December ones:
It feels weird that How Many Of This Weird Prime Are There? was so popular since that was posted the 30th of December. (And late, at that, as I didn’t schedule it right.) So in 30 hours it attracted more readers than posts that had all of November and December to collect readers. I guess there’s something about weird primes that people want to read about. Although not to comment on with their answers to the third prime of the form … well, maybe they’re leaving it for other people to find, unspoiled. I also always find it weird that these How-A-Month-Treated-My-Blog posts are so popular. I think other insecure bloggers like to see someone else suffering.
According to WordPress I published 7,758 words in December. This is only my fourth-most-laconic month in 2020. This put me also at an average of 596.8 words per posting in December. My average for all 2020 was 672 words per posting, so all those recaps were in theory saving me time.
Also according to WordPress, I started January 2021 with a total of 1,581 posts ever. (There’s one secret post, created to test some things out; there’s no sense revealing or deleting it.) These have drawn a total 122,051 views from 69,848 logged unique visitors. It’s not a bad record for a blog entering its tenth year of publication without ever getting a clear identity.
My Twitter account has gone feral. While it’s still posting announcements, I don’t read it, because I don’t have the energy to figure out why it sometimes won’t load. If you want to social-media thing with me try me on the Mastodon account @firstname.lastname@example.org. Mathstodon is a mathematics-themed instance of that microblogging network you remember hearing something about somewhere but not what anybody said about it.
And, yeah, I hope to have my closing thoughts about the 2020 A-To-Z later this week. Thank you all for reading.
A friend made me aware of a neat little unsolved problem in number theory. I know it seems like number theory is nothing but unsolved problems, but this is an unfair reputation. There are as many as four solved problems in number theory. It’s a tough field.
The question started with the observation that 11 is a prime number. And so is 101. But 1,001 is not; nor is 10,001. How many prime numbers are there that have the form , for whole-number values of n? Are there infinitely many? Finitely many? If there’s finitely many, how many are there?
It turns out this is an open question. We know of three prime numbers that you can write as . I’ll leave the third for you to find.
One neat bit is that if there are more prime numbers, they have to be ones where n is itself a whole power of 2. That is, where the number is for some whole number k. They’ve been tested up to at least, so this subset of the Generalized Fermat Numbers seems to be rare. But wouldn’t it be just our luck if from onward they were nothing but primes?
Folks who’ve been with me a long while know one of my happy Christmastime traditions is watching the Aardman Animation film Arthur Christmas. The film also gave me a great mathematical-physics question. You might consider some questions it raises.
First: Could `Arthur Christmas’ Happen In Real Life? There’s a spot in the movie when Arthur and Grand-Santa are stranded on a Caribbean island while the reindeer and sleigh, without them, go flying off in a straight line. What does a straight line on the surface of the Earth mean?
Second: Returning To Arthur Christmas. From here spoilers creep in and I have to discuss, among other things, what kind of straight line the reindeer might move in. There is no one “right” answer.
Third: Arthur Christmas And The Least Common Multiple. If we suppose the reindeer move in a straight line the way satellites move in a straight line, we can calculate how long Arthur and Grand-Santa would need to wait before the reindeer and sled are back if they’re lucky enough to be waiting on the equator.
Fourth: Six Minutes Off. Waiting for the reindeer to get back becomes much harder if Arthur and Grand-Santa are not on the equator. This has potential dangers for saving the day.
Fifth and last: Arthur Christmas and the End of Time. We get to the thing that every mathematical physics blogger really really wants to get into. This is the paradox that conservation of energy and the fact of entropy seem to force us into some weird conclusions, if the universe can get old enough. Maybe; there’s some extra considerations, though, that can change the conclusion.
I am happy, as ever, to complete an A-to-Z. Also to take some time to recover after the project. I had thought that spreading things out to 26 weeks would make them less stressful, and instead, I just wrote even longer pieces, in compensation. I’ll try to have other good observations in an essay next week.
For now, though, a piece that I will find useful for years to come: a roster of what essays I wrote this year. In future years, I may even check them before writing a third piece about tiling.
Jacob Siehler had several suggestions for this last of the A-to-Z essays for 2020. Zorn’s Lemma was an obvious choice. It’s got an important place in set theory, it’s got some neat and weird implications. It’s got a great name. The zero divisor is one of those technical things mathematics majors have deal with. It never gets any pop-mathematics attention. I picked the less-travelled road and found a delightful scenic spot.
3 times 4 is 12. That’s a clear, unambiguous, and easily-agreed-upon arithmetic statement. The thing to wonder is what kind of mathematics it takes to mess that up. The answer is algebra. Not the high school kind, with x’s and quadratic formulas and all. The college kind, with group theory and rings.
A ring is a mathematical construct that lets you do a bit of arithmetic. Something that looks like arithmetic, anyway. It has a set of elements. (An element is just a thing in a set. We say “element” because it feels weird to call it “thing” all the time.) The ring has an addition operation. The ring has a multiplication operation. Addition has an identity element, something you can add to any element without changing the original element. We can call that ‘0’. The integers, or to use the lingo , are a ring (among other things).
Among the rings you learn, after the integers, is the integers modulo … something. This can be modulo any counting number. The integers modulo 10, for example, we write as for short. There are different ways to think of what this means. The one convenient for this essay is that it’s the integers 0, 1, 2, up through 9. And that the result of any calculation is “how much more than a whole multiple of 10 this calculation would otherwise be”. So then 3 times 4 is now 2. 3 times 5 is 5; 3 times 6 is 8. 3 times 7 is 1, and doesn’t that seem peculiar? That’s part of how modulo arithmetic warns us that groups and rings can be quite strange things.
We can do modulo arithmetic with any of the counting numbers. Look, for example, at instead. In the integers modulo 5, 3 times 4 is … 2. This doesn’t seem to get us anything new. How about ? In this, 3 times 4 is 4. That’s interesting. It doesn’t make 3 the multiplicative identity for this ring. 3 times 3 is 1, for example. But you’d never see something like that for regular arithmetic.
How about ? Now we have 3 times 4 equalling 0. And that’s a dramatic break from how regular numbers work. One thing we know about regular numbers is that if a times b is 0, then either a is 0, or b is zero, or they’re both 0. We rely on this so much in high school algebra. It’s what lets us pick out roots of polynomials. Now? Now we can’t count on that.
When this does happen, when one thing times another equals zero, we have “zero divisors”. These are anything in your ring that can multiply by something else to give 0. Is, zero, the additive identity, always a zero divisor. … That depends on what the textbook you first learned algebra from said. To avoid ambiguity, you can write a “nonzero zero divisor”. This clarifies your intentions and slows down your copy editing every time you read “nonzero zero”. Or call it a “nontrivial zero divisor” or “proper zero divisor” instead. My preference is to accept 0 as always being a zero divisor. We can disagree on this. What of zero divisors other than zero?
Your ring might or might not have them. It depends on the ring. The ring of integers , for example, doesn’t have any zero divisors except for 0. The ring of integers modulo 12 , though? Anything that isn’t relatively prime to 12 is a zero divisor. So, 2, 3, 6, 8, 9, and 10 are zero divisors here. The ring of integers modulo 13 ? That doesn’t have any zero divisors, other than zero itself. In fact any ring of integers modulo a prime number, , lacks zero divisors besides 0.
Focusing too much on integers modulo something makes zero divisors sound like some curious shadow of prime numbers. There are some similarities. Whether a number is prime depends on your multiplication rule and what set of things it’s in. Being a zero divisor in one ring doesn’t directly relate to whether something’s a zero divisor in any other. Knowing what the zero divisors are tells you something about the structure of the ring.
It’s hard to resist focusing on integers-modulo-something when learning rings. They work very much like regular arithmetic does. Even the strange thing about them, that every result is from a finite set of digits, isn’t too alien. We do something quite like it when we observe that three hours after 10:00 is 1:00. But many sets of elements can create rings. Square matrixes are the obvious extension. Matrixes are grids of elements, each of which … well, they’re most often going to be numbers. Maybe integers, or real numbers, or complex numbers. They can be more abstract things, like rotations or whatnot, but they’re hard to typeset. It’s easy to find zero divisors in matrixes of numbers. Imagine, like, a matrix that’s all zeroes except for one element, somewhere. There are a lot of matrices which, multiplied by that, will be a zero matrix, one with nothing but zeroes in it. Another common kind of ring is the polynomials. For these you need some constraint like the polynomial coefficients being integers-modulo-something. You can make that work.
In 1988 Istvan Beck tried to establish a link between graph theory and ring theory. We now have a usable standard definition of one. If is any ring, then is the zero-divisor graph of . (I know some of you think is the real numbers. No; that’s a bold-faced instead. Unless that’s too much bother to typeset.) You make the graph by putting in a vertex for the elements in . You connect two vertices a and b if the product of the corresponding elements is zero. That is, if they’re zero divisors for one other. (In Beck’s original form, this included all the elements. In modern use, we don’t bother including the elements that are not zero divisors.)
Drawing this graph makes tools from graph theory available to study rings. We can measure things like the distance between elements, or what paths from one vertex to another exist. What cycles — paths that start and end at the same vertex — exist, and how large they are. Whether the graphs are bipartite. A bipartite graph is one where you can divide the vertices into two sets, and every edge connects one thing in the first set with one thing in the second. What the chromatic number — the minimum number of colors it takes to make sure no two adjacent vertices have the same color — is. What shape does the graph have?
And this lets me complete a cycle in this year’s A-to-Z, to my delight. There is an important question in topology which group theory could answer. It’s a generalization of the zero-divisors conjecture, a hypothesis about what fits in a ring based on certain types of groups. This hypothesis — actually, these hypotheses. There are a bunch of similar questions about invariants called the L2-Betti numbers can be. These we call the Atiyah Conjecture. This because of work Michael Atiyah did in the cohomology of manifolds starting in the 1970s. It’s work, I admit, I don’t understand well enough to summarize, and hope you’ll forgive me for that. I’m still amazed that one can get to cutting-edge mathematics research this. It seems, at its introduction, to be only a subversion of how we find x for which .
And for the last of this year’s (planned) exhumations from my archives? It’s a piece from summer 2017: Zeta Function. As will happen in mathematics, there are many zeta functions. But there’s also one special one that people find endlessly interesting, and that’s what we mean if we say “the zeta function”. It, of course, goes back to Bernhard Riemann.
Also a cute note I saw going around. If you cut off the century years then the date today — the 16th day of the 12th month of the 20th year of the century — you get a rare Pythagorean triplet. and after a moment we notice that’s the famous 3-4-5 Pythagorean triplet all over again. If you miss it, well, that’s all right. There’ll be another along in July of 2025, and one after that in October of 2026.
To dig something out of my archives today, I offer the Zermelo-Fraenkel Axioms. This wrapped up the End 2016 A-to-Z. On the last day of 2016, I see; I didn’t realize I was cutting things that close that year. These are fundamentals of set theory, which is the study of what you can include and what you exclude from a set of things. For a while in the 20th century this looked likely to be the foundation of mathematics, from which everything else could be derived. We’ve moved on now to thinking that category theory is more likely the core. But set theory remains a really good foundation. You can understand a lot of what’s interesting about it without needing more than a child’s ability to make marks on paper and draw circles around some of them. Or, like my essays insist on doing, without even doing the drawings that would make it all easier to follow.
Nobody had particular suggestions for the letter ‘Y’ this time around. It’s a tough letter to find mathematical terms for. It doesn’t even lend itself to typography or wordplay the way ‘X’ does. So I chose to do one more biographical piece before the series concludes. There were twists along the way in writing.
Several problems beset me in writing about this significant 13th-century Chinese mathematician. One is my ignorance of the Chinese mathematical tradition. I have little to guide me in choosing what tertiary sources to trust. Another is that the tertiary sources know little about him. The Complete Dictionary of Scientific Biography gives a dire verdict. “Nothing is known about the life of Yang Hui, except that he produced mathematical writings”. MacTutor’s biography gives his lifespan as from circa 1238 to circa 1298, on what basis I do not know. He seems to have been born in what’s now Hangzhou, near Shanghai. He seems to have worked as a civil servant. This is what I would have imagined; most scholars then were. It’s the sort of job that gives one time to write mathematics. Also he seems not to have been a prominent civil servant; he’s apparently not listed in any dynastic records. After that, we need to speculate.
E F Robertson, writing the MacTutor biography, speculates that Yang Hui was a teacher. That he was writing to explain mathematics in interesting and helpful ways. I’m not qualified to judge Robertson’s conclusions. And Robertson notes that’s not inconsistent with Yang being a civil servant. Robertson’s argument is based on Yang’s surviving writings, and what they say about the demonstrated problems. There is, for example, 1274’s Cheng Chu Tong Bian Ben Mo. Robertson translates that title as Alpha and omega of variations on multiplication and division. I try to work out my unease at having something translated from Chinese as “Alpha and Omega”. That is my issue. Relevant here is that a syllabus prefaces the first chapter. It provides a schedule and series of topics, as well as a rationale for why this plan.
Was Yang Hui a discoverer of significant new mathematics? Or did he “merely” present what was already known in a useful way? This is not to dismiss him; we have the same questions about Euclid. He is held up as among the great Chinese mathematicians of the 13th century, a particularly fruitful time and place for mathematics. How much greatness to assign to original work and how much to good exposition is unanswerable with what we know now.
Consider for example the thing I’ve featured before, Yang Hui’s Triangle. It’s the arrangement of numbers known in the west as Pascal’s Triangle. Yang provides the earliest extant description of the triangle and how to form it and use it. This in the 1261 Xiangjie jiuzhang suanfa (Detailed analysis of the mathematical rules in the Nine Chapters and their reclassifications). But in it, Yang Hui says he learned the triangle from a treatise by Jia Xian, Huangdi Jiuzhang Suanjing Xicao (The Yellow Emperor’s detailed solutions to the Nine Chapters on the Mathematical Art). Jia Xian lived in the 11th century; he’s known to have written two books, both lost. Yang Hui’s commentary gives us a fair idea what Jia Xian wrote about. But we’re limited in judging what was Jia Xian’s idea and what was Yang Hui’s inference or what.
The Nine Chapters referred to is Jiuzhang suanshu. An English title is Nine Chapters on the Mathematical Art. The book is a 246-problem handbook of mathematics that dates back to antiquity. It’s impossible to say when the Nine Chapters was first written. Liu Hui, who wrote a commentary on the Nine Chapters in 263 CE, thought it predated the Qin ruler Shih Huant Ti’s 213 BCE destruction of all books. But the book — and the many commentaries on the book — served as a centerpiece for Chinese mathematics for a long while. Jia Xian’s and Yang Hui’s work was part of this tradition.
Yang Hui’s Detailed Analysis covers the Nine Chapters. It goes on for three chapters, more about geometry and fundamentals of mathematics. Even how to classify the problems. He had further works. In 1275 Yang published Practical mathematical rules for surveying and Continuation of ancient mathematical methods for elucidating strange properties of numbers. (I’m not confident in my ability to give the Chinese titles for these.) The first title particularly echoes how in the Western tradition geometry was born of practical concerns.
The breadth of topics covers, it seems to me, a decent modern (American) high school mathematics education. The triangle, and the binomial expansions it gives us, fit that. Yang writes about more efficient ways to multiply on the abacus. He writes about finding simultaneous solutions to sets of equations. And through a technique that amounts to finding the matrix of coefficients for the equations, and its determinant. He writes about finding the roots for cubic and quartic equations. The technique is commonly known in the west as Horner’s Method, a technique of calculating divided differences. We see the calculating of areas and volumes for regular shapes.
And sequences. He found the sum of the squares of natural numbers followed a rule:
And then there’s magic squares, and magic circles. He seems to have found them, as professional mathematicians today would, good ways to interest people in calculation. Not magic; he called them something like number diagrams. But he gives magic squares from three-by-three all the way to ten-by-ten. We don’t know of earlier examples of Chinese mathematicians writing about the larger magic squares. But Yang Hui doesn’t claim to be presenting new work. He also gives magic circles. The simplest is a web of seven intersecting circles, each with four numbers along the circle and one at its center. The sum of the center and the circumference numbers are 65 for all seven circles. Is this significant? No; merely fun.
Grant this breadth of work. Is he significant? I learned this year that familiar names might have been obscure until quite recently. The record is once again ambiguous. Other mathematicians wrote about Yang Hui’s work in the early 1300s. Yang Hui’s works were printed in China in 1378, says the Complete Dictionary of Scientific Biography, and reprinted in Korea in 1433. They’re listed in a 1441 catalogue of the Ming Imperial Library. Seki Takakazu, a towering figure in 17th century Japanese mathematics, copied the Korean text by hand. Yet Yang Hui’s work seems to have been lost by the 18th century. Reconstructions, from commentaries and encyclopedias, started in the 19th century. But we don’t have everything we know he wrote. We don’t even have a complete text of Detailed Analysis. This is not to say he wasn’t influential. All I could say is there seems to have been a time his influence was indirect.
I will be late with this week’s A-to-Z essay. I’ve had more demands on my time and my ability to organize thoughts than I could manage and something had to yield. I’m sorry for that but figure to post on Friday something for the letter ‘Y’.
But there is some exciting news in one of my regular Reading the Comics features. It’s about the kid who shows up often in Mark Anderson’s Andertoons. At the nomination of — I want to say Ray Kassinger? — I’ve been calling him “Wavehead”. Last week, though, the strip gave his name. I don’t know if this is the first time we’ve seen it. It is the first time I’ve noticed. He turns out to be Tommy.
And what about my Reading the Comics posts, which have been on suspension since the 2020 A-to-Z started? I’m not sure. I figure to resume them after the new year. I don’t know that it’ll be quite the same, though. A lot of mathematics mentions in comic strips are about the same couple themes. It is exhausting to write about the same thing every time. But I have, I trust, a rotating readership. Someone may not know, or know how to find, a decent 200-word piece about lotteries published four months in the past. I need to better balance not repeating myself.
Also a factor is lightening my overhead. Most of my strips come from Comics Kingdom or GoComics. Both of them also cull strips from their archives occasionally, leaving me with dead links. (GoComics particularly is dropping a lot of strips by the end of 2020. I understand them dumping, say, The Sunshine Club, which has been in reruns since 2007. But Dave Kellett’s Sheldon?)
The only way to make sure a strip I write about remains visible to my readers is to include it here. But to make my including the strip fair use requires that I offer meaningful commentary. I have to write something substantial, and something that’s worsened without the strip to look at. You see how this builds to a workload spiral, especially for strips where all there is to say is it’s a funny story problem. (If any cartoonists are up for me being another, unofficial archive for their mathematics-themed strips? Drop me a comment, Bill Amend, we can work something out if it doesn’t involve me sending more money than I’m taking in.)
So I don’t know how I’ll resolve all this. Key will be remembering that I can just not do the stuff I find tedious here. I will not, in fact, remember that.
I have many flaws as a pop mathematics blogger. Less depth of knowledge than I should have, for example. A tendency to start a series before I have a clear ending, so that projects will peter out rather than resolve. A-to-Z’s are different, as they have a clear direction and ending. And a frightful cultural bias, too. I’m terribly weak on mathematics outside the western tradition. Yang Hui’s Triangle, an essay I wrote about in the End 2020 A-to-Z, is a slight correction to that. I grew up learning this under a different name, that of a Western mathematician who studied the thing centuries after Yang Hui did. But then Yang Hui credited an earlier-yet mathematician, Jia Xian, for the insight. It’s difficult to get anything in mathematics named for the “correct” person.
I am again looking at the past month’s readership figures. And I’m again doing this in what I mean to be a lower-key form. November was a relatively laconic month for me, at least by A-to-Z standards.
I had only 15 posts in November, not many more than would be in a normal month. The majority of posts were pointers to earlier posts yet. It doesn’t seem to have hurt my readership, though. WordPress says there were 2,873 pages viewed in November, for an average of 191.5 views per posting. This is a good bit above the twelve-month running average leading up to November. That average was a mere 1,912.8 views for a month and 81.6 views per posting. This is because that anomalously high October 2019 figure has passed out of the twelve-month range. There were 2,067 unique visitors logged, for 137.8 unique visitors per posting. The twelve-month running average was 1,294.1 unique visitors for the month, and 81.6 unique visitors per posting. So that’s suggestive of readership growth over the past year.
The things that signal engaged readers were more ambiguous, as they always are. There were 60 things liked in November, or an average of 4.0 likes per posting. The twelve-month running average had 57.5 likes for a month, and 3.5 likes per posting. There were 11 comments given over the month, an average of 0.7 per posting. And that is below the twelve-month running average of 17.2 for a month and 1.1 comments per posting. I did have an appeal for topics for the A-to-Z, which usually draws comments. But they were for unappealing letters like W and X and it takes some inspiration to think of good mathematics terms for that part of the alphabet.
I like to look over the most popular postings I’ve had but every month it’s either trapezoids or record grooves. I did start limiting my listing to the most popular things posted in the two prior months, so new stuff has a chance at appearing. I make it the two prior months so that things which appeared at the end of a month might show up. And then that got messed up. The most popular recent post was from the end of September: Playful Math Education Blog Carnival 141. It’s a collection of recreational or education-related mathematics you might like. I’m not going to ignore that just because it published three days before October started.
November’s most popular things posted in October or November were:
I have no idea why these post reviews are always popular. I think people might see there’s a list or two in the middle and figure that must be a worthwhile essay. Someday I’ll put up some test essays that are complete nonsense, one with a list and one without, and see how they compare. Of course, now you know the trick and won’t fall for it.
If WordPress’s numbers are right, in November I published 7,304 words, barely more than half of October’s total. It was my tersest month since January. Per post it was even more dramatic: a mere 486.9 words per posting in November, my lowest of the year, to date. My average words per posting, for 2020, dropped to 678.
As of the start of December I’ve had 1,568 total postings here. They’ve gathered 119,685 page views from a logged 68,097 unique visitors.
If you’d like to follow on WordPress, you can add this to your Reading page by clicking the “Follow Nebusresearch” button on the page.
My essays are announced on Twitter as @nebusj. Don’t try to talk with me there. The account’s gone feral. There’s an automated publicity thing on WordPress that posts to it, and is the only way I have to reliably post there. If you want to social-media talk with me look to the mathematics-themed Mathstodon and my account @email@example.com. Or you can leave a comment. Dad, you can also e-mail me. You know the address. The rest of you don’t know, but I bet you could guess it. Not the obvious first guess, though. Around your fourth or fifth guess would get it. I know that changes what your guesses would be.
Thank you all for reading. Have fun with that logic problem.
When developing general relativity, Albert Einstein created a convention. He’s not unique in that. All mathematicians create conventions. They use shorthand for an idea that’s complicated or common. Relatively unique is that other people adopted his convention, because it expressed an idea compactly. This was in working with tensors, which look somewhat like matrixes and have a lot of indexes. In the equations of general relativity you need to take sums over many combinations of values of these indexes. What indexes there are are the same in most every problem. The possible values of the indexes is constant, problem to problem, too.
So Einstein saved himself writing, and his publishers from typesetting, a lot of redundant writing. This by writing out the conditions which implied “take the sums over these indexes on this range”. This is good for people doing general relativity, and certain kinds of geometry. It’s a problem only when an expression escapes its context. When it’s shown to a student or someone who doesn’t know this is a differential-geometry problem. Then the problem becomes confusing, and they can’t work on it.
This is not to fault the Einstein Summation Convention. It puts common necessary scaffolding out of the way and highlighting the interesting unique parts of a problem. Most conventions aim for that. We have the hazard, though, that we may not notice something breaking the convention.
And this is how we create extraneous solutions. And, as a bonus, to have missing solutions. We encounter them with the start of (high school) algebra, when we get used to manipulating equations. When we solve an equation what we always want is something clear, like
But it never starts that way. It always starts with something like
or worse. We learn how to handle this. We know that we can do six things that do not alter the truth of an equation. We can regroup terms in the equation. We can add the same number to both sides of the equation. We can multiply both sides of the equation by some number besides zero. We can add zero to one side of the equation. We can multiply one side of the equation by 1. We can replace one quantity with another that has the same value. That doesn’t sound like a lot. It covers more than it seems. Multiplying by 1, for example, is the same as multiplying by . If x isn’t zero, then we can multiply both sides of the equation by that x. And x can’t be zero, or else would not be 1.
So with my example there, start off by multiplying the right side by 1, in the guise . Then multiply both sides by that same non-zero x. At this point the right-hand side simplifies to being 6. Add a -6 to both sides. And then with a lot of shuffling around you work out that the equation is the same as
And that can only be true when x equals 2.
It should be easy to catch spurious solutions creeping in. They must result from breaking a rule. The obvious problem is multiplying — or dividing — by zero. We expect those to be trouble. Wikipedia has a fine example:
The obvious step is to multiply this whole mess by , which turns our work into a linear equation. Very soon we find the solution must be . Which would make at least two of the denominators in the original equation zero. We know not to want that.
The problems can be subtler, though. Consider:
That’s not hard to solve. Multiply both sides by . Although, before working out substitute that with something equal to it. We know one thing is equal to it, . Then we have
It’s a quadratic equation. A little bit of work shows the roots are 9 and 16. One of those answers is correct and the other spurious. At no point did we divide anything, by zero or anything else.
So what is happening and what is the necessary rhetorical link to the Einstein Summation Convention?
There are many ways to look at equations. One that’s common is to look at them as functions. This is so common that we’ll elide between an equation and a function representation. This confuses the prealgebra student who wants to know why sometimes we look at
and sometimes we look at
and sometimes at
The advantage of looking at the function which shadows any equation is we have different tools for studying functions. Sometimes that makes solving the equation easier. In this form, we’re looking for what in the domain matches with something particular in the range.
And now we’ve reached the convention. When we write down something lke we’re implicitly defining a function. A function has three pieces. It has a set called the domain, from which we draw the independent variable. It has a set called the range. It has a rule matching elements in the domain to an element in the range. We’ve only given the rule. What are the domain and what’s the range for ?
And here are the conventions. If we haven’t said otherwise, the domain and range are usually either the real numbers or the complex numbers. If we used x or y or t as the independent variable, we mean the real numbers. If we used z as the independent variable, and haven’t already put x and y in, we mean the complex numbers. Sometimes we call in s or w or another letter; never mind that. The range can be the whole set of real or complex numbers. It does us no harm to have too large a range.
The domain, though. We do insist that everything in the domain match to something in the range. And, like, ? That can’t mean anything if x equals 2.
So we take an implicit definition of the domain: it’s all the real numbers for which the function’s rule is meaningful. So, would have a domain “real numbers other than 2”. would have a domain “real numbers other than 2 and -2”.
We create extraneous solutions — or we lose some — when our convention changes the domain. An extraneous solution is one that existed outside the original problem’s domain. A missing solution is one that existed in an excised part of the domain. To go from to by dividing out x is to cut out of the space of possible solutions.
A complaint you might raise. What is the domain for ? Rewrite that as a function. would seem to have a domain “x greater than or equal to 0”. The extraneous solution is , a number which rumor has it is greater than or equal to 0. What happened?
We have to take that equation-handling more slowly. We had started out with
The domain has to be “x is greater than or equal to 0” here. All right. The next step was multiplying both sides by the same quantity, . So:
The domain is still “x is greater than or equal to 0”. The next step, though, was a substitution. I wanted to replace the on the right with . We know, from the original equation, that those are equal. At least, they’re equal wherever the original equation is true. What happens when , though?
We start to see the catch. 9 – 12 is -3. And while it’s true that -3 squared will be 9, it’s false that -3 is the square root of 9. The equation can only be true, for real numbers, if is nonnegative. We can make this rigorous with two supplementary functions. Let me call and .
has an implicit domain of “x greater than or equal to 0”. What’s the domain of ? If , like we said it does, then they have to agree for every x in either’s domain. So can’t have in its domain any x for which isn’t defined. So the domain of has to be “x for which x – 12 is greater than or equal to 0”. And that’s “x greater than or equal to 12”.
So the domain for the original equation is “x greater than or equal to 12”. When we keep that domain in mind, the extraneous nature of is clear, and we avoid trouble.
Not all extraneous solutions come from algebraic manipulations. Sometimes there are constraints on the problem, rather than the numbers, that make a solution absurd. There is a betting strategy called the martingale. This amounts to doubling the bet every time one loses. This makes the first win balance out all the losses leading to it. This solution fails because the player has a finite wallet, and after a few losses any player hasn’t got the money to continue.
Or consider a case that may be legend. It concerns the Apollo Guidance Computer. It was designed to take the Lunar Module to a spot at zero altitude above the moon’s surface, with zero velocity. The story is that in early test runs, the computer would not avoid trajectories that dropped to a negative altitude along the way to the surface. One imagines the scene after the first Apollo subway trip. (I have not found a date when such a test run was done, or corrections to the code ordered. If someone knows, I’d appreciate learning specifics.)
The convention, that we trust the domain is “everything which makes sense”, is not to blame here. It’s normally a good convention. Explicitly noting the domain at every step is tedious and, most of the time, unenlightening. It belongs in the background. We also must check our possible solutions, and that they represent things that make sense. We can try to concentrate our thinking on the obvious interesting parts, but must spend some time on the rest also.
As mentioned, ‘X’ is a difficult letter for a glossary project. There aren’t many mathematical terms that start with the letter, as much as it is the default variable name. Making things better is that many of the terms that do are important ones. Xor, from my 2015 A-to-Z, is an example of this. It’s one of the major pieces of propositional logic, and anyone working in logic gets familiar with it really fast.
The letter ‘X’ is a problem for this sort of glossary project. At least around the fourth time you do one, as you exhaust the good terms that start with the letter X. In 2018, I went to the Extreme Value Theorem, using the 1990s Rule that x- and ex- were pretty much the same thing. The Extreme Value Theorem is one of those little utility theorems. On a quick look it seems too obvious to tell us anything useful. It serves a role in proofs that do tell us interesting, surprising things.
Today (the 26th of November) is the Thanksgiving holiday in the United States. The holiday’s set, by law since 1941, to the fourth Thursday in November. (Before then it was customarily the last Thursday in November, but set by Presidential declaration. After Franklin Delano Roosevelt set the holiday to the third Thursday in November, to extend the 1939 and 1940 Christmas-shopping seasons — a decision Republican Alf Landon characterized as Hitlerian — the fourth Thursday was encoded in law.)
Any know-it-all will tell you, though, how the 13th of the month is very slightly more likely to be a Friday than any other day of the week. This is because the Gregorian calendar has that peculiar century-year leap day rule. It throws off the regular progression of the dates through the week. It takes 400 years for the calendar to start repeating itself. How does this affect the fourth Thursday of November? (A month which, this year, did have a Friday the 13th.)
It turns out, it changes things in subtle ways. Thanksgiving, by the current rule, can be any date between the 22nd and 28th; it’s most likely to be any of the 22nd, 24th, or 26th. (This implies that the 13th of November is equally likely to be a Friday, Wednesday, or Monday, a result that surprises me too.) So here’s how often which date is Thanksgiving. This if we pretend the current United States definition of Thanksgiving will be in force for 400 years unchanged:
Today’s is another topic suggested by Mr Wu, author of the Singapore Maths Tuition blog. The Wronskian is named for Józef Maria Hoëne-Wroński, a Polish mathematician, born in 1778. He served in General Tadeusz Kosciuszko’s army in the 1794 Kosciuszko Uprising. After being captured and forced to serve in the Russian army, he moved to France. He kicked around Western Europe and its mathematical and scientific circles. I’d like to say this was all creative and insightful, but, well. Wikipedia describes him trying to build a perpetual motion machine. Trying to square the circle (also impossible). Building a machine to predict the future. The St Andrews mathematical biography notes his writing a summary of “the general solution of the fifth degree [polynomial] equation”. This doesn’t exist.
Both sources, though, admit that for all that he got wrong, there were flashes of insight and brilliance in his work. The St Andrews biography particularly notes that Wronski’s tables of logarithms were well-designed. This is a hard thing to feel impressed by. But it’s hard to balance information so that it’s compact yet useful. He wrote about the Wronskian in 1812; it wouldn’t be named for him until 1882. This was 29 years after his death, but it does seem likely he’d have enjoyed having a familiar thing named for him. I suspect he wouldn’t enjoy my next paragraph, but would enjoy the fight with me about it.
The Wronskian is a thing put into Introduction to Ordinary Differential Equations courses because students must suffer in atonement for their sins. Those who fail to reform enough must go on to the Hessian, in Partial Differential Equations.
To be more precise, the Wronskian is the determinant of a matrix. The determinant you find by adding and subtracting products of the elements in a matrix together. It’s not hard, but it is tedious, and gets more tedious pretty fast as the matrix gets bigger. (In Big-O notation, it’s the order of the cube of the matrix size. This is rough, for things humans do, although not bad as algorithms go.) The matrix here is made up of a bunch of functions and their derivatives. The functions need to be ones of a single variable. The derivatives, you need first, second, third, and so on, up to one less than the number of functions you have.
If you have two functions, and , you need their first derivatives, and . If you have three functions, , , and , you need first derivatives, , , and , as well as second derivatives, , , and . If you have functions and here I’ll call them , you need derivatives, and so on through . You see right away this is a fun and exciting thing to calculate. Also why in intro to differential equations you only work this out with two or three functions. Maybe four functions if the class has been really naughty.
Go through your functions and your derivatives and make a big square matrix. And then you go through calculating the derivative. This involves a lot of multiplying strings of these derivatives together. It’s a lot of work. But at least doing all this work gets you older.
So one will ask why do all this? Why fit it into every Intro to Ordinary Differential Equations textbook and why slip it in to classes that have enough stuff going on?
One answer is that if the Wronskian is not zero for some values of the independent variable, then the functions that went into it are linearly independent. Mathematicians learn to like sets of linearly independent functions. We can treat functions like directions in space. Linear independence assures us none of these functions are redundant, pointing a way we already can describe. (Real people see nothing wrong in having north, east, and northeast as directions. But mathematicians would like as few directions in our set as possible.) The Wronskian being zero for every value of the independent variable seems like it should tell us the functions are linearly dependent. It doesn’t, not without some more constraints on the functions.
This is fine, but who cares? And, unfortunately, in Intro it’s hard to reach a strong reason to care. To this major, the emphasis on linearly independent functions felt misplaced. It’s the sort of thing we care about in linear algebra. Or some course where we talk about vector spaces. Differential equations do lead us into vector spaces. It’s hard to find a corner of analysis that doesn’t.
Every ordinary differential equation has a secret picture. This is a vector field. One axis in the field is the independent variable of the function. The other axes are the value of the function. And maybe its derivatives, depending on how many derivatives are used in the ordinary differential equation. To solve one particular differential equation is to find one path in this field. People who just use differential equations will want to find one path.
Mathematicians tend to be fine with finding one path. But they want to find what kinds of paths there can be. Are there paths which the differential equation picks out, by making paths near it stay near? Or by making paths that run away from it? And here is the value of the Wronskian. The Wronskian tells us about the divergence of this vector field. This gives us insight to how these paths behave. It’s in the same way that knowing where high- and low-pressure systems are describes how the weather will change. The Wronskian, by way of a thing called Liouville’s Theorem that I haven’t the strength to describe today, ties in to the Hamiltonian. And the Hamiltonian we see in almost every mechanics problem of note.
You can see where the mathematics PhD, or the physicist, would find this interesting. But what about the student, who would look at the symbols evoked by those paragraphs above with reasonable horror?
And here’s the second answer for what the Wronskian is good for. It helps us solve ordinary differential equations. Like, particular ones. An ordinary differential equation will (normally) have several linearly independent solutions. If you know all but one of those solutions, it’s possible to calculate the Wronskian and, from that, the last of the independent solutions. Since a big chunk of mathematics — particularly for science or engineering — is solving differential equations you see why this is something valuable. Allow that it’s tedious. Tedious work we can automate, or give to research assistant to do.
One then asks what kind of differential equation would have all-but-one answer findable, and yield that last one only by long efforts of hard work. So let me show you an example ordinary differential equation:
Here , , and are some functions that depend only on the independent variable, . Don’t know what they are; don’t care. The differential equation is a lot easier of and are constants, but we don’t insist on that.
This equation has a close cousin, and one that’s easier to solve than the original. Is cousin is called a homogeneous equation:
The left-hand-side, the parts with the function that we want to find, is the same. It’s the right-hand-side that’s different, that’s a constant zero. This is what makes the new equation homogenous. This homogenous equation is easier and we can expect to find two functions, and , that solve it. If and are constant this is even easy. Even if they’re not, if you can find one solution, the Wronskian lets you generate the second.
That’s nice for the homogenous equation. But if we care about the original, inhomogenous one? The Wronskian serves us there too. Imagine that the inhomogenous solution has any solution, which we’ll call . (The ‘p’ stands for ‘particular’, as in “the solution for this particular ”.) But also has to solve that inhomogenous differential equation. It seems startling but if you work it out, it’s so. (The key is the derivative of the sum of functions is the same as the sum of the derivative of functions.) also has to solve that inhomogenous differential equation. In fact, for any constants and , it has to be that is a solution.
I’ll skip the derivation; you have Wikipedia for that. The key is that knowing these homogenous solutions, and the Wronskian, and the original , will let you find the that you really want.
My reading is that this is more useful in proving things true about differential equations, rather than particularly solving them. It takes a lot of paper and I don’t blame anyone not wanting to do it. But it’s a wonder that it works, and so well.
Don’t make your instructor so mad you have to do the Wronskian for four functions.
And let me tease other W-words I won’t be repeating for my essay this week with the Well-Ordering Principle, discussed in the summer of 2017. This is one of those little properties that some sets of numbers, like whole numbers, have and that others, like the rationals, don’t. It doesn’t seem like anything much, which is often a warning that the concept sneaks into a lot of interesting work. On re-reading my own work, I got surprised, which I hope speaks better of the essay than it does of me.
No reason not to keep showing off old posts while I prepare new ones. A Summer 2015 Mathematics A To Z: well-posed problem shows off one of the set of things mathematicians describe as “well”. Well-posedness is one of those things mathematicians learn to look for in problems, and to recast problems so that they have it. The essay also shows off how much I haven’t been able to settle on rules about how to capitalize subject lines.
This is easy. The velocity is the first derivative of the position. First derivative with respect to time, if you must know. That hardly needed an extra week to write.
Yes, there’s more. There is always more. Velocity is important by itself. It’s also important for guiding us into new ideas. There are many. One idea is that it’s often the first good example of vectors. Many things can be vectors, as mathematicians see them. But the ones we think of most often are “some magnitude, in some direction”.
The position of things, in space, we describe with vectors. But somehow velocity, the changes of positions, seems more significant. I suspect we often find static things below our interest. I remember as a physics major that my Intro to Mechanics instructor skipped Statics altogether. There are many important things, like bridges and roofs and roller coaster supports, that we find interesting because they don’t move. But the real Intro to Mechanics is stuff in motion. Balls rolling down inclined planes. Pendulums. Blocks on springs. Also planets. (And bridges and roofs and roller coaster supports wouldn’t work if they didn’t move a bit. It’s not much though.)
So velocity shows us vectors. Anything could, in principle, be moving in any direction, with any speed. We can imagine a thing in motion inside a room that’s in motion, its net velocity being the sum of two vectors.
And they show us derivatives. A compelling answer to “what does differentiation mean?” is “it’s the rate at which something changes”. Properly, we can take the derivative of any quantity with respect to any variable. But there are some that make sense to do, and position with respect to time is one. Anyone who’s tried to catch a ball understands the interest in knowing.
We take derivatives with respect to time so often we have shorthands for it, by putting a ‘ mark after, or a dot above, the variable. So if x is the position (and it often is), then is the velocity. If we want to emphasize we think of vectors, is the position and the velocity.
Velocity has another common shorthand. This is , or if we want to emphasize its vector nature, . Why a name besides the good enough ? It helps us avoid misplacing a ‘ mark in our work, for one. And giving velocity a separate symbol encourages us to think of the velocity as independent from the position. It’s not — not exactly — independent. But knowing that a thing is in the lawn outside tells us nothing about how it’s moving. Velocity affects position, in a process so familiar we rarely consider how there’s parts we don’t understand about it. But velocity is also somehow also free of the position at an instant.
Velocity also guides us into a first understanding of how to take derivatives. Thinking of the change in position over smaller and smaller time intervals gets us to the “instantaneous” velocity by doing only things we can imagine doing with a ruler and a stopwatch.
Velocity has a velocity. , also known as . Or, if we’re sure we won’t lose a ‘ mark, . Once we are comfortable thinking of how position changes in time we can think of other changes. Velocity’s change in time we call acceleration. This is also a vector, more abstract than position or velocity. Multiply the acceleration by the mass of the thing accelerating and we have a vector called the “force”. That, we at least feel we understand, and can work with.
Acceleration has a velocity too, a rate of change in time. It’s called the “jerk” by people telling you the change in acceleration in time is called the “jerk”. (I don’t see the term used in the wild, but admit my experience is limited.) And so on. We could, in principle, keep taking derivatives of the position and keep finding new changes. But most physics problems we find interesting use just a couple of derivatives of the position. We can label them, if we need, , where n is some big enough number like 4.
We can bundle them in interesting ways, though. Come back to that mention of treating position and velocity of something as though they were independent coordinates. It’s a useful perspective. Imagine the rules about how particles interacting with one another and with their environment. These usually have explicit roles for position and velocity. (Granting this may reflect a selection bias. But these do cover enough interesting problems to fill a career.)
So we create a new vector. It’s made of the positition and the velocity. We’d write it out as . The superscript-T there, “transposition”, lets us use the tools of matrix algebra. This vector describes a point in phase space. Phase space is the collection of all the physically possible positions and velocities for the system.
What’s the derivative, in time, of this point in phase space? Glad to say we can do this piece by piece. The derivative of a vector is the derivative of each component of a vector. So the derivative of is , or, . This acceleration itself depends on, normally, the positions and velocities. So we can describe this as for some function . You are surely impressed with this symbol-shuffling. You are less sure why this bother.
The bother is a trick of ordinary differential equations. All differential equations are about how a function-to-be-determined and its derivatives relate to one another. In ordinary differential equations, the function-to-be-determined depends on a single variable. Usually it’s called x or t. There may be many derivatives of f. This symbol-shuffling rewriting takes away those higher-order derivatives. We rewrite the equation as a vector equation of just one order. There’s some point in phase space, and we know what its velocity is. That we do because in this form many problems can be written as a matrix problem: . Or approximate our problem as a matrix problem. This lets us bring in linear algebra tools, and that’s worthwhile.
It calls on a more abstract idea of what a “velocity” might be. We can explain what the thing that’s “moving” and what it’s moving through are, given time. But the instincts we develop from watching ordinary things move help us in these new territories. This is also a classic mathematician’s trick. It may seem like all mathematicians do is develop tricks to extend what they already do. I can’t say this is wrong.
I decided to let the V essay slide to Wednesday. This will make the end of the 2020 A-to-Z run a week later than I originally imagined, but that’s all right. It’ll all end in 2020 unless there’s another unexpected delay.
I have gotten several good suggestions for the letters W and X, but I’m still open to more, preferably for X. And I would like any thoughts anyone would like to share for the last letters of the alphabet. If you have an idea for a mathematical term starting with either letter, please let me know in comments. Also please let me know about any blogs or other projects you have, so that I can give them my modest boost with the essay. I’m open to revisiting topics I’ve already discussed, if I can think of something new to say or if I’ve forgotten I wrote them about them already.
Topics I’ve already covered, starting with the letter ‘Y’, are:
I have accepted that this week, at least, I do not have it in me to write an A-to-Z essay. I’ll be back to it next week, I think. I don’t know whether I’ll publish my usual I-meant-this-to-be-800-words-and-it’s-three-times-that piece on Monday or on Wednesday, but it’ll be sometime next week. And, events personal and public allowing, I’ll continue weekly from there. Should still finish the essay series before 2020 finishes. I say this assuming that 2020 will in fact finish.
But now let me look back on a time when I could produce essays with an almost machine-like reliability, except for when I forgot to post them. My 2019 Mathematics A To Z: Versine is such an essay. The versine is a function that had a respectably long life in a niche of computational computing. Cheap electronic computers wiped out that niche. The reasons that niche ever existed, though, still apply, just to different problems. Knowing of past experiences can help us handle future problems.
I am not writing another duplicate essay. I intend to have an A-to-Z essay for the week. I just haven’t had the time or energy to write anything so complicated as an A-to-Z since the month began. Things are looking up, though, and I hope to have something presentable for Friday.
So let me just swap my publication slots around, and share an older essay, as I would have on Friday. My 2018 Mathematics A To Z: Volume was suggested by Ray Kassinger, of the popular web comic Housepets!, albeit as a Mystery Science Theater 3000 reference. It’s a great topic, though. It’s one of those things everyone instinctively understands. But making that instinct precise demands we accept some things that seem absurd. It’s a great example of what mathematics can do, given a chance.
In looking over past A-to-Z’s I notice a lot of my U- entries are the negation of something. Unknots, for example. Or unbounded. English makes this construction hard to avoid. Any interesting property is also interesting when it’s absent. But there are also mathematical terms that start with a U on their own terms. The Summer 2017 Mathematics A To Z: Ulam’s Spiral shows off one of them. Stanislaw Ulam’s spiral is one of those things we find as a curious graphical adjunct to prime numbers. The essay also features one of my many pieces in praise of boredom.
I assume that last week I disappointed Mr Wu, of the Singapore Maths Tuition blog, last week when I passed on a topic he suggested to unintentionally rewrite a good enough essay. I hope to make it up this week with a piece of linear algebra.
A Unitary Matrix — note the article; there is not a singular the Unitary Matrix — starts with a matrix. This is an ordered collection of scalars. The scalars we call elements. I can’t think of a time I ever saw a matrix represented except as a rectangular grid of elements, or as a capital letter for the name of a matrix. Or a block inside a matrix. In principle the elements can be anything. In practice, they’re almost always either real numbers or complex numbers. To speak of Unitary Matrixes invokes complex-valued numbers. If a matrix that would be Unitary has only real-valued elements, we call that an Orthogonal Matrix. It’s not wrong to call an Orthogonal matrix “Unitary”. It’s like pointing to a known square, though, and calling it a parallelogram. Your audience will grant that’s true. But it wonder what you’re getting at, unless you’re talking about a bunch of parallelograms and some of them happen to be squares.
As with polygons, though, there are many names for particular kinds of matrices. The flurry of them settles down on the Intro to Linear Algebra student and it takes three or four courses before most of them feel like familiar names. I will try to keep the flurry clear. First, we’re talking about square matrices, ones with the same number of rows as columns.
Start with any old square matrix. Give it the name U because you see where this is going. There are a couple of new matrices we can derive from it. One of them is the complex conjugate. This is the matrix you get by taking the complex conjugate of every term. So, if one element is , in the complex conjugate, that element would be . Reverse the plus or minus sign of the imaginary component. The shorthand for “the complex conjugate to matrix U” is . Also we’ll often just say “the conjugate”, taking the “complex” part as implied.
Start back with any old square matrix, again called U. Another thing you can do with it is take the transposition. This matrix, U-transpose, you get by keeping the order of elements but changing rows and columns. That is, the elements in the first row become the elements in the first column. The elements in the second row become the elements in the second column. Third row becomes the third column, and so on. The diagonal — first row, first column; second row, second column; third row, third column; and so on — stays where it was. The shorthand for “the transposition of U” is .
You can chain these together. If you start with U and take both its complex-conjugate and its transposition, you get the adjoint. We write that with a little dagger: . For a wonder, as matrices go, it doesn’t matter whether you take the transpose or the conjugate first. It’s the same . You may ask how people writing this out by hand never mistake for . This is a good question and I hope to have an answer someday. (I would write it as in my notes.)
And the last thing you can maybe do with a square matrix is take its inverse. This is like taking the reciprocal of a number. When you multiply a matrix by its inverse, you get the Identity Matrix. Not every matrix has an inverse, though. It’s worse than real numbers, where only zero doesn’t have a reciprocal. You can have a matrix that isn’t all zeroes and that doesn’t have an inverse. This is part of why linear algebra mathematicians command the big money. But if a matrix U has an inverse, we write that inverse as .
The Identity Matrix is one of a family of square matrices. Every element in an identity matrix is zero, except on the diagonal. That is, the element at row one, column one, is the number 1. The element at row two, column two is also the number 1. Same with row three, column three: another one. And so on. This is the “identity” matrix because it works like the multiplicative identity. Pick any matrix you like, and multiply it by the identity matrix; you get the original matrix right back. We use the name for an identity matrix. If we have to be clear how many rows and columns the matrix has, we write that as a subscript: or or or so on.
So this, finally, lets me say what a Unitary Matrix is. It’s any square matrix U where the adjoint, is the same matrix as the inverse, . It’s wonderful to learn you have a Unitary Matrix. Not just because, most of the time, finding the inverse of a matrix is a long and tedious procedure. Here? You have to write the elements in a different order and change the plus-or-minus sign on the imaginary numbers. The only way it would be easier if you had only real numbers, and didn’t have to take the conjugates.
That’s all a nice heap of terms. What makes any of them important, other than so Intro to Linear Algebra professors can test their students?
Well, you know mathematicians. If we like something like this, it’s usually because it holds out the prospect of turning a hard problems into easier ones. So it is. Start out with any old matrix. Call it A. Then there exist some unitary matrixes, call them U and V. And their product does something wonderful: is a “diagonal” matrix. A diagonal matrix has zeroes for every element except the diagonal ones. That is, row one, column one; row two, column two; row three, column three; and so on. The elements that trace a path from the upper-left to the lower-right corner of the matrix. (The diagonal from the upper-right to the lower-left we have nothing to do with.) Everything we might do with matrices is easier on a diagonal matrix. So we process our matrix A into this diagonal matrix D. Process it by whatever the heck we’re doing. If we then multiply this by the inverses of U and V? If we calculate ? We get whatever our process would have given us had we done it to A. And, since U and V are unitary matrices, it’s easy to find these inverses. Wonderful!
Also this sounds like I just said Unitary Matrixes are great because they solve a problem you never heard of before.
The 20th Century’s first great use for Unitary Matrixes, and I imagine the impulse for Mr Wu’s suggestion, was quantum mechanics. (A later use would be data compression.) Unitary Matrixes help us calculate how quantum systems evolve. This should be a little easier to understand if I use a simple physics problem as demonstration.
So imagine three blocks, all the same mass. They’re connected in a row, left to right. There’s two springs, one between the left and the center mass, one between the center and the right mass. The springs have the same strength. The blocks can only move left-to-right. But, within those bounds, you can do anything you like with the blocks. Move them wherever you like and let go. Let them go with a kick moving to the left or the right. The only restraint is they can’t pass through one another; you can’t slide the center block to the right of the right block.
This is not quantum mechanics, by the way. But it’s not far, either. You can turn this into a fine toy of a molecule. For now, though, think of it as a toy. What can you do with it?
A bunch of things, but there’s two really distinct ways these blocks can move. These are the ways the blocks would move if you just hit it with some energy and let the system do what felt natural. One is to have the center block stay right where it is, and the left and right blocks swinging out and in. We know they’ll swing symmetrically, the left block going as far to the left as the right block goes to the right. But all these symmetric oscillations look about the same. They’re one mode.
The other is … not quite antisymmetric. In this mode, the center block moves in one direction and the outer blocks move in the other, just enough to keep momentum conserved. Eventually the center block switches direction and swings the other way. But the outer blocks switch direction and swing the other way too. If you’re having trouble imagining this, imagine looking at it from the outer blocks’ point of view. To them, it’s just the center block wobbling back and forth. That’s the other mode.
And it turns out? It doesn’t matter how you started these blocks moving. The movement looks like a combination of the symmetric and the not-quite-antisymmetric modes. So if you know how the symmetric mode evolves, and how the not-quite-antisymmetric mode evolves? Then you know how every possible arrangement of this system evolves.
So here’s where we get to quantum mechanics. Suppose we know the quantum mechanics description of a system at some time. This we can do as a vector. And we know the Hamiltonian, the description of all the potential and kinetic energy, for how the system evolves. The evolution in time of our quantum mechanics description we can see as a unitary matrix multiplied by this vector.
The Hamiltonian, by itself, won’t (normally) be a Unitary Matrix. It gets the boring name H. It’ll be some complicated messy thing. But perhaps we can find a Unitary Matrix U, so that is a diagonal matrix. And then that’s great. The original H is hard to work with. The diagonalized version? That one we can almost always work with. And then we can go from solutions on the diagonalized version back to solutions on the original. (If the function describes the evolution of , then describes the evolution of .) The work that U (and ) does to H is basically what we did with that three-block, two-spring model. It’s picking out the modes, and letting us figure out their behavior. Then put that together to work out the behavior of what we’re interested in.
There are other uses, besides time-evolution. For instance, an important part of quantum mechanics and thermodynamics is that we can swap particles of the same type. Like, there’s no telling an electron that’s on your nose from an electron that’s in one of the reflective mirrors the Apollo astronauts left on the Moon. If they swapped positions, somehow, we wouldn’t know. It’s important for calculating things like entropy that we consider this possibility. Two particles swapping positions is a permutation. We can describe that as multiplying the vector that describes what every electron on the Earth and Moon is doing by a Unitary Matrix. Here it’s a matrix that does nothing but swap the descriptions of these two electrons. I concede this doesn’t sound thrilling. But anything that goes into calculating entropy is first-rank important.
As with time-evolution and with permutation, though, any symmetry matches a Unitary Matrix. This includes obvious things like reflecting across a plane. But it also covers, like, being displaced a set distance. And some outright obscure symmetries too, such as the phase of the state function . I don’t have a good way to describe what this is, physically; we can’t observe it directly. This symmetry, though, manifests as the conservation of electric charge, a thing we rather like.
This, then, is the sort of problem that draws Unitary Matrixes to our attention.
I’m still only doing short reviews of my readership figures. These are nice easy posts to make, and strangely popular, but they do take time and I’m never sure why people find them interesting. I think it’s all from other bloggers, happy to know how much better their blogs are doing.
Granted that: I had, for me, a really well-read month. According to WordPress, there were 3,043 pages viewed here in October 2020. This is way above the twelve-month running average of 2,381.5 views per month. Also this is the second-largest number of page views I’ve gotten since October 2019. That month, too, was part of an A-to-Z sequence. I wrote something that got referenced on some actually popular web site, though, last year. This year, all I can figure is spillover of people on my other blog wanting to know what’s going on with Mark Trail.
(If you read any web site that regularly talks about Mark Trail, poke around the comments. There’s people upset about the new artist. It’s not my intention to mock them; anything you like changing out from under you is upsetting. But it is soothing to see people worrying about, ultimately, a guy who punches smugglers while giant squirrels talk. On my other blog I plan to have a full plot recap of that in about two weeks.)
There were more unique visitors in October 2020 than any other month besides October 2019, also. WordPress recorded 2,161 unique visitors, well above the twelve-month running average of 1,644.2. It’s much the same for interactions as well: 79 things were liked, compared to the running average of 59.8, and 18 comments, above the 17.1 running average.
October was another month of 18 posts, and I have a running average of 17.6 posts per month now. I’m surprised by that too. I feel like any month that isn’t an A-to-Z sequence I have twelve posts, but there we go. This all means the per-post October averages were above the per-post running averages.
What were the most popular recent posts? Here recent means “from September or October”? That I’m glad to share:
All told, in October I published 12,937 words, down a bit from September. This was an average of 718.7 words per posting in October, which still brings my year-to-date average post length up to 697 words. It had been 694 at the start of October.
As of the start of November I’ve published 1,554 posts here. They’ve gathered 116,811 page views. I like how nearly but not quite palindromic that number is. It even almost but not quite stays the same under a 180 degree rotation. These pages overall have drawn 66,030 logged unique visitors.
I know, it’s strange for me to not post another piece about tiling. But My 2019 Mathematics A To Z: Taylor Series is going to be a good utility essay, useful for a long while to come. Taylor Series represent one of the standard mathematician tricks. This is to rewrite a thing we want to do as a sum of things it’s easy to do. This can make our problem into a long series of little problems. But the advantage is we know what to do with all those little problems. It’s often a worthwhile trade.
I’ve always held out the option that I would revisit a topic sometime. I thought it would most likely be taking some essay from one of my earliest A-to-Z’s where, with a half-decade’s more experience in pop mathematics writing, I could do much better. And at the request of someone who felt that, like, my piece on duals was foggy. It is, but nobody’s ever cared enough about duals to say anything.
So I went looking at what previous T topics I’d written about here. Usually I pick them the Sunday or Monday of a week, since that’s easy to do. This week, I didn’t have the time until Thursday when I looked and found I wrote up “Tiling” for the 2018 A-to-Z. In about November of that year, too. And after casting aside a suggestion from Mr Wu of the Singapore Maths Tuition blog, although that time at least I was responding to a specific topic suggestion. 2020, you know?
Well, now that the deed is done, I can see what I learned from it anyway. First is picking out the archive pieces before I write the week’s essay. Second is how my approach differed in the 2020 essay. The broad picture is similar enough. The most interesting differences are that in the 2020 essay I look at more specifics. Like, just when Robert Berger found his aperiodic tiling of the plane. And what the Wang Tiles are that he found them with. Or, a very brief sketch of how to show Penrose (rhomboid) tiling is aperiodic. This changes the shape of the essay. Also it makes the essay longer, but that might also might reflect that in 2018 I was publishing two essays a week. This year I’m doing one, and somehow still putting out as many words per week.
I like the greater focus on specifics, although that might just reflect that I’m usually happiest with something I just wrote. As I get distance from it, I come to feel the whole thing’s so bad as to be humiliating. When it’s far enough in the past, usually, I come around again and feel it’s pretty good, and maybe that I don’t know how to write like that anymore. The 2018 essay is, to me, only embarrassing in stuff that I glossed over that in 2020 I made specific. Not to worry, though. I still get foggy and elliptical about important topics anyway.