I hadn’t quite intended it, but February was another low-power month here. No big A-to-Z project and no resumption of Reading the Comics. The high points were sharing things that I’d seen elsewhere, and a mathematics problem that occurred to me while making tea. Very low-scale stuff. Still, I like to check on how that’s received.
I did put together seven posts for February — the same as January — and here’s a list of them in descending order of popularity:
I assume the essay setting out the tea question was more popular than the answer because it had a week more to pick up readers. That or people reading the answer checked back on what the question was. It couldn’t be that people are that uninterested in my actually explaining a mathematics thing.
That’s it for relative popularity. How about for total readership?
I had expected readership to continue declining, since I’m publishing fewer things and having my name out there seems to matter. But the decline’s less drastic than I expected. There were 2,167 page views here in February. But in the twelve months from February 2020 through January 2021? I had a mean of 2,137.4 page views, and a median of 2,044.5. That is, I’m still on the high side of my popularity.
There were 1,576 logged unique visitors in February. In the twelve months leading up to that the mean was 1,480.7 unique visitors, and the median 1,395.5.
The figures look more impressive if you rate them by number of postings. In that case in February I gathered 309.6 views per posting, way above the mean of 157.9 and median of 135.6. There were also 225.1 unique visitors per posting, again way above the running mean of 109.9 and median of 90.7.
I’ll dig unpopularity out of any set of numbers, though. There were only 47 likes granted here in February, down from the running mean of 55.8 and median of 55.5. That is still 6.7 likes per posting, above the mean of 3.9 and median of 4.0, but it’s still sparse likings. There were a hearty 39 comments given — my highest number since October 2018 — and that’s well above the mean of 17.0 and median of 18. Per posting, that’s 5.6 comments per posting, the highest I have since I started calculating this figure back in July of 2018. The mean and median comments per posting, for the twelve months leading up to this, were both 1.2.
WordPress’s insights panel tells me I published seven things in February, which matches my experience. I still can’t explain the discrepancy back in January. It says also that I published 3,440 words over February, my quietest month since I started tracking those numbers. It put my average post at 590 words for February, and 573.3 words for the whole year to date.
I start March, if WordPress is reliable, having gathered 126,829 views from 73,273 logged unique visitors. This after 1,595 posts in total.
If you have a WordPress account you can add me to your Reader by clicking the “Follow Nebusresearch” button on this page. I’ve also re-enabled the “Follow NebusResearch By E-mail” option, for people who want to see posts before I’ve fixed the typos. The typos will never be fixed. Every time an author looks at an old blog post there are three more typos, even if they’ve corrected the typos before.
The problem I’d set out last week: I have a teapot good for about three cups of tea. I want to put milk in the once, before the first cup. How much should I drink before topping up the cup, to have the most milk at the end?
I have expectations. Some of this I know from experience, doing other problems where things get replaced at random. Here, tea or milk particles get swallowed at random, and replaced with tea particles. Yes, ‘particle’ is a strange word to apply to “a small bit of tea”. But it’s not like I can call them tea molecules. “Particle” will do and stop seeming weird someday.
Random replacement problems tend to be exponential decays. That I know from experience doing problems like this. So if I get an answer that doesn’t look like an exponential decay I’ll doubt it. I might be right, but I’ll need more convincing.
I also get some insight from extreme cases. We can call them reductios. Here “reductio” as in the word we usually follow with “ad absurdum”. Make the case ridiculous and see if that offers insight. The first reductio is to suppose I drink the entire first cup down to the last particle, then pour new tea in. By the second cup, there’s no milk left. The second reductio is to suppose I drink not a bit of the first cup of milk-with-tea. Then I have the most milk preserved. It’s not a satisfying break. But it leads me to suppose the most milk makes it through to the end if I have a lot of small sips and replacements of tea. And to look skeptically if my work suggests otherwise.
So that’s what I expect. What actually happens? Here, I do a bit of reasoning. Suppose that I have a mug. It can hold up to 1 unit of tea-and-milk. And the teapot, which holds up to 2 more units of tea-and-milk. What units? For the mathematics, I don’t care.
I’m going to suppose that I start with some amount — call it — of milk. is some number between 0 and 1. I fill the cup up to full, that is, 1 unit of tea-and-milk. And I drink some amount of the mixture. Call the amount I drink . It, too, is between 0 and 1. After this, I refill the mug up to full, so, putting in units of tea. And I repeat this until I empty the teapot. So I can do this times.
I know you noticed that I’m short on tea here. The teapot should hold 3 units of tea. I’m only pouring out . I could be more precise by refilling the mug times. I’m also going to suppose that I refill the mug with amount of tea a whole number of times. This sounds necessarily true. But consider: what if I drank and re-filled three-quarters of a cup of tea each time? How much tea is poured that third time?
I make these simplifications for good reasons. They reduce the complexity of the calculations I do without, I trust, making the result misleading. I can justify it too. I don’t drink tea from a graduated cylinder. It’s a false precision to pretend I do. I drink (say) about half my cup and refill it. How much tea I get in the teapot is variable too. Also, I don’t want to do that much work for this problem.
In fact, I’m going to do most of the work of this problem with a single drawing of a square. Here it is.
So! I start out with units of tea in the mixture. After drinking units of milk-and-tea, what’s left is units of milk in the mixture.
How about the second refill? The process is the same as the first refill. But where, before, there had been units of milk in the tea, now there are only units in. So that horizontal strip is a little narrower is all. The same reasoning applies and so, after the second refill, there’s milk in the mixture.
If you nodded to that, you’d agree that after the third refill there’s . And are pretty sure what happens at the fourth and fifth and so on. If you didn’t nod to that, it’s all right. If you’re willing to take me on faith we can continue. If you’re not, that’s good too. Try doing a couple drawings yourself and you may convince yourself. If not, I don’t know. Maybe try, like, getting six white and 24 brown beads, stir them up, take out four at random. Replace all four with brown beads and count, and do that several times over. If you’re short on beads, cut up some paper into squares and write ‘B’ and ‘W’ on each square.
But anyone comfortable with algebra can see how to reduce this. The amount of milk remaining after j refills is going to be
How many refills does it take to run out of tea? That we knew from above: it’s refills. So my last full mug of tea will have left in it
units of milk.
Anyone who does differential equations recognizes this. It’s the discrete approximation of the exponential decay curve. Discrete, here, because we take out some finite but nonzero amount of milk-and-tea, , and replace it with the same amount of pure tea.
Now, again, I’ve seen this before so I know its conclusions. The most milk will make it to the end of is as small as possible. The best possible case would be if I drink and replace an infinitesimal bit of milk-and-tea each time. Then the last mug would end with of milk. That’s as in the base of the natural logarithm. Every mathematics problem has an somewhere in it and I’m not exaggerating much. All told this would be about 13 and a half percent of the original milk.
Drinking more realistic amounts, like, half the mug before refilling, makes the milk situation more dire. Replacing half the mug at a time means the last full mug has only one-sixteenth what I started with. Drinking a quarter of the mug and replacing it lets about one-tenth the original milk survive.
But all told the lesson is clear. If I want milk in the last mug, I should put some in each refill. Putting all the milk in at the start and letting it dissolve doesn’t work.
A post on Mathstodon made me aware there’s a bit of talk about iceberg shapes. Particularly that one of the iconic photographs of an iceberg above-and-below water, is a imaginative work. A real iceberg wouldn’t be stable in that orientation. Which, I’ll admit, isn’t something I had thought about. I also hadn’t thought about the photography challenge of getting a clear picture of something in sunlight and in water at once. There was a lot I hadn’t thought about. In my defense, I spend a lot of time noticing comic strips had a character complain about the New Math.
I’ve been taking milk in my tea lately. I have a teapot good for about three cups of tea. So that’s got me thinking about how to keep the most milk in the last of my tea. You may ask why I don’t just get some more milk when I refill the cup. I answer that if I were willing to work that hard I wouldn’t be a mathematician.
It’s easy to spot the lowest amount of milk I could have. If I drank the whole of the first cup, there’d be only whatever milk was stuck by surface tension to the cup for the second. And so even less than that for the third. But if I drank half a cup, poured more tea in, drank half again, poured more in … without doing the calculation, that’s surely more milk for the last full cup.
So what’s the strategy for the most milk I could get in the final cup? And how much is in there?
Rosenbluth was a PhD in physics (and an Olympics-qualified fencer). Her postdoctoral work was with the Atomic Energy Commission, bringing her to a position at Los Alamos National Laboratory in the early 1950s. And a moment in computer science that touches very many people’s work, my own included. This is in what we call Metropolis-Hastings Markov Chain Monte Carlo.
Monte Carlo methods are numerical techniques that rely on randomness. The name references the casinos. Markov Chain refers to techniques that create a sequence of things. Each thing exists in some set of possibilities. If we’re talking about Markov Chain Monte Carlo this is usually an enormous set of possibilities, too many to deal with by hand, except for little tutorial problems. The trick is that what the next item in the sequence is depends on what the current item is, and nothing more. This may sound implausible — when does anything in the real world not depend on its history? — but the technique works well regardless. Metropolis-Hastings is a way of finding states that meet some condition well. Usually this is a maximum, or minimum, of some interesting property. The Metropolis-Hastings rule has the chance of going to an improved state, one with more of whatever the property we like, be 1, a certainty. The chance of going to a worsened state, with less of the property, be not zero. The worse the new state is, the less likely it is, but it’s never zero. The result is a sequence of states which, most of the time, improve whatever it is you’re looking for. It sometimes tries out some worse fits, in the hopes that this leads us to a better fit, for the same reason sometimes you have to go downhill to reach a larger hill. The technique works quite well at finding approximately-optimum states when it’s hard to find the best state, but it’s easy to judge which of two states is better. Also when you can have a computer do a lot of calculations, because it needs a lot of calculations.
So here we come to Rosenbluth. She and her then-husband, according to an interview he gave in 2003, were the primary workers behind the 1953 paper that set out the technique. And, particularly, she wrote the MANIAC computer program which ran the algorithm. It’s important work and an uncounted number of mathematicians, physicists, chemists, biologists, economists, and other planners have followed. She would go on to study statistical mechanics problems, in particular simulations of molecules. It’s still a rich field of study.
Both the Klein bottle and the Möbius strip have many possible appearances, for about the same reason there are many kinds of trapezoids or octagons or whatnot. Möbius strips are easy enough to make in real life. Klein bottles, not so; the shape needs four dimensions of space and we just don’t have them. We’ll represent it with a shape that loops back through itself, but a real Klein bottle wouldn’t do that, for the same reason a wireframe cube’s edges don’t intersect the way the lines of its photograph do.
It makes a good wireframe shape, though. I’m surprised not to see more playground equipment using it.
This is not the whole of her work, though my understanding is she’d be worth noticing even if it were. Part of the greatness of the translation was putting Newton’s mathematics — which he had done as geometric demonstrations — into the calculus of the day. The experts on In Our Time’s podcast argue that she did a good bit of work advancing the state of calculus in doing this. She’d also done a good bit of work on the problem of colliding bodies.
A major controversy was, in modern terms, whether momentum and kinetic energy are different things and, if they are different, which one collisions preserve. Châtelet worked on experiments — inspired by ideas of Gottfried Wilhelm Liebniz — to show kinetic energy was its own thing and was the important part of collisions. We today understand both momentum and energy are conserved, but we have the advantage of her work and the people influenced by her work to draw on.
She’s also renowned for a paper about the nature and propagation of fire, submitted anonymously for the Académie des Sciences’s 1737 Grand Prix. It didn’t win — Leonhard Euler’s did — but her paper and her lover Voltaire’s papers were published.
Châtelet was also surprisingly connected to the nascent mathematics and physics scene of the time. She had ongoing mathematical discussions with Pierre-Louis Maupertuis, of the principle of least action; Alexis Clairaut, who calculated the return of Halley’s Comet; Samuel König, author of a theorem relating systems of particles to their center of mass; and Bernard de Fontenelle, perpetual secretary of the Acadeémie des Sciences.
So for those interested in the history of mathematics and physics, and of women who are able to break through social restrictions to do good work, the podcast is worth a listen.
I spent much of the time waiting for a mention of Chatelier’s principle which never came. This because Chatelier’s principle’s — about the tendency of a system in equilibrium to resist changes — is named for Henry Louis Le Chatelier, a late 19th/early 20th century chemist with, so far as I know, no relation to Eacute;mile du Châtelet. I hope this spares you the confusion I felt.
I did not abandon my mathematics blog in January. I felt like I did, yes. But I posted seven essays, by my count. Six, by the WordPress statistics “Insight” panel. I have no idea what post it thinks doesn’t count, but this does shake my faith in whatever Insights it’s supposed to give me. On my humor blog, which had a post a day, it correctly logs 31. I haven’t noticed other discrepancies either. And it’s not like any of my seven January posts was a reblog which might count differently. One quoted a tweet, but that’s nothing unusual.
I’ve observed that my views-per-post tend to be pretty uniform. The implication then is that the more I write, the more I’m read, which seems reasonable. So what would I expect from the most short-winded month I’ve had in at least two and a half years?
So, this might encourage some bad habits in me. There were 2,611 page views here in January 2021. That’s above December’s total, and comfortably above the twelve-month running mean of 2,039.5. It’s also above the twelve-month running median of 2,014.5. This came from 1,849 unique visitors. That’s also above the twelve-month running mean of 1,405.8 unique visitors, and the running median of 1,349 unique visitors.
Where things fell off a bit are in likes and comments. There were 41 likes given in January 2021, below the running mean of 55.2 and running median of 55.5. There were 13 comments received, below the running mean of 16.5 and running median of 18.
Looked at per-post, though, these are fantastic numbers. 373.0 views per posting, crushing the running mean of 138.8 and running median of 135.6 visitors per posting. (And I know these were not all views of January 2021-dated posts.) There were 264.1 unique visitors per posting, similarly crushing the running mean of 95.8 and running median of 90.7 unique visitors per posting.
Even the likes and comments look good, rated that way. There were 5.9 likes per posting in January, above the running mean and median of 3.7 likes per posting. There were 1.9 comments per posting, above the running mean of 1.1 and median of 1.0 per posting. The implication is clear: people like it when I write less.
It seems absurd to list the five most popular posts from January when there were seven total, and two of them were statistics reviews. So I’ll list them all, in descending order of popularity.
WordPress claims that I published 4,231 words in January. Since the Insights panel thinks I published six things, that’s an average of 705 words per post. Since I know I published seven things, that’s an average of 604.4 words per post. I don’t know how to reconcile all this. WordPress put my 2020 average at 672 words per posting, for what that’s worth.
If I can trust anything WordPress tells me, I started February 2021 with 1,588 posts written since I started this in 2011. They’d drawn a total of 124,662 views from 71,697 logged unique visitors.
My love read a thread about the < and > signs, and mnemonics people had learned to tell which was which. And my love wondered, is a mnemonic needed? The symbol is wider on the side with the larger quantity; that’s what it means, right? Why imagine an alligator that’s already swallowed the smaller and is ready to eat the larger? In my elementary school it was goldfish, not alligators. Much easier to draw them in.
All right, but just because an interpretation seems obvious doesn’t mean it is. The questions are, who introduced the < and > symbols to mathematics, and what were they thinking?
And here we get complications. The symbols first appear, meaning what they do today, in Artis Analyticae Praxis ad Aequationes Algebraicas Resolvendas (“The Analytical Art by which Algebraic Equations can be Resolved”). This is a book, by Thomas Harriot, published in 1631. Thomas Harriot was one of the great English mathematicians of the late 16th and early 17th centuries. He worked on the longitude problem, on optics, on astronomy. Harriot’s observations are our first record of sunspots. He almost observed what we now call Halley’s Comet, with records used to work out its orbit. And he worked on how to solve equations, in ways that look at least recognizably close to what we do today.
There is a tradition that holds Harriot drew these symbols from the arm markings on a Native American. Harriot did sail to the New World at least once. He was on Walter Raleigh’s 1585-86 expedition to Virginia and observed the solar eclipse of April 1585. This was a rare chance to calculate the longitude of a ship at sea. So that’s possible. But there is also an argument that Harriot (or editor) drew from the example of the equals sign.
The = sign we first see in the mid-16th century, written by Robert Recorde, another of the great English mathematicians. Recorde did write, in The Whetstone of Witte (1557) that he used parallel lines of a common length because no two things could be more equal. Good mnemonic there. It seems Harriot (or editor) interpreted the common distance between the lines in the equals sign as the thing kept equal. So, on the side of the symbol with the greater number, make the distance between lines greater. On the lower-number’s side, make the distance between lines smaller. Which is another useful mnemonic for the symbol, if you need one.
It’s not an inevitable scheme. William Oughtred also had symbols for less-than and greater-than. Oughtred’s another vaguely familiar name in mathematics symbols. He gave us the symbol for multiplication, and and for the trig functions. He also pioneered slide rules. Oughtred’s symbols look like a block-letter U set on its side, with the upper leg longer than the lower. The vertical stroke and the shorter horizontal stroke would be on the left, to represent the left being greater than the right. The vertical stroke and shorter horizontal stroke would be on the right, for the left being less than the right. That is, the “open” side would face the smaller of the numbers, opposite to what we do with < and >.
And that seems to be as much as can be definitely said. If I’m reading right, we don’t have Harriot’s (or editor’s) statement of what inspired these symbols. We have guesses that seem reasonable, but that might only seem reasonable because we’ve brought our own interpretations to it. I’d love to know if there’s better information available.
My friend ChefMongoose pointed out this probability question. As with many probability questions, it comes from a dice game. Here, Yahtzee, based on rolling five dice to make combinations. I’m not sure whether my Twitter problems will get in the way of this embedding working; we’ll see.
Probability help please! You are playing Yahtzee against your insanely competitive spouse. You have two rolls left. You’re trying to get three of a kind. Is it better to commit and roll three dice here? Or split it and roll one die? pic.twitter.com/fi85UYUTUv
Probability help please! You are playing Yahtzee against your insanely competitive spouse. You have two rolls left. You’re trying to get three of a kind. Is it better to commit and roll three dice here? Or split it and roll one die? — Christopher Yost.
Of the five dice, two are showing 1’s; two are showing 2’s; and there’s one last die that’s a 3.
As with many dice questions you can in principle work this out by listing all the possible combinations of every possible outcome. A bit of reasoning takes much less work, but you have to think through the reasons.
I like starting the year with a look at the past year’s readership. Really what I like is sitting around waiting to see if WordPress is going to provide any automatically generated reports on this. The first few years I was here it did, this nice animated video with fireworks corresponding to posts and how they were received. That’s been gone for years and I suppose isn’t ever coming back. WordPress is run by a bunch of cowards.
But I can still do a look back the old-fashioned way, like I do with the monthly recaps. There’s just fewer years to look back on, and less reliable trends to examine.
2020 was my ninth full year of mathematics blogging. (I reach my tenth anniversary in September and no, I haven’t any idea what I’ll do for that. Most likely forget.) It was an unusual one in that I set aside what’s been my largest gimmick, the Reading the Comics essays, in favor of my second-largest gimmick, the A-to-Z. It’s the first year I’ve done an A-to-Z that didn’t have a month or two with a posting every day. Also along the way I slid from having a post every Sunday come what may to having a post every Wednesday, although usually also a Monday and a Friday also. Everyone claims it helps a blog to have a regular schedule, although I don’t know whether the particular day of the week counts for much. But how did all that work out for me?
So, I had a year that nearly duplicated 2019. There were 24,474 page views in 2020, down insignificantly from 2019’s 24,662. There were 16,870 unique visitors in 2020, up but also insignificantly from the 16,718 visiting in 2019. The number of likes continued to drift downward, from 798 in 2019 to 662 in 2020. My likes peaked in 2015 (over 3200!) and have fallen off ever since in what sure looks like a Poisson distribution to my eye. But the number of comments — which also peaked in 2015 (at 822) — actually rose, from 181 in 2019 to 198 in 2020.
There’s two big factors in my own control. One is when I post and, as noted, I moved away from Sunday posts midway through the year. The other is how much I post. And that dropped: in 2019 I had 201 posts published. In 2020 I posed only 178.
I thought of 2020 as a particularly longwinded year for me. WordPress says I published only 118,941 words, though, for an average of 672 words per posting. That’s my fewest number of words since 2014, though, and my shortest words-per-posting for the year going since 2013. Apparently throwing things off is all those posts that just point to earlier posts.
And what was popular among posts this year? Rather than give even more attention to how many kinds of trapezoid I can think of, I’ll focus just on what were the most popular things posted in 2020. Those were:
I am, first, surprised that so many Reading the Comics posts were among the most-read pieces. I like them, sure, but how many of them say anything that’s relevant one you’ve forgotten whether you read today’s Scary Gary? And yes, I am going to be bothered until the end of time that I was inconsistent about including the # symbol in the Playful Math Education Blog Carnival posts.
I fell off checking what countries sent me readers, month by month. I got bored writing an image alt-text of “Mercator-style map of the world, with the United States in dark red and most of the New World, western Europe, South and Pacific Rim Asia, Australia, and New Zealand in a more uniform pink” over and over and over again. But it’s a new year, it’s worth putting some fuss into things. And then, hey, what’s this?
Yeah! I finally got a reader from Greenland! Two page views, it looks like. Here’s the whole list, for the whole world.
United Arab Emirates
Hong Kong SAR China
Macau SAR China
Trinidad & Tobago
U.S. Virgin Islands
Bosnia & Herzegovina
Northern Mariana Islands
This is 141 countries, or country-like constructs, all together. I don’t know how that compares to previous years but I’m sure it’s the first time I’ve had five different countries send me a thousand page views each. That’s all gratifying to see.
So what plans have I got for 2021? And when am I going to get back to Reading the Comics posts? Good questions and I don’t know. I suppose I will pick up that series again, although since I took no notes last week, it isn’t going to be this week. At some time this year I want to do another A-to-Z, but I am still recovering from the workload of the last. Anything else? We’ll see. I am open to suggestions of things people think I should try, though.
This is, at least, a retrocomputing-adjacent piece. I’m looking back at the logic of a common and useful tool from the early-to-mid-80s and why it’s built that way. I hope you enjoy. It has to deal with some of the fussier points about how Commodore 64 computers worked. If you find a paragraph is too much technical fussing for you, I ask you to not give up, just zip on to the next paragraph. It’s interesting to know why something was written that way, but it’s all right to accept that it was and move to the next point.
How Did You Get Computer Programs In The 80s?
When the world and I were young, in the 1980s, we still had computers. There were two ways to get software, though. One was trading cassette tapes or floppy disks with cracked programs on them. (The cracking was taking off the copy-protection.) The other was typing. You could type in your own programs, certainly, just like you can make your own web page just by typing. Or you could type in a program. We had many magazines and books that had programs ready for entry. Some were serious programs, spreadsheets and word processors and such. Some were fun, like games or fractal-generators or such. Some were in-between, programs to draw or compose music or the such. Some added graphics or sound commands that the built-in BASIC programming language lacked. All this was available for the $2.95 cover price, or ten cents a page at the library photocopier. I had a Commodore 64 for most of this era, moving to a Commodore 128 (which also ran Commodore 64 programs) in 1989 or so. So my impressions, and this article, default to the Commodore 64 experience.
These programs all had the same weakness. You had to type them in. You can expect to make errors. If the program was written in BASIC you had a hope of spotting errors. The BASIC programming language uses common English words for its commands. Their grammar is not English, but it’s also very formulaic, and not hard to pick up. One has a chance of spotting mistakes if it’s 250 PIRNT "SUM; " S one typed.
But many programs were distributed as machine language. That is, the actual specific numbers that correspond to microchip instructions. For the Commodore 64, and most of the eight-bit home computers of the era, this was the 6502 microchip. (The 64 used a variation, the 6510. The differences between the 6502 and 6510 don’t matter for this essay.) Machine language had advantages, making the programs run faster, and usually able to do more things than BASIC could. But a string of numbers is only barely human-readable. Oh, you might in time learn to recognize the valid microchip instructions. But it is much harder to spot the mistakes on entering 32 255 120. That last would be a valid command on any eight-bit Commodore computer. It would have the computer print something, if it weren’t for the transposition errors.
What Was MLX and How Did You Use It?
The magazines came up with tools to handle this. In the 398-page(!) December 1983 issue of Compute!, my favorite line of magazines introduced MLX. This was a program, written in BASIC, which let you enter machine language programs. Charles Brannon has the credit for writing the article which introduced it. I assume he also wrote the program, but could be mistaken. I’m open to better information. Other magazines had other programs to do the same work; I knew them less well. MLX formatted machine language programs to look like this:
What did all this mean, though? These were lines you would enter in while running MLX. Before the colon was a location in memory. The numbers after the colon — the entries, I’ll call them — are six machine language instructions, one number to go into each memory cell. So, the number 169 was destined to go into memory location 49152. The number 002 would go into memory location 49153. The number 141 would go into memory location 49154. And so on; 000 would go into memory location 49158, 141 into 49159, 179 into 49160. 002 would go into memory location 49164; 141 would go into memory location 49170. And so on.
MLX would prompt you with the line number, the 49152 or 49158 or 49164 or so on. Machine language programs could go into almost any memory location. You had to tell it where to start. 49152 was a popular location for Commodore 64 programs. It was the start of a nice block of memory not easily accessed except by machine language programs. Then you would type in the entries, the numbers that follow. This was a reasonably efficient way to key this stuff in. MLX automatically advanced the location in memory and would handle things like saving the program to tape or disk when you were done.
The alert reader notices, though, that there are seven entries after the colon in each line. That seventh number is the checksum. It’s the guard that Compute! and Compute!’s Gazette put against typos. This seventh number was a checksum. MLX did a calculation based on the memory location and the first six numbers of the line. If it was not the seventh number on the line, then there was an error somewhere. You had to re-enter the line to get it right.
The thing I’d wondered, and finally got curious enough to explore, was how it calculated this.
What Was The Checksum And How Did It Work?
Happily, Compute! and Compute!’s Gazette published MLX in almost every issue, so it’s easy to find. You can see it, for example, on page 123 of the October 1985 issue of Compute!’s Gazette. And MLX was itself a BASIC program. There are quirks of the language, and its representation in magazine print, that take time to get used to. But one can parse it without needing much expertise. One important thing is that most Commodore BASIC commands didn’t need spaces after them. For an often-used program like this they’d skip the spaces. And the : symbol denoted the end of one command and start of another. So, for example, PRINTCHR$(20):IFN=CKSUMTHEN530 one learns means PRINT CHR$(20); IF N = CKSUM THEN 530.
So how does it work? MLX is, as a program, convoluted. It’s well-described by the old term “spaghetti code”. But the actual calculation of the checksum is done in a single line of the program, albeit one with several instructions. I’ll print it, but with some spaces added in to make it easier to read.
500 CKSUM = AD - INT(AD/256)*256:
FOR I = 1 TO 6:
CKSUM = (CKSUM + A(I))AND 255:
Most of this you have a chance of understanding even if you don’t program. CKSUM is the checksum number. AD is the memory address for the start of the line. A is an array of six numbers, the six numbers of that line of machine language. I is an index, a number that ranges from 1 to 6 here. Each A(I) happens to be a number between 0 and 255 inclusive, because that’s the range of integers you can represent with eight bits.
What Did This Code Mean?
So to decipher all this. Starting off. CKSUM = AD - INT(AD/256)*256. INT means “calculate the largest integer not greater than whatever’s inside”. So, like, INT(50/256) would be 0; INT(300/256) would be 1; INT(600/256) would be 2. What we start with, then, is the checksum is “the remainder after dividing the line’s starting address by 256”. We’re familiar with this, mathematically, as “address modulo 256”.
In any modern programming language, we’d write this as CKSUM = MOD(AD, 256) or CKSUM = AD % 256. But Commodore 64 BASIC didn’t have a modulo command. This structure was the familiar and comfortable enough workaround. But, read on.
The next bit was a for/next loop. This would do the steps inside for every integer value of I, starting at 1 and increasing to 6. CKSUM + A(I) has an obvious enough intention. What is the AND 255 part doing, though?
AND, here, is a logic operator. For the Commodore 64, it works on numbers represented as two-byte integers. These have a memory representation of 11111111 11111111 for ‘true’, and 00000000 00000000 for ‘false’. The very leftmost bit, for integers, is a plus-or-minus-sign. If that leftmost bit is a 1, the number is negative; if that leftmost bit is a 0, the number is positive. Did you notice me palming that card, there? We’ll come back to that.
Ordinary whole numbers can be represented in binary too. Like, the number 26 has a binary representation of 00000000 00011010. The number, say, 14 has a binary representation of 00000000 00001110. 26 AND 14 is the number 00000000 00001010, the binary digit being a 1 only when both the first and second numbers have a 1 in that column. This bitwise and operation is also sometimes referred to as masking, as in masking tape. The zeroes in the binary digits of one number mask out the binary digits of the other. (Which does the masking is a matter of taste; 26 AND 14 is the same number as 14 AND 26.)
The binary 00000000 0001010 is the decimal number 10. So you can see that generally these bitwise and operations give you weird results. Taking the bitwise and for 255 is more predictable, though. The number 255 has a bit representation of 00000000 11111111. So what (CKSUM + A(I)) AND 255 does is … give the remainder after dividing (CKSUM + A(I)) by 256. That is, it’s (CKSUM + A(I)) modulo 256.
The formula’s not complicated. To write it in mathematical terms, the calculation is:
Why Write It Like That?
So we have a question. Why are we calculating a number modulo 256 by two different processes? And in the same line of the program?
We get an answer by looking at the binary representation of 49152, which is 11000000 00000000. Remember that card I just palmed? I had warned that if the leftmost digit there were a 1, the number was understood to be negative. 49152 is many things, none of them negative.
So now we know the reason behind the odd programming choice to do the same thing two different ways. As with many odd programming choices it amounts to technical details of how Commodore hardware worked. The Commodore 64’s logical operators — AND, OR, and NOT — work on variables stored as two-byte integers. Two-byte integers can represent numbers from -32,768 up to +32,767. But memory addresses on the Commodore 64 are indexed from 0 up to 65,535. We can’t use bit masking to do the modulo operation, not on memory locations.
I have a second question, though. Look at the work inside the FOR loop. It takes the current value of the checksum, adds one of the entries to it, and takes the bitwise AND of that with 255. Why? The value would be the same if we waited until the loop was done to take the bitwise AND. At least, it would be unless the checksum grew to larger than 32,767. The checksum will be the sum of at most seven numbers, none of them larger than 255, though, so that can’t be the contraint. It’s usually faster to do as little inside a loop as possible, so, why this extravagance?
My first observation is that this FOR loop does the commands inside it six times. And logical operations like AND are very fast. The speed difference could not possibly be perceived. There is a point where optimizing your code is just making life harder for yourself.
My second observation goes back to the quirks of the Commodore 64. You entered commands, like the lines of a BASIC program, on a “logical line” that allowed up to eighty tokens. For typing in commands this is the same as the number of characters. Can this line be rewritten so there’s no redundant code inside the for loop, and so it’s all under 80 characters long?
Yes. This line would have the same effect and it’s only 78 characters:
I don’t have a clear answer. I suspect it’s for the benefit of people typing in the MLX program. In typing that in I’d have trouble not putting in a space between FOR and I, or between CKSUM and AND. Also before and after the TO and before and after AND. This would make the line run over 80 characters and make it crash. The original line is 68 characters, short enough that anyone could add a space here and there and not mess up anything. In looking through MLX, and other programs, I find there are relatively few lines more than 70 characters long. I have found them as long as 76 characters, though. I can’t rule out there being 78- or 79-character lines. They would have to suppose anyone typing them in understands when the line is too long.
There’s an interesting bit of support for this. Compute! also published machine language programs for the Atari 400 and 800. A version of MLX came out for the Atari at the same time the Commodore 64’s came out. Atari BASIC allowed for 120 characters total. And the equivalent line in Atari MLX was:
500 CKSUM=ADDR-INT(ADDR/256)*256:FOR I=1 TO 6:CKSUM=CKSUM+A(I):CKSUM=CKSUM-256*(CKSUM>255):NEXT I
This has a longer name for the address variable. It uses a different way to ensure that CKSUM stays a number between 0 and 255. But the whole line is only 98 characters.
We could save more spaces on the Commodore 64 version, though. Commodore BASIC “really” used only the first two characters of a variable name. To write CKSUM is for the convenience of the programmer. To the computer it would be the same if we wrote CK. We could even truncate it to CK for this one line of code. The only penalty would be confusing the reader who doesn’t remember that CK and CKSUM are the same variable.
And there’s no reason that this couldn’t have been two lines. One line could add up the checksum and a second could do the bitwise AND. Maybe this is all a matter of the programmer’s tastes.
In a modern language this is all quite zippy to code. To write it in Octave or Matlab is something like:
This is a bit verbose. I want it to be easier to see what work is being done. We could make it this compact:
function [checksOut] = oldmlx(oneline)
checksOut = !(mod(sum(oneline(1:7))-oneline(8), 256));
I don’t like compressing my thinking quite that much, though.
But that’s the checksum. Now the question: did it work?
Was This Checksum Any Good?
Since Compute! and Compute!’s Gazette used it for years, the presumptive answer is that it did. The real question, then, is did it work well? “Well” means does it prevent the kinds of mistakes you’re likely to make without demanding too much extra work. We could, for example, eliminate nearly all errors by demanding every line be entered three times and accept only a number that’s entered the same at least two of three times. That’s an incredible typing load. Here? We have to enter one extra number for every six. Much lower load, but it allows more errors through. But the calculation is — effectively — simply “add together all the numbers we typed in, and see if that adds to the expected total”. If it stops the most likely errors, though, then it’s good. So let’s consider them.
The first and simplest error? Entering the wrong line. MLX advanced the memory location on its own. So if you intend to write the line for memory location 50268, and your eye slips and you start entering that for 50274 instead? Or even, reading left to right, going to line 50814 in the next column? Very easy to do. This checksum will detect that nicely, though. Entering one line too soon, or too late, will give a checksum that’s off by 6. If your eye skips two lines, the checksum will be off by 12. The only way to not have the checksum miss is to enter a line that’s some multiple of 256 memory locations away. And since each line is six memory locations, that means you have to jump 768 memory locations away. That is 128 lines away. You are not going to make that mistake. (Going from one column in the magazine to the next is a jump of 91 lines. The pages were 8½-by-11 pages, so were a bit easier to read than the image makes them look.)
How about other errors? You could mis-key, say, 169. But think of the plausible errors. Typing it in as 159 or 196 or 269 would be detected by the checksum. The only one that wouldn’t would be to enter a number that’s equal to 169, modulo 256. So, 425, say, or 681. There is nobody so careless as to read 169 and accidentally type 425, though. In any case, other code in MLX rejects any data that’s not between 0 and 255, so that’s caught before the checksum comes into play.
So it’s safe against the most obvious mistake. And against mis-keying a single entry. Yes, it’s possible that you typed in the whole line right but mis-keyed the checksum. If you did that you felt dumb but re-entered the line. If you even noticed and didn’t just accept the error report and start re-entering the line.
What about mis-keying double entries? And here we have trouble. Suppose that you’re supposed to enter 169, 062 and instead enter 159, 072. They’ll add to the same quantity, and the same checksum. All that’s protecting you is that it takes a bit of luck to make two errors that exactly balance each other. But, then, slipping and hitting an adjacent number on the keyboard is an easy mistake to make.
Worse is entry transposition. If you enter 062, 169 instead you have made no checksum errors. And you won’t even be typing any number “wrong”. At least with the mis-keying you might notice that 169 is a common number and 159 a rare one in machine language. (169 was the command “Load Accumulator”. That is, copy a number into the Central Processing Unit’s accumulator. This was one of three on-chip memory slots. 159 was no meaningful command. It would only appear as data.) Swapping two numbers is another easy error to make.
And they would happen. I can attest from experience. I’d had at least one program which, after typing, had one of these glitches. After all the time spent entering it, I ended up with a program that didn’t work. And I never had the heart to go back and track down the glitch or, more efficiently, retype the whole thing from scratch.
The irony is that the program with the critical typing errors was a machine language compiler. It’s something that would have let me write this sort of machine language code. Since I never reentered it, I never created anything but the most trivial of machine language programs for the 64.
So this MLX checksum was fair. It devoted one-seventh of the typing to error detection. It could catch line-swap errors, single-entry mis-keyings, and transpositions within one entry. It couldn’t catch transposing two entries. So that could have been better. I hope to address that soon.
And now, finally, the close of the All 2020 Mathematics A-to-Z. You may see this as coming in after the close of 2020. I say, well, I’ve done that before. Things that come close to the end of the year are prone to that.
The first important lesson was that I need to read exactly what topics I’ve written about before going ahead with a new week’s topic. I am not sorry, really, to have written about Tiling a second time. I’d rather it have been more than two years after the previous time. But I can make a little something out of that, too. I enjoy the second essay more. I don’t think that’s only because I like my most recent writing more. In the second version I looked at one of those fussy little specific questions. Particularly, what were the 20,426 tiles which Robert Berger found could create an aperiodic tiling? Tracking that down brought me to some fun new connections. And it let me write in a less foggy way. It’s always tempting to write the most generally true thing possible. But details and example cases are easier to understand. It’s surprising that no one in the history of knowledge has observed this difference before I did.
The second lesson was about work during a crisis. 2020 was the most stressful year of my life, a fact I hope remains true. I am aware that ritual, doing regular routine things, helps with stress. So a regular schedule of composing an essay on a mathematical topic was probably a good thing for me. Committing to the essay meant I had specific, attainable goals on clear, predictable deadlines. The catch is that I never got on top of the A-to-Z the way I hoped. My ideal for these is to have the essay written a week ahead of publication. Enough that I can sleep on it many times and amend it as needed. I never got close to this. I was running up close to deadline every week. If I were better managing all this I’d have gotten all November’s essays written before the election, and I didn’t, and that’s why I had to slip a week. I have always been a Sabbath-is-made-for-man sort, so don’t feel bad for slipping the week. But I would have liked to never had had a week when I was copy-editing a half-hour before publication.
It does all imply that I need to do what I resolve every year. Select topics sooner. Start research and drafts sooner. Let myself slip a deadline when that’s needed. But there is also the observation that apparently I can’t cut down the time I spend writing. The first several years of this, believe it or not, I wrote three essays a week for eight intense weeks. These would be six to eight hundred words each. Then I slacked off, doing two a week; these of course grew to a thousand, maybe 1200 words each. For 2020? One essay a week and more than one topped 2500 words. Yes, the traditional joke is that you write a lot because you don’t have the time to write briefly. But writing a lot takes time too.
They’re challenging. In the pandemic particularly, as I can’t rely on the university library for a quick biography to read. Or to check journals of mathematical history, although I haven’t resorted to such actual information yet. But I’m also aware that I am not a historian or a real biographer. I have to balance drawing conclusions I can feel confident are not wrong with making declarations that are interesting to read. Still, I enjoy a focus on the culture of mathematics, and how mathematics interacts with the broader culture. It’s a piece mathematicians tend not to acknowledge; our field’s reputation for objective truth is a compelling romantic legend.
I do plan to write an A-to-Z for 2021. I suspect I’ll do it as this year, one per week. I don’t know when I’ll start, although it should be earlier than June. I’ll want to give myself more possible slip dates without running off the year. I will not be writing about tiling again. I do realize that, since I have seven A-to-Z sequences of 26 essays each, I could in principle fill half a year with writing by reblogging each, one a day. I’m not sure the point of such an exercise, but it would at least fill the content hole.
There is a side of me that would like to have a blogging gimmick that doesn’t commit me to 26 essays. I’ve tried a couple; they haven’t ever caught like this has. Maybe I could do something small and focused, like, ten terms from complex analysis. I’m open to suggestions.
When will I resume covering mathematical themes in comic strips? I don’t know; it’s the obvious thing to do while I wait for the A-to-Z cycle to start anew. It’s got some of the A-to-Z thrill, of writing about topics someone else chose. But I need some time to relax and play and I don’t know when I’ll be back to regular work.
I’m very slightly sorry to bump other things. But folks who like the history of mathematics, and how it links to other things, and who also like listening to stuff, might want to know. Peter Adamson, host of the History Of Philosophy Without Any Gaps podcast, this week talked for about twenty minutes about Girolamo Cardano.
Cardano is famous in mathematics circles for early work in probability. And, more, for pioneering the use of imaginary numbers. This along the way to a fantastic controversy about credit, and discovery, and secrets, and self-promotion.
Cardano was, as Adamson notes, a polymath; his day job was as a physician and he poked around in the philosophy of mind. That’s what makes him a fit subject for Adamson’s project. So if you’d like a different perspective on a person known, if vaguely, to many mathematics folks, and have a spot of time, you might enjoy.
And a happy new year, at last, to all. I’ll take this chance first to look at my readership figures from December. Later I’ll look at the whole year, and what things I would learn from that if I were capable of learning from this self-examination.
I had 13 posts here in December, which is my lowest count since June. For the twelve months from December 2019 through November 2020, I’d posted a mean of 15.3 and a median of 15 posts. So that’s relatively quiet. My blog overall got 2,366 page views from 1,751 unique visitors. That’s a decline from October and November. But it’s still above the running averages, which had a mean of 1,957.8 and median of 1,974 page views. And a mean of 1,335.7 and median of 1,290.5 unique visitors.
There were 51 likes given to posts in December. That’s barely below the twelve-month running averages, which had a mean of 54.6 and a median of 52 likes. The number of comments collapsed to a mere 4 and while it’s been worse, it’s still dire. There were a mean of 15.3 and median of 15 comments through the twelve months before that.
If it’s disappointing to see numbers drop, and it is, there’s some evidence that it’s all my own fault. Even beyond that this is my blog and I’m the only one writing for it. That is in the per-posting statistics. There were 182.0 views per posting, which is well above the averages (132.0 mean, 132.6 median). It’s also near the averages in November (191.5) and October (169.1). Likes per posting were even better: 3.9, compared to a running average mean of 3.5 and running average median of 3.4. The per-posting likes had been 4.0 and 4.4 the previous months. Comments per posting — 0.3 — is still a dire number, though. The running-average mean was 1.1 per posting and median of 1.0 per posting.
It suggests that the best thing I can do for my statistics is post more. Most of December’s posts were little but links to even earlier posts. This feels like cheating to me, to do too often. On the other hand, I’ve had 1,580 posts over the past decade; why have that if I’m not going to reuse them? And, yes, it’s a bit staggering to imagine that I could repost one entry a day for four and a third years before I ran out. (Granting that lot of those would be references to earlier posts. Or things like monthly statistics recaps that make not a lick of sense to repeat.)
What were popular posts from November or December 2020? It turns out the five most popular posts from that stretch were all December ones:
It feels weird that How Many Of This Weird Prime Are There? was so popular since that was posted the 30th of December. (And late, at that, as I didn’t schedule it right.) So in 30 hours it attracted more readers than posts that had all of November and December to collect readers. I guess there’s something about weird primes that people want to read about. Although not to comment on with their answers to the third prime of the form … well, maybe they’re leaving it for other people to find, unspoiled. I also always find it weird that these How-A-Month-Treated-My-Blog posts are so popular. I think other insecure bloggers like to see someone else suffering.
According to WordPress I published 7,758 words in December. This is only my fourth-most-laconic month in 2020. This put me also at an average of 596.8 words per posting in December. My average for all 2020 was 672 words per posting, so all those recaps were in theory saving me time.
Also according to WordPress, I started January 2021 with a total of 1,581 posts ever. (There’s one secret post, created to test some things out; there’s no sense revealing or deleting it.) These have drawn a total 122,051 views from 69,848 logged unique visitors. It’s not a bad record for a blog entering its tenth year of publication without ever getting a clear identity.
My Twitter account has gone feral. While it’s still posting announcements, I don’t read it, because I don’t have the energy to figure out why it sometimes won’t load. If you want to social-media thing with me try me on the Mastodon account @firstname.lastname@example.org. Mathstodon is a mathematics-themed instance of that microblogging network you remember hearing something about somewhere but not what anybody said about it.
And, yeah, I hope to have my closing thoughts about the 2020 A-To-Z later this week. Thank you all for reading.
A friend made me aware of a neat little unsolved problem in number theory. I know it seems like number theory is nothing but unsolved problems, but this is an unfair reputation. There are as many as four solved problems in number theory. It’s a tough field.
The question started with the observation that 11 is a prime number. And so is 101. But 1,001 is not; nor is 10,001. How many prime numbers are there that have the form , for whole-number values of n? Are there infinitely many? Finitely many? If there’s finitely many, how many are there?
It turns out this is an open question. We know of three prime numbers that you can write as . I’ll leave the third for you to find.
One neat bit is that if there are more prime numbers, they have to be ones where n is itself a whole power of 2. That is, where the number is for some whole number k. They’ve been tested up to at least, so this subset of the Generalized Fermat Numbers seems to be rare. But wouldn’t it be just our luck if from onward they were nothing but primes?
Folks who’ve been with me a long while know one of my happy Christmastime traditions is watching the Aardman Animation film Arthur Christmas. The film also gave me a great mathematical-physics question. You might consider some questions it raises.
First: Could `Arthur Christmas’ Happen In Real Life? There’s a spot in the movie when Arthur and Grand-Santa are stranded on a Caribbean island while the reindeer and sleigh, without them, go flying off in a straight line. What does a straight line on the surface of the Earth mean?
Second: Returning To Arthur Christmas. From here spoilers creep in and I have to discuss, among other things, what kind of straight line the reindeer might move in. There is no one “right” answer.
Third: Arthur Christmas And The Least Common Multiple. If we suppose the reindeer move in a straight line the way satellites move in a straight line, we can calculate how long Arthur and Grand-Santa would need to wait before the reindeer and sled are back if they’re lucky enough to be waiting on the equator.
Fourth: Six Minutes Off. Waiting for the reindeer to get back becomes much harder if Arthur and Grand-Santa are not on the equator. This has potential dangers for saving the day.
Fifth and last: Arthur Christmas and the End of Time. We get to the thing that every mathematical physics blogger really really wants to get into. This is the paradox that conservation of energy and the fact of entropy seem to force us into some weird conclusions, if the universe can get old enough. Maybe; there’s some extra considerations, though, that can change the conclusion.
I am happy, as ever, to complete an A-to-Z. Also to take some time to recover after the project. I had thought that spreading things out to 26 weeks would make them less stressful, and instead, I just wrote even longer pieces, in compensation. I’ll try to have other good observations in an essay next week.
For now, though, a piece that I will find useful for years to come: a roster of what essays I wrote this year. In future years, I may even check them before writing a third piece about tiling.
Jacob Siehler had several suggestions for this last of the A-to-Z essays for 2020. Zorn’s Lemma was an obvious choice. It’s got an important place in set theory, it’s got some neat and weird implications. It’s got a great name. The zero divisor is one of those technical things mathematics majors have deal with. It never gets any pop-mathematics attention. I picked the less-travelled road and found a delightful scenic spot.
3 times 4 is 12. That’s a clear, unambiguous, and easily-agreed-upon arithmetic statement. The thing to wonder is what kind of mathematics it takes to mess that up. The answer is algebra. Not the high school kind, with x’s and quadratic formulas and all. The college kind, with group theory and rings.
A ring is a mathematical construct that lets you do a bit of arithmetic. Something that looks like arithmetic, anyway. It has a set of elements. (An element is just a thing in a set. We say “element” because it feels weird to call it “thing” all the time.) The ring has an addition operation. The ring has a multiplication operation. Addition has an identity element, something you can add to any element without changing the original element. We can call that ‘0’. The integers, or to use the lingo , are a ring (among other things).
Among the rings you learn, after the integers, is the integers modulo … something. This can be modulo any counting number. The integers modulo 10, for example, we write as for short. There are different ways to think of what this means. The one convenient for this essay is that it’s the integers 0, 1, 2, up through 9. And that the result of any calculation is “how much more than a whole multiple of 10 this calculation would otherwise be”. So then 3 times 4 is now 2. 3 times 5 is 5; 3 times 6 is 8. 3 times 7 is 1, and doesn’t that seem peculiar? That’s part of how modulo arithmetic warns us that groups and rings can be quite strange things.
We can do modulo arithmetic with any of the counting numbers. Look, for example, at instead. In the integers modulo 5, 3 times 4 is … 2. This doesn’t seem to get us anything new. How about ? In this, 3 times 4 is 4. That’s interesting. It doesn’t make 3 the multiplicative identity for this ring. 3 times 3 is 1, for example. But you’d never see something like that for regular arithmetic.
How about ? Now we have 3 times 4 equalling 0. And that’s a dramatic break from how regular numbers work. One thing we know about regular numbers is that if a times b is 0, then either a is 0, or b is zero, or they’re both 0. We rely on this so much in high school algebra. It’s what lets us pick out roots of polynomials. Now? Now we can’t count on that.
When this does happen, when one thing times another equals zero, we have “zero divisors”. These are anything in your ring that can multiply by something else to give 0. Is, zero, the additive identity, always a zero divisor. … That depends on what the textbook you first learned algebra from said. To avoid ambiguity, you can write a “nonzero zero divisor”. This clarifies your intentions and slows down your copy editing every time you read “nonzero zero”. Or call it a “nontrivial zero divisor” or “proper zero divisor” instead. My preference is to accept 0 as always being a zero divisor. We can disagree on this. What of zero divisors other than zero?
Your ring might or might not have them. It depends on the ring. The ring of integers , for example, doesn’t have any zero divisors except for 0. The ring of integers modulo 12 , though? Anything that isn’t relatively prime to 12 is a zero divisor. So, 2, 3, 6, 8, 9, and 10 are zero divisors here. The ring of integers modulo 13 ? That doesn’t have any zero divisors, other than zero itself. In fact any ring of integers modulo a prime number, , lacks zero divisors besides 0.
Focusing too much on integers modulo something makes zero divisors sound like some curious shadow of prime numbers. There are some similarities. Whether a number is prime depends on your multiplication rule and what set of things it’s in. Being a zero divisor in one ring doesn’t directly relate to whether something’s a zero divisor in any other. Knowing what the zero divisors are tells you something about the structure of the ring.
It’s hard to resist focusing on integers-modulo-something when learning rings. They work very much like regular arithmetic does. Even the strange thing about them, that every result is from a finite set of digits, isn’t too alien. We do something quite like it when we observe that three hours after 10:00 is 1:00. But many sets of elements can create rings. Square matrixes are the obvious extension. Matrixes are grids of elements, each of which … well, they’re most often going to be numbers. Maybe integers, or real numbers, or complex numbers. They can be more abstract things, like rotations or whatnot, but they’re hard to typeset. It’s easy to find zero divisors in matrixes of numbers. Imagine, like, a matrix that’s all zeroes except for one element, somewhere. There are a lot of matrices which, multiplied by that, will be a zero matrix, one with nothing but zeroes in it. Another common kind of ring is the polynomials. For these you need some constraint like the polynomial coefficients being integers-modulo-something. You can make that work.
In 1988 Istvan Beck tried to establish a link between graph theory and ring theory. We now have a usable standard definition of one. If is any ring, then is the zero-divisor graph of . (I know some of you think is the real numbers. No; that’s a bold-faced instead. Unless that’s too much bother to typeset.) You make the graph by putting in a vertex for the elements in . You connect two vertices a and b if the product of the corresponding elements is zero. That is, if they’re zero divisors for one other. (In Beck’s original form, this included all the elements. In modern use, we don’t bother including the elements that are not zero divisors.)
Drawing this graph makes tools from graph theory available to study rings. We can measure things like the distance between elements, or what paths from one vertex to another exist. What cycles — paths that start and end at the same vertex — exist, and how large they are. Whether the graphs are bipartite. A bipartite graph is one where you can divide the vertices into two sets, and every edge connects one thing in the first set with one thing in the second. What the chromatic number — the minimum number of colors it takes to make sure no two adjacent vertices have the same color — is. What shape does the graph have?
And this lets me complete a cycle in this year’s A-to-Z, to my delight. There is an important question in topology which group theory could answer. It’s a generalization of the zero-divisors conjecture, a hypothesis about what fits in a ring based on certain types of groups. This hypothesis — actually, these hypotheses. There are a bunch of similar questions about invariants called the L2-Betti numbers can be. These we call the Atiyah Conjecture. This because of work Michael Atiyah did in the cohomology of manifolds starting in the 1970s. It’s work, I admit, I don’t understand well enough to summarize, and hope you’ll forgive me for that. I’m still amazed that one can get to cutting-edge mathematics research this. It seems, at its introduction, to be only a subversion of how we find x for which .
And for the last of this year’s (planned) exhumations from my archives? It’s a piece from summer 2017: Zeta Function. As will happen in mathematics, there are many zeta functions. But there’s also one special one that people find endlessly interesting, and that’s what we mean if we say “the zeta function”. It, of course, goes back to Bernhard Riemann.
Also a cute note I saw going around. If you cut off the century years then the date today — the 16th day of the 12th month of the 20th year of the century — you get a rare Pythagorean triplet. and after a moment we notice that’s the famous 3-4-5 Pythagorean triplet all over again. If you miss it, well, that’s all right. There’ll be another along in July of 2025, and one after that in October of 2026.
To dig something out of my archives today, I offer the Zermelo-Fraenkel Axioms. This wrapped up the End 2016 A-to-Z. On the last day of 2016, I see; I didn’t realize I was cutting things that close that year. These are fundamentals of set theory, which is the study of what you can include and what you exclude from a set of things. For a while in the 20th century this looked likely to be the foundation of mathematics, from which everything else could be derived. We’ve moved on now to thinking that category theory is more likely the core. But set theory remains a really good foundation. You can understand a lot of what’s interesting about it without needing more than a child’s ability to make marks on paper and draw circles around some of them. Or, like my essays insist on doing, without even doing the drawings that would make it all easier to follow.
Nobody had particular suggestions for the letter ‘Y’ this time around. It’s a tough letter to find mathematical terms for. It doesn’t even lend itself to typography or wordplay the way ‘X’ does. So I chose to do one more biographical piece before the series concludes. There were twists along the way in writing.
Several problems beset me in writing about this significant 13th-century Chinese mathematician. One is my ignorance of the Chinese mathematical tradition. I have little to guide me in choosing what tertiary sources to trust. Another is that the tertiary sources know little about him. The Complete Dictionary of Scientific Biography gives a dire verdict. “Nothing is known about the life of Yang Hui, except that he produced mathematical writings”. MacTutor’s biography gives his lifespan as from circa 1238 to circa 1298, on what basis I do not know. He seems to have been born in what’s now Hangzhou, near Shanghai. He seems to have worked as a civil servant. This is what I would have imagined; most scholars then were. It’s the sort of job that gives one time to write mathematics. Also he seems not to have been a prominent civil servant; he’s apparently not listed in any dynastic records. After that, we need to speculate.
E F Robertson, writing the MacTutor biography, speculates that Yang Hui was a teacher. That he was writing to explain mathematics in interesting and helpful ways. I’m not qualified to judge Robertson’s conclusions. And Robertson notes that’s not inconsistent with Yang being a civil servant. Robertson’s argument is based on Yang’s surviving writings, and what they say about the demonstrated problems. There is, for example, 1274’s Cheng Chu Tong Bian Ben Mo. Robertson translates that title as Alpha and omega of variations on multiplication and division. I try to work out my unease at having something translated from Chinese as “Alpha and Omega”. That is my issue. Relevant here is that a syllabus prefaces the first chapter. It provides a schedule and series of topics, as well as a rationale for why this plan.
Was Yang Hui a discoverer of significant new mathematics? Or did he “merely” present what was already known in a useful way? This is not to dismiss him; we have the same questions about Euclid. He is held up as among the great Chinese mathematicians of the 13th century, a particularly fruitful time and place for mathematics. How much greatness to assign to original work and how much to good exposition is unanswerable with what we know now.
Consider for example the thing I’ve featured before, Yang Hui’s Triangle. It’s the arrangement of numbers known in the west as Pascal’s Triangle. Yang provides the earliest extant description of the triangle and how to form it and use it. This in the 1261 Xiangjie jiuzhang suanfa (Detailed analysis of the mathematical rules in the Nine Chapters and their reclassifications). But in it, Yang Hui says he learned the triangle from a treatise by Jia Xian, Huangdi Jiuzhang Suanjing Xicao (The Yellow Emperor’s detailed solutions to the Nine Chapters on the Mathematical Art). Jia Xian lived in the 11th century; he’s known to have written two books, both lost. Yang Hui’s commentary gives us a fair idea what Jia Xian wrote about. But we’re limited in judging what was Jia Xian’s idea and what was Yang Hui’s inference or what.
The Nine Chapters referred to is Jiuzhang suanshu. An English title is Nine Chapters on the Mathematical Art. The book is a 246-problem handbook of mathematics that dates back to antiquity. It’s impossible to say when the Nine Chapters was first written. Liu Hui, who wrote a commentary on the Nine Chapters in 263 CE, thought it predated the Qin ruler Shih Huant Ti’s 213 BCE destruction of all books. But the book — and the many commentaries on the book — served as a centerpiece for Chinese mathematics for a long while. Jia Xian’s and Yang Hui’s work was part of this tradition.
Yang Hui’s Detailed Analysis covers the Nine Chapters. It goes on for three chapters, more about geometry and fundamentals of mathematics. Even how to classify the problems. He had further works. In 1275 Yang published Practical mathematical rules for surveying and Continuation of ancient mathematical methods for elucidating strange properties of numbers. (I’m not confident in my ability to give the Chinese titles for these.) The first title particularly echoes how in the Western tradition geometry was born of practical concerns.
The breadth of topics covers, it seems to me, a decent modern (American) high school mathematics education. The triangle, and the binomial expansions it gives us, fit that. Yang writes about more efficient ways to multiply on the abacus. He writes about finding simultaneous solutions to sets of equations. And through a technique that amounts to finding the matrix of coefficients for the equations, and its determinant. He writes about finding the roots for cubic and quartic equations. The technique is commonly known in the west as Horner’s Method, a technique of calculating divided differences. We see the calculating of areas and volumes for regular shapes.
And sequences. He found the sum of the squares of natural numbers followed a rule:
And then there’s magic squares, and magic circles. He seems to have found them, as professional mathematicians today would, good ways to interest people in calculation. Not magic; he called them something like number diagrams. But he gives magic squares from three-by-three all the way to ten-by-ten. We don’t know of earlier examples of Chinese mathematicians writing about the larger magic squares. But Yang Hui doesn’t claim to be presenting new work. He also gives magic circles. The simplest is a web of seven intersecting circles, each with four numbers along the circle and one at its center. The sum of the center and the circumference numbers are 65 for all seven circles. Is this significant? No; merely fun.
Grant this breadth of work. Is he significant? I learned this year that familiar names might have been obscure until quite recently. The record is once again ambiguous. Other mathematicians wrote about Yang Hui’s work in the early 1300s. Yang Hui’s works were printed in China in 1378, says the Complete Dictionary of Scientific Biography, and reprinted in Korea in 1433. They’re listed in a 1441 catalogue of the Ming Imperial Library. Seki Takakazu, a towering figure in 17th century Japanese mathematics, copied the Korean text by hand. Yet Yang Hui’s work seems to have been lost by the 18th century. Reconstructions, from commentaries and encyclopedias, started in the 19th century. But we don’t have everything we know he wrote. We don’t even have a complete text of Detailed Analysis. This is not to say he wasn’t influential. All I could say is there seems to have been a time his influence was indirect.
I will be late with this week’s A-to-Z essay. I’ve had more demands on my time and my ability to organize thoughts than I could manage and something had to yield. I’m sorry for that but figure to post on Friday something for the letter ‘Y’.
But there is some exciting news in one of my regular Reading the Comics features. It’s about the kid who shows up often in Mark Anderson’s Andertoons. At the nomination of — I want to say Ray Kassinger? — I’ve been calling him “Wavehead”. Last week, though, the strip gave his name. I don’t know if this is the first time we’ve seen it. It is the first time I’ve noticed. He turns out to be Tommy.
And what about my Reading the Comics posts, which have been on suspension since the 2020 A-to-Z started? I’m not sure. I figure to resume them after the new year. I don’t know that it’ll be quite the same, though. A lot of mathematics mentions in comic strips are about the same couple themes. It is exhausting to write about the same thing every time. But I have, I trust, a rotating readership. Someone may not know, or know how to find, a decent 200-word piece about lotteries published four months in the past. I need to better balance not repeating myself.
Also a factor is lightening my overhead. Most of my strips come from Comics Kingdom or GoComics. Both of them also cull strips from their archives occasionally, leaving me with dead links. (GoComics particularly is dropping a lot of strips by the end of 2020. I understand them dumping, say, The Sunshine Club, which has been in reruns since 2007. But Dave Kellett’s Sheldon?)
The only way to make sure a strip I write about remains visible to my readers is to include it here. But to make my including the strip fair use requires that I offer meaningful commentary. I have to write something substantial, and something that’s worsened without the strip to look at. You see how this builds to a workload spiral, especially for strips where all there is to say is it’s a funny story problem. (If any cartoonists are up for me being another, unofficial archive for their mathematics-themed strips? Drop me a comment, Bill Amend, we can work something out if it doesn’t involve me sending more money than I’m taking in.)
So I don’t know how I’ll resolve all this. Key will be remembering that I can just not do the stuff I find tedious here. I will not, in fact, remember that.
I have many flaws as a pop mathematics blogger. Less depth of knowledge than I should have, for example. A tendency to start a series before I have a clear ending, so that projects will peter out rather than resolve. A-to-Z’s are different, as they have a clear direction and ending. And a frightful cultural bias, too. I’m terribly weak on mathematics outside the western tradition. Yang Hui’s Triangle, an essay I wrote about in the End 2020 A-to-Z, is a slight correction to that. I grew up learning this under a different name, that of a Western mathematician who studied the thing centuries after Yang Hui did. But then Yang Hui credited an earlier-yet mathematician, Jia Xian, for the insight. It’s difficult to get anything in mathematics named for the “correct” person.
I am again looking at the past month’s readership figures. And I’m again doing this in what I mean to be a lower-key form. November was a relatively laconic month for me, at least by A-to-Z standards.
I had only 15 posts in November, not many more than would be in a normal month. The majority of posts were pointers to earlier posts yet. It doesn’t seem to have hurt my readership, though. WordPress says there were 2,873 pages viewed in November, for an average of 191.5 views per posting. This is a good bit above the twelve-month running average leading up to November. That average was a mere 1,912.8 views for a month and 81.6 views per posting. This is because that anomalously high October 2019 figure has passed out of the twelve-month range. There were 2,067 unique visitors logged, for 137.8 unique visitors per posting. The twelve-month running average was 1,294.1 unique visitors for the month, and 81.6 unique visitors per posting. So that’s suggestive of readership growth over the past year.
The things that signal engaged readers were more ambiguous, as they always are. There were 60 things liked in November, or an average of 4.0 likes per posting. The twelve-month running average had 57.5 likes for a month, and 3.5 likes per posting. There were 11 comments given over the month, an average of 0.7 per posting. And that is below the twelve-month running average of 17.2 for a month and 1.1 comments per posting. I did have an appeal for topics for the A-to-Z, which usually draws comments. But they were for unappealing letters like W and X and it takes some inspiration to think of good mathematics terms for that part of the alphabet.
I like to look over the most popular postings I’ve had but every month it’s either trapezoids or record grooves. I did start limiting my listing to the most popular things posted in the two prior months, so new stuff has a chance at appearing. I make it the two prior months so that things which appeared at the end of a month might show up. And then that got messed up. The most popular recent post was from the end of September: Playful Math Education Blog Carnival 141. It’s a collection of recreational or education-related mathematics you might like. I’m not going to ignore that just because it published three days before October started.
November’s most popular things posted in October or November were:
I have no idea why these post reviews are always popular. I think people might see there’s a list or two in the middle and figure that must be a worthwhile essay. Someday I’ll put up some test essays that are complete nonsense, one with a list and one without, and see how they compare. Of course, now you know the trick and won’t fall for it.
If WordPress’s numbers are right, in November I published 7,304 words, barely more than half of October’s total. It was my tersest month since January. Per post it was even more dramatic: a mere 486.9 words per posting in November, my lowest of the year, to date. My average words per posting, for 2020, dropped to 678.
As of the start of December I’ve had 1,568 total postings here. They’ve gathered 119,685 page views from a logged 68,097 unique visitors.
If you’d like to follow on WordPress, you can add this to your Reading page by clicking the “Follow Nebusresearch” button on the page.
My essays are announced on Twitter as @nebusj. Don’t try to talk with me there. The account’s gone feral. There’s an automated publicity thing on WordPress that posts to it, and is the only way I have to reliably post there. If you want to social-media talk with me look to the mathematics-themed Mathstodon and my account @email@example.com. Or you can leave a comment. Dad, you can also e-mail me. You know the address. The rest of you don’t know, but I bet you could guess it. Not the obvious first guess, though. Around your fourth or fifth guess would get it. I know that changes what your guesses would be.
Thank you all for reading. Have fun with that logic problem.
When developing general relativity, Albert Einstein created a convention. He’s not unique in that. All mathematicians create conventions. They use shorthand for an idea that’s complicated or common. Relatively unique is that other people adopted his convention, because it expressed an idea compactly. This was in working with tensors, which look somewhat like matrixes and have a lot of indexes. In the equations of general relativity you need to take sums over many combinations of values of these indexes. What indexes there are are the same in most every problem. The possible values of the indexes is constant, problem to problem, too.
So Einstein saved himself writing, and his publishers from typesetting, a lot of redundant writing. This by writing out the conditions which implied “take the sums over these indexes on this range”. This is good for people doing general relativity, and certain kinds of geometry. It’s a problem only when an expression escapes its context. When it’s shown to a student or someone who doesn’t know this is a differential-geometry problem. Then the problem becomes confusing, and they can’t work on it.
This is not to fault the Einstein Summation Convention. It puts common necessary scaffolding out of the way and highlighting the interesting unique parts of a problem. Most conventions aim for that. We have the hazard, though, that we may not notice something breaking the convention.
And this is how we create extraneous solutions. And, as a bonus, to have missing solutions. We encounter them with the start of (high school) algebra, when we get used to manipulating equations. When we solve an equation what we always want is something clear, like
But it never starts that way. It always starts with something like
or worse. We learn how to handle this. We know that we can do six things that do not alter the truth of an equation. We can regroup terms in the equation. We can add the same number to both sides of the equation. We can multiply both sides of the equation by some number besides zero. We can add zero to one side of the equation. We can multiply one side of the equation by 1. We can replace one quantity with another that has the same value. That doesn’t sound like a lot. It covers more than it seems. Multiplying by 1, for example, is the same as multiplying by . If x isn’t zero, then we can multiply both sides of the equation by that x. And x can’t be zero, or else would not be 1.
So with my example there, start off by multiplying the right side by 1, in the guise . Then multiply both sides by that same non-zero x. At this point the right-hand side simplifies to being 6. Add a -6 to both sides. And then with a lot of shuffling around you work out that the equation is the same as
And that can only be true when x equals 2.
It should be easy to catch spurious solutions creeping in. They must result from breaking a rule. The obvious problem is multiplying — or dividing — by zero. We expect those to be trouble. Wikipedia has a fine example:
The obvious step is to multiply this whole mess by , which turns our work into a linear equation. Very soon we find the solution must be . Which would make at least two of the denominators in the original equation zero. We know not to want that.
The problems can be subtler, though. Consider:
That’s not hard to solve. Multiply both sides by . Although, before working out substitute that with something equal to it. We know one thing is equal to it, . Then we have
It’s a quadratic equation. A little bit of work shows the roots are 9 and 16. One of those answers is correct and the other spurious. At no point did we divide anything, by zero or anything else.
So what is happening and what is the necessary rhetorical link to the Einstein Summation Convention?
There are many ways to look at equations. One that’s common is to look at them as functions. This is so common that we’ll elide between an equation and a function representation. This confuses the prealgebra student who wants to know why sometimes we look at
and sometimes we look at
and sometimes at
The advantage of looking at the function which shadows any equation is we have different tools for studying functions. Sometimes that makes solving the equation easier. In this form, we’re looking for what in the domain matches with something particular in the range.
And now we’ve reached the convention. When we write down something lke we’re implicitly defining a function. A function has three pieces. It has a set called the domain, from which we draw the independent variable. It has a set called the range. It has a rule matching elements in the domain to an element in the range. We’ve only given the rule. What are the domain and what’s the range for ?
And here are the conventions. If we haven’t said otherwise, the domain and range are usually either the real numbers or the complex numbers. If we used x or y or t as the independent variable, we mean the real numbers. If we used z as the independent variable, and haven’t already put x and y in, we mean the complex numbers. Sometimes we call in s or w or another letter; never mind that. The range can be the whole set of real or complex numbers. It does us no harm to have too large a range.
The domain, though. We do insist that everything in the domain match to something in the range. And, like, ? That can’t mean anything if x equals 2.
So we take an implicit definition of the domain: it’s all the real numbers for which the function’s rule is meaningful. So, would have a domain “real numbers other than 2”. would have a domain “real numbers other than 2 and -2”.
We create extraneous solutions — or we lose some — when our convention changes the domain. An extraneous solution is one that existed outside the original problem’s domain. A missing solution is one that existed in an excised part of the domain. To go from to by dividing out x is to cut out of the space of possible solutions.
A complaint you might raise. What is the domain for ? Rewrite that as a function. would seem to have a domain “x greater than or equal to 0”. The extraneous solution is , a number which rumor has it is greater than or equal to 0. What happened?
We have to take that equation-handling more slowly. We had started out with
The domain has to be “x is greater than or equal to 0” here. All right. The next step was multiplying both sides by the same quantity, . So:
The domain is still “x is greater than or equal to 0”. The next step, though, was a substitution. I wanted to replace the on the right with . We know, from the original equation, that those are equal. At least, they’re equal wherever the original equation is true. What happens when , though?
We start to see the catch. 9 – 12 is -3. And while it’s true that -3 squared will be 9, it’s false that -3 is the square root of 9. The equation can only be true, for real numbers, if is nonnegative. We can make this rigorous with two supplementary functions. Let me call and .
has an implicit domain of “x greater than or equal to 0”. What’s the domain of ? If , like we said it does, then they have to agree for every x in either’s domain. So can’t have in its domain any x for which isn’t defined. So the domain of has to be “x for which x – 12 is greater than or equal to 0”. And that’s “x greater than or equal to 12”.
So the domain for the original equation is “x greater than or equal to 12”. When we keep that domain in mind, the extraneous nature of is clear, and we avoid trouble.
Not all extraneous solutions come from algebraic manipulations. Sometimes there are constraints on the problem, rather than the numbers, that make a solution absurd. There is a betting strategy called the martingale. This amounts to doubling the bet every time one loses. This makes the first win balance out all the losses leading to it. This solution fails because the player has a finite wallet, and after a few losses any player hasn’t got the money to continue.
Or consider a case that may be legend. It concerns the Apollo Guidance Computer. It was designed to take the Lunar Module to a spot at zero altitude above the moon’s surface, with zero velocity. The story is that in early test runs, the computer would not avoid trajectories that dropped to a negative altitude along the way to the surface. One imagines the scene after the first Apollo subway trip. (I have not found a date when such a test run was done, or corrections to the code ordered. If someone knows, I’d appreciate learning specifics.)
The convention, that we trust the domain is “everything which makes sense”, is not to blame here. It’s normally a good convention. Explicitly noting the domain at every step is tedious and, most of the time, unenlightening. It belongs in the background. We also must check our possible solutions, and that they represent things that make sense. We can try to concentrate our thinking on the obvious interesting parts, but must spend some time on the rest also.
As mentioned, ‘X’ is a difficult letter for a glossary project. There aren’t many mathematical terms that start with the letter, as much as it is the default variable name. Making things better is that many of the terms that do are important ones. Xor, from my 2015 A-to-Z, is an example of this. It’s one of the major pieces of propositional logic, and anyone working in logic gets familiar with it really fast.
The letter ‘X’ is a problem for this sort of glossary project. At least around the fourth time you do one, as you exhaust the good terms that start with the letter X. In 2018, I went to the Extreme Value Theorem, using the 1990s Rule that x- and ex- were pretty much the same thing. The Extreme Value Theorem is one of those little utility theorems. On a quick look it seems too obvious to tell us anything useful. It serves a role in proofs that do tell us interesting, surprising things.
Today (the 26th of November) is the Thanksgiving holiday in the United States. The holiday’s set, by law since 1941, to the fourth Thursday in November. (Before then it was customarily the last Thursday in November, but set by Presidential declaration. After Franklin Delano Roosevelt set the holiday to the third Thursday in November, to extend the 1939 and 1940 Christmas-shopping seasons — a decision Republican Alf Landon characterized as Hitlerian — the fourth Thursday was encoded in law.)
Any know-it-all will tell you, though, how the 13th of the month is very slightly more likely to be a Friday than any other day of the week. This is because the Gregorian calendar has that peculiar century-year leap day rule. It throws off the regular progression of the dates through the week. It takes 400 years for the calendar to start repeating itself. How does this affect the fourth Thursday of November? (A month which, this year, did have a Friday the 13th.)
It turns out, it changes things in subtle ways. Thanksgiving, by the current rule, can be any date between the 22nd and 28th; it’s most likely to be any of the 22nd, 24th, or 26th. (This implies that the 13th of November is equally likely to be a Friday, Wednesday, or Monday, a result that surprises me too.) So here’s how often which date is Thanksgiving. This if we pretend the current United States definition of Thanksgiving will be in force for 400 years unchanged:
Today’s is another topic suggested by Mr Wu, author of the Singapore Maths Tuition blog. The Wronskian is named for Józef Maria Hoëne-Wroński, a Polish mathematician, born in 1778. He served in General Tadeusz Kosciuszko’s army in the 1794 Kosciuszko Uprising. After being captured and forced to serve in the Russian army, he moved to France. He kicked around Western Europe and its mathematical and scientific circles. I’d like to say this was all creative and insightful, but, well. Wikipedia describes him trying to build a perpetual motion machine. Trying to square the circle (also impossible). Building a machine to predict the future. The St Andrews mathematical biography notes his writing a summary of “the general solution of the fifth degree [polynomial] equation”. This doesn’t exist.
Both sources, though, admit that for all that he got wrong, there were flashes of insight and brilliance in his work. The St Andrews biography particularly notes that Wronski’s tables of logarithms were well-designed. This is a hard thing to feel impressed by. But it’s hard to balance information so that it’s compact yet useful. He wrote about the Wronskian in 1812; it wouldn’t be named for him until 1882. This was 29 years after his death, but it does seem likely he’d have enjoyed having a familiar thing named for him. I suspect he wouldn’t enjoy my next paragraph, but would enjoy the fight with me about it.
The Wronskian is a thing put into Introduction to Ordinary Differential Equations courses because students must suffer in atonement for their sins. Those who fail to reform enough must go on to the Hessian, in Partial Differential Equations.
To be more precise, the Wronskian is the determinant of a matrix. The determinant you find by adding and subtracting products of the elements in a matrix together. It’s not hard, but it is tedious, and gets more tedious pretty fast as the matrix gets bigger. (In Big-O notation, it’s the order of the cube of the matrix size. This is rough, for things humans do, although not bad as algorithms go.) The matrix here is made up of a bunch of functions and their derivatives. The functions need to be ones of a single variable. The derivatives, you need first, second, third, and so on, up to one less than the number of functions you have.
If you have two functions, and , you need their first derivatives, and . If you have three functions, , , and , you need first derivatives, , , and , as well as second derivatives, , , and . If you have functions and here I’ll call them , you need derivatives, and so on through . You see right away this is a fun and exciting thing to calculate. Also why in intro to differential equations you only work this out with two or three functions. Maybe four functions if the class has been really naughty.
Go through your functions and your derivatives and make a big square matrix. And then you go through calculating the derivative. This involves a lot of multiplying strings of these derivatives together. It’s a lot of work. But at least doing all this work gets you older.
So one will ask why do all this? Why fit it into every Intro to Ordinary Differential Equations textbook and why slip it in to classes that have enough stuff going on?
One answer is that if the Wronskian is not zero for some values of the independent variable, then the functions that went into it are linearly independent. Mathematicians learn to like sets of linearly independent functions. We can treat functions like directions in space. Linear independence assures us none of these functions are redundant, pointing a way we already can describe. (Real people see nothing wrong in having north, east, and northeast as directions. But mathematicians would like as few directions in our set as possible.) The Wronskian being zero for every value of the independent variable seems like it should tell us the functions are linearly dependent. It doesn’t, not without some more constraints on the functions.
This is fine, but who cares? And, unfortunately, in Intro it’s hard to reach a strong reason to care. To this major, the emphasis on linearly independent functions felt misplaced. It’s the sort of thing we care about in linear algebra. Or some course where we talk about vector spaces. Differential equations do lead us into vector spaces. It’s hard to find a corner of analysis that doesn’t.
Every ordinary differential equation has a secret picture. This is a vector field. One axis in the field is the independent variable of the function. The other axes are the value of the function. And maybe its derivatives, depending on how many derivatives are used in the ordinary differential equation. To solve one particular differential equation is to find one path in this field. People who just use differential equations will want to find one path.
Mathematicians tend to be fine with finding one path. But they want to find what kinds of paths there can be. Are there paths which the differential equation picks out, by making paths near it stay near? Or by making paths that run away from it? And here is the value of the Wronskian. The Wronskian tells us about the divergence of this vector field. This gives us insight to how these paths behave. It’s in the same way that knowing where high- and low-pressure systems are describes how the weather will change. The Wronskian, by way of a thing called Liouville’s Theorem that I haven’t the strength to describe today, ties in to the Hamiltonian. And the Hamiltonian we see in almost every mechanics problem of note.
You can see where the mathematics PhD, or the physicist, would find this interesting. But what about the student, who would look at the symbols evoked by those paragraphs above with reasonable horror?
And here’s the second answer for what the Wronskian is good for. It helps us solve ordinary differential equations. Like, particular ones. An ordinary differential equation will (normally) have several linearly independent solutions. If you know all but one of those solutions, it’s possible to calculate the Wronskian and, from that, the last of the independent solutions. Since a big chunk of mathematics — particularly for science or engineering — is solving differential equations you see why this is something valuable. Allow that it’s tedious. Tedious work we can automate, or give to research assistant to do.
One then asks what kind of differential equation would have all-but-one answer findable, and yield that last one only by long efforts of hard work. So let me show you an example ordinary differential equation:
Here , , and are some functions that depend only on the independent variable, . Don’t know what they are; don’t care. The differential equation is a lot easier of and are constants, but we don’t insist on that.
This equation has a close cousin, and one that’s easier to solve than the original. Is cousin is called a homogeneous equation:
The left-hand-side, the parts with the function that we want to find, is the same. It’s the right-hand-side that’s different, that’s a constant zero. This is what makes the new equation homogenous. This homogenous equation is easier and we can expect to find two functions, and , that solve it. If and are constant this is even easy. Even if they’re not, if you can find one solution, the Wronskian lets you generate the second.
That’s nice for the homogenous equation. But if we care about the original, inhomogenous one? The Wronskian serves us there too. Imagine that the inhomogenous solution has any solution, which we’ll call . (The ‘p’ stands for ‘particular’, as in “the solution for this particular ”.) But also has to solve that inhomogenous differential equation. It seems startling but if you work it out, it’s so. (The key is the derivative of the sum of functions is the same as the sum of the derivative of functions.) also has to solve that inhomogenous differential equation. In fact, for any constants and , it has to be that is a solution.
I’ll skip the derivation; you have Wikipedia for that. The key is that knowing these homogenous solutions, and the Wronskian, and the original , will let you find the that you really want.
My reading is that this is more useful in proving things true about differential equations, rather than particularly solving them. It takes a lot of paper and I don’t blame anyone not wanting to do it. But it’s a wonder that it works, and so well.
Don’t make your instructor so mad you have to do the Wronskian for four functions.
And let me tease other W-words I won’t be repeating for my essay this week with the Well-Ordering Principle, discussed in the summer of 2017. This is one of those little properties that some sets of numbers, like whole numbers, have and that others, like the rationals, don’t. It doesn’t seem like anything much, which is often a warning that the concept sneaks into a lot of interesting work. On re-reading my own work, I got surprised, which I hope speaks better of the essay than it does of me.
No reason not to keep showing off old posts while I prepare new ones. A Summer 2015 Mathematics A To Z: well-posed problem shows off one of the set of things mathematicians describe as “well”. Well-posedness is one of those things mathematicians learn to look for in problems, and to recast problems so that they have it. The essay also shows off how much I haven’t been able to settle on rules about how to capitalize subject lines.
This is easy. The velocity is the first derivative of the position. First derivative with respect to time, if you must know. That hardly needed an extra week to write.
Yes, there’s more. There is always more. Velocity is important by itself. It’s also important for guiding us into new ideas. There are many. One idea is that it’s often the first good example of vectors. Many things can be vectors, as mathematicians see them. But the ones we think of most often are “some magnitude, in some direction”.
The position of things, in space, we describe with vectors. But somehow velocity, the changes of positions, seems more significant. I suspect we often find static things below our interest. I remember as a physics major that my Intro to Mechanics instructor skipped Statics altogether. There are many important things, like bridges and roofs and roller coaster supports, that we find interesting because they don’t move. But the real Intro to Mechanics is stuff in motion. Balls rolling down inclined planes. Pendulums. Blocks on springs. Also planets. (And bridges and roofs and roller coaster supports wouldn’t work if they didn’t move a bit. It’s not much though.)
So velocity shows us vectors. Anything could, in principle, be moving in any direction, with any speed. We can imagine a thing in motion inside a room that’s in motion, its net velocity being the sum of two vectors.
And they show us derivatives. A compelling answer to “what does differentiation mean?” is “it’s the rate at which something changes”. Properly, we can take the derivative of any quantity with respect to any variable. But there are some that make sense to do, and position with respect to time is one. Anyone who’s tried to catch a ball understands the interest in knowing.
We take derivatives with respect to time so often we have shorthands for it, by putting a ‘ mark after, or a dot above, the variable. So if x is the position (and it often is), then is the velocity. If we want to emphasize we think of vectors, is the position and the velocity.
Velocity has another common shorthand. This is , or if we want to emphasize its vector nature, . Why a name besides the good enough ? It helps us avoid misplacing a ‘ mark in our work, for one. And giving velocity a separate symbol encourages us to think of the velocity as independent from the position. It’s not — not exactly — independent. But knowing that a thing is in the lawn outside tells us nothing about how it’s moving. Velocity affects position, in a process so familiar we rarely consider how there’s parts we don’t understand about it. But velocity is also somehow also free of the position at an instant.
Velocity also guides us into a first understanding of how to take derivatives. Thinking of the change in position over smaller and smaller time intervals gets us to the “instantaneous” velocity by doing only things we can imagine doing with a ruler and a stopwatch.
Velocity has a velocity. , also known as . Or, if we’re sure we won’t lose a ‘ mark, . Once we are comfortable thinking of how position changes in time we can think of other changes. Velocity’s change in time we call acceleration. This is also a vector, more abstract than position or velocity. Multiply the acceleration by the mass of the thing accelerating and we have a vector called the “force”. That, we at least feel we understand, and can work with.
Acceleration has a velocity too, a rate of change in time. It’s called the “jerk” by people telling you the change in acceleration in time is called the “jerk”. (I don’t see the term used in the wild, but admit my experience is limited.) And so on. We could, in principle, keep taking derivatives of the position and keep finding new changes. But most physics problems we find interesting use just a couple of derivatives of the position. We can label them, if we need, , where n is some big enough number like 4.
We can bundle them in interesting ways, though. Come back to that mention of treating position and velocity of something as though they were independent coordinates. It’s a useful perspective. Imagine the rules about how particles interacting with one another and with their environment. These usually have explicit roles for position and velocity. (Granting this may reflect a selection bias. But these do cover enough interesting problems to fill a career.)
So we create a new vector. It’s made of the positition and the velocity. We’d write it out as . The superscript-T there, “transposition”, lets us use the tools of matrix algebra. This vector describes a point in phase space. Phase space is the collection of all the physically possible positions and velocities for the system.
What’s the derivative, in time, of this point in phase space? Glad to say we can do this piece by piece. The derivative of a vector is the derivative of each component of a vector. So the derivative of is , or, . This acceleration itself depends on, normally, the positions and velocities. So we can describe this as for some function . You are surely impressed with this symbol-shuffling. You are less sure why this bother.
The bother is a trick of ordinary differential equations. All differential equations are about how a function-to-be-determined and its derivatives relate to one another. In ordinary differential equations, the function-to-be-determined depends on a single variable. Usually it’s called x or t. There may be many derivatives of f. This symbol-shuffling rewriting takes away those higher-order derivatives. We rewrite the equation as a vector equation of just one order. There’s some point in phase space, and we know what its velocity is. That we do because in this form many problems can be written as a matrix problem: . Or approximate our problem as a matrix problem. This lets us bring in linear algebra tools, and that’s worthwhile.
It calls on a more abstract idea of what a “velocity” might be. We can explain what the thing that’s “moving” and what it’s moving through are, given time. But the instincts we develop from watching ordinary things move help us in these new territories. This is also a classic mathematician’s trick. It may seem like all mathematicians do is develop tricks to extend what they already do. I can’t say this is wrong.