How did Compute!’s and Compute!’s Gazette’s New MLX Work?

A couple months ago I worked out a bit of personal curiosity. This was about how MLX worked. MLX was a program used in Compute! and Compute!’s Gazette magazine in the 1980s, so that people entering machine-language programs could avoid errors. There were a lot of fine programs, some of them quite powerful, free for the typing-in. The catch is this involved typing in a long string of numbers, and if any were wrong, the program wouldn’t work.

So MLX, introduced in late 1983, was a program to make typing in programs better. You would enter in a string of six numbers — six computer instructions or data — and a seventh, checksum, number. Back in January I worked out finally what the checksum was. It turned out to be simple. Take the memory location of the first of your set of six instructions, modulo 256. Add to it each of the six instructions, modulo 256. That’s the checksum. If it doesn’t match the typed-in checksum, there’s an error.

There’s weaknesses to this, though. It’s vulnerable to transposition errors: if you were supposed to type in 169 002 and put in 002 169 instead, it wouldn’t be caught. It’s also vulnerable to casual typos: 141 178 gives the same checksum as 142 177.

Which is all why the original MLX lasted only two years.

What Was The New MLX?

The New MLX, also called MLX 2.0, appeared first in the June 1985 Compute!. This in a version for the Apple II. Six months later a version for the Commodore 64 got published, again in Compute!, though it ran in Compute!’s Gazette too. Compute! was for all the home computers of the era; Compute!’s Gazette specialized in the Commodore computers. I would have sworn that MLX got adapted for the Atari eight-bit home computers too, but can’t find evidence it ever was. By 1986 Compute! was phasing out its type-in programs and didn’t run much for Atari anymore.

Cover of the December 1986 Compute!'s Gazette, which includes small pictures to represent several features. One is a neat watercolor picture for 'Q Bird', showing a cheerful little blue bird resting on the head of a nervous-looking snake.
Programming challenge: a video game with the aesthetics of 1980s video-game-art, such as Q Bird’s look there.

The new MLX made a bunch of changes. Some were internal, about how to store a program being entered. One was dramatic in appearance. In the original MLX people typed in decimal numbers, like 32 or 169. In the new, they would enter hexadecimal digits, like 20 or A9. And a string of eight numbers on a line, rather than six. This promised to save our poor fingers. Where before we needed to type in 21 digits to enter six instructions, now we needed 18 digits to enter eight instructions. So the same program would take about two-thirds the number of keystrokes. A plausible line of code would look something like:

0801:0B 08 00 00 9E 32 30 36 EC
0809:31 00 00 00 A9 00 8D 20 3A
0811:D0 20 CF 14 20 1B 08 4C 96
0819:C7 0B A9 93 20 D2 FF A9 34

(This from the first lines for “Q-Bird”, a game published in the December 1986 Compute!’s Gazette.)

And, most important, there was a new checksum.

What was the checksum formula?

I had a Commodore 64, so I always knew MLX from its Commodore version. The key parts of the checksum code appear in it in lines 350 through 390. Let me copy out the key code, spaced a bit out for easier reading:

360 A = INT(AD/Z6):
    GOSUB 350:
    A = AD - A*Z6:
    GOSUB 350:
370 CK = INT(AD/Z6):
    CK = AD - Z4*CK + Z5*(CK>27):
    GOTO 390
380 CK = CK*Z2 + Z5*(CK>Z7) + A
390 CK = CK + Z5*(CK>Z5):

Z2, Z4, Z5, Z6, and Z7 are constants, defined at the start of the program. Z4 equals 254, Z5 equals 255, Z6 equals 256, and Z7, as you’d expect, is 127. Z2, meanwhile, was a simple 2.

About a dozen lines of Commodore 64 BASIC, including the lines that represent the checksum calculations for MLX 2.0.
The bits at the end of each line, :rem 240 and the like, are not part of the working code. They’re instead the Automatic Proofreader checksum. Automatic Proofreader was a different program, one written in machine language that you used to make sure you typed in BASIC programs correctly. After entering a line of BASIC, the computed checksum appeared in the corner of the window, and if it was the :rem number, you had typed the line in correctly. Now you might wonder how you knew you typed in the machine language code for the Automatic Proofreader correctly, if you need the Automatic Proofreader to enter MLX correctly. To this I offer LOOK A BIG DISTRACTING THING! (Runs away.)

A bit of Commodore BASIC here. INT means to take the largest whole number not larger than whatever’s inside. AD is the address of the start of the line being entered. CK is the checksum. A is one number, one machine language instruction, being put in. GOSUB, “go to subroutine”, means to jump to another line and execute commands from there, and then RETURN. That’s the command. The program then continues from the next instruction after the GOSUB. In this code, line 350 converts a number from decimal to hexadecimal and prints out the hexadecimal version. This bit about adding Z5 * (CK>Z7) looks peculiar.

Commodore BASIC evaluates logical expressions like CK > 27 into a bit pattern. That pattern looks like a number. We can use it like an integer. Many programming languages do something like that and it can allow for clever but cryptic programming tricks. An expression that’s false evaluates as 0; an expression that’s true evaluates as -1. So, CK + Z5*(CK>Z5) is an efficient little filter. If CK is smaller than Z5, it’s left untouched. If CK is larger than Z5, then subtract Z5 from CK. This keeps CK from being more than 255, exactly as we’d wanted.

But you also notice: this code makes no sense.

Like, starting the checksum with something derived from the address makes sense. Adding to that numbers based on the instructions makes sense. But the last instruction of line 370 is a jump straight to line 390. Line 380, where any of the actual instructions are put into the checksum, never gets called. Also, there’s eight instructions per line. Why is only one ever called?

And this was a bear to work out. One friend insisted I consider the possibility that MLX was buggy and nobody had found the defect. I could not accept that, not for a program that was so central to so much programming for so long. Also, not considering that it worked. Make almost any entry error and the checksum would not match.

Where’s the rest of the checksum formula?

This is what took time! I had to go through the code and find what other lines call lines 360 through 390. There’s a hundred lines of code in the Commodore version of MLX, which isn’t that much. They jump around a lot, though. By my tally 68 of these 100 lines jump to, or can jump to, something besides the next line of code. I don’t know how that compares to modern programming languages, but it’s still dizzying. For a while I thought it might be a net saving in time to write something that would draw a directed graph of the program’s execution flow. It might still be worth doing that.

The checksum formula gets called by two pieces of code. One of them is the code when the program gets entered. MLX calculates a checksum and verifies whether it matches the ninth number entered. The other role is in printing out already-entered data. There, the checksum doesn’t have a role, apart from making the on-screen report look like the magazine listing.

Here’s the code that calls the checksum when you’re entering code:

440 POKE 198,0:
    GOSUB 360:
    [ many lines about entering your data here ]
560 FOR I=1 TO 25 STEP 3:
    B$ = MID$(IN$, I):
    GOSUB 320:
    IF I<25 THEN GOSUB 380: A(I/3)=A
570 NEXT:
    F = 1:
    GOTO 440
580 GOSUB 1080:
    [ several more lines setting up a new line of data to enter ]

Line 320 started the routine that turned a hexadecimal number, such as 7F, into decimal, such as 127. It returns this number as the variable named A. IN$ was the input text, part of the program you you enter. This should be 27 characters long. A(I/3) was an element in an array, the string of eight instructions for that entry. Yes, you could use the same name for an array and for a single, unrelated, number. Yes, this was confusing.

But here’s the logic. Line 440 starts work on your entry. It calculates the part of the checksum that comes from the location in memory that data’s entered in. Line 560 does several bits of work. It takes the entered instructions and converts the strings into numbers. Then it takes each of those instruction numbers and adds its contribution to the checksum. Line 570 compares whether the entered checksum matches the computed checksum. If it does match, good. If it doesn’t match, then go back and re-do the entry.

The code for displaying a line of your machine language program is shorter:

630 GOSUB 360:
    B = BS + AD - SA;
    FOR I = B TO B+7:
       A = PEEK(I):
       GOSUB 350:
       GOSUB 380:
       PRINT S$;
640 NEXT:
    PRINT "";       
    A = CK:
    GOSUB 350:

The bit about PEEK is looking into the buffer, which holds the entered instructions, and reading what’s there. The GOSUB 350 takes the number ‘A’ and prints out its hexadecimal representation. GOSUB 360 calculates the part of the checksum that’s based on the memory location. The GOSUB 380 contributes the part based on every instruction. S$ is a space. It’s used to keep all the numbers from running up against each other.

So what is the checksum formula?

The checksum takes in two parts. The first part is based on the address at the start of the line. Let me call that the number AD . The second part is based on the entry, the eight instructions following the line. Let me call them D_1 through D_8 . So this is easiest described in two parts.

The base of the checksum, which I’ll call ck_{0} , is:

ck_{0} = AD - 254 \cdot \left(floor(AD \div 256)\right) \\  \mbox { [ subtract 255 if this is 256 or greater ] }

For example, suppose the address is 49152 (in hexadecimal, C000), which was popular for Commodore 64 programming. Then ck_{0} would be 129. If the address is 2049 (in hexadecimal, 0801), another popular location, $latex ck_{0} would be 17.

Generally, the initial ck_{0} increases by 1 as the memory address for the start of a line increases. If you entered a line that started at memory address 49153 (hexadecimal C001) for some reason, that ck_{0} would be 130. A line which started at address 49154 (hexadecimal C002) would have ck_{0} start at 131. This progression continues until ck_{0} would reach 256. Then that greater-than filter at the end of the expression intrudes. A line starting at memory address 49278 (C07E) has ck_{0} of 255, and one starting at memory address 49279 (C07F) has ck_{0} of 1. I see reason behind this choice.

That’s the starting point. Now to use the actual data, the eight pieces D_1 through D_8 that are the actual instructions. The easiest way for me to describe this is do it as a loop, using ck_{0} to calculate ck_{1} , and ck_{1} to define ck_{2} and so on.

ck_{j} = 2 \cdot ck_{j - 1} \cdots \\  \mbox { [ subtract 255 if this is 256 or greater ] }  	\\   \cdots + d_{j} \\  \mbox { [ subtract 255 if this is 256 or greater ] }  	\mbox{for j = 1 ... 8}

That is, for each piece of data in turn, double the existing checksum and add the next data to it. If this sum is 256 or larger, subtract 255 from it. The working sum never gets larger than 512, thanks to that subtract-255-rule after the doubling. And then again that subtract-255-rule after adding d_j. Repeat through the eighth piece of data. That last calculated checksum, ck_{8} , is the checksum for the entry. If ck_{8} does match the entered checksum, go on to the next entry. If ck_{8} does not match the entered checksum, give a warning and go back and re-do the entry.

Why was MLX written like that?

There are mysterious bits to this checksum formula. First is where it came from. It’s not, as far as I can tell, a standard error-checking routine, or if it is it’s presented in a form I don’t recognize. But I know only small pieces of information theory, and it might be that this is equivalent to a trick everybody knows.

The formula is, at heart, “double your working sum and add the next instruction, and repeat”. At the end, take the sum modulo 255 so that the checksum is no more than two hexadecimal digits. Almost. In studying the program I spent a lot of time on a nearly-functionally-equivalent code that used modulo operations. I’m confident that if Apple II and Commodore BASIC had modulo functions, then MLX would have used them.

But those eight-bit BASICs did not. Instead the programs tested whether the working checksum had gotten larger than 255, and if it had, then subtracted 255 from it. This is a little bit different. It is possible for a checksum to be 255 (hexadecimal FF). This even happened. In the June 1985 Compute!, introducing the new MLX for the Apple II, we have this entry as part of the word processor Speedscript 3.0 that anyone could type in:

0848: 20 A9 00 8D 53 1E A0 00 FF

What we cannot have is a checksum of 0. (Unless a program began at memory location 0, and had instructions of nothing but 0. This would not happen. The Commodore 64, and the Apple II, used those low-address memory locations for system work. No program could use them.) Were the formulas written with modulo operations, we’d see 00 where we should see FF.

The start of the code for Apple SpeedScript 3.0, showing a couple dozen lines of machine language code.
So this program, which was a legitimate and useful and working word processor, was about 5,699 bytes long. This article is about 31,000 characters (and the characters are longer than a byte back then was), so, that’s the kind of compact writing they were capable of back then.

Doubling the working sum and then setting it to be in a valid range — from 1 to 255 — is easy enough. I don’t know how the designer settled on doubling, but have hypotheses. It’s a good scheme for catching transposition errors, entering 20 FF D2 where one means to enter 20 D2 FF.

The initial ck_{0} seems strange. The equivalent step for the original MLX was the address on which the entry started, modulo 256. Why the change?

My hypothesis is this change was to make it harder to start typing in the wrong entry. The code someone typed in would be long columns of numbers, for many pages. The text wasn’t backed by alternating bands of color, or periodic breaks, or anything else that made it harder for the eye to skip one or more lines of machine language code.

In the original MLX, skipping one line, or even a couple lines, can’t go undetected. The original MLX entered six pieces of data at a time. If your eye skips a line, the wrong data will mismatch the checksum by 6, or by 12, or by 18 — by 6 times the number of lines you miss. To have the checksum not catch this error, you have to skip 128 lines, and that’s not going to happen. That’s about one and a quarter columns of text and the eye just doesn’t make that mistake. Skimming down a couple lines, yes. Moving to the next column, yes. Next column plus 37 lines? No.

An entire page of lines of hexadecimal code, three columns of 83 lines each with nine sets of two-hexadecimal-digit numbers to enter. Plus the four-digit hexadecimal representation of the memory address for the line. It's a lot of data to enter.
So anyway this is why every kid who was really into their Commodore 64 has a repetitive strain injury today. Page of machine language instructions for SpeedCalc, a spreadsheet program, just like every 13-year-old kid needed.

In the new MLX, one enters eight instructions of code at a time. So skipping a line increases the checksum by 8 times the number of lines skipped. If the initial checksum were the line’s starting address modulo 256, then we’d only need to skip 16 lines to get the same initial checksum. Sixteen lines is a bit much to skip, but it’s less than one-sixth of a column. That’s not too far. And the eye could see 0968 where it means to read 0868. That’s a plausible enough error and one the new checksum would be helpless against.

So the more complicated, and outright weird, formula that MLX 2.0 uses betters this. Skipping 16 lines — entering the line for 0968 instead of 0868 — increases the base checksum by 2. Combined with the subtract-255 rule, you won’t get a duplicate of the checksum for, in most cases, 127 lines. Nobody is going to make that error.

So this explains the components. Why is the Commodore 64 version of MLX such a tangle of spaghetti code?

Here I have fewer answers. Part must be that Commodore BASIC was prone to creating messes. For example, it did not really have functions, smaller blocks of code with their own, independent, sets of variables. These would let, say, numbers convert from hexadecimal to decimal without interrupting the main flow of the program. Instead you had to jump, either by GOTO or GOSUB, to another part of the program. The Commodore or Apple II BASIC subroutine has to use the same variable names as the main part of the program, so, pick your variables wisely! Or do a bunch of reassigning values before and after the subroutine’s called.

Excerpt from two columns of the BASIC code for the Commodore 128 version of MLX. The first column includes several user-defined functions. The second column uses them as part of calculating the checksum.
And for completeness here’s excerpts from the Commodore 128 version of MLX. The checksum is calculated from lines 310 through 330. The reference to FNHB(AD) calls back to the rare user-defined function. On line 130 the DEF FN commands declare functions named HB, LB, and AD. The two-character codes before the line numbers, such as the SQ before the line 300, were for the new Automatic Proofreader, which did a better job catching common typing errors than the one using :rem (numbers) seen earlier.

To be precise, Commodore BASIC did let one define some functions. This by using the DEF FN command. It could take one number as the input, and return one number as output. The whole definition of the function couldn’t be more than 80 characters long. It couldn’t have a loop. Given these constraints, you can see why user-defined functions went all but unused.

The Commodore version jumps around a lot. Of its 100 lines of code, 68 jump or can jump to somewhere else. The Apple II version has 52 lines of code, 28 of which jump or can jump to another line. That’s just over 50 percent of the lines. I’m not sure how much of this reflects Apple II’s BASIC being better than Commodore’s. Commodore 64 BASIC we can charitably describe as underdeveloped. The Commodore 128 version of MLX is a bit shorter than the 64’s (90 lines of code). I haven’t analyzed it to see how much it jumps around. (But it does have some user-defined functions.)

Not quite a dozen lines of Apple II BASIC, including the lines that represent the checksum calculations for MLX 2.0.
The Apple II version of MLX just trusted you to type everything in right and good luck there. The checksum calculation — lines 560 and 570 here — are placed near the end of the program listing (it ends on line 610), rather than in the early-center.

The most mysterious element, to me, is the defining of some constants like Z2, which is 2, or Z5, which is 255. The Apple version of this doesn’t uses these constants. It uses 2 or 255 or such in the checksum calculation. I can rationalize replacing 254 with Z4, or 255 with Z5, or 127 with Z7. The Commodore 64 allowed only 80 tokens in a command line. So these values might save only a couple characters, but if they’re needed characters, good. Z2, though, only makes the line longer.

I would have guessed that this reflected experiments. That is, trying out whether one should double the existing sum and add a new number, or triple, or quadruple, or even some more complicated rule. But the Apple II version appeared first, and has the number 2 hard-coded in. This might reflect that Tim Victor, author of the Apple II version, preferred to clean up such details while Ottis R Cowper, writing the Commodore version, did not. Lacking better evidence, I have to credit that to style.

Is this checksum any good?

Whether something is “good” depends on what it is supposed to do. The New MLX, or MLX 2.0, was supposed to make it possible to type in long strings of machine-language code while avoiding errors. So it’s good if it protects against those errors without being burdensome.

It’s a light burden. The person using this types in 18 keystrokes per line. This carries eight machine-language instructions plus one checksum number. So only one-ninth of the keystrokes are overhead, things to check that other work is right. That’s not bad. And it’s better than the original version of MLX, where up to 21 keystrokes gave six instructions. And one-seventh of the keystrokes were the checksum overhead.

The checksum quite effectively guards against entering instructions on a wrong line. To get the same checksum that (say) line 0811 would have you need to jump to line 0C09. In print, that’s another column over and a third of the way down the page. It’s a hard mistake to make.

Entering a wrong number in the instructions — say, typing in 22 where one means 20 — gets caught. The difference gets multiplied by some whole power of two in the checksum. Which power depends on what number’s entered wrong. If the eighth instruction is entered wrong, the checksum is off by that error. If the seventh instruction is wrong, the checksum is off by two times that error. If the sixth instruction is wrong, the checksum is off by four times that error. And so on, so that if the first instruction is wrong, the checksum is off by 128 times that error. And these errors are taken not-quite-modulo 255.

The only way to enter a single number wrong without the checksum catching it is to type something 255 higher or lower than the correct number. And MLX confines you to entering a two-hexadecimal-digit number, that is, a number from 0 to 255. The only mistake it’s possible to make is to enter 00 where you mean FF, or FF where you mean 00.

What about transpositions? Here, the the new MLX checksum shines. Doubling the sum so far and adding a new term to it makes transpositions very likely to be caught. Not many, though. A transposition of the data at position number j and at position number k will go unnoticed only when d_j and d_k happen to make true

\left(2^j - 2^k\right)\cdot\left(d_j - d_k\right) = 0 \mbox{ mod } 255

This doesn’t happen much. It needs d_j and d_k to be 255 apart. Or for \left(2^j - 2^k\right) to be a divisor of 255 and d_j - d_k to be another divisor. I’ll discuss when that happens in the next section.

In practice, this is a great simple checksum formula. It isn’t hard to calculate, it catches most of the likely data-entry mistakes, and it doesn’t require much extra data entry to work.

What flaws did the checksum have?

The biggest flaw the MLX 2.0 checksum scheme has is that it’s helpless to distinguish FF, the number 255, from 00, the number 0. It’s so vulnerable to this that a warning got attached to the MLX listing in every issue of the magazines:

Because of the checksum formula used, MLX won’t notice if you accidentally type FF in place of 00, and vice versa. And there’s a very slim chance that you could garble a line and still end up with a combination of characters that adds up to the proper checksum. However, these mistakes should not occur if you take reasonable care while entering data.

So when can a transposition go wrong? Well, any time you swap a 00 and an FF on a line, however far apart they are. But also if you swap the elements in position j and k, if 2^j - 2^k is a divisor of 255 and d_j - d_k works with you, modulo 255.

For a transposition of adjacent instructions to go wrong — say, the third and the fourth numbers in a line — you need the third and fourth numbers to be 255 apart. That is, entering 00 FF where you mean FF 00 will go undetected. But that’s the only possible case for adjacent instructions.

A transposition past one space — say, swapping the third and the fifth numbers in a line — needs the two to be 85, 170, or 255 away. So, if you were supposed to enter (in hexadecimal) EE A9 44 and you instead entered 44 A9 EE, it would go undetected. That’s the only way a one-space transposition can happen. MLX will catch entering EE A9 45 as 45 A9 EE.

A transposition past two spaces — say, swapping the first and the fifth numbers — will always be caught unless the numbers are 255 apart, that is, a 00 and an FF. A transposition past three spaces — like, swapping the first and the sixth numbers — is vulnerable again. Then if the first and sixth numbers are off by 17 (or a multiple of 17) the swap will go unnoticed. A transposition across four spaces will always be caught unless it’s 00 for FF. A transposition across five spaces — like, swapping the second and eighth numbers — has to also have the two numbers be 85 or 170 or 255 apart to sneak through. And a transposition across six spaces — this has to be swapping the first and last elements in the line — again will be caught unless it’s 00 for FF.

Front cover of the June 1985 issue of Compute!, with the feature article being Apple Speedscript, a 'powerful word processor' inside. The art is a watercolor picture of a man in Apple T-shirt riding a bicycle. Behind him is a Commodore 128 floating in midair, and in front of him is a hand holding a flip-book animation.
So if you weren’t there in the 80s? This is pretty much what it was like. Well-toned men with regrettable moustaches pedaling their bikes while eight-bit computers exploded out of the void behind them and giants played with flip books in front of them.

Listing all the possible exceptions like this makes it sound dire. It’s not. The most likely transposition someone is going to make is swapping the order of two elements. That’s caught unless one of the numbers is FF and the other 00. If the transposition swaps non-neighboring numbers there’s a handful of new cases that might slip through. But you can estimate how often two numbers separated by one or three or five spaces are also different by 85 or 34 or another dangerous combination. (That estimate would suppose that every number from 0 to 255 is equally likely. They’re not, though, because popular machine language instruction codes such as A9 or 20 will be over-represented. So will references to important parts of computer memory such as, on the Commodore, FFD2.)

You will forgive me for not listing all the possible cases where competing typos in entering numbers will cancel out. I don’t want to figure them out either. I will go along with the magazines’ own assessment that there’s a “very slim chance” one could garble the line and get something that passes, though. After all, there are 18,446,744,073,709,551,615 conceivable lines of code one might type in, and only 255 possible checksums. Some garbled lines must match the correct checksum.

Could the checksum have been better?

The checksum could have been different. This is a trivial conclusion. “Better”? That demands thought. A good error-detection scheme needs to catch errors that are common or that are particularly dangerous. It should add as little overhead as possible.

The MLX checksum as it is catches many of the most common errors. A single entry mis-keyed, for example, except for the case of swapping 00 and FF. Or transposing one number for the one next to it. It even catches most transpositions with spaces between the transposed numbers. It catches almost all cases where one enters the entirely wrong line. And it does this for only two more keystrokes per eight pieces of data entered. That’s doing well.

The obvious gap is the inability to distinguish 00 from FF. There’s a cure for that, of course. Count the number of 00’s — or the number of FF’s — in a line, and include that as part of the checksum. It wouldn’t be particularly hard to enter (going back to the Q-Bird example)

0801:0B 08 00 00 9E 32 30 36 EC 2
0809:31 00 00 00 A9 00 8D 20 3A 4
0811:D0 20 CF 14 20 1B 08 4C 96 0
0819:C7 0B A9 93 20 D2 FF A9 34 0

(Or if you prefer, to have the extra checksums be 0 0 0 1.)

This adds to the overhead, yes, one more keystroke in what is already a good bit of typing. And one may ask whether you’re likely to ever touch 00 when you mean FF. They keys aren’t near one another. Then you learn that MLX soon got a patch which made keying much easier. They did this by making the characters in the rows under 7 8 9 0 type in digits. And the mapping used (on the Commodore 64) put the key to enter F right next to the key to enter 0.

The page of boilerplate text explaining MLX after it became a part of nearly every issue. In the rightmost column a chart explains how the program translates keys so that, for example, U, I, and O are read as the numbers 4, 5, and 6, to make a hexadecimal keypad for faster entry.
The last important revision of MLX made a data-entry keypad out of, for the Commodore 64, some of the letters on the keyboard. For the Commodore 128, it made a data-entry keypad out of … the keypad, but fitting in the hexadecimal numbers A, B, C, D, E, and F took some thought. But the 64 version still managed to put F and 0 next to each other, making it possible to enter FF where you meant 00 or vice-versa.

If you get ambitious, you might attempt even cleverer schemes. Suppose you want to catch those off-by-85 or off-by-17 differences that would detect transpositions. Why not, say, copy the last bits of each of your eight data, and use that to assemble a new checksum number? So, for example, in line 0801 up there the last bit of each number was 1-0-0-0-0-0-0-0 which is boring, but gives us 128, hexadecimal 80, as a second checksum. Line 0809 has eighth bits 1-0-0-0-1-0-1-0-0, or 138 (hex 8A). And so on; so we could have:

0801:0B 08 00 00 9E 32 30 36 EC 2 80
0809:31 00 00 00 A9 00 8D 20 3A 4 8A
0811:D0 20 CF 14 20 1B 08 4C 96 0 24
0819:C7 0B A9 93 20 D2 FF A9 34 0 B3

Now, though? We’ve got five keystrokes of overhead to sixteen keystrokes of data. Getting a bit bloated. It could be cleaned up a little; the single-digit count of 00’s (or FF’s) is redundant to the two-digit number formed from the cross-section I did there.

And if we were working in a modern programming language we could reduce the MLX checksum and this sampled-digit checksum to a single number. Use the bitwise exclusive-or of the two numbers as the new, ‘mixed’ checksum. Exclusive-or the sampled-digit with the mixed checksum and you get back the classic MLX checksum. You get two checksums in the space of one. In the program you’d build the sampled-digit checksum, and exclusive-or it with the mixed checksum, and get back what should be the MLX checksum. Or take the mixed checksum and exclusive-or it with the MLX checksum, and you get the sampled-digit checksum.

This almost magic move has two problems. This sampled digit checksum could catch transpositions that are off by 85 or 17. It won’t catch transpositions off by 17 or by 34, though, just as deadly. It will catch transpositions off by odd multiples of 17, at least. You would catch transpositions off by 85 or by 34 if you sampled the seventh digit, at least. Or if you build a sample based on the fifth or the third digit. But then you won’t catch transpositions off by 85 or by 17. You can add new sampled checksums. This threatens us again with putting in too many check digits for actual data entry.

The other problem is worse: Commodore 64 BASIC did not have a bitwise exclusive-or command. I was shocked, and I was more shocked to learn that Applesoft BASIC also lacked an exclusive-or. The Commodore 128 had exclusive-or, at least. But given that lack, and the inability to add an exclusive-or function that wouldn’t be infuriating? I can’t blame anyone for not trying.

So there is my verdict. There are some obvious enough ways that MLX’s checksum might have been able to catch more errors. But, given the constraints of the computers it was running on? A more sensitive error check likely would not have been available. Not without demanding much more typing. And, as a another practical matter, demanding the program listings in the magazine be smaller and harder to read. The New MLX did, overall, a quite good job catching errors without requiring too much extra typing. We’ll probably never see its like again.

How Did Compute!’s Gazette’s MLX Program Work?

This is, at least, a retrocomputing-adjacent piece. I’m looking back at the logic of a common and useful tool from the early-to-mid-80s and why it’s built that way. I hope you enjoy. It has to deal with some of the fussier points about how Commodore 64 computers worked. If you find a paragraph is too much technical fussing for you, I ask you to not give up, just zip on to the next paragraph. It’s interesting to know why something was written that way, but it’s all right to accept that it was and move to the next point.

How Did You Get Computer Programs In The 80s?

When the world and I were young, in the 1980s, we still had computers. There were two ways to get software, though. One was trading cassette tapes or floppy disks with cracked programs on them. (The cracking was taking off the copy-protection.) The other was typing. You could type in your own programs, certainly, just like you can make your own web page just by typing. Or you could type in a program. We had many magazines and books that had programs ready for entry. Some were serious programs, spreadsheets and word processors and such. Some were fun, like games or fractal-generators or such. Some were in-between, programs to draw or compose music or the such. Some added graphics or sound commands that the built-in BASIC programming language lacked. All this was available for the $2.95 cover price, or ten cents a page at the library photocopier. I had a Commodore 64 for most of this era, moving to a Commodore 128 (which also ran Commodore 64 programs) in 1989 or so. So my impressions, and this article, default to the Commodore 64 experience.

These programs all had the same weakness. You had to type them in. You can expect to make errors. If the program was written in BASIC you had a hope of spotting errors. The BASIC programming language uses common English words for its commands. Their grammar is not English, but it’s also very formulaic, and not hard to pick up. One has a chance of spotting mistakes if it’s 250 PIRNT "SUM; " S one typed.

But many programs were distributed as machine language. That is, the actual specific numbers that correspond to microchip instructions. For the Commodore 64, and most of the eight-bit home computers of the era, this was the 6502 microchip. (The 64 used a variation, the 6510. The differences between the 6502 and 6510 don’t matter for this essay.) Machine language had advantages, making the programs run faster, and usually able to do more things than BASIC could. But a string of numbers is only barely human-readable. Oh, you might in time learn to recognize the valid microchip instructions. But it is much harder to spot the mistakes on entering 32 255 120. That last would be a valid command on any eight-bit Commodore computer. It would have the computer print something, if it weren’t for the transposition errors.

The cover of the December 1983 Compute! magazine. The right two-thirds is a cartoony illustration of a superhero flying out of a city telephone booth; he wears a '64' on his chest. In the background flying saucers flying the Atari and the Commodore logos approach town. In an inset bubble a man wearing a tie looks surprised at a small piece of paper.
The December 1983 issue of Compute!. On page 216, program editor Charles Brannon introduced MLX, subject of this post. Picture from .

What Was MLX and How Did You Use It?

The magazines came up with tools to handle this. In the 398-page(!) December 1983 issue of Compute!, my favorite line of magazines introduced MLX. This was a program, written in BASIC, which let you enter machine language programs. Charles Brannon has the credit for writing the article which introduced it. I assume he also wrote the program, but could be mistaken. I’m open to better information. Other magazines had other programs to do the same work; I knew them less well. MLX formatted machine language programs to look like this:

49152 :169,002,141,178,002,169,149
49158 :000,141,179,002,141,180,137
49164 :002,141,181,002,169,001,252
49170 :141,183,002,169,003,141,145

This was the first few lines of code for a game called Turnabout, by Mark Tuttle and Kevin Mykytyn. It ran in issue #28 of Commodore-computers-spinoff magazine Compute!’s Gazette. You can see the original at and imagine the poor wrists of people typing all this in.

What did all this mean, though? These were lines you would enter in while running MLX. Before the colon was a location in memory. The numbers after the colon — the entries, I’ll call them — are six machine language instructions, one number to go into each memory cell. So, the number 169 was destined to go into memory location 49152. The number 002 would go into memory location 49153. The number 141 would go into memory location 49154. And so on; 000 would go into memory location 49158, 141 into 49159, 179 into 49160. 002 would go into memory location 49164; 141 would go into memory location 49170. And so on.

MLX would prompt you with the line number, the 49152 or 49158 or 49164 or so on. Machine language programs could go into almost any memory location. You had to tell it where to start. 49152 was a popular location for Commodore 64 programs. It was the start of a nice block of memory not easily accessed except by machine language programs. Then you would type in the entries, the numbers that follow. This was a reasonably efficient way to key this stuff in. MLX automatically advanced the location in memory and would handle things like saving the program to tape or disk when you were done.

The alert reader notices, though, that there are seven entries after the colon in each line. That seventh number is the checksum. It’s the guard that Compute! and Compute!’s Gazette put against typos. This seventh number was a checksum. MLX did a calculation based on the memory location and the first six numbers of the line. If it was not the seventh number on the line, then there was an error somewhere. You had to re-enter the line to get it right.

The thing I’d wondered, and finally got curious enough to explore, was how it calculated this.

What Was The Checksum And How Did It Work?

Happily, Compute! and Compute!’s Gazette published MLX in almost every issue, so it’s easy to find. You can see it, for example, on page 123 of the October 1985 issue of Compute!’s Gazette. And MLX was itself a BASIC program. There are quirks of the language, and its representation in magazine print, that take time to get used to. But one can parse it without needing much expertise. One important thing is that most Commodore BASIC commands didn’t need spaces after them. For an often-used program like this they’d skip the spaces. And the : symbol denoted the end of one command and start of another. So, for example, PRINTCHR$(20):IFN=CKSUMTHEN530 one learns means PRINT CHR$(20); IF N = CKSUM THEN 530.

So how does it work? MLX is, as a program, convoluted. It’s well-described by the old term “spaghetti code”. But the actual calculation of the checksum is done in a single line of the program, albeit one with several instructions. I’ll print it, but with some spaces added in to make it easier to read.

Several lines of the BASIC program for MLX, including the one key line, 500, on which the checksum calculation was done.
The heart of the code! Line 500 is the checksum calculation. You can see on line 520 what happens if the checksum calculation disagrees with the last number on the line you key in. Picture from .
500 CKSUM = AD - INT(AD/256)*256:
    FOR I = 1 TO 6:
       CKSUM = (CKSUM + A(I))AND 255:

Most of this you have a chance of understanding even if you don’t program. CKSUM is the checksum number. AD is the memory address for the start of the line. A is an array of six numbers, the six numbers of that line of machine language. I is an index, a number that ranges from 1 to 6 here. Each A(I) happens to be a number between 0 and 255 inclusive, because that’s the range of integers you can represent with eight bits.

What Did This Code Mean?

So to decipher all this. Starting off. CKSUM = AD - INT(AD/256)*256. INT means “calculate the largest integer not greater than whatever’s inside”. So, like, INT(50/256) would be 0; INT(300/256) would be 1; INT(600/256) would be 2. What we start with, then, is the checksum is “the remainder after dividing the line’s starting address by 256”. We’re familiar with this, mathematically, as “address modulo 256”.

In any modern programming language, we’d write this as CKSUM = MOD(AD, 256) or CKSUM = AD % 256. But Commodore 64 BASIC didn’t have a modulo command. This structure was the familiar and comfortable enough workaround. But, read on.

The next bit was a for/next loop. This would do the steps inside for every integer value of I, starting at 1 and increasing to 6. CKSUM + A(I) has an obvious enough intention. What is the AND 255 part doing, though?

AND, here, is a logic operator. For the Commodore 64, it works on numbers represented as two-byte integers. These have a memory representation of 11111111 11111111 for ‘true’, and 00000000 00000000 for ‘false’. The very leftmost bit, for integers, is a plus-or-minus-sign. If that leftmost bit is a 1, the number is negative; if that leftmost bit is a 0, the number is positive. Did you notice me palming that card, there? We’ll come back to that.

Ordinary whole numbers can be represented in binary too. Like, the number 26 has a binary representation of 00000000 00011010. The number, say, 14 has a binary representation of 00000000 00001110. 26 AND 14 is the number 00000000 00001010, the binary digit being a 1 only when both the first and second numbers have a 1 in that column. This bitwise and operation is also sometimes referred to as masking, as in masking tape. The zeroes in the binary digits of one number mask out the binary digits of the other. (Which does the masking is a matter of taste; 26 AND 14 is the same number as 14 AND 26.)

The binary 00000000 0001010 is the decimal number 10. So you can see that generally these bitwise and operations give you weird results. Taking the bitwise and for 255 is more predictable, though. The number 255 has a bit representation of 00000000 11111111. So what (CKSUM + A(I)) AND 255 does is … give the remainder after dividing (CKSUM + A(I)) by 256. That is, it’s (CKSUM + A(I)) modulo 256.

The formula’s not complicated. To write it in mathematical terms, the calculation is:

ck = \left(addr + \sum_{i = 1}^6 a_i\right) mod 256

Why Write It Like That?

So we have a question. Why are we calculating a number modulo 256 by two different processes? And in the same line of the program?

We get an answer by looking at the binary representation of 49152, which is 11000000 00000000. Remember that card I just palmed? I had warned that if the leftmost digit there were a 1, the number was understood to be negative. 49152 is many things, none of them negative.

So now we know the reason behind the odd programming choice to do the same thing two different ways. As with many odd programming choices it amounts to technical details of how Commodore hardware worked. The Commodore 64’s logical operators — AND, OR, and NOT — work on variables stored as two-byte integers. Two-byte integers can represent numbers from -32,768 up to +32,767. But memory addresses on the Commodore 64 are indexed from 0 up to 65,535. We can’t use bit masking to do the modulo operation, not on memory locations.

I have a second question, though. Look at the work inside the FOR loop. It takes the current value of the checksum, adds one of the entries to it, and takes the bitwise AND of that with 255. Why? The value would be the same if we waited until the loop was done to take the bitwise AND. At least, it would be unless the checksum grew to larger than 32,767. The checksum will be the sum of at most seven numbers, none of them larger than 255, though, so that can’t be the contraint. It’s usually faster to do as little inside a loop as possible, so, why this extravagance?

My first observation is that this FOR loop does the commands inside it six times. And logical operations like AND are very fast. The speed difference could not possibly be perceived. There is a point where optimizing your code is just making life harder for yourself.

My second observation goes back to the quirks of the Commodore 64. You entered commands, like the lines of a BASIC program, on a “logical line” that allowed up to eighty tokens. For typing in commands this is the same as the number of characters. Can this line be rewritten so there’s no redundant code inside the for loop, and so it’s all under 80 characters long?

Yes. This line would have the same effect and it’s only 78 characters:


Why not use that, then?

I don’t have a clear answer. I suspect it’s for the benefit of people typing in the MLX program. In typing that in I’d have trouble not putting in a space between FOR and I, or between CKSUM and AND. Also before and after the TO and before and after AND. This would make the line run over 80 characters and make it crash. The original line is 68 characters, short enough that anyone could add a space here and there and not mess up anything. In looking through MLX, and other programs, I find there are relatively few lines more than 70 characters long. I have found them as long as 76 characters, though. I can’t rule out there being 78- or 79-character lines. They would have to suppose anyone typing them in understands when the line is too long.

There’s an interesting bit of support for this. Compute! also published machine language programs for the Atari 400 and 800. A version of MLX came out for the Atari at the same time the Commodore 64’s came out. Atari BASIC allowed for 120 characters total. And the equivalent line in Atari MLX was:


This has a longer name for the address variable. It uses a different way to ensure that CKSUM stays a number between 0 and 255. But the whole line is only 98 characters.

We could save more spaces on the Commodore 64 version, though. Commodore BASIC “really” used only the first two characters of a variable name. To write CKSUM is for the convenience of the programmer. To the computer it would be the same if we wrote CK. We could even truncate it to CK for this one line of code. The only penalty would be confusing the reader who doesn’t remember that CK and CKSUM are the same variable.

And there’s no reason that this couldn’t have been two lines. One line could add up the checksum and a second could do the bitwise AND. Maybe this is all a matter of the programmer’s tastes.

In a modern language this is all quite zippy to code. To write it in Octave or Matlab is something like:

function [checksOut] = oldmlx(oneline)
  address = oneline(1);
  entries = oneline(2:7);
  keyedChecksum = oneline(8);
  checksum = mod(address, 256);
  checksum += sum(entries);
  checksum = mod(checksum, 256);
  checksOut = (keyedChecksum == checksum);

This is a bit verbose. I want it to be easier to see what work is being done. We could make it this compact:

function [checksOut] = oldmlx(oneline)
   checksOut = !(mod(sum(oneline(1:7))-oneline(8), 256));

I don’t like compressing my thinking quite that much, though.

But that’s the checksum. Now the question: did it work?

Was This Checksum Any Good?

Since Compute! and Compute!’s Gazette used it for years, the presumptive answer is that it did. The real question, then, is did it work well? “Well” means does it prevent the kinds of mistakes you’re likely to make without demanding too much extra work. We could, for example, eliminate nearly all errors by demanding every line be entered three times and accept only a number that’s entered the same at least two of three times. That’s an incredible typing load. Here? We have to enter one extra number for every six. Much lower load, but it allows more errors through. But the calculation is — effectively — simply “add together all the numbers we typed in, and see if that adds to the expected total”. If it stops the most likely errors, though, then it’s good. So let’s consider them.

The first and simplest error? Entering the wrong line. MLX advanced the memory location on its own. So if you intend to write the line for memory location 50268, and your eye slips and you start entering that for 50274 instead? Or even, reading left to right, going to line 50814 in the next column? Very easy to do. This checksum will detect that nicely, though. Entering one line too soon, or too late, will give a checksum that’s off by 6. If your eye skips two lines, the checksum will be off by 12. The only way to not have the checksum miss is to enter a line that’s some multiple of 256 memory locations away. And since each line is six memory locations, that means you have to jump 768 memory locations away. That is 128 lines away. You are not going to make that mistake. (Going from one column in the magazine to the next is a jump of 91 lines. The pages were 8½-by-11 pages, so were a bit easier to read than the image makes them look.)

Two pages of Compute!'s Gazette, showing six columns of 91 lines each of type-in programming. Each line is a string of numbers like 49152 :169,002,141,178,062,169,149 and it's all hypnotic or dizzying to read.
This was what fun looked like on a computer in 1985. So you see why Napster took us all by storm. Picture from .

How about other errors? You could mis-key, say, 169. But think of the plausible errors. Typing it in as 159 or 196 or 269 would be detected by the checksum. The only one that wouldn’t would be to enter a number that’s equal to 169, modulo 256. So, 425, say, or 681. There is nobody so careless as to read 169 and accidentally type 425, though. In any case, other code in MLX rejects any data that’s not between 0 and 255, so that’s caught before the checksum comes into play.

So it’s safe against the most obvious mistake. And against mis-keying a single entry. Yes, it’s possible that you typed in the whole line right but mis-keyed the checksum. If you did that you felt dumb but re-entered the line. If you even noticed and didn’t just accept the error report and start re-entering the line.

What about mis-keying double entries? And here we have trouble. Suppose that you’re supposed to enter 169, 062 and instead enter 159, 072. They’ll add to the same quantity, and the same checksum. All that’s protecting you is that it takes a bit of luck to make two errors that exactly balance each other. But, then, slipping and hitting an adjacent number on the keyboard is an easy mistake to make.

Worse is entry transposition. If you enter 062, 169 instead you have made no checksum errors. And you won’t even be typing any number “wrong”. At least with the mis-keying you might notice that 169 is a common number and 159 a rare one in machine language. (169 was the command “Load Accumulator”. That is, copy a number into the Central Processing Unit’s accumulator. This was one of three on-chip memory slots. 159 was no meaningful command. It would only appear as data.) Swapping two numbers is another easy error to make.

And they would happen. I can attest from experience. I’d had at least one program which, after typing, had one of these glitches. After all the time spent entering it, I ended up with a program that didn’t work. And I never had the heart to go back and track down the glitch or, more efficiently, retype the whole thing from scratch.

The irony is that the program with the critical typing errors was a machine language compiler. It’s something that would have let me write this sort of machine language code. Since I never reentered it, I never created anything but the most trivial of machine language programs for the 64.

So this MLX checksum was fair. It devoted one-seventh of the typing to error detection. It could catch line-swap errors, single-entry mis-keyings, and transpositions within one entry. It couldn’t catch transposing two entries. So that could have been better. I hope to address that soon.