Is this mathematics thing ambiguous or confusing?

There is an excellent chance it is! Mathematicians sometimes assert the object of their study is a universal truth, independent of all human culture. It may be. But the expression of that interest depends on the humans expressing it. And as with all human activities it picks up quirks. Patterns that don’t seem to make sense. Or that seem to conflict with other patterns. It’s not two days ago I most recently saw someone cross that 0 times anything is 0, but 0! is 1.

Mathematicians are not all of one mind. They notice different things that seem important and want to focus on that. They use ways that make sense to their culture. When they create new notation, or new definitions, they use the old ones to guide them. When a topic’s interesting enough for many people to notice, they bring many trails of notation to describe it. Usually a consensus emerges, that there are some notations that work well to describe these concepts, and the others fall away. But it’s difficult to get complete consistency. Particularly when there are several major fields that don’t need to interact much, but do have some overlap.

Christian Lawson-Perfect has started something that might be helpful for understanding this. WhyStartAt.xyz is to be a collection of “ambiguous, inconsistent, or just plain unpleasant conventions in mathematical notation”. There’s four major categories already: inconsistencies, ambiguities, unpleasantness, and conflicting definitions. And there’s a set of references useful for anyone curious why something is a convention. (Nobody knows why we use ‘m’ for the slope in the slope-intercept or point-slope equations describing a line. Sometimes a convention is arbitrary.) It’s already great reading, though, not just for this line from our friend Thomas Hobbes.

Reading the Comics, August 26, 2017: Dragon Edition

It’s another week where everything I have to talk about comes from GoComics.com. So, no pictures. The Comics Kingdom and the Creators.com strips are harder for non-subscribers to read so I feel better including those pictures. There’s not an overarching theme that I can fit to this week’s strips either, so I’m going to name it for the one that was most visually interesting to me.

Charlie Pondrebarac’s CowTown for the 22nd I just knew was a rerun. It turned up the 26th of August, 2015. Back then I described it as also “every graduate students’ thesis defense anxiety dream”. Now I wonder if I have the possessive apostrophe in the right place there. On reflection, if I have “every” there, then “graduate student” has to be singular. If I dropped the “every” then I could talk about “graduate students” in the plural and be sensible. I guess that’s all for a different blog to answer.

Mike Thompson’s Grand Avenue for the 22nd threatened to get me all cranky again, as Grandmom decided the kids needed to do arithmetic worksheets over the summer. The strip earned bad attention from me a few years ago when a week, maybe more, of the strip was focused on making sure the kids drudged their way through times tables. I grant it’s a true attitude that some people figure what kids need is to do a lot of arithmetic problems so they get better at arithmetic problems. But it’s hard enough to convince someone that arithmetic problems are worth doing, and to make them chores isn’t helping.

John Zakour and Scott Roberts’s Maria’s Day for the 22nd name-drops fractions as a worse challenge than dragon-slaying. I’m including it here for the cool partial picture of the fire-breathing dragon. Also I take a skeptical view of the value of slaying the dragons anyway. Have they given enough time for sanctions to work?

Maria’s Day pops back in the 24th. Needs more dragon-slaying.

Eric the Circle for the 24th, this one by Dennill, gets in here by throwing some casual talk about arcs around. That and π. The given formula looks like nonsense to me. $\frac{pi}{180}\cdot 94 - sin 94\deg$ has parts that make sense. The first part will tell you what radian measure corresponds to 94 degrees, and that’s fine. Mathematicians will tend to look for radian measures rather than degrees for serious work. The sine of 94 degrees they might want to know. Subtracting the two? I don’t see the point. I dare to say this might be a bunch of silliness.

Cathy Law’s Claw for the 25th writes off another Powerball lottery loss as being bad at math and how it’s like algebra. Seeing algebra in lottery tickets is a kind of badness at mathematics, yes. It’s probability, after all. Merely playing can be defended mathematically, though, at least for the extremely large jackpots such as the Powerball had last week. If the payout is around 750 million dollars (as it was) and the chance of winning is about one in 250 million (close enough to true), then the expectation value of playing a ticket is about three dollars. If the ticket costs less than three dollars (and it does; I forget if it’s one or two dollars, but it’s certainly not three), then, on average you could expect to come out slightly ahead. Therefore it makes sense to play.

Except that, of course, it doesn’t make sense to play. On average you’ll lose the cost of the ticket. The on-average long-run you need to expect to come out ahead is millions of tickets deep. The chance of any ticket winning is about one in 250 million. You need to play a couple hundred million times to get a good enough chance of the jackpot for it to really be worth it. Therefore it makes no sense to play.

Mathematical logic therefore fails us: we can justify both playing and not playing. We must study lottery tickets as a different thing. They are (for the purposes of this) entertainment, something for a bit of disposable income. Are they worth the dollar or two per ticket? Did you have other plans for the money that would be more enjoyable? That’s not my ruling to make.

Samson’s Dark Side Of The Horse for the 25th just hurts my feelings. Why the harsh word, Samson? Anyway, it’s playing on the typographic similarity between 0 and O, and how we bunch digits together.

Grouping together three decimal digits as a block is as old, in the Western tradition, as decimal digits are. Leonardo of Pisa, in Liber Abbaci, groups the thousands and millions and thousands of millions and such together. By 1228 he had the idea to note this grouping with an arc above the set of digits, like a tie between notes on a sheet of music. This got cut down, part of the struggle in notation to write as little as possible. Johannes de Sacrobosco in 1256 proposed just putting a dot every third digit. In 1636 Thomas Blundeville put a | mark after every third digit. (I take all this, as ever, from Florian Cajori’s A History Of Mathematical Notations, because it’s got like everything in it.) We eventually settled on separating these stanzas of digits with a , or . mark. But that it should be three digits goes as far back as it could.

The End 2016 Mathematics A To Z: Hat

I was hoping to pick a term that was a quick and easy one to dash off. I learned better.

Hat.

This is a simple one. It’s about notation. Notation is never simple. But it’s important. Good symbols organize our thoughts. They tell us what are the common ordinary bits of our problem, and what are the unique bits we need to pay attention to here. We like them to be easy to write. Easy to type is nice, too, but in my experience mathematicians work by hand first. Typing is tidying-up, and we accept that being sluggish. Unique would be nice, so that anyone knows what kind of work we’re doing just by looking at the symbols. I don’t think anything manages that. But at least some notation has alternate uses rare enough we don’t have to worry about it.

“Hat” has two major uses I know of. And we call it “hat”, although our friends in the languages department would point out this is a caret. The little pointy corner that goes above a letter, like so: $\hat{i}$. $\hat{x}$. $\hat{e}$. It’s not something we see on its own. It’s always above some variable.

The first use of the hat like this comes up in statistics. It’s a way of marking that something is an estimate. By “estimate” here we mean what anyone might mean by “estimate”. Statistics is full of uses for this sort of thing. For example, we often want to know what the arithmetic mean of some quantity is. The average height of people. The average temperature for the 18th of November. The average weight of a loaf of bread. We have some letter that we use to mean “the value this has for any one example”. By some letter we mean ‘x’, maybe sometimes ‘y’. We can use any and maybe the problem begs for something. But it’s ‘x’, maybe sometimes ‘y’.

For the arithmetic mean of ‘x’ for the whole population we write the letter with a horizontal bar over it. (The arithmetic mean is the thing everybody in the world except mathematicians calls the average. Also, it’s what mathematicians mean when they say the average. We just get fussy because we know if we don’t say “arithmetic mean” someone will come along and point out there are other averages.) That arithmetic mean is $\bar{x}$. Maybe $\bar{y}$ if we must. Must be some number. But what is it? If we can’t measure whatever it is for every single example of our group — the whole population — then we have to make an estimate. We do that by taking a sample, ideally one that isn’t biased in some way. (This is so hard to do, or at least be sure you’ve done.) We can find the mean for this sample, though, because that’s how we picked it. The mean of this sample is probably close to the mean of the whole population. It’s an estimate. So we can write $\hat{x}$ and understand. This is not $\bar{x}$ but it does give us a good idea what $\hat{x}$ should be.

(We don’t always use the caret ^ for this. Sometimes we use a tilde ~ instead. ~ has the advantage that it’s often used for “approximately equal to”. So it will carry that suggestion over to its new context.)

The other major use of the hat comes in vectors. Mathematics types do a lot of work with vectors. It turns out a lot of mathematical structures work the way that pointing and moving in directions in ordinary space do. That’s why back when I talked about what vectors were I didn’t say “they’re like arrows pointing some length in some direction”. Arrows pointing some length in some direction are vectors, yes, but there are many more things that are vectors. Thinking of moving in particular directions gives us good intuition for how to work with vectors, and for stuff that turns out to be vectors. But they’re not everything.

If we need to highlight that something is a vector we put a little arrow over its name. $\vec{x}$. $\vec{e}$. That sort of thing. (Or if we’re typing, we might put the letter in boldface: x. This was good back before computers let us put in mathematics without giving the typesetters hazard pay.) We don’t always do that. By the time we do a lot of stuff with vectors we don’t always need the reminder. But we will include it if we need a warning. Like if we want to have both $\vec{r}$ telling us where something is and to use a plain old $r$ to tell us how big the vector $\vec{r}$ is. That turns up a lot in physics problems.

Every vector has some length. Even vectors that don’t seem to have anything to do with distances do. We can make a perfectly good vector out of “polynomials defined for the domain of numbers between -2 and +2”. Those polynomials are vectors, and they have lengths.

There’s a special class of vectors, ones that we really like in mathematics. They’re the “unit vectors”. Those are vectors with a length of 1. And we are always glad to see them. They’re usually good choices for a basis. Basis vectors are useful things. They give us, in a way, a representative slate of cases to solve. Then we can use that representative slate to give us whatever our specific problem’s solution is. So mathematicians learn to look instinctively to them. We want basis vectors, and we really like them to have a length of 1. Even if we aren’t putting the arrow over our variables we’ll put the caret over the unit vectors.

There are some unit vectors we use all the time. One is just the directions in space. That’s $\hat{e}_1$ and $\hat{e}_2$ and for that matter $\hat{e}_3$ and I bet you have an idea what the next one in the set might be. You might be right. These are basis vectors for normal, Euclidean space, which is why they’re labelled “e”. We have as many of them as we have dimensions of space. We have as many dimensions of space as we need for whatever problem we’re working on. If we need a basis vector and aren’t sure which one, we summon one of the letters used as indices all the time. $\hat{e}_i$, say, or $\hat{e}_j$. If we have an n-dimensional space, then we have unit vectors all the way up to $\hat{e}_n$.

We also use the hat a lot if we’re writing quaternions. You remember quaternions, vaguely. They’re complex-valued numbers for people who’re bored with complex-valued numbers and want some thrills again. We build them as a quartet of numbers, each added together. Three of them are multiplied by the mysterious numbers ‘i’, ‘j’, and ‘k’. Each ‘i’, ‘j’, or ‘k’ multiplied by itself is equal to -1. But ‘i’ doesn’t equal ‘j’. Nor does ‘j’ equal ‘k’. Nor does ‘k’ equal ‘i’. And ‘i’ times ‘j’ is ‘k’, while ‘j’ times ‘i’ is minus ‘k’. That sort of thing. Easy to look up. You don’t need to know all the rules just now.

But we often end up writing a quaternion as a number like $4 + 2\hat{i} - 3\hat{j} + 1 \hat{k}$. OK, that’s just the one number. But we will write numbers like $a + b\hat{i} + c\hat{j} + d\hat{k}$. Here a, b, c, and d are all real numbers. This is kind of sloppy; the pieces of a quaternion aren’t in fact vectors added together. But it is hard not to look at a quaternion and see something pointing in some direction, like the first vectors we ever learn about. And there are some problems in pointing-in-a-direction vectors that quaternions handle so well. (Mostly how to rotate one direction around another axis.) So a bit of vector notation seeps in where it isn’t appropriate.

I suppose there’s some value in pointing out that the ‘i’ and ‘j’ and ‘k’ in a quaternion are fixed and set numbers. They’re unlike an ‘a’ or an ‘x’ we might see in the expression. I’m not sure anyone was thinking they were, though. Notation is a tricky thing. It’s as hard to get sensible and consistent and clear as it is to make words and grammar sensible. But the hat is a simple one. It’s good to have something like that to rely on.

The End 2016 Mathematics A To Z: The Fredholm Alternative

Some things are created with magnificent names. My essay today is about one of them. It’s one of my favorite terms and I get a strange little delight whenever it needs to be mentioned in a proof. It’s also the title I shall use for my 1970s Paranoid-Conspiracy Thriller.

The Fredholm Alternative.

So the Fredholm Alternative is about whether this supercomputer with the ability to monitor every commercial transaction in the country falls into the hands of the Parallax Corporation or whether — ahm. Sorry. Wrong one. OK.

The Fredholm Alternative comes from the world of functional analysis. In functional analysis we study sets of functions with tools from elsewhere in mathematics. Some you’d be surprised aren’t already in there. There’s adding functions together, multiplying them, the stuff of arithmetic. Some might be a bit surprising, like the stuff we draw from linear algebra. That’s ideas like functions having length, or being at angles to each other. Or that length and those angles changing when we take a function of those functions. This may sound baffling. But a mathematics student who’s got into functional analysis usually has a happy surprise waiting. She discovers the subject is easy. At least, it relies on a lot of stuff she’s learned already, applied to stuff that’s less difficult to work with than, like, numbers.

(This may be a personal bias. I found functional analysis a thoroughgoing delight, even though I didn’t specialize in it. But I got the impression from other grad students that functional analysis was well-liked. Maybe we just got the right instructor for it.)

I’ve mentioned in passing “operators”. These are functions that have a domain that’s a set of functions and a range that’s another set of functions. Suppose you come up to me with some function, let’s say $f(x) = x^2$. I give you back some other function — say, $F(x) = \frac{1}{3}x^3 - 4$. Then I’m acting as an operator.

Why should I do such a thing? Many operators correspond to doing interesting stuff. Taking derivatives of functions, for example. Or undoing the work of taking a derivative. Describing how changing a condition changes what sorts of outcomes a process has. We do a lot of stuff with these. Trust me.

Let me use the name T’ for some operator. I’m not going to say anything about what it does. The letter’s arbitrary. We like to use capital letters for operators because it makes the operators look extra important. And we don’t want to use O’ because that just looks like zero and we don’t need that confusion.

Anyway. We need two functions. One of them will be called ‘f’ because we always call functions ‘f’. The other we’ll call ‘v’. In setting up the Fredholm Alternative we have this important thing: we know what ‘f’ is. We don’t know what ‘v’ is. We’re finding out something about what ‘v’ might be. The operator doing whatever it does to a function we write down as if it were multiplication, that is, like ‘Tv’. We get this notation from linear algebra. There we multiple matrices by vectors. Matrix-times-vector multiplication works like operator-on-a-function stuff. So much so that if we didn’t use the same notation young mathematics grad students would rise in rebellion. “This is absurd,” they would say, in unison. “The connotations of these processes are too alike not to use the same notation!” And the department chair would admit they have a point. So we write ‘Tv’.

If you skipped out on mathematics after high school you might guess we’d write ‘T(v)’ and that would make sense too. And, actually, we do sometimes. But by the time we’re doing a lot of functional analysis we don’t need the parentheses so much. They don’t clarify anything we’re confused about, and they require all the work of parenthesis-making. But I do see it sometimes, mostly in older books. This makes me think mathematicians started out with ‘T(v)’ and then wrote less as people got used to what they were doing.

I admit we might not literally know what ‘f’ is. I mean we know what ‘f’ is in the same way that, for a quadratic equation, “ax2 + bx + c = 0”, we “know” what ‘a’, ‘b’, and ‘c’ are. Similarly we don’t know what ‘v’ is in the same way we don’t know what ‘x’ there is. The Fredholm Alternative tells us exactly one of these two things has to be true:

For operators that meet some requirements I don’t feel like getting into, either:

1. There’s one and only one ‘v’ which makes the equation $Tv = f$ true.
2. Or else $Tv = 0$ for some ‘v’ that isn’t just zero everywhere.

That is, either there’s exactly one solution, or else there’s no solving this particular equation. We can rule out there being two solutions (the way quadratic equations often have), or ten solutions (the way some annoying problems will), or infinitely many solutions (oh, it happens).

It turns up often in boundary value problems. Often before we try solving one we spend some time working out whether there is a solution. You can imagine why it’s worth spending a little time working that out before committing to a big equation-solving project. But it comes up elsewhere. Very often we have problems that, at their core, are “does this operator match anything at all in the domain to a particular function in the range?” When we try to answer we stumble across Fredholm’s Alternative over and over.

Fredholm here was Ivar Fredholm, a Swedish mathematician of the late 19th and early 20th centuries. He worked for Uppsala University, and for the Swedish Social Insurance Agency, and as an actuary for the Skandia insurance company. Wikipedia tells me that his mathematical work was used to calculate buyback prices. I have no idea how.

As though to reinforce how nothing was basically wrong, Comic Strip Master Command sent a normal number of mathematically themed comics around this past week. They bunched the strips up in the first half of the week, but that will happen. It was a fun set of strips in any event.

Rob Harrell’s Adam @ Home for the 11th tells of a teacher explaining division through violent means. I’m all for visualization tools and if we are going to use them, the more dramatic the better. But I suspect Mrs Clark’s students will end up confused about what exactly they’ve learned. If a doll is torn into five parts, is that communicating that one divided by five is five? If the students were supposed to identify the mass of the parts of the torn-up dolls as the result of dividing one by five, was that made clear to them? Maybe it was. But there’s always the risk in a dramatic presentation that the audience will misunderstand the point. The showier the drama the greater the risk, it seems to me. But I did only get the demonstration secondhand; who knows how well it was done?

Greg Cravens’ The Buckets for the 11th has the kid, Toby, struggling to turn a shirt backwards and inside-out without taking it off. As the commenters note this is the sort of problem we get into all the time in topology. The field is about what can we say about shapes when we don’t worry about distance? If all we know about a shape is the ways it’s connected, the number of holes it has, whether we can distinguish one side from another, what else can we conclude? I believe Gocomics.com commenter Mike is right: take one hand out the bottom of the shirt and slide it into the other sleeve from the outside end, and proceed from there. But I have not tried it myself. I haven’t yet started wearing long-sleeve shirts for the season.

Bill Amend’s FoxTrot for the 11th — a new strip — does a story problem featuring pizzas cut into some improbable numbers of slices. I don’t say it’s unrealistic someone might get this homework problem. Just that the story writer should really ask whether they’ve ever seen a pizza cut into sevenths. I have a faint memory of being served a pizza cut into tenths by same daft pizza shop, which implies fifths is at least possible. Sevenths I refuse, though.

Mark Tatulli’s Heart of the City for the 12th plays on the show-your-work directive many mathematics assignments carry. I like Heart’s showiness. But the point of showing your work is because nobody cares what (say) 224 divided by 14 is. What’s worth teaching is the ability to recognize what approaches are likely to solve what problems. What’s tested is whether someone can identify a way to solve the problem that’s likely to succeed, and whether that can be carried out successfully. This is why it’s always a good idea, if you are stumped on a problem, to write out how you think this problem should be solved. Writing out what you mean to do can clarify the steps you should take. And it can guide your instructor to whether you’re misunderstanding something fundamental, or whether you just missed something small, or whether you just had a bad day.

Norm Feuti’s Gil for the 12th, another rerun, has another fanciful depiction of showing your work. The teacher’s got a fair complaint in the note. We moved away from tally marks as a way to denote numbers for reasons. Twelve depictions of apples are harder to read than the number 12. And they’re terrible if we need to depict numbers like one-half or one-third. Might be an interesting side lesson in that.

Brian Basset’s Red and Rover for the 14th is a rerun and one I’ve mentioned in these parts before. I understand Red getting fired up to be an animator by the movie. It’s been a while since I watched Donald Duck in Mathmagic Land but my recollection is that while it was breathtaking and visually inventive it didn’t really get at mathematics. I mean, not at noticing interesting little oddities and working out whether they might be true always, or sometimes, or almost never. There is a lot of play in mathematics, especially in the exciting early stages where one looks for a thing to prove. But it’s also in seeing how an ingenious method lets you get just what you wanted to know. I don’t know that the short demonstrates enough of that.

Bud Blake’s Tiger rerun for the 15th gives Punkinhead the chance to ask a question. And it’s a great question. I’m not sure what I’d say arithmetic is, not if I’m going to be careful. Offhand I’d say arithmetic is a set of rules we apply to a set of things we call numbers. The rules are mostly about how we can take two numbers and a rule and replace them with a single number. And these turn out to correspond uncannily well with the sorts of things we do with counting, combining, separating, and doing some other stuff with real-world objects. That it’s so useful is why, I believe, arithmetic and geometry were the first mathematics humans learned. But much of geometry we can see. We can look at objects and see how they fit together. Arithmetic we have to infer from the way the stuff we like to count works. And that’s probably why it’s harder to do when we start school.

What’s not good about that as an answer is that it actually applies to a lot of mathematical constructs, including those crazy exotic ones you sometimes see in science press. You know, the ones where there’s this impossibly complicated tangle with ribbons of every color and a headline about “It’s Revolutionary. It’s 46-Dimensional. It’s Breaking The Rules Of Geometry. Is It The Shape That Finally Quantizes Gravity?” or something like that. Well, describe a thing vaguely and it’ll match a lot of other things. But also when we look to new mathematical structures, we tend to look for things that resemble arithmetic. Group theory, for example, is one of the cornerstones of modern mathematical thought. It’s built around having a set of things on which we can do something that looks like addition. So it shouldn’t be a surprise that many groups have a passing resemblance to arithmetic. Mathematics may produce universal truths. But the ones we see are also ones we are readied to see by our common experience. Arithmetic is part of that common experience.

Also Jerry Scott and Jim Borgman’s Zits for the 14th I think doesn’t really belong here. It’s just got a cameo appearance by the concept of mathematics. Dave Whamond’s Reality Check for the 17th similarly just mentions the subject. But I did want to reassure any readers worried after last week that Pierce recovered fine. Also that, you know, for not having a stomach for mathematics he’s doing well carrying on. Discipline will carry one far.

Why Stuff Can Orbit, Part 4: On The L

Less way previously:

We were chatting about central forces. In these a small object — a satellite, a planet, a weight on a spring — is attracted to the center of the universe, called the origin. We’ve been studying this by looking at potential energy, a function that in this case depends only on how far the object is from the origin. But to find circular orbits, we can’t just look at the potential energy. We have to modify this potential energy to account for angular momentum. This essay I mean to discuss that angular momentum some.

Let me talk first about the potential energy. Mathematical physicists usually write this as a function named U or V. I’m using V. That’s what my professor used teaching this, back when I was an undergraduate several hundred thousand years ago. A central force, by definition, changes only with how far you are from the center. I’ve put the center at the origin, because I am not a madman. This lets me write the potential energy as V = V(r).

V(r) could, in principle, be anything. In practice, though, I am going to want it to be r raised to a power. That is, V(r) is equal to C rn. The ‘C’ here is a constant. It’s a scaling constant. The bigger a number it is the stronger the central force. The closer the number is to zero the weaker the force is. In standard units, gravity has a constant incredibly close to zero. This makes orbits very big things, which generally works out well for planets. In the mathematics of masses on springs, the constant is closer to middling little numbers like 1.

The ‘n’ here is a deceiver. It’s a constant number, yes, and it can be anything we want. But the use of ‘n’ as a symbol has connotations. Usually when a mathematician or a physicist writes ‘n’ it’s because she needs a whole number. Usually a positive whole number. Sometimes it’s negative. But we have a legitimate central force if ‘n’ is any real number: 2, -1, one-half, the square root of π, any of that is good. If you just write ‘n’ without explanation, the reader will probably think “integers”, possibly “counting numbers”. So it’s worth making explicit when this isn’t so. It’s bad form to surprise the reader with what kind of number you’re even talking about.

(Some number of essays on we’ll find out that the only values ‘n’ can have that are worth anything are -1, 2, and 7. And 7 isn’t all that good. But we aren’t supposed to know that yet.)

C rn isn’t the only kind of central force that could exist. Any function rule would do. But it’s enough. If we wanted a more complicated rule we could just add two, or three, or more potential energies together. This would give us $V(r) = C_1 r^{n_1} + C_2 r^{n_2}$, with C1 and C2 two possibly different numbers, and n1 and n2 two definitely different numbers. (If n1 and n2 were the same number then we should just add C1 and C2 together and stop using a more complicated expression than we need.) Remember that Newton’s Law of Motion about the sum of multiple forces being something vector something something direction? When we look at forces as potential energy functions, that law turns into just adding potential energies together. They’re well-behaved that way.

And if we can add these r-to-a-power potential energies together then we’ve got everything we need. Why? Polynomials. We can approximate most any potential energy that would actually happen with a big enough polynomial. Or at least a polynomial-like function. These r-to-a-power forces are a basis set for all the potential energies we’re likely to care about. Understand how to work with one and you understand how to work with them all.

Well, one exception. The logarithmic potential, V(r) = C log(r), is really interesting. And it has real-world applicability. It describes how strongly two vortices, two whirlpools, attract each other. You can write the logarithm as a polynomial. But logarithms are pretty well-behaved functions. You might be better off just doing that as a special case.

Still, at least to start with, we’ll stick with V(r) = C rn and you know what I mean by all those letters now. So I’m free to talk about angular momentum.

You’ve probably heard of momentum. It’s got something to do with movement, only sports teams and political campaigns are always gaining or losing it somehow. When we talk of that we’re talking of linear momentum. It describes how much mass is moving how fast in what direction. So it’s a vector, in three-dimensional space. Or two-dimensional space if you’re making the calculations easier. To find what the vector is, we make a list of every object that’s moving. We take its velocity — how fast it’s moving and in what direction — and multiply that by its mass. Mass is a single number, a scalar, and we’re always allowed to multiply a vector by a scalar. This gets us another vector. Once we’ve done that for everything that’s moving, we add all those product vectors together. We can always add vectors together. And this gives us a grand total vector, the linear momentum of the system.

And that’s conserved. If one part of the system starts moving slower it’s because other parts are moving faster, and vice-versa. In the real world momentum seems to evaporate. That’s because some of the stuff moving faster turns out to be air objects bumped into, or particles of the floor that get dragged along by friction, or other stuff we don’t care about. That momentum can seem to evaporate is what makes its use in talking about ports teams or political campaigns make sense. It also annoys people who want you to know they understand science words better than you. So please consider this my authorization to use “gaining” and “losing” momentum in this sense. Ignore complainers. They’re the people who complain the word “decimate” gets used to mean “destroy way more than ten percent of something”, even though that’s the least bad mutation of an English word’s meaning in three centuries.

Angular momentum is also a vector. It’s also conserved. We can calculate what that vector is by the same sort of process, that of calculating something on each object that’s spinning and adding it all up. In real applications it can seem to evaporate. But that’s also because the angular momentum is going into particles of air. Or it rubs off grease on the axle. Or it does other stuff we wish we didn’t have to deal with.

The calculation is a little harder to deal with. There’s three parts to a spinning thing. There’s the thing, and there’s how far it is from the axis it’s spinning around, and there’s how fast it’s spinning. So you need to know how fast it’s travelling in the direction perpendicular to the shortest line between the thing and the axis it’s spinning around. Its angular momentum is going to be as big as the mass times the distance from the axis times the perpendicular speed. It’s going to be pointing in whichever axis direction makes its movement counterclockwise. (Because that’s how physicists started working this out and it would be too much bother to change now.)

You might ask: wait, what about stuff like a wheel that’s spinning around its center? Or a ball being spun? That can’t be an angular momentum of zero? How do we work that out? The answer is: calculus. Also, we don’t need that. This central force problem I’ve framed so that we barely even need algebra for it.

See, we only have a single object that’s moving. That’s the planet or satellite or weight or whatever it is. It’s got some mass, the value of which we call ‘m’ because why make it any harder on ourselves. And it’s spinning around the origin. We’ve been using ‘r’ to mean the number describing how far it is from the origin. That’s the distance to the axis it’s spinning around. Its velocity — well, we don’t have any symbols to describe what that is yet. But you can imagine working that out. Or you trust that I have some clever mathematical-physics tool ready to introduce to work it out. I have, kind of. I’m going to ignore it altogether. For now.

The symbol we use for the total angular momentum in a system is $\vec{L}$. The little arrow above the symbol is one way to denote “this is a vector”. It’s a good scheme, what with arrows making people think of vectors and it being easy to write on a whiteboard. In books, sometimes, we make do just by putting the letter in boldface, L, which is easier for old-fashioned word processors to do. If we’re sure that the reader isn’t going to forget that L is this vector then we might stop highlighting the fact altogether. That’s even less work to do.

It’s going to be less work yet. Central force problems like this mean the object can move only in a two-dimensional plane. (If it didn’t, it wouldn’t conserve angular momentum: the direction of $\vec{L}$ would have to change. Sounds like magic, but trust me.) The angular momentum’s direction has to be perpendicular to that plane. If the object is spinning around on a sheet of paper, the angular momentum is pointing straight outward from the sheet of paper. It’s pointing toward you if the object is moving counterclockwise. It’s pointing away from you if the object is moving clockwise. What direction it’s pointing is locked in.

All we need to know is how big this angular momentum vector is, and whether it’s positive or negative. So we just care about this number. We can call it ‘L’, no arrow, no boldface, no nothing. It’s just a number, the same as is the mass ‘m’ or distance from the origin ‘r’ or any of our other variables.

If ‘L’ is zero, this means there’s no total angular momentum. This means the object can be moving directly out from the origin, or directly in. This is the only way that something can crash into the center. So if setting L to be zero doesn’t allow that then we know we did something wrong, somewhere. If ‘L’ isn’t zero, then the object can’t crash into the center. If it did we’d be losing angular momentum. The object’s mass times its distance from the center times its perpendicular speed would have to be some non-zero number, even when the distance was zero. We know better than to look for that.

You maybe wonder why we use ‘L’ of all letters for the angular momentum. I do. I don’t know. I haven’t found any sources that say why this letter. Linear momentum, which we represent with $\vec{p}$, I know. Or, well, I know the story every physicist says about it. p is the designated letter for linear momentum because we used to use the word “impetus”, as in “impulse”, to mean what we mean by momentum these days. And “p” is the first letter in “impetus” that isn’t needed for some more urgent purpose. (“m” is too good a fit for mass. “i” has to work both as an index and as that number which, squared, gives us -1. And for that matter, “e” we need for that exponentials stuff, and “t” is too good a fit for time.) That said, while everybody, everybody, repeats this, I don’t know the source. Perhaps it is true. I can imagine, say, Euler or Lagrange in their writing settling on “p” for momentum and everybody copying them. I just haven’t seen a primary citation showing this is so.

(I don’t mean to sound too unnecessarily suspicious. But just because everyone agrees on the impetus-thus-p story doesn’t mean it’s so. I mean, every Star Trek fan or space historian will tell you that the first space shuttle would have been named Constitution until the Trekkies wrote in and got it renamed Enterprise. But the actual primary documentation that the shuttle would have been named Constitution is weak to nonexistent. I’ve come to the conclusion NASA had no plan in mind to name space shuttles until the Trekkies wrote in and got one named. I’ve done less poking around the impetus-thus-p story, in that I’ve really done none, but I do want it on record that I would like more proof.)

Anyway, “p” for momentum is well-established. So I would guess that when mathematical physicists needed a symbol for angular momentum they looked for letters close to “p”. When you get into more advanced corners of physics “q” gets called on to be position a lot. (Momentum and position, it turns out, are nearly-identical-twins mathematically. So making their symbols p and q offers aesthetic charm. Also great danger if you make one little slip with the pen.) “r” is called on for “radius” a lot. Looking on, “t” is going to be time.

On the other side of the alphabet, well, “o” is just inviting danger. “n” we need to count stuff. “m” is mass or we’re crazy. “l” might have just been the nearest we could get to “p” without intruding on a more urgently-needed symbol. (“s” we use a lot for parameters like length of an arc that work kind of like time but aren’t time.) And then shift to the capital letter, I expect, because a lowercase l looks like a “1”, to everybody’s certain doom.

The modified potential energy, then, is going to include the angular momentum L. At least, the amount of angular momentum. It’s also going to include the mass of the object moving, and the radius r that says how far the object is from the center. It will be:

$V_{eff}(r) = V(r) + \frac{L^2}{2 m r^2}$

V(r) was the original potential, whatever that was. The modifying term, with this square of the angular momentum and all that, I kind of hope you’ll just accept on my word. The L2 means that whether the angular momentum is positive or negative, the potential will grow very large as the radius gets small. If it didn’t, there might not be orbits at all. And if the angular momentum is zero, then the effective potential is the same original potential that let stuff crash into the center.

For the sort of r-to-a-power potentials I’ve been looking at, I get an effective potential of:

$V_{eff}(r) = C r^n + \frac{L^2}{2 m r^2}$

where n might be an integer. I’m going to pretend a while longer that it might not be, though. C is certainly some number, maybe positive, maybe negative.

If you pick some values for C, n, L, and m you can sketch this out. If you just want a feel for how this Veff looks it doesn’t much matter what values you pick. Changing values just changes the scale, that is, where a circular orbit might happen. It doesn’t change whether it happens. Picking some arbitrary numbers is a good way to get a feel for how this sort of problem works. It’s good practice.

Sketching will convince you there are energy minimums, where we can get circular orbits. It won’t say where to find them without some trial-and-error or building a model of this energy and seeing where a ball bearing dropped into it rolls to a stop. We can do this more efficiently.

A Leap Day 2016 Mathematics A To Z: Fractions (Continued)

Another request! I was asked to write about continued fractions for the Leap Day 2016 A To Z. The request came from Keilah, of the Knot Theorist blog. But I’d already had a c-word request in (conjecture). So you see my elegant workaround to talk about continued fractions anyway.

Fractions (continued).

There are fashions in mathematics. There are fashions in all human endeavors. But mathematics almost begs people to forget that it is a human endeavor. Sometimes a field of mathematics will be popular a while and then fade. Some fade almost to oblivion. Continued fractions are one of them.

A continued fraction comes from a simple enough starting point. Start with a whole number. Add a fraction to it. $1 + \frac{2}{3}$. Everyone knows what that is. But then look at the denominator. In this case, that’s the ‘3’. Why couldn’t that be a sum, instead? No reason. Imagine then the number $1 + \frac{2}{3 + 4}$. Is there a reason that we couldn’t, instead of the ‘4’ there, have a fraction instead? No reason beyond our own timidity. Let’s be courageous. Does $1 + \frac{2}{3 + \frac{4}{5}}$ even mean anything?

Well, sure. It’s getting a little hard to read, but $3 + \frac{4}{5}$ is a fine enough number. It’s 3.8. $\frac{2}{3.8}$ is a less friendly number, but it’s a number anyway. It’s a little over 0.526. (It takes a fair number of digits past the decimal before it ends, but trust me, it does.) And we can add 1 to that easily. So $1 + \frac{2}{3 + \frac{4}{5}}$ means a number a slight bit more than 1.526.

Dare we replace the “5” in that expression with a sum? Better, with the sum of a whole number and a fraction? If we don’t fear being audacious, yes. Could we replace the denominator of that with another sum? Yes. Can we keep doing this forever, creating this never-ending stack of whole numbers plus fractions? … If we want an irrational number, anyway. If we want a rational number, this stack will eventually end. But suppose we feel like creating an infinitely long stack of continued fractions. Can we do it? Why not? Who dares, wins!

OK. Wins what, exactly?

Well … um. Continued fractions certainly had a fashionable time. John Wallis, the 17th century mathematician famous for introducing the ∞ symbol, and for an interminable quarrel with Thomas Hobbes over Hobbes’s attempts to reform mathematics, did much to establish continuous fractions as a field of study. (He’s credited with inventing the field. But all claims to inventing something big are misleading. Real things are complicated and go back farther than people realize, and inventions are more ambiguous than people think.) The astronomer Christiaan Huygens showed how to use continued fractions to design better gear ratios. This may strike you as the dullest application of mathematics ever. Let it. It’s also important stuff. People who need to scale one movement to another need this.

In the 18th and 19th century continued fractions became interesting for higher mathematics. Continued fractions were the approach Leonhard Euler used to prove that e had to be irrational. That’s one of the superstar numbers of mathematics. Johan Heinrich Lambert used this to show that if θ is a rational number (other than zero) then the tangent of θ must be irrational. This is one path to showing that π must be irrational. Many of the astounding theorems of Srinivasa Ramanujan were about continued fractions, or ideas which built on continued fractions.

But since the early 20th century the field’s evaporated. I don’t have a good answer why. The best speculation I’ve heard is that the field seems to fit poorly into any particular topic. Continued fractions get interesting when you have an infinitely long stack of nesting denominators. You don’t want to work with infinitely long strings of things before you’ve studied calculus. You have to be comfortable with these things. But that means students don’t encounter it until college, at least. And at that point fractions seem beneath the grade level. There’s a handful of proofs best done by them. But those proofs can be shown as odd, novel approaches to these particular problems. Studying the whole field is hardly needed.

So, perhaps because it seems like an odd fit, the subject’s dried up and blown away. Even enthusiasts seem to be resigned to its oblivion. Professor Adam Van Tyul, then at Queens University in Kingston, Ontario, composed a nice set of introductory pages about continued fractions. But the page is defunct. Dr Ron Knott has a more thorough page, though, and one with calculators that work well.

Will continued fractions make a comeback? Maybe. It might take the discovery of some interesting new results, or some better visualization tools, to reignite interest. Chaos theory, the study of deterministic yet unpredictable systems, first grew (we now recognize) in the 1890s. But it fell into obscurity. When we got some new theoretical papers and the ability to do computer simulations, it flowered again. For a time it looked ready to take over all mathematics, although we’ve got things under better control now. Could continued fractions do the same? I’m skeptical, but won’t rule it out.

Postscript: something you notice quickly with continued fractions is they’re a pain to typeset. We’re all right with $1 + \frac{2}{3 + \frac{4}{5}}$. But after that the LaTeX engine that WordPress uses to render mathematical symbols is doomed. A real LaTeX engine gets another couple nested denominators in before the situation is hopeless. If you’re writing this out on paper, the way people did in the 19th century, that’s all right. But there’s no typing it out that way.

But notation is made for us, not us for notation. If we want to write a continued fraction in which the numerators are all 1, we have a brackets shorthand available. In this we would write $2 + \frac{1}{3 + \frac{1}{4 + \cdots }}$ as [2; 3, 4, … ]. The numbers are the whole numbers added to the next level of fractions. Another option, and one that lends itself to having numerators which aren’t 1, is to write out a string of fractions. In this we’d write $2 + \frac{1}{3 +} \frac{1}{4 +} \frac{1}{\cdots + }$. We have to trust people notice the + sign is in the denominator there. But if people know we’re doing continued fractions then they know to look for the peculiar notation.

Reading the Comics, February 17, 2016: Using Mathematics Edition

Is there a unifying theme between many of the syndicated comic strips with mathematical themes the last few days? Of course there is. It’s students giving snarky answers to their teachers’ questions. That’s the theme every week. But other stuff comes up.

Joe Martin’s Boffo for the 12th of depicts “the early days before all the bugs were worked out” of mathematics. And the early figure got a whole string of operations which don’t actually respect the equals sign, before getting finally to the end. Were I to do this, I would use an arrow, =>, and I suspect many mathematicians would too. It’s a way of indicating the flow of one’s thoughts without trying to assert that 2+2 is actually the same number as 1 + 1 + 1 + 1 + 6.

And this comic is funny, in part, because it’s true. New mathematical discoveries tend to be somewhat complicated, sloppy messes to start. Over time, if the thing is of any use, the mathematical construct gets better. By better I mean the logic behind it gets better explained. You’d expect that, of course, just because time to reflect gives time to improve exposition. But the logic also tends to get better. We tend to find arguments that are, if not shorter, then better-constructed. We get to see how something gets used, and how to relate it to other things we’d like to do, and how to generalize the pieces of argument that go into it. If we think of a mathematical argument as a narrative, then, we learn how to write the better narrative.

Then, too, we get better at notation, at isolating what concepts we want to describe and how to describe them. For example, to write the fourth power of a number such as ‘x’, mathematicians used to write ‘xxxx’ — fair enough, but cumbersome. Or then xqq — the ‘q’ standing for quadratic, that is, square, of the thing before. That’s better. At least it’s less stuff to write. How about “xiiii” (as in the Roman numeral IV)? Getting to “x4” took time, and thought, and practice with what we wanted to raise numbers to powers to do. In short, we had to get the bugs worked out.

John Rose’s Barney Google and Snuffy Smith for the 12th of February is your normal student-resisting-word-problems joke. And hey, at least they have train service still in Smith’s hometown.

Randy Glasbergen’s Glasbergen Cartoons for the 12th (a rerun; Galsbergen died last year) is a similar student-resisting-problems joke. Arithmetic gets an appearance no doubt because it’s the easiest kind of problem to put on the board and not distract from the actual joke.

Mark Pett’s Lucky Cow for the 14th (a rerun from the early 2000s) mentions the chaos butterfly. I am considering retiring chaos butterfly mentions from these roundups because I seem to say the same thing each time. But I haven’t yet, so I’ll say it. Part of what makes a system chaotic is that it’s deterministic and unpredictable. Most different outcomes result from starting points so similar they can’t be told apart. There’s no guessing whether any action makes things better or worse, and whether that’s in the short or the long term.

Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 14th is surely not a response to that Pearls Before Swine from last time. I believe all the Saturday Morning Breakfast Cereal strips to appear on Gocomics are reruns from its earlier days as a web comic. But it serves as a riposte to the “nobody uses mathematics anyway” charge. And it’s a fine bit of revenge fantasy.

Historically, being the sole party that understands the financial calculations has not brought money lenders appreciation.

Tony Cochran’s Agnes for the 17th also can’t be a response to that Pearls Before Swine. The lead times just don’t work that way. But it gives another great reason to learn mathematics. I encourage anyone who wants to be Lord and Queen of Mathdom; it’s worth a try.

Tom Thaves’s Frank and Ernest for the 17th tells one of the obvious jokes about infinite sets. Fortunately mathematicians aren’t expected to list everything that goes into an infinitely large set. It would put a terrible strain on our wrists. Usually it’s enough to describe the things that go in it. Some descriptions are easy, especially if there’s a way to match the set with something already familiar, like counting numbers or real numbers. And sometimes a description has to be complicated.

There are urban legends among grad students. Many of them are thesis nightmares. One is about such sets. The story goes of the student who had worked for years on a set whose elements all had some interesting collection of properties. At the defense her advisor — the person who’s supposed to have guided her through finding and addressing an interesting problem — actually looks at the student’s work for the first time in ages, or ever. And starts drawing conclusions from it. And proves that the only set whose elements all have these properties is the null set, which hasn’t got anything in it. The whole thesis is a bust. Thaves probably didn’t have that legend in mind. But you could read the comic that way.

Percy Crosby’s Skippy for the 17th gives a hint how long kids in comic strips have been giving smart answers to teachers. This installment’s from 1928 sometime. Skippy’s pretty confident in himself, it must be said.

The Set Tour, Stage 2: The Real Star

For the second of my little tour of sets that get commonly used as domains and ranges I want to name the most common of them all.

R

This is the real numbers. In text that’s written with a bold R. Written by hand, and often in text, that’s written with a capital R that has a double stroke for the main vertical line. That’s an easy-to-write way to distinguish it from a plain old civilian R. The double-vertical-stroke convention is used for many of the most common sets of numbers. It will get used for letters like I and J (the integers), or N (the counting numbers). A vertical stroke will even get added to symbols that technically don’t have any vertical strokes, like Q (the rational numbers). There it’s just put inside the loop, on the left side, far enough from the edge that the reader can notice the vertical stroke is there.

R is a big one. It’s not just a big set. It’s also a popular one. It may as well be the default domain and range. If someone fails to tell you what either set is, you can suppose she meant R and be only rarely wrong. The real numbers are familiar and popular and it feels like we know what they are. It’s a bit tricky to define them exactly, though, and you’ll notice that I’m not doing that. You know what I mean, though. It’s whole numbers, and rational numbers, and irrational numbers like the square root of pi, and for that matter pi, and a whole bunch of other boring numbers nobody looks at. Let’s leave it at that.

All the intervals I talked about last time are subsets of R. If we really wanted to, we could turn a function with domain an interval like [0, 1] into a function with a domain of R. That’s a kind of “embedding”. Let me call the function with domain [0, 1] by the name “f”. I’ll then define g, on the domain R, by the rule “whatever f(x) is, if x is from 0 to 1; and some other, harmless value, if x isn’t”. Probably the harmless value is zero. Sometimes we need to change the domain a function’s defined on, and this is a way to do it.

If we only want to talk about the positive real numbers we can denote that by putting a plus sign in superscript: R+. If we only want the negative numbers we put in a minus sign: R. Do either of these include zero? My heart tells me neither should, but I wouldn’t be surprised if in practice either did, because zero is often useful to have around. To be careful we might explicitly include zero, using the notations of set theory. Then we might write $\textbf{R}^+ \cup \left\{0\right\}$.

Sometimes the rule for a function doesn’t make sense for some values. For example, if a function has the rule $f: x \mapsto 1 / (x - 1)$ then you can’t work out a value for f(1). That would require dividing by zero and we dare not do that. A careful mathematician would say the domain of that function f is all the real numbers R except for the number 1. This exclusion gets written as “R \ {1}”. The backslash means “except the numbers in the following set”. It might be a single number, such as in this example. It might be a lot of numbers. The function $g: x \mapsto \log\left(1 - x\right)$ is meaningless for any x that’s equal to or greater than 1. We could write its domain then as “R \ { x: x ≥ 1 }”.

That’s if we’re being careful. If we get a little careless, or if we’re writing casually, or if the set of non-permitted points is complicated we might omit that. Mathematical writing includes an assumption of good faith. The author is supposed to be trying to say something interesting and true. The reader is expected to be skeptical but not quarrelsome. Spotting a flaw in the argument because the domain doesn’t explicitly rule out some points it shouldn’t have is tedious. Finding that the interesting thing only holds true for values that are implicitly outside the domain is serious.

The set of real numbers is a group; it has an operation that works like addition. We call it addition. For that matter, it’s a ring. It has an operation that works like multiplication. We call it multiplication. And it’s even more than a ring. Everything in R except for the additive identity — 0, the number you can add to anything without changing what the thing is — has a multiplicative inverse. That is, any number except zero has some number you can multiply it by to get 1. This property makes it a “field”, to people who study (abstract) algebra. This “field” hasn’t got anything to do with gravitational or electrical or baseball or magnetic fields. But the overlap in names does serve to sometimes confuse people.

But having this multiplicative inverse means that we can do something that operates like division. Divide one thing by a second by taking the first thing and multiplying it by the second thing’s multiplicative inverse. We call this division-like operation “division”.

It’s not coincidence that the algebraic “addition” and “multiplication” and “division” operations are the ones we call addition and multiplication and division. What makes abstract algebra abstract is that it’s the study of things that work kind of like the real numbers do. The operations we can do on the real numbers inspire us to look for other sets that can let us do similar things.

Lewis Carroll Tries Changing The Way You See Trigonometry

Today’s On This Day In Math tweet was well-timed. I’d recently read Robin Wilson’s Lewis Carroll In Numberland: His Fantastical Mathematical Logical Life. It’s a biography centered around Charles Dodgson’s mathematical work. It shouldn’t surprise you that he was fascinated with logic, and wrote texts — and logic games — that crackle with humor. People who write logic texts have a great advantage on other mathematicians (or philosophers). Almost any of their examples can be presented as a classically structured joke. Vector calculus isn’t so welcoming. But Carroll was good at logic-joke writing.

Developing good notation was one of Dodgson/Carroll’s ongoing efforts, though. I’m not aware of any of his symbols that have got general adoption. But he put forth some interesting symbols to denote the sine and cosine and other trigonometric functions. In 1861, the magazine The Athanaeum reviewed one of his books, with its new symbols for the basic trigonometric functions. (The link shows off all these symbols.) The reviewer was unconvinced, apparently.

I confess that I am, too, but mostly on typographical grounds. It is very easy to write or type out “sin θ” and get something that makes one think of the sine of angle θ. And I’m biased by familiarity, after all. But Carroll’s symbols have a certain appeal. I wonder if they would help people learning the functions keep straight what each one means.

The basic element of the symbols is a half-circle. The sine is denoted by the half-circle above the center, with a vertical line in the middle of that. So it looks a bit like an Art Deco ‘E’ fell over. The cosine is denoted by the half circle above the center, but with a horizontal line underneath. It’s as if someone started drawing Chad and got bored and wandered off. The tangent gets the same half-circle again, with a horizontal line on top of the arc, literally tangent to the circle.

There’s a subtle brilliance to this. One of the ordinary ways to think of trigonometric functions is to imagine a circle with radius 1 that’s centered on the origin. That is, its center has x-coordinate 0 and y-coordinate 0. And we imagine drawing the line that starts at the origin, and that is off at an angle θ from the positive x-axis. (That is, the line that starts at the origin and goes off to the right. That’s the direction where the x-coordinate of points is increasing and the y-coordinate is always zero.) (Yes, yes, these are line segments, or rays, rather than lines. Let it pass.)

The sine of the angle θ is also going to be the y-coordinate of the point where the line crosses the unit circle. That is, it’s the vertical coordinate of that point. So using a vertical line touching a semicircle to suggest the sine represents visually one thing that the sine means. And the cosine of the angle θ is going to be the x-coordinate of the point where the line crosses the unit circle. So representing the cosine with a horizontal line and a semicircle again underlines one of its meanings. And, for that matter, the line might serve as a reminder to someone that the sine of a right angle will be 1, while the cosine of an angle of zero is 1.

The tangent has a more abstract interpretation. But a line that comes up to and just touches a curve at a single point is, literally, a tangent line. This might not help one remember any useful values for the tangent. (That the tangent of zero is zero, the tangent of half a right angle is 1, the tangent of a right angle is undefined). But it’s still a guide to what things mean.

The cotangent is just the tangent upside-down. Literally; it’s the lower half of a circle, with a horizontal line touching it at its lowest point. That’s not too bad a symbol, actually. The cotangent of an angle is the reciprocal of the tangent of an angle. So making its symbol be the tangent flipped over is mnemonic.

The secant and cosecant are worse symbols, it must be admitted. The secant of an angle is the reciprocal of the cosine of the angle, and the cosecant is the reciprocal of the sine. As far as I can tell they’re mostly used because it’s hard to typeset $\frac{1}{\sin\left(\theta\right)}$. And to write instead $\sin^{-1}\left(\theta\right)$ would be confusing as that’s often used for the inverse sine, or arcsine, function. I don’t think these symbols help matters any. I’m surprised Carroll didn’t just flip over the cosine and sine symbols, the way he did with the cotangent.

The versed sine function is one that I got through high school without hearing about. I imagine you have too. The versed sine, or the versine, of an angle is equal to one minus the cosine of the angle. Why do we need such a thing? … Computational convenience is the best answer I can find. It turns up naturally if you’re trying to work out the distance between points on the surface of a sphere, so navigators needed to know it.

And if we need to work with small angles, then this can be more computationally stable than the cosine is. The cosine of a small angle is close to 1, and the difference between 1 and the cosine, if you need such a thing, may be lost to roundoff error. But the versed sine … well, it will be the same small number. But the table of versed sines you have to refer to will list more digits. There’s a difference between working out “1 – 0.9999” and working with “0.0001473”, if you need three digits of accuracy.

But now we don’t need printed tables of trigonometric functions to get three (or many more) digits of accuracy. So we can afford to forget the versed sine ever existed. I learn (through Wikipedia) that there are also functions called versed cosines, coversed sines, hacoversed cosines, and excosecants, among others. These names have a wonderful melody and are almost poems by themselves. Just the same I’m glad I don’t have to remember what they all are.

Carroll’s notation just replaces the “sin” or “cos” or “tan” with these symbols, so you would have the half-circle and the line followed by θ or whatever variable you used for the angle. So the symbols don’t save any space on the line. They take fewer pen strokes to write, just two for each symbol. Writing the symbols out by hand takes three or four (or for cosecant, as many as five), unless you’re writing in cursive. They’re still probably faster than the truncated words, though. So I don’t know why precisely the symbols didn’t take hold. I suppose part is that people were probably used to writing “sin θ”. And typesetters already got enough hazard pay dealing with mathematicians and their need for specialized symbols. Why add in another half-dozen or more specialized bits of type for something everyone’s already got along without?

Still, I think there might be some use in these as symbols for mathematicians in training. I’d be interested to know how they serve people just learning trigonometry.

N-tuple.

We use numbers to represent things we want to think about. Sometimes the numbers represent real-world things: the area of our backyard, the number of pets we have, the time until we have to go back to work. Sometimes the numbers mean something more abstract: an index of all the stuff we’re tracking, or how its importance compares to other things we worry about.

Often we’ll want to group together several numbers. Each of these numbers may measure a different kind of thing, but we want to keep straight what kind of thing it is. For example, we might want to keep track of how many people are in each house on the block. The houses have an obvious index number — the street number — and the number of people in each house is just what it says. So instead of just keeping track of, say, “32” and “34” and “36”, and “3” and “2” and “3”, we would keep track of pairs: “32, 3”, and “34, 2”, and “36, 3”. These are called ordered pairs.

They’re not called ordered because the numbers are in order. They’re called ordered because the order in which the numbers are recorded contains information about what the numbers mean. In this case, the first number is the street address, and the second number is the count of people in the house, and woe to our data set if we get that mixed up.

And there’s no reason the ordering has to stop at pairs of numbers. You can have ordered triplets of numbers — (32, 3, 2), say, giving the house number, the number of people in the house, and the number of bathrooms. Or you can have ordered quadruplets — (32, 3, 2, 6), say, house number, number of people, bathroom count, room count. And so on.

An n-tuple is an ordered set of some collection of numbers. How many? We don’t care, or we don’t care to say right now. There are two popular ways to pronounce it. One is to say it the way you say “multiple” only with the first syllable changed to “enn”. Others say it about the same, but with a long u vowel, so, “enn-too-pull”. I believe everyone worries that everyone else says it the other way and that they sound like they’re the weird ones.

You might care to specify what your n is for your n-tuple. In that case you can plug in a value for that n right in the symbol: a 3-tuple is an ordered triplet. A 4-tuple is that ordered quadruplet. A 26-tuple seems like rather a lot but I’ll trust that you know what you’re trying to study. A 1-tuple is just a number. We might use that if we’re trying to make our notation consistent with something else in the discussion.

If you’re familiar with vectors you might ask: so, an n-tuple is just a vector? It’s not quite. A vector is an n-tuple, but in the same way a square is a rectangle. It has to meet some extra requirements. To be a vector we have to be able to add corresponding numbers together and get something meaningful out of it. The ordered pair (32, 3) representing “32 blocks north and 3 blocks east” can be a vector. (32, 3) plus (34, 2) can give us us (66, 5). This makes sense because we can say, “32 blocks north, 3 blocks east, 34 more blocks north, 2 more blocks east gives us 66 blocks north, 5 blocks east.” At least it makes sense if we don’t run out of city. But to add together (32, 3) plus (34, 2) meaning “house number 32 with 3 people plus house number 34 with 2 people gives us house number 66 with 5 people”? That’s not good, whatever town you’re in.

I think the commonest use of n-tuples is to talk about vectors, though. Vectors are such useful things.

Reading the Comics, April 15, 2015: Tax Day Edition

Since it is mid-April, and most of the comic strips at Comics Kingdom and GoComics.com are based in the United States, Comic Strip Master Command ordered quite a few comics about taxes. Most of those are simple grumbling, but the subject naturally comes around to arithmetic and calculation and sometimes even logic. Thus, this is a Tax Day edition, though it’s bookended with Mutt and Jeff.

Bud Fisher’s Mutt And Jeff (April 11) — a rerun rom goodness only knows when, and almost certainly neither written nor drawn by Bud Fisher at that point — recounts a joke that has the form of a word problem in which a person’s age is deduced from information about the age. It’s an old form, but jokes about cutting the Gordion knot are probably always going to be reliable. I’m reminded there’s a story of Thomas Edison giving a new hire, mathematician, the problem of working out the volume of a light bulb. Edison got impatient with the mathematician treating it as a calculus problem — the volume of a rotationally symmetric object like a bulb is the sort of thing you can do by the end of Freshman Calculus — and instead filling a bulb with water, pouring the water into a graduated cylinder, and reading it off that.

Sandra Bell-Lundy’s Between Friends (April 12) uses Calculus as the shorthand for “the hardest stuff you might have to deal with”. The symbols on the left-hand side are fair enough, although I’d think of them more as precalculus or linear algebra or physics, but they do parse well enough as long as I suppose that what sure looks like a couple of extraneous + signs are meant to refer to “t”. But “t” is a common enough variable in calculus problems, usually representing time, sometimes just representing “some parameter whose value we don’t really care about, but we don’t want it to be x”, and it looks an awful lot like a plus sign there too. On the right side, I have no idea what a root of forty minutes on a treadmill might be. It’s symbolic.

Reading the Comics, February 24, 2014: Getting Caught Up Edition

And now, I think, I’ve got caught up on the mathematics-themed comics that appeared at Comics Kingdom and at Gocomics.com over the past week and a half. I’m sorry to say today’s entries don’t get to be about as rich a set of topics as the previous bunch’s, but on the other hand, there’s a couple Comics Kingdom strips that I feel comfortable using as images, so there’s that. And come to think of it, none of them involve the setup of a teacher asking a student in class a word problem, so that’s different.

Mason Mastroianni, Mick Mastroianni, and Perri Hart’s B.C. (February 21) tells the old joke about how much of fractions someone understands. To me the canonical version of the joke was a Sydney Harris panel in which one teacher complains that five-thirds of the class doesn’t understand a word she says about fractions, but it’s all the same gag. I’m a touch amused that three and five turn up in this version of the joke too. That probably reflects writing necessity — especially for this B.C. the numbers have to be a pair that obviously doesn’t give you one-half — and that, somehow, odd numbers seem to read as funnier than even ones.

Bud Fisher’s Mutt and Jeff (February 21) decimates one of the old work-rate problems, this one about how long it takes a group of people to eat a pot roast. It was surely an old joke even when this comic first appeared (and I can’t tell you when it was; Gocomics.com’s reruns have been a mixed bunch of 1940s and 1950s ones, but they don’t say when the original run date was), but the spread across five panels treats the joke well as it’s able to be presented as a fuller stage-ready sketch. Modern comic strips value an efficiently told, minimalist joke, but pacing and minor punch lines (“some men don’t eat as fast as others”) add their charm to a comic.

Combining Matrices And Model Universes

I would like to resume talking about matrices and really old universes and the way nucleosynthesis in these model universes causes atoms to keep settling down to peculiar but unchanging distribution.

I’d already described how a matrix offers a nice way to organize elements, and in ways that encode information about the context of the elements by where they’re placed. That’s useful and saves some writing, certainly, although by itself it’s not that interesting. Matrices start to get really powerful when, first, the elements being stored are things on which you can do something like arithmetic with pairs of them. Here I mostly just mean that you can add together two elements, or multiply them, and get back something meaningful.

This typically means that the matrix is made up of a grid of numbers, although that isn’t actually required, just, really common if we’re trying to do mathematics.

Then you get the ability to add together and multiply together the matrices themselves, turning pairs of matrices into some new matrix, and building something that works a lot like arithmetic on these matrices.

Adding one matrix to another is done in almost the obvious way: add the element in the first row, first column of the first matrix to the element in the first row, first column of the second matrix; that’s the first row, first column of your new matrix. Then add the element in the first row, second column of the first matrix to the element in the first row, second column of the second matrix; that’s the first row, second column of the new matrix. Add the element in the second row, first column of the first matrix to the element in the second row, first column of the second matrix, and put that in the second row, first column of the new matrix. And so on.

This means you can only add together two matrices that are the same size — the same number of rows and of columns — but that doesn’t seem unreasonable.

You can also do something called scalar multiplication of a matrix, in which you multiply every element in the matrix by the same number. A scalar is just a number that isn’t part of a matrix. This multiplication is useful, not least because it lets us talk about how to subtract one matrix from another: to find the difference of the first matrix and the second, scalar-multiply the second matrix by -1, and then add the first to that product. But you can do scalar multiplication by any number, by two or minus pi or by zero if you feel like it.

I should say something about notation. When we want to write out these kinds of operations efficiently, of course, we turn to symbols to represent the matrices. We can, in principle, use any symbols, but by convention a matrix usually gets represented with a capital letter, A or B or M or P or the like. So to add matrix A to matrix B, with the result being matrix C, we can write out the equation “A + B = C”, which is about as simple as we could hope to see. Scalars are normally written in lowercase letters, often Greek letters, if we don’t know what the number is, so that the scalar multiplication of the number r and the matrix A would be the product “rA”, and we could write the difference between matrix A and matrix B as “A + (-1)B” or “A – B”.

Matrix multiplication, now, that is done by a process that sounds like doubletalk, and it takes a while of practice to do it right. But there are good reasons for doing it that way and we’ll get to one of those reasons by the end of this essay.

To multiply matrix A and matrix B together, we do multiply various pairs of elements from both matrix A and matrix B. The surprising thing is that we also add together sets of these products, per this rule.

Take the element in the first row, first column of A, and multiply it by the element in the first row, first column of B. Add to that the product of the element in the first row, second column of A and the second row, first column of B. Add to that total the product of the element in the first row, third column of A and the third row, second column of B, and so on. When you’ve run out of columns of A and rows of B, this total is the first row, first column of the product of the matrices A and B.

Plenty of work. But we have more to do. Take the product of the element in the first row, first column of A and the element in the first row, second column of B. Add to that the product of the element in the first row, second column of A and the element in the second row, second column of B. Add to that the product of the element in the first row, third column of A and the element in the third row, second column of B. And keep adding those up until you’re out of columns of A and rows of B. This total is the first row, second column of the product of matrices A and B.

This does mean that you can multiply matrices of different sizes, provided the first one has as many columns as the second has rows. And the product may be a completely different size from the first or second matrices. It also means it might be possible to multiply matrices in one order but not the other: if matrix A has four rows and three columns, and matrix B has three rows and two columns, then you can multiply A by B, but not B by A.

My recollection on learning this process was that this was crazy, and the workload ridiculous, and I imagine people who get this in Algebra II, and don’t go on to using mathematics later on, remember the process as nothing more than an unpleasant blur of doing a lot of multiplying and addition for some reason or other.

So here is one of the reasons why we do it this way. Let me define two matrices:

$A = \left(\begin{tabular}{c c c} 3/4 & 0 & 2/5 \\ 1/4 & 3/5 & 2/5 \\ 0 & 2/5 & 1/5 \end{tabular}\right)$

$B = \left(\begin{tabular}{c} 100 \\ 0 \\ 0 \end{tabular}\right)$

Then matrix A times B is

$AB = \left(\begin{tabular}{c} 3/4 * 100 + 0 * 0 + 2/5 * 0 \\ 1/4 * 100 + 3/5 * 0 + 2/5 * 0 \\ 0 * 100 + 2/5 * 0 + 1/5 * 0 \end{tabular}\right) = \left(\begin{tabular}{c} 75 \\ 25 \\ 0 \end{tabular}\right)$

You’ve seen those numbers before, of course: the matrix A contains the probabilities I put in my first model universe to describe the chances that over the course of a billion years a hydrogen atom would stay hydrogen, or become iron, or become uranium, and so on. The matrix B contains the original distribution of atoms in the toy universe, 100 percent hydrogen and nothing anything else. And the product of A and B was exactly the distribution after that first billion years: 75 percent hydrogen, 25 percent iron, nothing uranium.

If we multiply the matrix A by that product again — well, you should expect we’re going to get the distribution of elements after two billion years, that is, 56.25 percent hydrogen, 33.75 percent iron, 10 percent uranium, but let me write it out anyway to show:

$\left(\begin{tabular}{c c c} 3/4 & 0 & 2/5 \\ 1/4 & 3/5 & 2/5 \\ 0 & 2/5 & 1/5 \end{tabular}\right)\left(\begin{tabular}{c} 75 \\ 25 \\ 0 \end{tabular}\right) = \left(\begin{tabular}{c} 3/4 * 75 + 0 * 25 + 2/5 * 0 \\ 1/4 * 75 + 3/5 * 25 + 2/5 * 0 \\ 0 * 75 + 2/5 * 25 + 1/5 * 0 \end{tabular}\right) = \left(\begin{tabular}{c} 56.25 \\ 33.75 \\ 10 \end{tabular}\right)$

And if you don’t know just what would happen if we multipled A by that product, you aren’t paying attention.

This also gives a reason why matrix multiplication is defined this way. The operation captures neatly the operation of making a new thing — in the toy universe case, hydrogen or iron or uranium — out of some combination of fractions of an old thing — again, the former distribution of hydrogen and iron and uranium.

Or here’s another reason. Since this matrix A has three rows and three columns, you can multiply it by itself and get a matrix of three rows and three columns out of it. That matrix — which we can write as A2 — then describes how two billion years of nucleosynthesis would change the distribution of elements in the toy universe. A times A times A would give three billion years of nucleosynthesis; A10 ten billion years. The actual calculating of the numbers in these matrices may be tedious, but it describes a complicated operation very efficiently, which we always want to do.

I should mention another bit of notation. We usually use capital letters to represent matrices; but, a matrix that’s just got one column is also called a vector. That’s often written with a lowercase letter, with a little arrow above the letter, as in $\vec{x}$, or in bold typeface, as in x. (The arrows are easier to put in writing, the bold easier when you were typing on typewriters.) But if you’re doing a lot of writing this out, and know that (say) x isn’t being used for anything but vectors, then even that arrow or boldface will be forgotten. Then we’d write the product of matrix A and vector x as just Ax.  (There are also cases where you put a little caret over the letter; that’s to denote that it’s a vector that’s one unit of length long.)

When you start writing vectors without an arrow or boldface you start to run the risk of confusing what symbols mean scalars and what ones mean vectors. That’s one of the reasons that Greek letters are popular for scalars. It’s also common to put scalars to the left and vectors to the right. So if one saw “rMx”, it would be expected that r is a scalar, M a matrix, and x a vector, and if they’re not then this should be explained in text nearby, preferably before the equations. (And of course if it’s work you’re doing, you should know going in what you mean the letters to represent.)

16,000 and a Square

I reached my 16,000th page view, sometime on Thursday. That’s a tiny bit slower than I projected based on May’s readership statistics, but May was a busy month and I’ve had a little less time to write stuff this month, so I’m not feeling bad about that.

Meanwhile, while looking for something else, I ran across a bit about mathematical notation in Florian Cajori’s A History of Mathematical Notation which has left me with a grin since. The book is very good about telling the stories of just what the title suggests. It’s a book well worth dipping into because everything you see written down is the result of a long process of experimentation and fiddling about to find the right balance of “expressing an idea clearly” and “expressing an idea concisely” and “expressing an idea so it’s not too hard to work with”.

The idea here is the square of a variable, which these days we’d normally write as $a^2$. According to Cajori (section 304), René Descartes “preferred the notation $aa$ to $a^2$.” Cajori notes that Carl Gauss had this same preference and defended it on the grounds that doubling the symbol didn’t take any more (or less) space than the superscript 2 did. Cajori lists other great mathematicians who preferred doubling the letter for squaring, including Christiaan Huygens, Edmond Halley, Leonhard Euler, and Isaac Newton. Among mathematicians who preferred $a^2$ were Blaise Pascal, David Gregory (who was big in infinite series), and Wilhelm Leibniz.

Well of course Newton and Leibniz would be on opposite sides of the $aa$ versus $a^2$ debate. How could the universe be sensible otherwise?

What I Call Some Impossible Logic Problems

I’m sorry to go another day without following up the essay I meant to follow up, but it’s been a frantically busy week on a frantically busy month and something has to give somewhere. But before I return the Symbolic Logic book to the library — Project Gutenberg has the first part of it, but the second is soundly in copyright, I would expect (its first publication in a recognizable form was in the 1970s) — I wanted to pick some more stuff out of the second part.

When last we discussed divisibility rules, particularly, rules for just adding up the digits in a number to tell what it might divide by, we had worked out rules for testing divisibility by eight. In that, we take the sum of four times the hundreds digit, plus two times the tens digit, plus the units digit, and if that sum is divisible by eight, then so was the original number. This hasn’t got the slick, smooth memorability of the rules for three and nine — just add all the numbers up — or the simplicity of checking for divisibility by ten, five, or two — just look at the last digit — but it’s not a complicated rule either.

Still, we came at it through an experimental method, fiddling around with possible rules until we found one which seemed to work. It seemed to work, and since we found out there are only a thousand possible cases to consider we can check that it works in every one of those cases. That’s tiresome to do, but functions, and it’s a legitimate way of forming mathematical rules. Quite a number of proofs amount to dividing a problem into several different cases and show that whatever we mean to prove is so in each ase.

Let’s see what we can do to tidy up the proof, though, and see if we can make it work without having to test out so many cases. We can, or I’d have been foolish to start this essay rather than another; along the way, though, we can remove the traces that show the experimenting that lead to the technique. We can put forth the cleaned-up reasoning and look all the more clever because it isn’t so obvious how we got there. This is another common property of proofs; the most attractive or elegant method of presenting them can leave the reader wondering how it was ever imagined.

Pinball and Large Numbers

I had another little occasion to reflect on the ways of representing numbers, as well as the chance to feel a bit foolish, this past weekend so I’m naturally driven to share it. This came about on visiting the Silverball Museum, a pinball museum, or arcade, in Asbury Park, New Jersey. (I’m not sure the exact difference between a museum in which games are playable by visitors and an arcade, except for the signs affixed to nearly all the games.) Naturally I failed to bring my camera, so I can’t easily show what I had in mind; too bad.

Pinballs, at least once they got around to having electricity installed, need to show the scores. Since about the mid-1990s these have been shown by dot matrix displays, which are pretty easy to read — the current player’s score can be shown extremely large, for example — and make it easy for the game to go into different modes, where the scoring and objectives of play vary for a time. From about the mid-1970s to the mid-1990s eight-segment light-emitting diodes were preferred, for that “small alarm clock” look. And going before that were rotating number wheels, which are probably the iconic look to pinball score boards, to the extent anyone thinks of a classic pinball machine in that detail.

But there’s another score display, which I must admit offends my sense of order. In this, which I noticed mostly in the machines from the 1950s, with a few outliers in the early 60s (often used in conjunction with the rotating wheels), the parts of the number are broken apart, and the score is read by adding up the parts which are lit up. The machine I was looking at had one column of digits for the millions, another for hundreds of thousands, and then another with two-digit numbers.

A Quick Impersonation Of Base Nine

I now resume the thread of spotting multiples of numbers easily. Thanks to the way positional notation lets us write out numbers as some multiple of our base, which is so nearly always ten it takes some effort to show where it’s not, it’s easy to spot whether a number is a multiple of that base, or some factor of the base, just by looking at the last digit. And if we’re interested in factors of some whole power of the base, of the ten squared which is a hundred, or the ten cubed which is a thousand, or so, we can find all we want to know just by looking at the last two or last three or last or-so digits.

Sadly, three and nine don’t go into ten, and never go into any power of ten either. Six and seven won’t either, although that exhausts the numbers below ten which don’t go into any power of ten. Of course, we also have the unpleasant point that eleven won’t go into a hundred or thousand or ten-thousand or more, and so won’t many other numbers we’d like.

If we didn’t have to use base ten, if we could use base nine, then we could get the benefits of instantly recognizing multiples of three or nine that we get for multiples of five or ten. If the digits of a number are some strand R finished off with an a, then the number written as Ra means the number gotten by multiplying nine by R and adding to that a. The whole strand will be divisible by nine whenever a is, which is to say when a is zero; and the whole strand will be divisible by three when a is, that is, when a is zero, three, or six.

Some Names Which e Doesn’t Have

I’ve outlined now some of the numbers which grew important enough to earn their own names. Most of them are counting numbers; the stragglers are a handful of irrational numbers which proved themselves useful, such as π (pi), or attractive, such as φ (phi), or physically important, such as the fine structure constant. Unnamed except in the list of categories is the number whose explanation I hope to be the first movement of this blog: e.

It’s an important number physically, and a convenient and practical number mathematically. For all that, it defies a simple explanation like π enjoys. The simplest description of which I’m aware is that it is the base of the natural logarithm, which perfectly clarifies things to people who know what logarithms are, know which one is the natural logarithm, and know what the significance of the base is. This I will explain, but not today. For now it’s enough to think of the base as a size of the measurement tool, and to know that switching between one base and another is akin to switching between measuring in centimeters and measuring in inches. What the logarithm is will also wait for explanation; for now, let me hold off on that by saying it’s, in a way, a measure of how many digits it takes to write down a number, so that “81” has a logarithm twice that of “9”, and “49” twice that of “7”, and please don’t take this description so literally as to think the logarithm of “81” is equal to that of “49”.

I agree it’s not clear why we should be interested in the natural logarithm when there are an infinity of possible logarithms, and we can convert a logarithm base e into a logarithm base 10 just by multiplying by the correct number. That, too, will come.

Another common explanation is to say that e describes how fast savings will grow under the influence of compound interest. A dollar invested at one-percent interest, compounded daily, for a year, will grow to just about e dollars. Compounded hourly it grows even closer; compounded by the second it grows closer still; compounded annually, it stays pretty far away. The comparison is probably perfectly clear to those who can invest in anything with interest compounded daily. For my part I note when I finally opened an individual retirement account I put a thousand dollars into an almost thoughtfully selected mutual fund, and within mere weeks had lost \$15. That about finishes off compound interest to me.