I’m writing this on Thursday, because I’m expecting to be busy Friday and Saturday. It might be a good policy if I planned the deadline for all these Reading the Comics posts to be a couple days before publishing. But it’ll probably ever come to that. I am not yet begun resisting treating this blog like a professional would. Well, what’s been interesting this week so far have been comic strips presenting or about story problems. That’s enough for a theme.
Olivia Jaimes’s Nancy for the 13th makes the familiar complaint that story problems aren’t “useful”. Perhaps not; she can cut the Gordion knot for most common setups. Well, students aren’t likely to get problems for which there’s no other way to find a solution. I suppose what’s happening is that many mathematical puzzles come from questions like, what’s the least amount of information you need to deduce something? Or what’s an indirect way to find that? Mathematicians are often drawn to questions like this. At least Nancy has found there are problems she’s legitimately interested in, questions about how to do a thing she finds important.
Mort Walker and Dik Browne’s vintage Hi and Lois for the 17th has Chip’s new arithmetic book trying to be more relevant. Chip’s still bored by the problem, which chooses to be about foreign aid. (In the 50s and 60s comics discovered it was very funny that the United States would just give money to other countries and not get anything back out of it except maybe their economies staying stable or their countries not going to war too much.)
The End 2016 A-To-Z feels like the one where I figured out how to do these things. Normal Numbers, this entry, is a piece that felt like I was making the breakthrough I wanted. Some of it is about the technical definition of normal numbers. But I also got into why normal numbers are interesting. Mysterious, even, in the medieval-theologist sense of of mystery. Almost every number is normal, but we only know a handful of normal numbers. So far as I’m aware, though, we don’t know of any number that’s interesting in its own right that’s also normal. It seems like a paradox.
My first A-to-Z sequence was in the summer of 2015. It’s the most primitive of my sequences and I really see how my writing style has changed from then. Well, there’s not much point to writing a lot if your style doesn’t change. Today, though, I’d like to bring up a piece that still holds up. It’s about Measure, another of those concepts that weaves through so many sections of mathematics. It’s also got a charming little anecdote of maybe dubious relevance as I bring spackle into things. That’s quite me.
Today’s A To Z term is another free choice. So I’m picking a term from the world of … mathematics. There are a lot of norms out there. Many are specialized to particular roles, such as looking at complex-valued numbers, or vectors, or matrices, or polynomials.
Still they share things in common, and that’s what this essay is for. And I’ve brushed up against the topic before.
The norm, also, has nothing particular to do with “normal”. “Normal” is an adjective which attaches to every noun in mathematics. This is security for me as while these A-To-Z sequences may run out of X and Y and W letters, I will never be short of N’s.
A “norm” is the size of whatever kind of thing you’re working with. You can see where this is something we look for. It’s easy to look at two things and wonder which is the smaller.
There are many norms, even for one set of things. Some seem compelling. For the real numbers, we usually let the absolute value do this work. By “usually” I mean “I don’t remember ever seeing a different one except from someone introducing the idea of other norms”. For a complex-valued number, it’s usually the square root of the sum of the square of the real part and the square of the imaginary coefficient. For a vector, it’s usually the square root of the vector dot-product with itself. (Dot product is this binary operation that is like multiplication, if you squint, for vectors.) Again, these, the “usually” means “always except when someone’s trying to make a point”.
Which is why we have the convention that there is a “the norm” for a kind of operation. The norm dignified as “the” is usually the one that looks as much as possible like the way we find distances between two points on a plane. I assume this is because we bring our intuition about everyday geometry to mathematical structures. You know how it is. Given an infinity of possible choices we take the one that seems least difficult.
Every sort of thing which can have a norm, that I can think of, is a vector space. This might be my failing imagination. It may also be that it’s quite easy to have a vector space. A vector space is a collection of things with some rules. Those rules are about adding the things inside the vector space, and multiplying the things in the vector space by scalars. These rules are not difficult requirements to meet. So a lot of mathematical structures are vector spaces, and the things inside them are vectors.
A norm is a function that has these vectors as its domain, and the non-negative real numbers as its range. And there are three rules that it has to meet. So. Give me a vector ‘u’ and a vector ‘v’. I’ll also need a scalar, ‘a. Then the function f is a norm when:
. This is a famous rule, called the triangle inequality. You know how in a triangle, the sum of the lengths of any two legs is greater than the length of the third leg? That’s the rule at work here.
. This doesn’t have so snappy a name. Sorry. It’s something about being homogeneous, at least.
If then u has to be the additive identity, the vector that works like zero does.
Norms take on many shapes. They depend on the kind of thing we measure, and what we find interesting about those things. Some are familiar. Look at a Euclidean space, with Cartesian coordinates, so that we might write something like (3, 4) to describe a point. The “the norm” for this, called the Euclidean norm or the L2 norm, is the square root of the sum of the squares of the coordinates. So, 5. But there are other norms. The L1 norm is the sum of the absolute values of all the coefficients; here, 7. The L∞ norm is the largest single absolute value of any coefficient; here, 4.
A polynomial, meanwhile? Write it out as . Take the absolute value of each of these terms. Then … you have choices. You could take those absolute values and add them up. That’s the L1 polynomial norm. Take those absolute values and square them, then add those squares, and take the square root of that sum. That’s the L2 norm. Take the largest absolute value of any of these coefficients. That’s the L∞ norm.
These don’t look so different, even though points in space and polynomials seem to be different things. We designed the tool. We want it not to be weirder than it has to be. When we try to put a norm on a new kind of thing, we look for a norm that resembles the old kind of thing. For example, when we want to define the norm of a matrix, we’ll typically rely on a norm we’ve already found for a vector. At least to set up the matrix norm; in practice, we might do a calculation that doesn’t explicitly use a vector’s norm, but gives us the same answer.
If we have a norm for some vector space, then we have an idea of distance. We can say how far apart two vectors are. It’s the norm of the difference between the vectors. This is called defining a metric on the vector space. A metric is that sense of how far apart two things are. What keeps a norm and a metric from being the same thing is that it’s possible to come up with a metric that doesn’t match any sensible norm.
It’s always possible to use a norm to define a metric, though. Doing that promotes our normed vector space to the dignified status of a “metric space”. Many of the spaces we find interesting enough to work in are such metric spaces. It’s hard to think of doing without some idea of size.
Today, I’m just listing the comics from last week that mentioned mathematics, but which didn’t raise a deep enough topic to be worth discussing. You know what a story problem looks like. I can’t keep adding to that.
Hector D. Cantú and Carlos Castellanos’s Baldo for the 10th quotes René Descartes, billing him as a “French mathematician”. Which is true, but the quote is one about living properly. That’s more fairly a philosophical matter. Descartes has some reputation for his philosophical work, I understand.
Today’s A To Z term was nominated again by @aajohannas. The other compelling nomination was from Vayuputrii, for the Mittag-Leffler function. I was tempted. But I realized I could not think of a clear way to describe why the function was interesting. Or even where it comes from that avoided being a heap of technical terms. There’s no avoiding technical terms in writing about mathematics, but there’s only so much I want to put in at once either. It also makes me realize I don’t understand the Mittag-Leffler function, but it is after all something I haven’t worked much with.
The Mittag-Leffler function looks like it’s one of those things named for several contributors, like Runge-Kutta Integration or Cauchy-Kovalevskaya Theorem or something. Not so here; this was one person, Gösta Mittag-Leffler. His name’s all over the theory of functions. And he was one of the people helping Sofia Kovalevskaya, whom you know from every list of pioneering women in mathematics, secure her professorship.
A martingale is how mathematicians prove you can’t get rich gambling.
Well, that exaggerates. Some people will be lucky, of course. But there’s no strategy that works. The only strategy that works is to rig the game. You can do this openly, by setting rules that give you a slight edge. You usually have to be the house to do this. Or you can do it covertly, using tricks like card-counting (in blackjack) or weighted dice or other tricks. But a fair game? Meaning one not biased towards or against any player? There’s no strategy to guarantee winning that.
We can make this more technical. Martingales arise form the world of stochastic processes. This is an indexed set of random variables. A random variable is some variable with a value that depends on the result of some phenomenon. A tossed coin. Rolled dice. Number of people crossing a particular walkway over a day. Engine temperature. Value of a stock being traded. Whatever. We can’t forecast what the next value will be. But we now the distribution, which values are more likely and which ones are unlikely and which ones impossible.
The field grew out of studying real-world phenomena. Things we could sample and do statistics on. So it’s hard to think of an index that isn’t time, or some proxy for time like “rolls of the dice”. Stochastic processes turn up all over the place. A lot of what we want to know is impossible, or at least impractical, to exactly forecast. Think of the work needed to forecast how many people will cross this particular walk four days from now. But it’s practical to describe what are more and less likely outcomes. What the average number of walk-crossers will be. What the most likely number will be. Whether to expect tomorrow to be a busier or a slower day.
And this is what the martingale is for. Start with a sequence of your random variables. How many people have crossed that street each day since you started studying. What is the expectation value, the best guess, for the next result? Your best guess for how many will cross tomorrow? Keeping in mind your knowledge of how all these past values. That’s an important piece. It’s not a martingale if the history of results isn’t a factor.
Every probability question has to deal with knowledge. Sometimes it’s easy. The probability of a coin coming up tails next toss? That’s one-half. The probability of a coin coming up tails next toss, given that it came up tails last time? That’s still one-half. The probability of a coin coming up tails next toss, given that it came up tails the last 40 tosses? That’s … starting to make you wonder if this is a fair coin. I’d bet tails, but I’d also ask to examine both sides, for a start.
So a martingale is a stochastic process where we can make forecasts about the future. Particularly, the expectation value. The expectation value is the sum of the products of every possible value and how probable they are. In a martingale, the expected value for all time to come is just the current value. So if whatever it was you’re measuring was, say, 40 this time? That’s your expectation for the whole future. Specific values might be above 40, or below 40, but on average, 40 is it.
Put it that way and you’d think, well, how often does that ever happen? Maybe some freak process will give you that, but most stuff?
Well, here’s one. The random walk. Set a value. At each step, it can increase or decrease by some fixed value. It’s as likely to increase as to decrease. This is a martingale. And it turns out a lot of stuff is random walks. Or can be processed into random walks. Even if the original walk is unbalanced — say it’s more likely to increase than decrease. Then we can do a transformation, and find a new random variable based on the original. Then that one is as likely to increase as decrease. That one is a martingale.
It’s not just random walks. Poisson processes are things where the chance of something happening is tiny, but it has lots of chances to happen. So this measures things like how many car accidents happen on this stretch of road each week. Or where a couple plants will grow together into a forest, as opposed to lone trees. How often a store will have too many customers for the cashiers on hand. These processes by themselves aren’t often martingales. But we can use them to make a new stochastic process, and that one is a martingale.
Where this all comes to gambling is in stopping times. This is a random variable that’s based on the stochastic process you started with. Its value at each index represents the probability that the random variable in that has reached some particular value by this index. The language evokes a gambler’s decision: when do you stop? There are two obvious stopping times for any game. One is to stop when you’ve won enough money. The other is to stop when you’ve lost your whole stake.
So there is something interesting about a martingale that has bounds. It will almost certainly hit at least one of those bounds, in a finite time. (“Almost certainly” has a technical meaning. It’s the same thing I mean when I say if you flip a fair coin infinitely many times then “almost certainly” it’ll come up tails at least once. Like, it’s not impossible that it doesn’t. It just won’t happen.) And for the gambler? The boundary of “runs out of money” is a lot closer than “makes the house run out of money”.
Oh, if you just want a little payoff, that’s fine. If you’re happy to walk away from the table with a one percent profit? You can probably do that. You’re closer to that boundary than to the runs-out-of-money one. A ten percent profit? Maybe so. Making an unlimited amount of money, like you’d want to live on your gambling winnings? No, that just doesn’t happen.
This gets controversial when we turn from gambling to the stock market. Or a lot of financial mathematics. Look at the value of a stock over time. I write “stock” for my convenience. It can be anything with a price that’s constantly open for renegotiation. Stocks, bonds, exchange funds, used cars, fish at the market, anything. The price over time looks like it’s random, at least hour-by-hour. So how can you reliably make money if the fluctuations of the price of a stock are random?
Well, if I knew, I’d have smaller student loans outstanding. But martingales seem like they should offer some guidance. Much of modern finance builds on not dealing with a stock price varying. Instead, buy the right to buy the stock at a set price. Or buy the right to sell the stock at a set price. This lets you pay to secure a certain profit, or a worst-possible loss, in case the price reaches some level. And now you see the martingale. Is it likely that the stock will reach a certain price within this set time? How likely? This can, in principle, guide you to a fair price for this right-to-buy.
The mathematical reasoning behind that is fine, so far as I understand it. Trouble arises because pricing correctly means having a good understanding of how likely it is prices will reach different levels. Fortunately, there are few things humans are better at than estimating probabilities. Especially the probabilities of complicated situations, with abstract and remote dangers.
So martingales are an interesting corner of mathematics. They apply to purely abstract problems like random walks. Or to good mathematical physics problems like Brownian motion and the diffusion of particles. And they’re lurking behind the scenes of the finance news. Exciting stuff.
I’m hopefully going to pass the halfway point on this year’s mathematics A-To-Z. This makes it a good time to panel for topics for the next several letters in the alphabet. It’s easier for me to keep my notes straight if you post requests as comments on this thread, but I’ll try to keep up if you do comment on other threads.
As ever, I’m happy to consider most mathematical topics, including ones that I’ve written about in the past if I think I can better an old essay. If there’s several suggestions for the same letter, I’ll pick the one that I think I can do most interestingly. If several seem interesting I might try rephrasing, if the subject allows for that.
And I do thank everyone who makes a suggestion, especially if it’s one that surprises me and that makes me learn something along the way.
Here’s the essays I’ve written in past years for the letters O through T.
Comic Strip Master Command hoped to give me an easy week, one that would let me finally get ahead on my A-to-Z essays and avoid the last-minute rush to complete tasks. I showed them, though. I can procrastinate more than they can give me breaks. This essay alone I’m writing about ten minutes after you read it.
Eric the Circle for the 7th, by Shoy, is one of the jokes where Eric’s drawn as something besides a circle. I can work with this, though, because the cube is less far from a circle than you think. It gets to what we mean by “a circle”. If it’s all the points that are exactly a particular distance from a given center? Or maybe all the points up to that particular distance from a given center? This seems too reasonable to argue with, so you know where the trick is.
The trick is asking what we mean by distance? The ordinary distance that normal people use has a couple names. The Euclidean distance, often. Or Euclidean metric. Euclidean norm. It has some fancier names that can wait. Give two points. You can find this distance easily if you have their coordinates in a Cartesian system. (There’s infinitely many Cartesian systems you could use. You can pick whatever one you like; the distance will be the same whatever they are.) That’s that thing about finding the distance between corresponding coordinates, squaring those distances, adding that up, and taking the square root. And that’s good.
That’s not our only choice, though. We can make a perfectly good distance using other rules. For example, take the difference between corresponding coordinates, take the absolute value of each, and add all those absolute values up. This distance even has real-world application. It’s how far it is to go from one place to another on a grid of city squares, where it’s considered poor form to walk directly through buildings. There’s another. Instead of adding those absolute values up? Just pick the biggest of the absolute values. This is another distance. In it, circles look like squares. Or, in three dimensions, spheres look like cubes.
Ryan North’s Dinosaur Comics for the 9th builds on a common science fictional premise, that contact with an alien intelligence is done through mathematics first. It’s a common supposition in science fiction circles, and among many scientists, that mathematics is a truly universal language. It’s hard to imagine a species capable of communication with us that wouldn’t understand two and two adding up to four. Or about the ratio of a circle circumference to its diameter being independent of that diameter. Or about how an alternating knot for which the minimum number of crossing points is odd can’t ever be amphicheiral.
All right, I guess I can imagine a species that never ran across that point. Which is one of the things we suppose in using mathematics as a universal language. Its truths are indisputable, if we allow the rules of logic and axioms and definitions that we use. And I agree I don’t know that it’s possible not to notice basic arithmetic and basic geometry, not if one lives in a sensory world much like humans’. But it does seem to me at least some of mathematics is probably idiosyncratic. In representation at least; certainly in organization. I suspect there may be trouble in using universal and generically true things to express something local and specific. I don’t know how to go from deductive logic to telling someone when my birthday is. Well, I’m sure our friends in the philosophy department have considered that problem and have some good thoughts we can use, if there were only some way to communicate with them.
Bill Whitehead’s Free Range for the 12th is your classic blackboard-full-of-symbols. I like the beauty of the symbols used. I mean, the whole expression doesn’t parse, but many of the symbols do and are used in reasonable ways. Long trailing strings of arrows to extend one line to another are common and reasonable too. In the middle of the second line is , which doesn’t make sense, but which doesn’t make sense in a way that seems authentic to working out an idea. It’s something that could be cleaned up if the reasoning needed to be made presentable.
I don’t use pictures enough for any of my essays. Even ones that would really benefit from them, like the Julia Set. So let me share one of the rare times I did. I got to use some pictures of my first visit to Niagara Falls, and tell about stepping into the river above the Falls, all as part of discussing what mathematicians mean by “local”.
I couldn’t find a place to fit this in the essay proper. But it’s too good to leave out. The simplex method, discussed within, traces to George Dantzig. He’d been planning methods for the US Army Air Force during the Second World War. Dantzig is a person you have heard about, if you’ve heard any mathematical urban legends. In 1939 he was late to Jerzy Neyman’s class. He took two statistics problems on the board to be homework. He found them “harder than usual”, but solved them in a couple days and turned in the late homework hoping Neyman would be understanding. They weren’t homework. They were examples of famously unsolved problems. Within weeks Neyman had written one of the solutions up for publication. When he needed a thesis topic Neyman advised him to just put what he already had in a binder. It’s the stuff every grad student dreams of. The story mutated. It picked up some glurge to become a narrative about positive thinking. And mutated further, into the movie Good Will Hunting.
Every three days one of the comic strips I read has the elderly main character talk about how they never used algebra. This is my hyperbole. But mathematics has got the reputation for being difficult and inapplicable to everyday life. We’ll concede using arithmetic, when we get angry at the fast food cashier who hands back our two pennies before giving change for our $6.77 hummus wrap. But otherwise, who knows what an elliptic integral is, and whether it’s working properly?
Linear programming does not have this problem. In part, this is because it lacks a reputation. But those who have heard of it, acknowledge it as immensely practical mathematics. It is about something a particular kind of human always finds compelling. That is how to do a thing best.
There are several kinds of “best”. There is doing a thing in as little time as possible. Or for as little effort as possible. For the greatest profit. For the highest capacity. For the best score. For the least risk. The goals have a thousand names, none of which we need to know. They all mean the same thing. They mean “the thing we wish to optimize”. To optimize has two directions, which are one. The optimum is either the maximum or the minimum. To be good at finding a maximum is to be good at finding a minimum.
It’s obvious why we call this “programming”; obviously, we leave the work of finding answers to a computer. It’s a spurious reason. The “programming” here comes from an independent sense of the word. It means more about finding a plan. Think of “programming” a night’s entertainment, so that every performer gets their turn, all scene changes have time to be done, you don’t put two comedians right after the other, and you accommodate the performer who has to leave early and the performer who’ll get in an hour late. Linear programming problems are often about finding how to do as well as possible given various priorities. All right. At least the “linear” part is obvious. A mathematics problem is “linear” when it’s something we can reasonably expect to solve. This is not the technical meaning. Technically what it means is we’re looking at a function something like:
Here, x, y, and z are the independent variables. We don’t know their values but wish to. a, b, and c are coefficients. These values are set to some constant for the problem, but they might be something else for other problems. They’re allowed to be positive or negative or even zero. If a coefficient is zero, then the matching variable doesn’t affect matters at all. The corresponding value can be anything at all, within the constraints.
I’ve written this for three variables, as an example and because ‘x’ and ‘y’ and ‘z’ are comfortable, familiar variables. There can be fewer. There can be more. There almost always are. Two- and three-variable problems will teach you how to do this kind of problem. They’re too simple to be interesting, usually. To avoid committing to a particular number of variables we can use indices. for values of j from 1 up to N. Or we can bundle all these values together into a vector, and write everything as . This has a particular advantage since when we can write the coefficients as a vector too. Then we use the notation of linear algebra, and write that we hope to maximize the value of:
(The superscript T means “transpose”. As a linear algebra problem we’d usually think of writing a vector as a tall column of things. By transposing that we write a long row of things. By transposing we can use the notation of matrix multiplication.)
This is the objective function. Objective here in the sense of goal; it’s the thing we want to find the best possible value of.
We have constraints. These represent limits on the variables. The variables are always things that come in limited supply. There’s no allocating more money than the budget allows, nor putting more people on staff than work for the company. Often these constraints interact. Perhaps not only is there only so much staff, but no one person can work more than a set number of days in a row. Something like that. That’s all right. We can write all these constraints as a matrix equation. An inequality, properly. We can bundle all the constraints into a big matrix named A, and demand:
Also, traditionally, we suppose that every component of is non-negative. That is, positive, or at lowest, zero. This reflects the field’s core problems of figuring how to allocate resources. There’s no allocating less than zero of something.
But we need some bounds. This is easiest to see with a two-dimensional problem. Try it yourself: draw a pair of axes on a sheet of paper. Now put in a constraint. Doesn’t matter what. The constraint’s edge is a straight line, which you can draw at any position and any angle you like. This includes horizontal and vertical. Shade in one side of the constraint. Whatever you shade in is the “feasible region”, the sets of values allowed under the constraint. Now draw in another line, another constraint. Shade in one side or the other of that. Draw in yet another line, another constraint. Shade in one side or another of that. The “feasible region” is whatever points have taken on all these shades. If you were lucky, this is a bounded region, a triangle. If you weren’t lucky, it’s not bounded. It’s maybe got some corners but goes off to the edge of the page where you stopped shading things in.
So adding that every component of is at least as big as zero is a backstop. It means we’ll usually get a feasible region with a finite volume. What was the last project you worked on that had no upper limits for anything, just minimums you had to satisfy? Anyway if you know you need something to be allowed less than zero go ahead. We’ll work it out. The important thing is there’s finite bounds on all the variables.
I didn’t see the bounds you drew. It’s possible you have a triangle with all three shades inside. But it’s also possible you picked the other sides to shade, and you have an annulus, with no region having more than two shades in it. This can happen. It means it’s impossible to satisfy all the constraints at once. At least one of them has to give. You may be reminded of the sign taped to the wall of your mechanics’ about picking two of good-fast-cheap.
But impossibility is at least easy. What if there is a feasible region?
Well, we have reason to hope. The optimum has to be somewhere inside the region, that’s clear enough. And it even has to be on the edge of the region. If you’re not seeing why, think of a simple example, like, finding the maximum of , inside the square where x is between 0 and 2 and y is between 0 and 3. Suppose you had a putative maximum on the inside, like, where x was 1 and y was 2. What happens if you increase x a tiny bit? If you increase y by twice that? No, it’s only on the edges you can get a maximum that can’t be locally bettered. And only on the corners of the edges, at that.
(This doesn’t prove the case. But it is what the proof gets at.)
So the problem sounds simple then! We just have to try out all the vertices and pick the maximum (or minimum) from them all.
OK, and here’s where we start getting into trouble. With two variables and, like, three constraints? That’s easy enough. That’s like five points to evaluate? We can do that.
We never need to do that. If someone’s hiring you to test five combinations I admire your hustle and need you to start getting me consulting work. A real problem will have many variables and many constraints. The feasible region will most often look like a multifaceted gemstone. It’ll extend into more than three dimensions, usually. It’s all right if you just imagine the three, as long as the gemstone is complicated enough.
Because now we’ve got lots of vertices. Maybe more than we really want to deal with. So what’s there to do?
The basic approach, the one that’s foundational to the field, is the simplex method. A “simplex” is a triangle. In three dimensions, anyway. In four dimensions it’s a tetrahedron. In two dimensions it’s a line segment. Generally, however many dimensions of space you have? The simplex is the simplest thing that fills up volume in your space.
You know how you can turn any polygon into a bunch of triangles? Just by connecting enough vertices together? You can turn a polyhedron into a bunch of tetrahedrons, by adding faces that connect trios of vertices. And for polyhedron-like shapes in more dimensions? We call those polytopes. Polytopes we can turn into a bunch of simplexes. So this is why it’s the “simplex method”. Any one simplex it’s easy to check the vertices on. And we can turn the polytope into a bunch of simplexes. And we can ignore all the interior vertices of the simplexes.
So here’s the simplex method. First, break your polytope up into simplexes. Next, pick any simplex; doesn’t matter which. Pick any outside vertex of that simplex. This is the first viable possible solution. It’s most likely wrong. That’s okay. We’ll make it better.
Because there are other vertices on this simplex. And there are other simplexes, adjacent to that first, which share this vertex. Test the vertices that share an edge with this one. Is there one that improves the objective function? Probably. Is there a best one of those in this simplex? Sure. So now that’s our second viable possible solution. If we had to give an answer right now, that would be our best guess.
But this new vertex, this new tentative solution? It shares edges with other vertices, across several simplexes. So look at these new neighbors. Are any of them an improvement? Which one of them is the best improvement? Move over there. That’s our next tentative solution.
You see where this is going. Keep at this. Eventually it’ll wind to a conclusion. Usually this works great. If you have, like, 8 constraints, you can usually expect to get your answer in from 16 to 24 iterations. If you have 20 constraints, expect an answer in from 40 to 60 iterations. This is doing pretty well.
But it might take a while. It’s possible for the method to “stall” a while, often because one or more of the variables is at its constraint boundary. Or the division of polytope into simplexes got unlucky, and it’s hard to get to better solutions. Or there might be a string of vertices that are all at, or near, the same value, so the simplex method can’t resolve where to “go” next. In the worst possible case, the simplex method takes a number of iterations that grows exponentially with the number of constraints. This, yes, is very bad. It doesn’t typically happen. It’s a numerical algorithm. There’s some problem to spoil any numerical algorithm.
You may have complaints. Like, the world is complicated. Why are we only looking at linear objective functions? Or, why only look at linear constraints? Well, if you really need to do that? Go ahead, but that’s not linear programming anymore. Think hard about whether you really need that, though. Linear anything is usually simpler than nonlinear anything. I mean, if your optimization function absolutely has to have in it? Could we just say you have a new variable that just happens to be equal to the square of y? Will that work? If you have to have the sine of z? Are you sure that z isn’t going to get outside the region where the sine of z is pretty close to just being z? Can you check?
Maybe you have, and there’s just nothing for it. That’s all right. This is why optimization is a living field of study. It demands judgement and thought and all that hard work.
John Hambrock’s The Brilliant Mind of Edison Lee for the 1st of October is a calendar joke. Well, many of the months used to have names that denoted their count. Month names have changed more than you’d think. For a while there every Roman Emperor was renaming months after himself. Most of these name changes did not stick. Lucius Aurelius Commodus, who reined from 177 to 192, gave all twelve months one or another of his names.
I’m more fluent in graph theory, and my writing will reflect that. But its critical insight involves looking at spaces and ignoring things like distance and area and angle. It is amazing that one can discard so much of geometry and still have anything to consider. What we do learn then applies to very many problems.
Königsberg Bridge Problem.
Once upon a time there was a city named Königsberg. It no longer is. It is Kaliningrad now. It’s no longer in that odd non-contiguous chunk of Prussia facing the Baltic Sea. It’s now in that odd non-contiguous chunk of Russia facing the Baltic Sea.
I put it this way because what the city evokes, to mathematicians, is a story. I do not have specific reason to think the story untrue. But it is a good story, and as I think more about history I grow more skeptical of good stories. A good story teaches, though not always the thing it means to convey.
The story is this. The city is on two sides of the Pregel river, now the Pregolya River. Two large islands are in the river. For several centuries these four land masses were connected by a total of seven bridges. And we are told that people in the city would enjoy free time with an idle puzzle. Was there a way to walk all seven bridges one and only one time? If no one did something fowl like taking a boat to cross the river, or not going the whole way across a bridge, anyway? There were enough bridges, though, and enough possible ways to cross them, that trying out every option was hopeless.
Then came Leonhard Euler. Who is himself a preposterous number of stories. Pick any major field of mathematics; there is an Euler’s Theorem at its center. Or an Euler’s Formula. Euler’s Method. Euler’s Function. Likely he brought great new light to it.
And in 1736 he solved the Königsberg Bridge Problem. The answer was to look at what would have to be true for a solution to exist. He noticed something so obvious it required genius not to dismiss it. It seems too simple to be useful. In a successful walk you enter each land mass (river bank or island) the same number of times you leave it. So if you cross each bridge exactly once, you use an even number of bridges per land mass. The exceptions are that you must start at one land mass, and end at a land mass. Maybe a different one. How you get there doesn’t count for the problem. How you leave doesn’t either. So the land mass you start from may have an odd number of bridges. So may the one you end on. So there are up to two land masses that may have an odd number of bridges.
Once this is observed, it’s easy to tell that Königsberg’s Bridges did not match that. All four land masses in Königsberg have an odd number of bridges. And so we could stop looking. It’s impossible to walk the seven bridges exactly once each in a tour, not without cheating.
Graph theoreticians, like the topologists of my prologue, now consider this foundational to their field. To look at a geographic problem and not concern oneself with areas and surfaces and shapes? To worry only about how sets connect? This guides graph theory in how to think about networks.
The city exists, as do the islands, and the bridges existed as described. So does Euler’s solution. And his reasoning is sound. The reasoning is ingenious, too. Everything hard about the problem evaporates. So what do I doubt about this fine story?
Teo Paoletti, author of that web page, says Danzig mayor Carl Leonhard Gottlieb Ehler wrote Euler, asking for a solution. This falls short of proving that the bridges were a common subject of speculation. It does show at least that Ehler thought it worth pondering. Euler apparently did not think it was even mathematics. Not that he thought it was hard; he simply thought it didn’t depend on mathematical principles. It took only reason. But he did find something interesting: why was it not mathematics? Paoletti quotes Euler as writing:
This question is so banal, but seemed to me worthy of attention in that [neither] geometry, nor algebra, nor even the art of counting was sufficient to solve it.
I am reminded of a mathematical joke. It’s about the professor who always went on at great length about any topic, however slight. I have no idea why this should stick with me. Finally one day the professor admitted of something, “This problem is not interesting.” The students barely had time to feel relief. The professor went on: “But the reasons why it is not interesting are very interesting. So let us explore that.”
The Königsberg Bridge Problem is in the first chapter of every graph theory book ever. And it is a good graph theory problem. It may not be fair to say it created graph theory, though. Euler seems to have treated this as a little side bit of business, unrelated to his real mathematics. Graph theory as we know it — as a genre — formed in the 19th century. So did topology. In hindsight we can see how studying these bridges brought us good questions to ask, and ways to solve them. But for something like a century after Euler published this, it was just the clever solution to a recreational mathematics puzzle. It was as important as finding knight’s tours of chessboards.
That we take it as the introduction to graph theory, and maybe topology, tells us something. It is an easy problem to pose. Its solution is clever, but not obscure. It takes no long chains of complex reasoning. Many people approach mathematics problems with fear. By telling this story, we promise mathematics that feels as secure as a stroll along the riverfront. This promise is good through about chapter three, section four, where there are four definitions on one page and the notation summons obscure demons of LaTeX.
Still. Look at what the story of the bridges tells us. We notice something curious about our environment. The problem seems mathematical, or at least geographic. The problem is of no consequence. But it lingers in the mind. The obvious approaches to solving it won’t work. But think of the problem differently. The problem becomes simple. And better than simple. It guides one to new insights. In a century it gives birth to two fields of mathematics. In two centuries these are significant fields. They’re things even non-mathematicians have heard of. It’s almost a mathematician’s fantasy of insight and accomplishment.
But this does happen. The world suggests no end of little mathematics problems. Sometimes they are wonderful. Richard Feynman’s memoirs tell of his imagination being captured by a plate spinning in the air. Solving that helped him resolve a problem in developing Quantum Electrodynamics. There are more mundane problems. One of my professors in grad school remembered tossing and catching a tennis racket and realizing he didn’t know why sometimes it flipped over and sometimes didn’t. His specialty was in dynamical systems, and he could work out the mechanics of what a tennis racket should do, and when. And I know that within me is the ability to work out when a pile of books becomes too tall to stand on its own. I just need to work up to it.
The story of the Königsberg Bridge Problem is about this. Even if nobody but the mayor of Danzig pondered how to cross the bridges, and he only got an answer because he infected Euler with the need to know? It is a story of an important piece of mathematics. Good stories will tell us things that are true, which are not necessarily the things that happen in them.
I enjoyed a very busy September 2019 around here. This in several senses. For one, I had a posting every day through September. This is a state I sometimes achieve during A-to-Z months. This time around I’m helped by making two days of the week “Exploiting My Archives” posts. All they do is point to older posts. But that still counts as a new post. And I continue to believe without checking that the number of times I post is the one thing within my control that affects my readership. So let’s dig in to the details of that readership, mm? Thanks.
69 countries or country-like polities sent me readers in September. That’s up from the 65 of August and 64 of July. 19 of these were single-page-view countries. That’s up from the 17 of the past two months. Which countries were these all?
United Arab Emirates
Hong Kong SAR China
Trinidad & Tobago
Hungary, Portugal, and Ukraine had a single page view in August too. Vietnam has settled for one page view a month for five months now. Yes, I’d love to know what the post is.
It’s another month in a row that the Philippines have sent me the second-greatest number of page views. I don’t know why. I’m curious if, like, some teacher found I had a good reference for something and a couple classes have gone looking that up reliably.
But if I were to guess what anyone was reading? Well, 296 pages besides my home page got at least one page view in September. 172 of them got more than one page view. 37 got at least ten views. The most popular pieces? That selection slightly defied my expectations.
It’s the first time a Reading the Comics post hasn’t made the top five since, well, August. The first Reading the Comics post to appear was in ninth place, though. At least record grooves and the real number system diagram are staying popular.
So this was a popular month around here, as mentioned. I logged 2,444 page views from 1,387 unique visitors. This is well above the twelve-month running average of 1363.9 views from 845.9 unique visitors. Both figures are my new record highs. And the views-per-visitor average of 1.76 is the highest I’ve logged since November of 2018.
There were 110 things given likes in September, nearly double the twelve-month running average of 64.0. And there were 36 comments, again nearly double the twelve-month running average of 21.1. Which, combined with the fact there were 30 posts, more than double the twelve-month running average of 14.2 in the month, implies …
Well, if we look at things per post, the implication is clear: daily is too much me. There were 81.5 views per posting, compared to the running average of 99.6. There were 46.2 unique visitors per posting, compared to the running average of 63.1. 3.7 likes per posting, compared to the 4.6 running average. 1.2 comments per posting, below the average of 1.4. It does suggest that while posting a lot is good, posting every day is not. Or at least it’s not necessary, if I want to maximize views per posting.
As of the start of October I’d published 122 things this year, for a total of 115,143 words. This was 27,695 words published in September. That’s an average of 923.2 words per post in September, still somehow below the 944 average words per post all this year. I credit the “Exploiting my Archives” posts, which are a couple of sentences each.
From the start of the blog to the start of October 2019 I’d posted 1,324 things. This collected 85,194 page views in total, from 44,139 logged unique visitors.
And I had a new record busiest day. The 6th of September saw 249 page views from 162 unique visitors. I have to suspect that I got mentioned in some online forum.
If you’d like to be a regular reader, please do. You can use the “Follow Nebusresearch” button in the upper right corner of the page to add my essays to your WordPress reader. If you’d rather read without my ability to detect you, try using the https://nebusresearch.wordpress.com/feed/ RSS feed. If you need an RSS reader, you can still get a free Livejournal or Dreamwidth account and add any RSS feeds you like through that.
Several of the mathematically-themed comic strips from last week featured the fine art of calculation. So that was set to be my title for this week. Then I realized that all the comics worth some detailed mention were published last Sunday, and I do like essays that are entirely one-day affairs. There are a couple of other comic strips that mentioned mathematics tangentially and I’ll list those later this week.
John Hambrock’s The Brilliant Mind of Edison lee for the 29th has Edison show off an organic computer. This is a person, naturally enough. Everyone can do some arithmetic in their heads, especially if we allow that sometimes approximate answers are often fine. People with good speed and precision have always been wonders, though. The setup may also riff on the ancient joke of mathematicians being ways to turn coffee into theorems. (I would imagine that Hambrock has heard that joke. But it is enough to suppose that he’s aware many adult humans drink coffee.)
John Kovaleski’s Daddy Daze for the 29th sees Paul, the dad, working out the calculations his son (Angus) proposed. It’s a good bit of arithmetic that Paul’s doing in his head. The process of multiplying an insubstantial thing by many, many times until you get something of moderate size happens all the time. Much of integral calculus is based on the idea that we can add together infinitely many infinitesimal numbers, and from that get something understandable on the human scale. Saving nine seconds every other day is useless for actual activities, though. You need a certain fungibility in the thing conserved for the bother to be worth it.
Dan Thompson’s Harley for the 29th gets us into some comic strips not drawn by people named John. The comic has some mathematics in it qualitatively. The observation that you could jump a motorcycle farther, or higher, with more energy, and that you can get energy from rolling downhill. It’s here mostly because of the good fortune that another comic strip did a joke on the same topic, and did it quantitatively. That comic?
Bill Amend’s FoxTrot for the 29th. Young prodigies Jason and Marcus are putting serious calculation into their Hot Wheels track and working out the biggest loop-the-loop possible from a starting point. Their calculations are right, of course. Bill Amend, who’d been a physics major, likes putting authentic mathematics and mathematical physics in. The key is making sure the car moves fast enough in the loop that it stays on the track. This means the car experiencing a centrifugal force that’s larger than that of gravity. The centrifugal force on something moving in a circle is proportional to the square of the thing’s speed, and inversely proportional to the radius of the circle. This for a circle in any direction, by the way.
So they need to know, if the car starts at the height A, how fast will it go at the top of the loop, at height B? If the car’s going fast enough at height B to stay on the track, it’s certainly going fast enough to stay on for the rest of the loop.
The hard part would be figuring the speed at height B. Or it would be hard if we tried calculating the forces, and thus acceleration, of the car along the track. This would be a tedious problem. It would depend on the exact path of the track, for example. And it would be a long integration problem, which is trouble. There aren’t many integrals we can actually calculate directly. Most of the interesting ones we have to do numerically or work on approximations of the actual thing. This is all right, though. We don’t have to do that integral. We can look at potential energy instead. This turns what would be a tedious problem into the first three lines of work. And one of those was “Kinetic Energy = Δ Potential Energy”.
But as Peter observes, this does depend on supposing the track is frictionless. We always do this in basic physics problems. Friction is hard. It does depend on the exact path one follows, for example. And it depends on speed in complicated ways. We can make approximations to allow for friction losses, often based in experiment. Or try to make the problem one that has less friction, as Jason and Marcus are trying to do.
I’ve written some good essays for i-words. There’s one that stands out from the pack. If I did not refer people to the Infinite Monkey Theorem I would be giving bad advice. This is the one about monkeys typing Shakespeare. Along the way of researching it I discovered that Bob Newhart seems to have played a role in turning a thought-experiment about probability into something any comic strip can make a quick joke about. I still would like to know more about the history of monkeys-at-typewriters jokes. If somebody knows how to get a research grant for this sort of thing, please, hook me up.
Today’s A To Z term is my pick again. So I choose the Julia Set. This is named for Gaston Julia, one of the pioneers in chaos theory and fractals. He was born earlier than you imagine. No, earlier than that: he was born in 1893.
The early 20th century saw amazing work done. We think of chaos theory and fractals as modern things, things that require vast computing power to understand. The computers help, yes. But the foundational work was done more than a century ago. Some of these pioneering mathematicians may have been able to get some numerical computing done. But many did not. They would have to do the hard work of thinking about things which they could not visualize. Things which surely did not look like they imagined.
We think of things as moving. Even static things we consider as implying movement. Else we’d think it odd to ask, “Where does that road go?” This carries over to abstract things, like mathematical functions. A function is a domain, a range, and a rule matching things in the domain to things in the range. It “moves” things as much as a dictionary moves words.
Yet we still think of a function as expressing motion. A common way for mathematicians to write functions uses little arrows, and describes what’s done as “mapping”. We might write . This is a general idea. We’re expressing that it maps things in the set D to things in the set R. We can use the notation to write something more specific. If ‘z’ is in the set D, we might write . This describes the rule that matches things in the domain to things in the range. represents the evaluation of this rule at a specific point, the one where the independent variable has the value ‘2’. represents the evaluation of this rule at a specific point without committing to what that point is. represents a collection of points. It’s the set you get by evaluating the rule at every point in D.
And it’s not bad to think of motion. Many functions are models of things that move. Particles in space. Fluids in a room. Populations changing in time. Signal strengths varying with a sensor’s position. Often we’ll calculate the development of something iteratively, too. If the domain and the range of a function are the same set? There’s no reason that we can’t take our z, evaluate f(z), and then take whatever that thing is and evaluate f(f(z)). And again. And again.
My age cohort, at least, learned to do this almost instinctively when we discovered you could take the result on a calculator and hit a function again. Calculate something and keep hitting square root; you get a string of numbers that eventually settle on 1. Or you started at zero. Calculate something and keep hitting square; you settle at either 0, 1, or grow to infinity. Hitting sine over and over … well, that was interesting since you might settle on 0 or some other, weird number. Same with tangent. Cosine you wouldn’t settle down to zero.
Serious mathematicians look at this stuff too, though. Take any set ‘D’, and find what its image is, f(D). Then iterate this, figuring out what f(f(D)) is. Then f(f(f(D))). f(f(f(f(D)))). And so on. What happens if you keep doing this? Like, forever?
We can say some things, at least. Even without knowing what f is. There could be a part of D that all these many iterations of f will send out to infinity. There could be a part of D that all these many iterations will send to some fixed point. And there could be a part of D that just keeps getting shuffled around without ever finishing.
Some of these might not exist. Like, doesn’t have any fixed points or shuffled-around points. It sends everything off to infinity. has only a fixed point; nothing from it goes off to infinity and nothing’s shuffled back and forth. has a fixed point and a lot of points that shuffle back and forth.
Thinking about these fixed points and these shuffling points gets us Julia Sets. These sets are the fixed points and shuffling-around points for certain kinds of functions. These functions are ones that have domain and range of the complex-valued numbers. Complex-valued numbers are the sum of a real number plus an imaginary number. A real number is just what it says on the tin. An imaginary number is a real number multiplied by . What is ? It’s the imaginary unit. It has the neat property that . That’s all we need to know about it.
Oh, also, zero times is zero again. So if you really want, you can say all real numbers are complex numbers; they’re just themselves plus . Complex-valued functions are worth a lot of study in their own right. Better, they’re easier to study (at the introductory level) than real-valued functions are. This is such a relief to the mathematics major.
And now let me explain some little nagging weird thing. I’ve been using ‘z’ to represent the independent variable here. You know, using it as if it were ‘x’. This is a convention mathematicians use, when working with complex-valued numbers. An arbitrary complex-valued number tends to be called ‘z’. We haven’t forgotten x, though. We just in this context use ‘x’ to mean “the real part of z”. We also use “y” to carry information about the imaginary part of z. When we write ‘z’ we hold in trust an ‘x’ and ‘y’ for which . This all comes in handy.
But we still don’t have Julia Sets for every complex-valued function. We need it to be a rational function. The name evokes rational numbers, but that doesn’t seem like much guidance. is a rational function. It seems too boring to be worth studying, though, and it is. A “rational function” is a function that’s one polynomial divided by another polynomial. This whether they’re real-valued or complex-valued polynomials.
So. Start with an ‘f’ that’s one complex-valued polynomial divided by another complex-valued polynomial. Start with the domain D, all of the complex-valued numbers. Find f(D). And f(f(D)). And f(f(f(D))). And so on. If you iterated this ‘f’ without limit, what’s the set of points that never go off to infinity? That’s the Julia Set for that function ‘f’.
There are some famous Julia sets, though. There are the Julia sets that we heard about during the great fractal boom of the 1980s. This was when computers got cheap enough, and their graphic abilities good enough, to automate the calculation of points in these sets. At least to approximate the points in these sets. And these are based on some nice, easy-to-understand functions. First, you have to pick a constant C. This C is drawn from the complex-valued numbers. But that can still be, like, ½, if that’s what interests you. For whatever your C is? Define this function:
And that’s it. Yes, this is a rational function. The numerator function is . The denominator function is .
This produces many different patterns. If you picked C = 0, you get a circle. Good on you for starting out with something you could double-check. If you picked C = -2? You get a long skinny line, again, easy enough to check. If you picked C = -1? Well, now you have a nice interesting weird shape, several bulging ovals with peninsulas of other bulging ovals all over. Pick other numbers. Pick numbers with interesting imaginary components. You get pinwheels. You get jagged streaks of lightning. You can even get separate islands, whole clouds of disjoint threatening-looking blobs.
There is some guessing what you’ll get. If you work out a Julia Set for a particular C, you’ll see a similar-looking Julia Set for a different C that’s very close to it. This is a comfort.
You can create a Julia Set for any rational function. I’ve only ever seen anyone actually do it for functions that look like what we already had. . Sometimes . I suppose once, in high school, I might have tried but I don’t remember what it looked like. If someone’s done, say, please write in and let me know what it looks like.
The Julia Set has a famous partner. Maybe the most famous fractal of them all, the Mandelbrot Set. That’s the strange blobby sea surrounded by lightning bolts that you see on the cover of every pop mathematics book from the 80s and 90s. If a C gives us a Julia Set that’s one single, contiguous patch? Then that C is in the Mandelbrot Set. Also vice-versa.
The ideas behind these sets are old. Julia’s paper about the iterations of rational functions first appeared in 1918. Julia died in 1978, the same year that the first computer rendering of the Mandelbrot set was done. I haven’t been able to find whether that rendering existed before his death. Nor have I decided which I would think the better sequence.
The second half of last week’s mathematically-themed comic strips had an interesting range of topics. Two of them seemed to circle around the making of models. So that’s my name for this installment.
Ryan North’s Dinosaur Comics for the 26th has T-Rex trying to build a model. In this case, it’s to project how often we should expect to see a real-life Batman. T-Rex is building a simple model, which is fine. Simple models, first, are usually easier to calculate with. How they differ from reality can give a guide to how to make a more complex model. Or they can indicate the things that have to be learned in order to make a more complex model. The difference between a model’s representation and the observed reality (or plausibly expected reality) can point out problems in one’s assumptions, too.
For example, T-Rex supposes that a Batman needs to have billionaire parents. This makes for a tiny number of available parents. But surely what’s important is that a Batman be wealthy enough he doesn’t have to show up to any appointments he doesn’t want to make. Having a half-billion dollars, or a “mere” hundred million, would allow that. Even a Batman who had “only” ten million dollars would be about as free to be a superhero. Similarly, consider the restriction to Olympic athletes. Astronaut Ed White, who on Gemini IV became the first American to walk in space, was not an Olympic athlete; but he certainly could have been. He missed by a split-second in the 400 meter hurdles race. Surely someone as physically fit as Ed White would be fit enough for a Batman. Not to say that “Olympic athletes or NASA astronauts” is a much bigger population than “Olympic athletes”. (And White was unusually fit even for NASA astronauts.) But it does suggest that merely counting Olympic athletes is too restrictive.
But that’s quibbling over the exact numbers. The process is a good rough model. List all the factors, suppose that all the factors are independent of one another, and multiply how likely it is each step happens by the population it could happen to. It’s hard to imagine a simpler model, but it’s a place to start.
Greg Wallace’s Nothing Is Not Something for the 26th is a bit of a geometry joke. It’s built on the idiom of the love triangle, expanding it into more-sided shapes. Relationships between groups of people like this can be well-represented in graph theory, with each person a vertex, and each pair of involved people an edge. There are even “directed graphs”, where each edge contains a direction. This lets one represent the difference between requited and unrequited interests.
Brian Anderson’s Dog Eat Doug for the 27th has Sophie the dog encounter some squirrels trying to disprove a flat Earth. They’re not proposing a round Earth either; they’ve gone in for a rhomboid. Sophie’s right to point out that drilling is a really hard way to get through the Earth. That’s a practical matter, though.
Is it possible to tell something about the shape of a whole thing from a small spot? In the terminology, what kind of global knowledge can we get from local information? We can do some things. For example, we can draw a triangle on the surface of the Earth and measure the interior angles to see what they sum to. If this could be done perfectly, finding that the interior angles add up to more than 180 degrees would show the triangle’s on a spherical surface. But that also has practical limitations. Like, if we find that locally the planet is curved then we can rule out it being entirely flat. But it’s imaginable that we’d be on the one dome of an otherwise flat planet. At some point you have to either assume you’re in a typical spot, or work out ways to find what’s atypical. In the Conspiracy Squirrels’ case, that would be the edge between two faces of the rhomboid Earth. Then it becomes something susceptible to reason.
Zach Weinersmith’s Saturday Morning Breakfast Cereal for the 27th has the mathematician making another model. And this is one of the other uses of a model: to show a thing can’t happen, show that it would have results contrary to reason. But then you have to validate the model, showing that its premises do represent reality so well that its conclusion should be believed. This can be hard. There’s some nice symbol-writing on the chalkboard here, although I don’t see that they parse. Particularly, the bit on the right edge of the panel, where the writing has a rotated-by-180-degrees ‘E’ followed by an ‘x’, a rotated-by-180-degrees ‘A’, and then a ‘z’, is hard to fit inside an equation like this. The string of symbols mean “there exists some x for which, for all z, (something) is true”. This fits at the start of a proof, or before an equation starts. It doesn’t make grammatical sense in the middle of an equation. But, in the heat of writing out an idea, mathematicians will write out ungrammatical things. As with plain-text writing, it’s valuable to get an idea down, and edit it into good form later.
Tom Batiuk’s Funky Winkerbean Vintage for the 28th sees the school’s Computer explaining the nature of its existence, and how it works. Here the Computer claims to just be filled with thousands of toes to count on. It’s silly, but it is the case that there’s no operation a computer does that isn’t something a human can do, manually. If you had the paper and the time you could do all the steps of a Facebook group chat, a game of SimCity, or a rocket guidance computer’s calculations. The results might just be impractically slow.
Today’s A To Z term is a free pick. I didn’t notice any suggestions for a mathematics term starting with this letter. I apologize if you did submit one and I missed it. I don’t mean any insult.
What I’ve picked is a concept from analysis. I’ve described this casually as the study of why calculus works. That’s a good part of what it is. Analysis is also about why real numbers work. Later on you also get to why complex numbers and why functions work. But it’s in the courses about Real Analysis where a mathematics major can expect to find the infimum, and it’ll stick around on the analysis courses after that.
The infimum is the thing you mean when you say “lower bound”. It applies to a set of things that you can put in order. The order has to work the way less-than-or-equal-to works with whole numbers. You don’t have to have numbers to put a number-like order on things. Otherwise whoever made up the Alphabet Song was fibbing to us all. But starting out with numbers can let you get confident with the idea, and we’ll trust you can go from numbers to other stuff, in case you ever need to.
A lower bound would start out meaning what you’d imagine if you spoke English. Let me call it L. It’ll make my sentences so much easier to write. Suppose that L is less than or equal to all the elements in your set. Then, great! L is a lower bound of your set.
You see the loophole here. It’s in the article “a”. If L is a lower bound, then what about L – 1? L – 10? L – 1,000,000,000½? Yeah, they’re all lower bounds, too. There’s no end of lower bounds. And that is not what you mean be a lower bound, in everyday language. You mean “the smallest thing you have to deal with”.
But you can’t just say “well, the lower bound of a set is the smallest thing in the set”. There’s sets that don’t have a smallest thing. The iconic example is positive numbers. No positive number can be a lower bound of this. All the negative numbers are lowest bounds of this. Zero can be a lower bound of this.
For the postive numbers, it’s obvious: zero is the lower bound we want. It’s smaller than all of the positive numbers. And there’s no greater number that’s also smaller than all the positive numbers. So this is the infimum of the positive numbers. It’s the greatest lower bound of the set.
The infimum of a set may or may not be part of the original set. But. Between the infimum of a set and the infimum plus any positive number, however tiny that is? There’s always at least one thing in the set.
And there isn’t always an infimum. This is obvious if your set is, like, the set of all the integers. If there’s no lower bound at all, there can’t be a greatest lower bound. So that’s obvious enough.
Infimums turn up in a good number of proofs. There are a couple reasons they do. One is that we want to prove a boundary between two kinds of things exist. It’s lurking in the proof, for example, of the intermediate value theorem. This is the proposition that if you have a continuous function on the domain [a, b], and range of real numbers, and pick some number g that’s between f(a) and f(b)? There’ll be at least one point c, between a and b, where f(c) equals g. You can structure this: look at the set of numbers x in the domain [a, b] whose f(x) is larger than g. So what’s the infimum of this set? What does f have to be for that infimum?
It also turns up a lot in proofs about calculus. Proofs about functions, particularly, especially integrating functions. A proof like this will, generically, not deal with the original function, which might have all kinds of unpleasant aspects. Instead it’ll look at a sequence of approximations of the original function. Each approximation is chosen so it has no unpleasant aspect. And then prove that we could make arbitrarily tiny the difference between the result for the function we want and the result for the sequence of functions we make. Infimums turn up in this, since we’ll want a minimum function without being sure that the minimum is in the sequence we work with.
This is the terminology of stuff to work as lower bounds. There’s a similar terminology to work with upper bounds. The upper-bound equivalent of the infimum is the supremum. They’re abbreviated as inf and sup. The supremum turns up most every time an infimum does, and for the reasons you’d expect.
If an infimum does exist, it’s unique; there can’t be two different ones. Same with the supremum.
And things can get weird. It’s possible to have lower bounds but no infimum. This seems bizarre. This is because we’ve been relying on the real numbers to guide our intuition. And the real numbers have a useful property called being “complete”. So let me break the real numbers. Imagine the real numbers except for zero. Call that the set R’. Now look at the set of positive numbers inside R’. What’s the infimum of the positive numbers, within R’? All we can do is shrug and say there is none, even though there are plenty of lower bounds. The infimum of a set depends on the set. It also depends on what bigger set that the set is within. That something depends both on a set and what the bigger set of things is, is another thing that turns up all the time in analysis. It’s worth becoming familiar with.