[ I’d like to thank all who’ve read me or passed on links to me for getting my total hit count above 3,000. In fact, as I write this, the total seems to be 3,033, which is a pleasantly 3-ish number. I suppose that it’s ungrateful to look for 4,000 right away, but after all, I do hope to be interesting or useful, and both of those seem to correlate pretty strongly with being read. In any case, I’ll see how long it takes to reach 3,100, and be silent about that if it’s a number of days too embarrassing to mention. ]
The task I’ve set myself is finding an approximation to the population of Charlotte, North Carolina, for the year 1975. The tools I have on hand are the data that I’m fairly sure I believe for Charlotte’s population in 1970 and in 1980. I have to accept one thing or I’ll be hopelessly disappointed ever after: I’m not going to get the right answer. I’m not going to do my job badly, at least not on purpose; it’s just that — barring a remarkable stroke of luck — I won’t get Charlotte’s actual 1975 population. That’s the nature of interpolations (and extrapolations). But there are degrees of wrongness. Guessing that Charlotte had no people in it in 1975, or twenty millions of people, would be obviously ridiculously wrong. Guessing that it had somewhere between 840,347 (its 1970 Census population) and 971,391 (its 1980 Census population) seems much more plausible. So let me make my first interpolation to Charlotte’s 1975 population.
I project a 1975 population of 840,347.
That foils everyone expecting this to go into estimating the population as a straight line between the two data points, doesn’t it? I’ve picked out an even simpler interpolation than that. I’m pretending that the population of Charlotte is a fixed number, that 1970 Census Day population, until some point before the 1980 Census Day when it leaps up by 131,044 people. Then it stays fixed at that level until sometime before the next census.
This may not sound like much of an interpolation. For one, how can I get away with claiming that the population in 1975 is exactly the same as it was in 1970?
Let me start defending it by pointing this out: suppose I knew Charlotte’s population as of the opening of government offices this morning. Would I be justified in saying the population of Charlotte right this minute is exactly the same number? I’d probably be wrong; surely someone has moved in or out since the start of the day, and it’d be a stroke of luck if exactly as many people moved out as in. But it’s going to be a pretty good guess. If I look at a short enough interval, the population of Charlotte is indeed constant, at least to within tolerable error margins.
Supposing the population doesn’t change over the course of the day, probably, anyone would agree is just fine. Supposing it doesn’t change over the course of a week, well, that’s a little less fine but still not really objectionable. Over the course of a month? There’s a little more error being made, but then, if I needed to know how many people were in the city and I had a population figure from 26 days ago I’d be wasting my time and money to recalculate it. So what’s so bad over the course of a year?
All right, as the stretch of time between my data point and the point at which I’m making my interpolation grows, the probable error between my interpolation and the actual population count grows. I can’t avoid that. Is it going to be too big an error to tolerate? Well, that depends on my needs.
This sort of interpolation, where we suppose the value to be some fixed number over a whole interval, neither increasing nor decreasing except in these sudden jumps, is called a “piecewise constant” interpolation. That’s a mathematical term barely worth the energy to define once you’ve gone to the effort of reading it. The projected population is a constant number, but in multiple pieces. If we drew this projected population on a chart, it would look like the horizontal parts of a stairwell, several short flat lines.
It’s an easy scheme to worth with. Many of the things we’re interested in doing with functions, particularly in finding the areas underneath a function, are very easy with piecewise constant functions. And it’s easy to match to whatever our data set is, even if our data points are clustered very close together, or very far apart, or even if how closely data points are close together in some regions and distant in others.
And this piecewise interpolation can be just fine. Consider the picture drawn on any computer monitor. Each pixel is a little roughly square island of a color that does not vary. But this (two-dimensional) piecewise constant approximation to the picture we mean to represent can look very good indeed, and one may have to get very close up to notice the difference between (say) a picture of the sky and the actual sky.
So there’s my first approximation: to declare that Charlotte’s population in 1975 was exactly the same as its population in 1970, and that it kept that population constant right until …
Say, there’s a question.