* [ We didn’t break 3,100 yet, and too bad that. But over the day I did get my first readers from Turkey and the second from the United Arab Emirates that I’ve noticed. Also while my many posts about trapezoids are drawing search engine results, “frazz sequins” comes up a lot. ] *

I think I’ve managed, more or less, acceptance that a piecewise constant interpolation makes the simplest way to estimate the population of Charlotte, North Carolina, when all I had to work with was the population data from the 1970 and the 1980 censuses. In 1970 the city had 840,347 people; in 1980 it had 971,391, and therefore the easiest guess to the population in 1975 would be the 1970 value, of 840,347. We suppose that on the 1st of April, 1970 — that Census Day — the population was the lower value, and then sometime before the 1st of April, 1980, it leapt up at once by the 131,044-person difference. Only … how do I know the population jumped up sometime after 1975?

Here’s the thing about my piecewise constant interpolation. I want it to match the actual data I have of 840,347 people on the 1st of April, 1970. And I want it to match the actual data I have of 971,391 people on the 1st of April, 1980. I’m supposing that somewhere in-between the population leapt up by 131,044 people. After the last essay probably the easiest supposition was that the population leapt up so on the last day before the 1980 census, or maybe the last day of 1979. That’s my first thought, anyway.

But why can’t I have the population jump up the 2nd of April, 1970? I’d still have this piecewise-constant approximation, and I would still hit exactly correctly the two pieces of data I have. Why can’t I take the whole population increase all at once and enjoy it for the whole of the 1970s?

If my only considerations are that I want to exactly match the data points I have, and I want to have a piecewise constant interpolation, and I want to take that whole increase all at once, there’s not really any reason I can’t draw my interpolation that way. It may go against instinct — it goes against my instinct, anyway, and until I hear otherwise I’ll suppose you feel the same — but there’s nothing wrong with doing so.

So there’s one of the neat little complications of my nice simplest-possible-interpolation scheme. There’s actually two schemes that fit my known data of these two points. Imagine what I could do if I had there data points. (Well, I could come up with four piecewise-constant interpolations, as I make it out, although I’d say only two of them would ever actually be used. One of them is too dull to use. Another is a little ad hoc. If you’d like to spend a little time doodling, see if I’m right or if I’ve overlooked some.)

Although, really, why should I limit myself to two interpolations? If the population can leap upwards on the 2nd of April 1970, or on the 31stof March, 1980, why couldn’t it leap up on the 1st of January, 1975? Or the 31st of December, 1975? Why not the 18th of September, 1978? Or the 29th of February, 1972? For that matter, why not make the leap at 12:37 pm, the 11th of July, 1979? 12:37 and 14 seconds that same day? 12:37 and 14 and one-quarter seconds?

In my interpolation for Charlotte’s population for 1975 — at least, if I pick out one moment in 1975 — these many variations aren’t going to matter. It’ll be either the low number or the high number. But they’re unmistakably different interpolations: they don’t agree on the projected population for every moment throughout the decade.

I find it wonderful we can come up with, literally, infinitely many different interpolations from just these two data points and the simplest possible function connecting the two.

We can get more complicated yet.

I feel very naive and embarrassed when I speculate about mathematical things, even when there’s something that seems to me like logic driving my best guesses. But I’ll go out on a limb and say that it seems to me that the best thing to do would be to make the jump exactly midway between the two known data points.

LikeLike

I’d hope you don’t feel too embarrassed. You have a good intuitive feeling for mathematics, at least based on the speculations you offer whenever we do talk mathematics topics. You’re well-trained in logical thought and that makes up for a lot of unfamiliarity with the background.

All the possible ways of having the jump are equally justified, at least from the goals I had set out. In practice, there are the three ways to set the leap that actually get used: put the jump just at the latter date, so we’d keep the 1970 figure up to 1980; put the jump at the earlier date, so we’d use the 1980 figure immediately after 1970; or have the jump be at the midpoint. Other points are all right, they just look like eccentric choices, unless the problem carries some special property which makes it look compelling.

I suspect, and say this without being a specialist in interpolations, that the midpoint method is the least popular of the actually-used three at least at the teaching level, just because it’s more tedious to write out. Usually, at the teaching level, the data points are given at nice whole numbers, and so writing out the definition of the piece wise constant function requires writing a lot of 1/2’s, and that’s tedious, so the left- or right-point methods get used instead.

I can say that in the big use I have for interpolations (which actually is done in two dimensions but is the same idea) I have the jumps done midway between my data points. My sense is that when one doesn’t have to bring a class to understanding the basic idea the midpoint becomes more popular. However, I am open to correction from people with deeper experience about what the actual practices are.

LikeLike

Wouldn’t the midpoint be the most likely to minimize the maximum deviation from the true population? That’s why I chose it.

LikeLike

I would expect, typically, that it should. The error depends on how fast the function being interpolated changes, and how far you are from the data point, and how far you are from the data point is the only part you can really control.

(Of course, it won’t always be true. Given any interpolation scheme, it’s always possible to find a function which the interpolation scheme fails hilariously badly at approximating. We’ll suppose that the data isn’t trying to be difficult.)

LikeLike