Reading the Comics, March 4, 2015: Driving Me Crazy Edition


I like it when there are themes to these collections of mathematical comics, but since I don’t decide what subjects cartoonists write about — Comic Strip Master Command does — it depends on luck and my ability to dig out loose connections to find any. Sometimes, a theme just drops into my lap, though, as with today’s collection: several cartoonists tossed off bits that had me double-checking their work and trying to figure out what it was I wasn’t understanding. Ultimately I came to the conclusion that they just made mistakes, and that’s unnerving since how could a mathematical error slip through the rigorous editing and checking of modern comic strips?

Mac and Bill King’s Magic in a Minute (March 1) tries to show off how to do a magic trick based on parity, using the spots on a die to tell whether it was turned in one direction or another. It’s a good gimmick, and parity — whether something is odd or even — can be a great way to encode information or to do simple checks against slight errors. That said, I believe the Kings made a mistake in describing the system: I can’t figure out how the parity of the three sides of a die facing you could not change, from odd to even or from even to odd, as the die is rotated one turn. I believe they mean that you should just count the dots on the vertical sides, so that for example in the “Howdy Do It?” panel in the lower right corner, add two and one to make three. But with that corrected it should be a good trick.

Continue reading “Reading the Comics, March 4, 2015: Driving Me Crazy Edition”

How To Build Infinite Numbers


I had missed it, as mentioned in the above tweet. The link is to a page on the Form And Formalism blog, reprinting a translation of one of Georg Cantor’s papers in which he founded the modern understanding of sets, of infinite sets, and of infinitely large numbers. Although it gets into pretty heady topics, it doesn’t actually require a mathematical background, at least as I look at it; it just requires a willingness to follow long chains of reasoning, which I admit is much harder than algebra.

Cantor — whom I’d talked a bit about in a recent Reading The Comics post — was deeply concerned and intrigued by infinity. His paper enters into that curious space where mathematics, philosophy, and even theology blend together, since it’s difficult to talk about the infinite without people thinking of God. I admit the philosophical side of the discussion is difficult for me to follow, and the theological side harder yet, but a philosopher or theologian would probably have symmetric complaints.

The translation is provided as scans of a typewritten document, so you can see what it was like trying to include mathematical symbols in non-typeset text in the days before LaTeX (which is great at it, but requires annoying amounts of setup) or HTML (which is mediocre at it, but requires less setup) or Word (I don’t use Word) were available. Somehow, folks managed to live through times like that, but it wasn’t pretty.

How February 2015 Treated My Mathematics Blog


Of course I’m going to claim February 2015 was a successful month for my mathematics blog here. When have I ever claimed it was a dismal month? Probably I have, though last month wasn’t a case of it.

Anyway, according to WordPress’s statistics page, both the old and the new (which they’re getting around to making less awful), in February the mathematics blog had 859 views, down from January’s 944, but up from December’s 831. This is my second-highest on record. That said, I do want to point out that with a mere 28 days February was at a relative disadvantage for page clicks, and that January saw an average of 30.45 views per day, while February came in at 30.68, which is a record high.

There were 407 visitors in February, down from January’s 438 and December’s 424. 407 is the fourth-highest visitor count I have on record, though its 14.54 visitors per day falls short of January 2015’s 15.64, and way short of the all-time record, January 2013’s 15.26 visitors per day.

The views per visitor were at 1.96 in December, 2.16 in January, and dropped surely insignificantly to 2.11 for February, and there’s no plausibly splitting that up per day. Anyway, the mathematics blog started March at 21,815 views so there’s every reason to hope it’ll hit that wonderfully uniform count of 22,222 views soon.

The new statistics page lets me see that I drew 179 “likes” in February, down from 196 in January, but well up from December’s 128. Not to get too bean-counting but that is 6.39 likes per day in February against a mere 6.32 per day in January.

The most popular posts in February were mostly the comic strip posts, with the perennial favorite of trapezoids sneaking in. Getting more than thirty views each in February were:

  1. Reading the Comics, February 4, 2015: Neutral Edition, where I really showed off the weakness of naming each edition.
  2. Reading the Comics, February 14, 2015: Valentine‚Äôs Eve Edition, again, an edition name that’s not really better than just giving the date.
  3. Reading the Comics, January 29, 2015: Returned Motifs Edition, which is the one where I learned anything about the history of blackjack.
  4. How Many Trapezoids I Can Draw, which is the closest I’ll come to classifying the sporadic finite simple groups.
  5. Reading the Comics, February 20, 2015: 19th-Century German Mathematicians Edition, because Saturday Morning Breakfast Cereal name-dropped Georg Cantor and Bernard Riemann.
  6. How To Re-Count Fish, describing problems in the post …
  7. How To Count Fish, which was somehow read three fewer times than the Re-Count one was.
  8. Denominated Mischief, in which a bit of arithmetic manipulation proves that 7 equals 11.

In the listing of nations: as ever the countries sending me the most readers were the United States, with a timely 555; Canada with 83, and the United Kingdom with 66. The United States is down from January, but Canada and the United Kingdom strikingly higher. Germany sent 27 (up from 22), Austria 23 (down from 32), and Slovenia came from out of nowhere to send 21 readers this time around. India dropped from 18 to 6.

There were sixteen single-reader countries in February, up from January’s 14: Chile, Czech Republic, Hungary, Iceland, Ireland, Jamaica, Japan, Mexico, New Zealand, Philippines, Poland, Romania, Swaziland, Sweden, Venezuela, and Vietnam. The repeats from January are Hungary, Japan, and Mexico; Mexico is on a three-month streak.

There weren’t any really good, strange, amusing search terms bringing people here this past month, sad to say. The most evocative of them were:

  • topic about national mathematics day (I think this must be a reference to India’s holiday)
  • price is right piggy bank game (I’ve never studied this one, but I have done bits on the Item Up For Bid and on the Money Game)
  • jokes about algebraic geometry (are there any?)
  • groove spacing 78 and 45 (Yeah, I couldn’t find a definitive answer, but something like 170 grooves per inch seems plausible. Nobody’s taken me up on my Muzak challenge.)
  • two trapezoids make a (well, at least someone’s composing modernist, iconoclastic poetry around here)
  • sketch on how to inscribe more than one in a cycle in a triangle according to g.m green (I think this guy should meet the algebraic geometry jokester)

Reading the Comics, February 28, 2015: Calendar Reform Edition


It’s the last day of the shortest month of the year, a day that always makes me think about whether the calendar could be different. I was bit by the calendar-reform bug as a child and I’ve mostly recovered from the infection, but some things can make it flare up again and I’ve never stopped being fascinated by the problem of keeping track of days, which you’d think would not be so difficult.

That’s why I’m leading this review of comics with Jef Mallet’s Frazz (February 27) even if it’s not transparently a mathematics topic. The biggest problem with calendar reform is there really aren’t fully satisfactory ways to do it. If you want every month to be as equal as possible, yeah, 13 months of 28 days each, plus one day (in leap years, two days) that doesn’t belong to any month or week is probably the least obnoxious, if you don’t mind 13 months to the year meaning there’s no good way to make a year-at-a-glance calendar tolerably symmetric. If you don’t want the unlucky, prime number of 13 months, you can go with four blocks of months with 31-30-30 days and toss in a leap day that’s again, not in any month or week. But people don’t seem perfectly comfortable with days that belong to no month — suggest it to folks, see how they get weirded out — and a month that doesn’t belong to any week is right out. Ask them. Changing the default map projection in schools is an easier task to complete.

There are several problems with the calendar, starting with the year being more nearly 365 days than a nice, round, supremely divisible 360. Also a factor is that the calendar tries to hack together the moon-based months with the sun-based year, and those don’t fit together on any cycle that’s convenient to human use. Add to that the need for Easter to be close to the vernal equinox without being right at Passover and you have a muddle of requirements, and the best we can hope for is that the system doesn’t get too bad.

Continue reading “Reading the Comics, February 28, 2015: Calendar Reform Edition”

How Not To Count Fish


I’d discussed a probability/sampling-based method to estimate the number of fish that might be in our pond out back, and then some of the errors that have to be handled if you want to have a reliable result. Now, I want to get into why the method doesn’t work, at least not without much greater insight into goldfish behavior than simply catching a couple and releasing them will do.

Catching a sample, re-releasing it, and counting how many of that sample we re-catch later on is a logically valid method, provided certain assumptions the method requires are accurately — or at least accurately enough — close to the way the actual thing works. Here are some of the ways goldfish fall short of the ideal.

First faulty assumption: Goldfish are perfectly identical. In this goldfish-trapped we make the assumption that there is some, fixed, constant probability of a goldfish being caught in the net. We have to assume that this is the same number for every goldfish, and that it doesn’t change as goldfish go through the experience of getting caught and then released. But goldfish have personality, as you learn if you have a bunch in a nice setting and do things like try feeding them koi treats or introduce something new like a wire-mesh trap to their environment. Some are adventurous and will explore the unfamiliar thing; some are shy and will let everyone else go first and then maybe not bother going at all. I empathize with both positions.

If there are enough goldfish, the variation between personalities is probably not going to matter much. There’ll be some that are easy to catch, and they’ll probably be roughly as common as the ones who can’t be coaxed into the trap at all. It won’t be exactly balanced unless we’re very lucky, but this would probably only throw off our calculations a little bit.

Whether the goldfish learn, and become more, or less, likely to be trapped in time is harder. Goldfish do learn, certainly, although it’s not obvious to me that the trapping and releasing experience would be one they draw much of a lesson from. It’s only a little inconvenience, really, and not at all harmful; what should they learn? Other than that there’s maybe an easy bit of food to be had here so why not go in? So this might change their behavior and it’s hard to predict how.

(I note that animal capture studies get quite frustrated when the animals start working out how to game the folks studying them. Bil Gilbert’s early-70s study of coatis — Latin American raccoons, written up in the lovely popularization Chulo: A Year Among The Coatimundis — was plagued by some coatis who figured out going into the trap was an easy, safe meal they’d be released from without harm, and wouldn’t go back about their business and leave room for other specimens.)

Second faulty assumption: Goldfish are not perfectly identical. This is the biggest challenge to counting goldfish population by re-catching a sample of them. How do you know if you caught a goldfish before? When they grow to adulthood, it’s not so bad, since they grow fairly distinctive patterns of orange and white and black and such, and they’ll usually settle into different sizes. (That said, we do have two adult fish who were very distinct when we first got them, but who’ve grown into near-twins.)

But baby goldfish? They’re basically all tiny black things, meant to hide into the mud at the bottom of ponds and rivers — their preferred habitat — and pretty near indistinguishable. As they get larger they get distinguishable, a bit, and start to grow patterns, but for the vast number of baby fish there’s just no telling one from another.

When we were trying to work out whether some mice we found in the house were ones we had previously caught and put out in the garage, we were able to mark them by squiring some food dye at their heads as they were released. The mice would rub the food dye from their heads onto their whole bodies and it would take a while before the dye would completely fade out. (We didn’t re-catch any mice, although it’s hard to dye a wild mouse efficiently because they will take off like bullets. Also one time when we thought we’d captured one there were actually three in the humane trap and you try squiring the food dye bottle at two more mice than you thought were there, fleeing.) But you can see how the food dye wouldn’t work here. Animal researchers with a budget might go on to attach collars or somehow otherwise mark animals, but if there’s a way to mark and track goldfish with ordinary household items I can’t think of it.

(No, we will not be taking the bits of americium in our smoke detectors out and injecting them into trapped goldfish; among the objections, I don’t have a radioactivity detector.)

Third faulty assumption: Goldfish are independent entities. The first two faulty assumptions are ones that could be kind of worked around. If there’s enough goldfish then the distribution of how likely any one is to get caught will probably be near enough normal that we can pretend there’s an identical chance of catching each, and if we really thought about it we could probably find some way of marking goldfish to tell if we re-caught any. Independence, though; this is the point on which so many probability-based schemes fall.

Independence, in the language of probability, is the principle that one thing’s happening does not affect the likelihood of another thing happening. For our problem, it’s the assumption that one goldfish being caught does not make it any more or less likely that another goldfish will be caught. We like independence, in studying probability. It makes so many problems easier to study, or even possible to study, and it often seems like a reasonable supposition.

A good number of interesting scientific discoveries amount to finding evidence that two things are not actually independent, and that one thing happening makes it more (or less) likely the other will. Sometimes these turn out to be vapor — there was a 19th-century notion suggesting a link between sunspot activity and economic depressions (because sunspots correlate to solar activity, which could affect agriculture, and up to 1893 the economy and agriculture were pretty much the same thing) — but when there is a link the results can be profound, as see the smoking-and-cancer link, or for something promising but still (to my understanding) under debate, the link between leaded gasoline and crime rates.

How this applies to the goldfish population problem, though, is that goldfish are social creatures. They school, loosely, forming and re-forming groups, and would much rather be around another goldfish than not. Even as babies they form these adorable tiny little schools; that may be in the hopes that someone else will get eaten by a bigger fish, but they keep hanging around other fish their own size through their whole lives. If there’s a goldfish inside the trap, it is hard to believe that other goldfish are not going to follow it just to be with the company.

Indeed, the first day we set out the trap for the winter, we pulled in all but one of the adult fish, all of whom apparently followed the others into the enclosure. I’m sorry I couldn’t photograph that because it was both adorable and funny to see so many fish just station-keeping beside one another — they were even all looking in the same direction — and waiting for whatever might happen next. Throughout the months we were able to spend bringing in fish, the best bait we could find was to have one fish already in the trap, and a couple days we did leave one fish in a few more hours or another night so that it would be joined by several companions the next time we checked.

So that’s something which foils the catch and re-catch scheme: goldfish are not independent entities. They’re happy to follow one another into trap. I would think the catch and re-catch scheme should be salvageable, if it were adapted to the way goldfish actually behave. But that requires a mathematician admitting that he can’t just blunder into a field with an obvious, simple scheme to solve a problem, and instead requires the specialized knowledge and experience of people who are experts in the field, and that of course can’t be done. (For example, I don’t actually know that goldfish behavior is sufficiently non-independent as to make an important difference in a population estimate of this kind. But someone who knew goldfish or carp well could tell me, or tell me how to find out.)

Several dozen goldfish, most of them babies, within a 150-gallon rubber stock tank, their wintering home.
Goldfish brought indoors, to a stock tank, for the winter.

For those curious how the goldfish worked out, though, we were able to spend about two and a half months catching fish before the pond froze over for the winter, though the number we caught each week dropped off as the temperature dropped. We have them floating about in a stock tank in the basement, waiting for the coming of spring and the time the pond will be warm enough for them to re-occupy it. We also know that at least some of the goldfish we didn’t catch made it to, well, about a month ago. I’d seen one of the five orange baby fish who refused to go into the trap through a hole in the ice then. It was holding close to the bottom but seemed to be in good shape.

This coming year should be an exciting one for our fish population.

Reading the Comics, February 24, 2014: Getting Caught Up Edition


And now, I think, I’ve got caught up on the mathematics-themed comics that appeared at Comics Kingdom and at Gocomics.com over the past week and a half. I’m sorry to say today’s entries don’t get to be about as rich a set of topics as the previous bunch’s, but on the other hand, there’s a couple Comics Kingdom strips that I feel comfortable using as images, so there’s that. And come to think of it, none of them involve the setup of a teacher asking a student in class a word problem, so that’s different.

Mason Mastroianni, Mick Mastroianni, and Perri Hart’s B.C. (February 21) tells the old joke about how much of fractions someone understands. To me the canonical version of the joke was a Sydney Harris panel in which one teacher complains that five-thirds of the class doesn’t understand a word she says about fractions, but it’s all the same gag. I’m a touch amused that three and five turn up in this version of the joke too. That probably reflects writing necessity — especially for this B.C. the numbers have to be a pair that obviously doesn’t give you one-half — and that, somehow, odd numbers seem to read as funnier than even ones.

Bud Fisher’s Mutt and Jeff (February 21) decimates one of the old work-rate problems, this one about how long it takes a group of people to eat a pot roast. It was surely an old joke even when this comic first appeared (and I can’t tell you when it was; Gocomics.com’s reruns have been a mixed bunch of 1940s and 1950s ones, but they don’t say when the original run date was), but the spread across five panels treats the joke well as it’s able to be presented as a fuller stage-ready sketch. Modern comic strips value an efficiently told, minimalist joke, but pacing and minor punch lines (“some men don’t eat as fast as others”) add their charm to a comic.

Continue reading “Reading the Comics, February 24, 2014: Getting Caught Up Edition”

Reading the Comics, February 20, 2015: 19th-Century German Mathematicians Edition


So, the mathematics comics ran away from me a little bit, and I didn’t have the chance to write up a proper post on Thursday or Friday. So I’m writing what I probably would have got to on Friday had time allowed, and there’ll be another in this sequence sooner than usual. I hope you’ll understand.

The title for this entry is basically thanks to Zach Weinersmith, because his comics over the past week gave me reasons to talk about Georg Cantor and Bernard Riemann. These were two of the many extremely sharp, extremely perceptive German mathematicians of the 19th Century who put solid, rigorously logical foundations under the work of centuries of mathematics, only to discover that this implied new and very difficult questions about mathematics. Some of them are good material for jokes.

Eric and Bill Teitelbaum’s Bottomliners panel (February 14) builds a joke around everything in some set of medical tests coming back negative, as well as the bank account. “Negative”, the word, has connotations that are … well, negative, which may inspire the question why is it a medical test coming back “negative” corresponds with what is usually good news, nothing being wrong? As best I can make out the terminology derives from statistics. The diagnosis of any condition amounts to measuring some property (or properties), and working out whether it’s plausible that the measurements could reflect the body’s normal processes, or whether they’re such that there just has to be some special cause. A “negative” result amounts to saying that we are not forced to suppose something is causing these measurements; that is, we don’t have a strong reason to think something is wrong. And so in this context a “negative” result is the one we ordinarily hope for.

Continue reading “Reading the Comics, February 20, 2015: 19th-Century German Mathematicians Edition”

How To Re-Count Fish


Last week I chatted a bit with a probabilistic, sampling-based method to estimate the population of fish in our backyard pond. The method estimates the population N of a thing, in this case the fish, by capturing a sample of size M and dividing that M by the probability of catching one of the things in your sampling. Since we might know know the chance of catching the thing beforehand, we estimate it: catch some number n of the fish or whatever, then put them back, and then re-catch as many. Some number m of those will be re-caught, so we can estimate the chance of catching one fish as \frac{m}{n} . So the original population will be somewhere about N = M \div \frac{m}{n} = M \cdot \frac{n}{m} .

I want to talk a little bit about why that won’t work.

There is of course the obvious reason to think this will go wrong; it amounts to exactly the same reason why a baseball player with a .250 batting average — meaning the player can expect to get a hit in one out of every four at-bats — might go an entire game without getting on base, or might get on base three times in four at-bats. If something has N chances to happen, and it has a probability p of happening at every chance, it’s most likely that it will happen N \cdot p times, but it can happen more or fewer times than that. Indeed, we’d get a little suspicious if it happened exactly N \cdot p times. If we flipped a fair coin twenty times, it’s most likely to come up tails ten times, but there’s nothing odd about it coming up tails only eight or as many as fourteen times, and it’d stand out if it always came up tails exactly ten times.

To apply this to the fish problem: suppose that there are N = 50 fish in the pond; that 50 is the number we want to get. And suppose we know for a fact that every fish has a 12.5 percent chance — p = 0.125 — of being caught in our trap. Ignore for right now how we know that probability; just pretend we can count on that being exactly true. The expectation value, the most probable number of fish to catch in any attempt, is N \cdot p = 50 \cdot 0.125 = 6.25 fish, which presents our first obvious problem. Well, maybe a fish might be wriggling around the edge of the net and fall out as we pull the trap out. (This actually happened as I was pulling some of the baby fish in for the winter.)

It's a pond about ten feet across and maybe two feet deep, with (at the time of the photograph) at most eleven fish in it.
This is the backyard pond; pictured are several fish, though not all of them.

With these numbers it’s most probable to catch six fish, slightly less probable to catch seven fish, less probable yet to catch five, then eight and so on. But these are all tolerably plausible numbers. I used a mathematics package (Octave, an open-source clone of Matlab) to run ten simulated catches, from fifty fish each with a probability of .125 of being caught, and came out with these sizes M for the fish harvests:

M = 4 6 3 6 7 7 5 7 8 9

Since we know, by some method, that the chance p of catching any one fish is exactly 0.125, this implies fish populations N = M \div p of:

M = 4 6 3 6 7 7 5 7 8 9
N = 32 48 24 48 56 56 40 56 64 72

Now, none of these is the right number, although 48 is respectably close and 56 isn’t too bad. But the range is hilarious: there might be as few as 24 or as many as 72 fish, based on just this evidence. That might as well be guessing.

This is essentially a matter of error analysis. Any one attempt at catching fish may be faulty, because the fish are too shy of the trap, or too eager to leap into it, or are just being difficult for some reason. But we can correct for the flaws of one attempt at fish-counting by repeating the experiment. We can’t always be unlucky in the same ways.

This is conceptually easy, and extremely easy to do on the computer; it’s a little harder in real life but certainly within the bounds of our research budget, since I just have to go out back and put the trap out. And redoing the experiment even pays off, too: average those population samples from the ten simulated runs there and we get a mean estimated fish population of 49.6, which is basically dead on.

(That was lucky, I must admit. Ten attempts isn’t really enough to make the variation comfortably small. Another run with ten simulated catchings produced a mean estimate population of 56; the next one … well, 49.6 again, but the one after that gave me 64. It isn’t until we get into a couple dozen attempts that the mean population estimate gets reliably close to fifty. Still, the work is essentially the same as the problem of “I flipped a fair coin some number of times; it came up tails ten times. How many times did I flip it?” It might have been any number ten or above, but I most probably flipped it about twenty times, and twenty would be your best guess absent more information.)

The same problem affects working out what the probability of catching a fish is, since we do that by catching some small number n of fish and then seeing how many some smaller number m of them we re-catch later on. Suppose the probability of catching a fish really is p = 0.125 , but we’re only trying to catch n = 6 fish. Here’s a couple rounds of ten simulated catchings of six fish, and how many of those were re-caught:

2 0 1 0 1 0 1 0 0 1
2 0 1 1 0 3 0 0 1 1
0 1 0 1 0 0 1 0 0 0
1 0 0 0 0 0 0 0 2 1

Obviously any one of those indicates a probability ranging from 0 to 0.5 of re-catching a fish. Technically, yes, 0.125 is a number between 0 and 0.5, but it hasn’t really shown itself. But if we average out all these probabilities … well, those forty attempts give us a mean estimated probability of 0.092. This isn’t excellent but at least it’s in range. If we keep doing the experiment we’d get do better; one simulated batch of a hundred experiments turned up a mean estimated probability of 0.12833. (And there’s variations, of course; another batch of 100 attempts estimated the probability at 0.13333, and then the next at 0.10667, though if you use all three hundred of these that gets to an average of 0.12278, which isn’t too bad.)

This inconvenience amounts to a problem of working with small numbers in the original fish population, in the number of fish sampled in any one catching, and in the number of catches done to estimate their population. Small numbers tend to be problems for probability and statistics; the tools grow much more powerful and much more precise when they can work with enormously large collections of things. If the backyard pond held infinitely many fish we could have a much better idea of how many fish were in it.

Reading the Comics, February 14, 2015: Valentine’s Eve Edition


I haven’t had the chance to read today’s comics, what with it having snowed just enough last night that we have to deal with it instead of waiting for the sun to melt it, so, let me go with what I have. There’s a sad lack of strips I feel justified including the images of, since they’re all Gocomics.com representatives and I’m used to those being reasonably stable links. Too bad.

Eric the Circle has a pair of strips by Griffinetsabine, the first on the 7th of February, and the next on February 13, both returning to “the Shape Single’s Bar” and both working on “complementary angles” for a pun. That all may help folks remember the difference between complementary angles — those add up to a right angle — and supplementary angles — those add up to two right angles, a straight line — although what it makes me wonder is the organization behind the Eric the Circle art collective. It hasn’t got any nominal author, after all, and there’s what appear to be different people writing and often drawing it, so, who does the scheduling so that the same joke doesn’t get repeated too frequently? I suppose there’s some way of finding that out for myself, but this is the Internet, so it’s easier to admit my ignorance and let the answer come up to me.

Mark Anderson’s Andertoons (February 10) surprised me with a joke about the Dewey decimal system that I hadn’t encountered before. I don’t know how that happened; it just did. This is, obviously, a use of decimal that’s distinct from the number system, but it’s so relatively rare to think of decimals as apart from representations of numbers that pointing it out has the power to surprise me at least.

Continue reading “Reading the Comics, February 14, 2015: Valentine’s Eve Edition”

How To Count Fish


We have a pond out back, and in 2013, added some goldfish to it. The goldfish, finding themselves in a comfortable spot with clean water, went about the business of making more goldfish. They didn’t have much time to do that before winter of 2013, but they had a very good summer in 2014, producing so many baby goldfish that we got a bit tired of discovering new babies. The pond isn’t quite deep enough that we could be sure it was safe for them to winter over, so we had to work out moving them to a tub indoors. This required, among other things, having an idea how many goldfish there were. The question then was: how many goldfish were in the pond?

It's a pond about ten feet across and maybe two feet deep, with (at the time of the photograph) at most eleven fish in it.
This is the backyard pond; pictured are several fish, though not all of them.

It’s not hard to come up with a maximum estimate: a goldfish needs some amount of water to be healthy. Wikipedia seems to suggest a single fish needs about twenty gallons — call it 80 liters — and I’ll accept that since it sounds plausible enough and it doesn’t change the logic of the maximum estimate if the number is actually something different. The pond’s about ten feet across, and roughly circular, and not quite two feet deep. Call that a circular cylinder, with a diameter of three meters, and a depth of two-thirds of a meter, and that implies a volume of about pi times (3/2) squared times (2/3) cubic meters. That’s about 4.7 cubic meters, or 4700 liters. So there probably would be at most 60 goldfish in the pond. Could the goldfish have reached the pond’s maximum carrying capacity that quickly? Easily; you would not believe how fast goldfish will make more goldfish given fresh water and a little warm weather.

It can be a little harder to quite believe in the maximum estimate. For one, smaller fish don’t need as much water as bigger ones do and the baby fish are, after all, small. Or, since we don’t really know how deep the pond is — it’s not a very regular bottom, and it’s covered with water — might there be even more water and thus capacity for even more fish? That might sound ridiculous but consider: an error of two inches in my estimate of the pond’s depth amounts to a difference of 350 liters or room for four or five fish.

We can turn to probability, though. If we have some way of catching fish — and we have; we’ve got a wire trap and a mesh trap, which we’d use for bringing in fish — we could set them out and see how many fish we can catch. If we suppose there’s a certain probability p of catching any one fish, and if there are N fish in the pond any of which might be caught, then we could expect that some number M =  N \cdot p fish are going to be caught. So if, say, we have a one-in-three chance of catching a fish, and after trying we’ve got some number M fish — let’s say there were 8 caught, so we have some specific number to play with — we could conclude that there must have been about M \div p = 8 \div \frac{1}{3} or 24 fish in the population to catch.

This does bring up the problem of how to guess what the probability of catching any one fish is. But if we make some reasonable-sounding assumptions we can get an estimate of that: set out the traps and catch some number, call it n , of fish. Then set them back and after they’ve had time to recover from the experience, put the traps out again to catch n fish again. We can expect that of that bunch there will be some number, call it m , of the fish we’d previously caught. The ratio of the fish we catch twice to the number of fish we caught in the first place should be close to the chance of catching any one fish.

So let’s lay all this out. If there are some unknown number N fish in the pond, and there is a chance of \frac{m}{n} of any one fish being caught, and we’ve caught in seriously trying M fish, then: M = N \cdot \frac{m}{n} and therefore N = M \cdot \frac{n}{m} .

For example, suppose in practice we caught ten fish, and were able to re-catch four of them. Then in trying seriously we caught twelve fish. From this we’d conclude that n = 10, m = 4, M = 12 and therefore there are about N = M \cdot \frac{m}{n} = 12 \cdot \frac{10}{4} = 30 fish in the pond.

Or if in practice we’d caught twelve fish, five of them a second time, and then in trying seriously we caught eleven fish. Then since n = 12, m = 5, M = 11 we get an estimate of N = M \cdot \frac{m}{n} = 11 \cdot \frac{12}{5} = 26.4 or call it 26 fish in the pond.

Or for another variation: suppose the first time out we caught nine fish, and the second time around, catching another nine, we re-caught three of them. If we’re feeling a little lazy we can skip going around and catching fish again, and just use the figures that n = 9, m = 3, M = 9 and from that conclude there are about N = 9 \cdot \frac{9}{3} = 27 fish in the pond.

So, in principle, if we’ve made assumptions about the fish population that are right, or at least close enough to right, we can estimate what the fish population is without having to go to the work of catching every single one of them.


Since this is a generally useful scheme for estimating a population let me lay it out in an easy-to-follow formula.

To estimate the size of a population of N things, assuming that they are all equally likely to be detected by some system (being caught in a trap, being photographed by someone at a spot, anything), try this:

  1. Catch some particular number n of the things. Then let them go back about their business.
  2. Catch another n of them. Count the number m of them that you caught before.
  3. The chance of catching one is therefore about p = m \div n .
  4. Catch some number M of the things.
  5. Since — we assume — every one of the N things had the same chance p of being caught, and since we caught M of them, then we estimate there to be N = M \div p of the things to catch.

Warning! There is a world of trouble hidden in that “we assume” on the last step there. Do not use this for professional wildlife-population-estimation until you have fully understood those two words.