We have a pond out back, and in 2013, added some goldfish to it. The goldfish, finding themselves in a comfortable spot with clean water, went about the business of making more goldfish. They didn’t have much time to do that before winter of 2013, but they had a very good summer in 2014, producing so many baby goldfish that we got a bit tired of discovering new babies. The pond isn’t quite deep enough that we could be sure it was safe for them to winter over, so we had to work out moving them to a tub indoors. This required, among other things, having an idea how many goldfish there were. The question then was: how many goldfish were in the pond?
It’s not hard to come up with a maximum estimate: a goldfish needs some amount of water to be healthy. Wikipedia seems to suggest a single fish needs about twenty gallons — call it 80 liters — and I’ll accept that since it sounds plausible enough and it doesn’t change the logic of the maximum estimate if the number is actually something different. The pond’s about ten feet across, and roughly circular, and not quite two feet deep. Call that a circular cylinder, with a diameter of three meters, and a depth of two-thirds of a meter, and that implies a volume of about pi times (3/2) squared times (2/3) cubic meters. That’s about 4.7 cubic meters, or 4700 liters. So there probably would be at most 60 goldfish in the pond. Could the goldfish have reached the pond’s maximum carrying capacity that quickly? Easily; you would not believe how fast goldfish will make more goldfish given fresh water and a little warm weather.
It can be a little harder to quite believe in the maximum estimate. For one, smaller fish don’t need as much water as bigger ones do and the baby fish are, after all, small. Or, since we don’t really know how deep the pond is — it’s not a very regular bottom, and it’s covered with water — might there be even more water and thus capacity for even more fish? That might sound ridiculous but consider: an error of two inches in my estimate of the pond’s depth amounts to a difference of 350 liters or room for four or five fish.
We can turn to probability, though. If we have some way of catching fish — and we have; we’ve got a wire trap and a mesh trap, which we’d use for bringing in fish — we could set them out and see how many fish we can catch. If we suppose there’s a certain probability of catching any one fish, and if there are fish in the pond any of which might be caught, then we could expect that some number fish are going to be caught. So if, say, we have a one-in-three chance of catching a fish, and after trying we’ve got some number fish — let’s say there were 8 caught, so we have some specific number to play with — we could conclude that there must have been about or 24 fish in the population to catch.
This does bring up the problem of how to guess what the probability of catching any one fish is. But if we make some reasonable-sounding assumptions we can get an estimate of that: set out the traps and catch some number, call it , of fish. Then set them back and after they’ve had time to recover from the experience, put the traps out again to catch fish again. We can expect that of that bunch there will be some number, call it , of the fish we’d previously caught. The ratio of the fish we catch twice to the number of fish we caught in the first place should be close to the chance of catching any one fish.
So let’s lay all this out. If there are some unknown number fish in the pond, and there is a chance of of any one fish being caught, and we’ve caught in seriously trying fish, then: and therefore .
For example, suppose in practice we caught ten fish, and were able to re-catch four of them. Then in trying seriously we caught twelve fish. From this we’d conclude that and therefore there are about fish in the pond.
Or if in practice we’d caught twelve fish, five of them a second time, and then in trying seriously we caught eleven fish. Then since we get an estimate of or call it 26 fish in the pond.
Or for another variation: suppose the first time out we caught nine fish, and the second time around, catching another nine, we re-caught three of them. If we’re feeling a little lazy we can skip going around and catching fish again, and just use the figures that and from that conclude there are about fish in the pond.
So, in principle, if we’ve made assumptions about the fish population that are right, or at least close enough to right, we can estimate what the fish population is without having to go to the work of catching every single one of them.
Since this is a generally useful scheme for estimating a population let me lay it out in an easy-to-follow formula.
To estimate the size of a population of things, assuming that they are all equally likely to be detected by some system (being caught in a trap, being photographed by someone at a spot, anything), try this:
- Catch some particular number of the things. Then let them go back about their business.
- Catch another of them. Count the number of them that you caught before.
- The chance of catching one is therefore about .
- Catch some number of the things.
- Since — we assume — every one of the things had the same chance of being caught, and since we caught of them, then we estimate there to be of the things to catch.
Warning! There is a world of trouble hidden in that “we assume” on the last step there. Do not use this for professional wildlife-population-estimation until you have fully understood those two words.