Today’s A To Z term is another from goldenoj. It’s one important to probability, and it’s one at the center of the field.
The sample space is a tool for probability questions. We need them. Humans are bad at probability questions. Thinking of sample spaces helps us. It’s a way to recast probability questions so that our intuitions about space — which are pretty good — will guide us to probabilities.
A sample space collects the possible results of some experiment. “Experiment” means what way mathematicians intend, so, not something with test tubes and colorful liquids that might blow up. Instead it’s things like tossing coins and dice and pulling cards out of reduced decks. At least while we’re learning. In real mathematical work this turns into more varied stuff. Fluid flows or magnetic field strengths or economic forecasts. The experiment is the doing of something which gives us information. This information is the result of flipping this coin or drawing this card or measuring this wind speed. Once we know the information, that’s the outcome.
So each possible outcome we represent as a point in the sample space. Describing it as a “space” might cause trouble. “Space” carries connotations of something three-dimensional and continuous and contiguous. This isn’t necessarily so. We can be interested in discrete outcomes. A coin’s toss has two possible outcomes. Three, if we count losing the coin. The day of the week on which someone’s birthday falls has seven possible outcomes. We can also be interested in continuous outcomes. The amount of rain over the day is some nonnegative real number. The amount of time spent waiting at this traffic light is some nonnegative real number. We’re often interested in discrete representations of something continuous. We did not have inches of rain overnight, even if we did. We recorded 0.71 inches after the storm.
We don’t demand every point in the sample space to be equally probable. There seems to be a circularity to requiring that. What we do demand is that the sample space be a “sigma algebra”, or σ-algebra to write it briefly. I don’t know how σ came to be the shorthand for this kind of algebra. Here “algebra” means a thing with a bunch of rules. These rules are about what you’d guess if you read pop mathematics blogs and had to bluff your way through a conversation of rules about sets. The algebra’s this collection of sets made up of the elements of X. Subsets of this algebra have to be contained in this collection. Their complements are also sets in the collection. The unions of sets have to be in the collection.
So the sample space is a set. All the possible outcomes of the experiment we’re thinking about are its elements. Every experiment must have some outcome that’s inside the sample space. And any two different outcomes have to be mutually exclusive. That is, if outcome A has happened, then outcome B has not happened. And vice-versa; I’m not so fond of A that I would refuse B.
I see your protest. You’ve worked through probability homework problems where you’re asked the chance a card drawn from this deck is either a face card or a diamond. The jack of diamonds is both. This is true; but it’s not what we’re looking at. The outcome of this experiment is the card that’s drawn, which might be any of 52 options.
If you like treating it that way. You might build the sample space differently, like saying that it’s an ordered pair. One part of the pair is the suit of the card. The other part is the value. This might be better for the problem you’re doing. This is part of why the probability department commands such high wages. There are many sample spaces that can describe the problem you’re interested in. This does include one where one event is “draw a card that’s a face card or diamond” and the other is “draw one that isn’t”. (These events don’t have an equal probability.) The work is finding a sample space that clarifies your problem.
Working out the sample space that clarifies the problem is the hard part, usually. Not being rigorous about the space gives us many probability paradoxes. You know, like the puzzle where you’re told someone’s two children are either boys or girls. One walks in and it’s a girl. You’re told the probability the other is a boy is two-thirds. And you get mad. Or the Monty Hall Paradox, where you’re asked to pick which of three doors has the grand prize behind it. You’re shown one that you didn’t pick which hasn’t. You’re given the chance to switch to the remaining door. You’re told the probability that the grand prize is behind that other door is two-thirds, and you get mad. There are probability paradoxes that don’t involve a chance of two-thirds. Having a clear idea of the sample space avoids getting the answers wrong, at least. There’s not much to do about not getting mad.
Like I said, we don’t insist that every point in the sample space have an equal probability of being the outcome. Or, if it’s a continuous space, that every region of the same area has the same probability. It is certainly easier if it does. Then finding the probability of some result becomes easy. You count the number of outcomes that satisfy that result, and divide by the total number of outcomes. You see this in problems about throwing two dice and asking the chance the total is seven, or five, or twelve.
For a continuous sample space, you’d find the area of all the results that satisfy the result. Divide that by the area of the sample space and there’s the probability of that result. (It’s possible for a result to have an area of zero, which implies that the thing cannot happen. This presents a paradox. A thing is in the sample space because it is a possible outcome. What these measure-zero results are, typically, is something like every one of infinitely many tossed coins coming up tails. That can’t happen, but it’s not like there’s any reason it can’t.)
If every outcome isn’t equally likely, though? Sometimes we can redesign the sample space to something that is. The result of rolling two dice is a familiar example. The chance of the dice totalling 2 is different from the chance of them totalling 4. So a sample space that’s just the sums, the numbers 2 through 12, is annoying to deal with. But rewrite the space as the ordered pairs, the result of die one and die two? Then we have something nice. The chance of die one being 1 and die two being 1 is the same as the chance of die one being 2 and die two being 2. There happen to be other die combinations that add up to 4 is all.
Sometimes there’s no finding a sample space which describes what you’re interested in and that makes every point equally probable. Or nearly enough. The world is vast and complicated. That’s all right. We can have a function that describes, for each point in the sample space, the probability of its turning up. Really we had that already, for equally-probable outcomes. It’s just that was all the same number. But this function is called the probability measure. If we combine together a sample space, and a collection of all the events we’re interested in, and a probability measure for all these events, then this triad is a probability space.
And probability spaces give us all sorts of great possibilities. Dearest to my own work is Monte Carlo methods, in which we look for particular points inside the sample space. We do this by starting out anywhere, picking a point at random. And then try moving to a different point, picking the “direction” of the change at random. We decide whether that move succeeds by a rule that depends in part on the probability measure, and in part on how well whatever we’re looking for holds true. This is a scheme that demands a lot of calculation. You won’t be surprised that it only became a serious tool once computing power was abundant.
So for many problems there is no actually listing all the sample space. A real problem might include, say, the up-or-down orientation of millions of magnets. This is a sample space of unspeakable vastness. But thinking out this space, and what it must look like, helps these probability questions become ones that our intuitions help us with instead. If you do not know what to do with a probability question, think to the sample spaces.
This and other essays for the Fall 2019 A to Z should be at this link. Later this week I hope to publish the letter T. And all of the A to Z essays ought to be at this link. Thanks for reading.