## The Music Goes Round And Round

So. The really big flaw in my analysis of an “Infinite Jukebox” tune — one in which the song is free to jump between two points, with a probability of $\frac13$ of jumping from the one-minute mark to the two-minute mark, and an equal likelihood of jumping from the two-minute mark to the one-minute mark — and my conclusion that, on average, the song would lose a minute just as often as it gained one and so we could expect the song to be just as long as the original, is that I made allowance for only the one jump. The three-minute song with two points at which it could jump, which I used for the model, can play straight through with no cuts or jumps (three minutes long), or it can play jumping from the one-minute to the two-minute mark (a two minute version), or it can play from the start to the second minute, jump back to the first, and continue to the end (a four minute version). But if you play any song on the Infinite Jukebox you see that more can happen.

## Infinite Buggles

Working through my circle of friends have been links to The Infinite Jukebox, an amusing web site which takes a song, analyzes points at which clean edits can be made, and then randomly jumps through them so that the song just never ends. The idea is neat, and its visual representation of the song and the places where it can — but doesn’t have to — jump forward or back can be captivating. My Dearly Beloved has been particularly delighted with the results on “I Am A Camera”, by the Buggles, as it has many good edit points and can sound quite natural after the jumps if you aren’t paying close attention to the lyrics. I recommend playing that at least a bit so you get some sense of how it works, although listening to an infinitely long rendition of the Buggles, or any other band, is asking for a lot.

One question that comes naturally to mind, at least to my mind, is: given there are these various points where the song can skip ahead or skip back, how long should we expect such an “infinite” rendition of a song to take? What’s the average, that is the expected value, of the song’s playing? I wouldn’t dare jump into analyzing “I Am A Camera”, not without working on some easier problems to figure out how it should be done, but let’s look.

## It Would Have Been One More Ride Because

I apologize for being slow writing the conclusion of the explanation for why my Dearly Beloved and I would expect one more ride following our plan to keep re-riding Disaster Transport as long as a fairly flipped coin came up tails. It’s been a busy week, and actually, I’d got stuck trying to think of a way to explain the sum I needed to take using only formulas that a normal person might find, or believe. I think I have it.

## The Help Needed To Get to One

So, it’s established that my little series, representing the number of rides we could expect to get if we based re-riding on a fair coin flip, is convergent. So trying to figure out the sum will get a meaningful answer. The question is, how do we calculate it?

My first impulse is to see if someone else solved the problem first, for exactly the reasons you might guess. This is a case where mathematics textbooks can have an advantage over the web, really, since an introduction to calculus book is almost certain to have page after page of Common Series Sums. Figuring out the right combination of keywords to search the web for it can be an act of elaborate guesswork. Mercifully, Wikipedia has a List of Mathematical Series which covers my problem exactly. Almost.

## Reblog: Random matrix theory and the Coulomb gas

inordinatum’s guest blog post here discusses something which, I must confess, isn’t going to be accessible to most of my readers. But it’s interesting to me, since it addresses many topics that are either directly in or close to my mathematical research interests.

The random matrix theory discussed here is the study of what we can say about matrices when we aren’t told the numbers in the matrix, but are told the distribution of the numbers — how likely any cell within the matrix is to be within any particular range. From that start it might sound like almost nothing could be said; after all, couldn’t anything go? But in exactly the same way that we’re able to speak very precisely about random events in the context of probability and statistics — for example, that a (fair) coin flipped a million times will come up tails pretty near 500,000 times, and will not come up tails 600,000 times — we’re able to speak confidently about the properties of these random matrices.

In any event, please do not worry about understanding the whole post. I found it fascinating and that’s why I’ve reblogged it here.

Today I have the pleasure of presenting you a guest post by Ricardo, a good physicist friend of mine in Paris, who is working on random matrix theory. Enjoy!

After writing a nice piece of hardcore physics to my science blog (in Portuguese, I am sorry), Alex asked me to come by and repay the favor. I am happy to write a few lines about the basis of my research in random matrices, and one of the first nice surprises we have while learning the subject.

In this post, I intent to present you some few neat tricks I learned while tinkering with Random Matrix Theory (RMT). It is a pretty vast subject, whose ramifications extend to nuclear physics, information theory, particle physics and, surely, mathematics as a whole. One of the main questions on this subject is: given a matrix $latex M$ whose entries are taken randomly from a…

View original post 1,082 more words

## Why Not Infinitely Many More Rides?

Returning to the Disaster Transport ride problem: by flipping a coin after each ride of the roller coaster we’d decide whether to go around again. How many more times could I expect to ride? Using the letter k to represent the number of rides, and p(k) to represent the probability of getting that many rides, it’s a straightforward use of the formula for expectation value — the sum of all the possible outcomes times the probability of that particular outcome — to find the expected number of rides.

Where this gets to be a bit of a bother is that there are, properly speaking, infinitely many possible outcomes. There’s no reason, in theory, that a coin couldn’t come up tails every single time, and only the impatience of the Cedar Point management which would keep us from riding a million times, a billion times, an infinite number of times. Common sense tells us this can’t happen; the chance of getting a billion tails in a row is just impossibly tiny, but, how do we know all these outcomes that are incredibly unlikely don’t add up to something moderately likely? It happens in integral calculus all the time that a huge enough pile of tiny things adds up to a moderate thing, so why not here?

## Just One More Ride?

Given that we know the chance of getting any arbitrary number — let’s say k, because that’s a good arbitrary number — of rides in a row on Disaster Transport, using the scheme where we re-ride if the flipped coin comes up tails and stop if it comes up heads, the natural follow-up to me is: how many more rides can we expect? It’s more likely that we’d get one more ride than two, two more rides than three, three more rides than four; there’s a tiny chance we might get ten more rides; there’s a real if vanishingly tiny chance we’d get a million more rides, if Cedar Point didn’t throw us out of the park and tear the roller coaster down first.

## How Many Last Rides?

So our scheme for getting a last ride in on Disaster Transport without knowing in advance it was our last ride was to flip a coin after each ride, and then re-ride if the coin came up tails. (Maybe it was heads. It doesn’t matter, since we’re supposing the coin is equally likely to come up heads as tails.) The obvious question is, how many times could we expect to ride? Or put another way, how many times in a row could I expect a flipped coin to come up tails, before the first time that it came up heads? The probability tool used here is called the geometric distribution.

## The Last Ride Of A Roller Coaster

Cedar Point amusement park, in Sandusky, Ohio, built in the mid-1980s a bobsled-style roller coaster named Avalanche Run, because it was the mid-1980s and bobsled-style roller coasters seemed like a good idea. My home amusement park, Great Adventure, had something called the Sarajevo Bobsled opened in that time because back then Sarajevo was thought to be a pretty good city apart from that unpleasantness seventy years before. But Cedar Point’s bobsled roller coaster had a longer existence than Great Adventure’s, and around 1990, it was rebuilt to something newer and more exciting, with a building enclosing it and a whole backstory behind the ride.

## Reading the Comics, September 26, 2012

I haven’t time to write a short piece today so let me go through a fresh batch of math-themed comic strips instead. There might be a change coming to these features soon, both in the strips I read and in how I present them, since Comics Kingdom, which provides the King Features Syndicate comic strips, has shown signs that they’re tightening up web access to their strips.

I can’t blame them for wanting to make sure people go through paths they control — and, pay for, at least in advertising clicks — but I can fault them for doing a rotten job of it. They’re just not very good web masters, and end up serving strips — you may have seen them if you’ve gone to the comics page of your local newspaper — that are tiny, which kills plot-heavy features like The Phantom or fine-print heavy features like Slylock Fox Sunday pages, and loaded with referrer-based and cookie-based nonsense that makes it too easy to fail to show a comic altogether or to screw up hopelessly loading up several web browser tabs with different comics in them.

For now that hasn’t happened, at least, but I’m warning that if it does, I might not necessarily read all the King Features strips — their advertising claims they have the best strips in the world, but then, they also run The Katzenjammer Kids which, believe it or not, still exists — and might not be able to comment on them. We’ll see. On to the strips for the middle of September, though:

## Proving Something With One Month’s Counting

One week, it seems, isn’t enough to tell the difference conclusively between the first bidder on Contestants Row having a 25 percent chance of winning — winning one out of four times — or a 17 percent chance of winning — winning one out of six times. But we’re not limited to watching just the one week of The Price Is Right, at least in principle. Some more episodes might help us, and we can test how many episodes are needed to be confident that we can tell the difference. I won’t be clever about this. I have a tool — Octave — which makes it very easy to figure out whether it’s plausible for something which happens 1/4 of the time to turn up only 1/6 of the time in a set number of attempts, and I’ll just keep trying larger numbers of attempts until I’m satisfied. Sometimes the easiest way to solve a problem is to keep trying numbers until something works.

In two weeks (or any ten episodes, really, as talked about above), with 60 items up for bids, a 25 percent chance of winning suggests the first bidder should win 15 times. A 17 percent chance of winning would be a touch over 10 wins. The chance of 10 or fewer successes out of 60 attempts, with a 25 percent chance of success each time, is about 8.6 percent, still none too compelling.

Here we might turn to despair: 6,000 episodes — about 35 years of production — weren’t enough to give perfectly unambiguous answers about whether there were fewer clean sweeps than we expected. There were too few at the 5 percent significance level, but not too few at the 1 percent significance level. Do we really expect to do better with only 60 shows?

## What Can One Week Prove?

We have some reason to think the chance of winning an Item Up For Bids, if you’re the first one of the four to place bids — let’s call this the first bidder or first seat so there’s a name for it — is lower than the 25 percent which we’d expect if every contestant in The Price Is Right‘s Contestants Row had an equal shot at it. Based on the assertion that only one time in about six thousand episodes had all six winning bids in one episode come from the same seat, we reasoned that the chance for the first bidder — the same seat as won the previous bid — could be around 17 percent. My next question is how we could test this? The chance for the first bidder to win might be higher than 17 percent — around 1/6, which is near enough and easier to work with — or lower than 25 percent — exactly 1/4 — or conceivably even be outside that range.

The obvious thing to do is test: watch a couple episodes, and see whether it’s nearer to 1/6 or to 1/4 of the winning bids come from the first seat. It’s easy to tally the number of items up for bid and how often the first bidder wins. However, there are only six items up for bid each episode, and there are five episodes per week, for 30 trials in all. I talk about a week’s worth of episodes because it’s a convenient unit, easy to record on the Tivo or an equivalent device, easy to watch at The Price Is Right‘s online site, but it doesn’t have to be a single week. It could be any five episodes. But I’ll say a week just because it’s convenient to do so.

If the first seat has a chance of 25 percent of winning, we expect 30 times 1/4, or seven or eight, first-seat wins per week. If the first seat has a 17 percent chance of winning, we expect 30 times 1/6, or 5, first-seat wins per week. That’s not much difference. What’s the chance we see 5 first-seat wins if the first seat has a 25 percent chance of winning?

## Figuring Out The Penalty Of Going First

Let’s accept the conclusion that the small number of clean sweeps of Contestants Row is statistically significant, that all six winning contestants on a single episode of The Price Is Right come from the same seat less often than we would expect from chance alone, and that the reason for this is that whichever seat won the last item up for bids is less likely to win the next. It seems natural to suppose the seat which won last time — and which is therefore bidding first this next time — is at a disadvantage. The irresistible question, to me anyway, is: how big is that disadvantage? If no seats had any advantage, the first, second, third, and fourth bidders would be expected to have a probability of 1/4 of winning any particular item. How much less a chance does the first bidder need to have to get the one clean sweep in 6,000 episodes reported?

Chiaroscuro came to an estimate that the first bidder had a probability of about 17.6 percent of winning the item up for bids, and I agree with that, at least if we make a couple of assumptions which I’m confident we are making together. But it’s worth saying what those assumptions are because if the assumptions do not hold, the answers come out different.

The first assumption was made explicitly in the first paragraph here: that the low number of clean sweeps is because the chance of a clean sweep is less than the 1 in 1000 (or to be exact, 1 in 1024) chance which supposes every seat has an equal probability of winning. After all, the probability that we saw so few clean sweeps for chance alone was only a bit under two percent; that’s unlikely but hardly unthinkable. We’re supposing there is something to explain.

## Interpreting Drew Carey

If we’ve decided that at the significance level we find comfortable there are too few clean sweeps of any position in Contestants Row, the natural question is why there are so few. We estimated there should have been six clean sweeps, based on modelling clean-sweep occurrences as a binomial distribution. Something in the model went wrong. Let’s try to reason out what it was.

One assumption for a binomial distribution are that we have some trial, some event, which happens many times. Each episodes is the obvious trial here. The outcome we’re interested in seeing has some probability of happening on each trial; there is indeed some probability of a clean sweep each episode. The binomial distribution assumes that this probability is constant for every trial, that it doesn’t become more or less likely the tenth or hundredth or thousandth time around, and this seems likely to hold for The Price Is Right episodes. Granted there is some chance of a clean sweep in one episode; what could be done to increase or decrease the likelihood from episode to episode?

## The Significance of the Item Up For Bids

The last important idea missing before we can judge this problem about The Price Is Right clean sweeps of Contestants Row is the significance level. Whenever an experiment is run — whether it’s the classic probability class problems of flipping coins or rolling dice, or whether it’s watching 6,000 episodes of a game show to see whether any seat produces the most winners, or whether it’s counting the number of red traffic lights one gets during the commute — there are some outcomes which are reasonably likely, some which are unlikely, and some which are vanishingly improbable.

We have to decide that some outcomes have such a low probability of happening naturally that they represent something going on, and are not just the result of chance. How low that probability should be is our decision. There are some common dividing lines, but they’re common just because they represent numbers which human beings find to be nice round figures: five percent, one percent, half a percent, one-tenth of a percent. What significance level one picks depends on many factors, including what’s common in the field, how different outcomes are expected to be, even what one can afford. Physicists looking for evidence of new subatomic particles have an extremely high standard before declaring something is definitely a new particle, but, they can run particle detection experiments until they get such clear evidence.

To be fair, we ought to pick our significance level before we’ve worked out the probability of something happening, but this is the earliest I could discuss it with motivation for you to read about it. But if we take the five percent significance level, we see we know already that there’s a little more than a one and a half percent chance of there being as few clean sweeps as observed. The conclusion is obvious: all six winning contestants in an episode should have come from the same seat, over 6,000 episodes, more often than the one time Drew Carey claimed they had. We can start looking for explanations for why there should be this deficiency.

Or …

## The First Tail

We became suspicious of the number of clean sweeps in The Price Is Right when there were not the expected six of them in 6,000 episodes. The chance there would be only one was about one and a half percent, not very high. But are there so few clean sweeps that we should be suspicious? That is, is the difference between the expected number of sweeps and the observed number so large as to be significant? Is it too big to just result from chance?

This is significance testing: is whatever quantity we mean to observe dramatically less than what is expected? Is it dramatically more? Is it at least different? Are these differences bigger than what could be expected by mere chance? For every statistician’s favorite example, a tossed fair coin will come up tails half the time; that means, of twenty flips, there are expected to be ten tails. But there being merely nine or as many as twelve is reasonable. Three or fifteen tails may be a little unlikely. Zero or twenty seem impossible. There’s a point where if our observations are so different from what we expect then we have to reject the idea that our observations and our expectations agree.

It’s not enough to say there’s a probability of only 1.5 percent that there should be exactly one clean sweep episode out of 6,000, though. It’s unlikely that should happen, but if we look at it, it’s unlikely there should be any outcome. Even the most likely result of 6,000 episodes, six clean sweeps, has only about one chance in six of happening. That’s near the chance that the next person you meet will have a birthday in either September or November. That isn’t absurdly unlikely, but, the person betting against it has the surer deal.

## Significance Intrudes on Contestants Row

We worked out the likelihood that there would be only one clean sweep, with all six contestants getting on stage coming from the same seat in Contestants Row, out of six thousand episodes of The Price Is Right. That turned out to be not terribly likely: it had about a one and a half percent chance of being the case. For a sense of scale, that’s around the same probability that the moment you finish reading this sentence will be exactly 26 seconds past the minute. It’s pretty safe to bet that it wasn’t.

However, it isn’t particularly outlandish to suppose that it was. I’d certainly hope at least some reader found that it was. Events which aren’t particularly likely do happen, all the time. Consider the likelihood of this single-clean-sweep or the 26-seconds-past-the-minute thing happening to the likelihood of any given hand of poker: any specific hand is phenomenally less likely, but something has to happen once you start dealing. So do we have any grounds for saying the particular outcome of one clean sweep in 6,000 shows is improbable? Or for saying that it’s reasonable?

## A Simple Demonstration Which Does Not Clarify

When last we talked about the “clean sweep” of winning contestants coming from the same of four seats in Contestants Row for all six Items Up For Bid on The Price Is Right, we had got established the pieces needed if we suppose this to be a binomial distribution problem. That is, we suppose that any given episode has a probability, p, of successfully having all six contestants from the same seat, and a probability 1 – p of failing to have all six contestants from the same seat. There are N episodes, and we are interested in the chance of x of them being clean sweeps. From the production schedule we know the number of episodes N is about 6,000. We supposed the probability of a clean sweep to be about p = 1/1000, on the assumption that the chance of winning isn’t any better or worse for any contestant. The probability of there not being a clean sweep is then 1 – p = 999/1000. And we expected x = 6 clean sweeps, while Drew Carey claimed there had been only 1.

The chance of finding x successes out of N attempts, according to the binomial distribution, is the probability of any combination of x successes and N – x successes — which is equal to (p)(x) * (1 – p)(N – x) — times the number of ways there are to select x items out of N candidates. Either of those is easy enough to calculate, up to the point where we try calculating it. Let’s start out by supposing x to be the expected 6, and later we’ll look at it being 1 or other numbers.

## Off By A Factor Of 720 (Or More)

To work out the task of figuring out whether it was plausible that there had been only one “clean sweep”, of all six contestants winning the Item Up For Bid on The Price Is Right coming from the same seat, we had started a little into the binomial distribution. The key ideas included that we have “Bernoulli trials”, a number of independent chances for some condition to happen — in this case, we had about 6,000 such trials, the number of hourlong episodes of The Price Is Right — and a probability p of successfully seeing some event occur on any one episode. We worked that out to be somewhere about p = 1/1000, if every seat is equally likely to win every time. There is also a probability of 1 – p or 999/1000 of the event failing to see this event, that is, that one or more contestants comes from a different seat.

To find the probability of seeing some number, call it x since we don’t particularly care what it is, of successes out of some larger number, call it N because that’s a convenient number, of trials, we need to figure out how many ways there are to arrange x successes out of N trials. For small x and N values we can figure this out by hand, given time. For large numbers, we’d never finish if we tried by hand. But we can solve it, if we attack the problem methodically.

## From Drew Carey To An Imaginary Baseball Player

So, we calculated that on any given episode of The Price Is Right there’s around one chance of all six winners of the Item Up For Bid coming from the same seat. And we know there have been about six thousand episodes with six Items Up For Bid. So we expect there to have been about six clean sweep episodes; yet if Drew Carey is to be believed, there has been just the one. What’s wrong?

Possibly, nothing. Just because there is a certain probability of a thing happening does not mean it happens all that often. Consider an analogous situation: a baseball batter might hit safely one time out of every three at-bats; but there would be nothing particularly odd in the batter going hitless in four at-bats during a single game, however much we would expect him to get at least one. There wouldn’t be much very peculiar in his hitting all four times, either. Our expected value, the number of times something could happen times the probability of it happening each time, is not necessarily what we actually see. (We might get suspicious if we always saw the expected value turn up.)

Still, there must be some limits. We might accept a batter who hits one time out of every three getting no hits in four at-bats. If he got no runs in four hundred at-bats, we’d be inclined to say he’s not a decent hitter having some bad luck. More likely he’s failing to bring the bat with him to the plate. We need a tool to say whether some particular outcome is tolerably likely or so improbable that something must be up.

## Came On Down

On the December 15th episode of The Price Is Right, host Drew Carey mentioned as the sixth Item Up For Bids began that so far that show, all the contestants who won their Item Up For Bids (and so got on-stage for the pricing games) had come from the same spot so far, five out of six. He said that only once before on the show had all the contestants come from the same seat in Contestants Row. That seems awfully few, but, how many should there be?

We can say roughly how many “clean sweep” shows we should expect. There’ve been just about 6,000 episodes of The Price Is Right played in the current hour-long format (the show was a half-hour its first few years after being revived in 1972; it was a very different show in previous decades). If we know the probability of all six contestants in one game winning their Item Up For Bids — properly speaking, it’s called the One-Bid, but nobody cares — and multiply the probability of six contestants in one show coming from the same seat by the number of shows, we have the number of shows we should expect to have had such a clean sweep. This product, the chance of something happening times the number of times it could happen, is termed the “expected value” or “expectation value”, or sometimes just the “mean”, as in the average number to be, well, expected.

This makes a couple of assumptions. All probability problems do. For example, it assumes the chance of a clean sweep in one show is unaffected by clean sweeps in other shows. That is, if everyone in the red seat won on Thursday, that wouldn’t make everyone in the blue seat winning Friday more or less likely. That condition is termed “independence”, and it is frequently relied upon to make probability problems work out. Unfortunately, it’s often hard to prove: how do you prove that one thing happening doesn’t affect the other?

## Ted Baxter and the Binomial Distribution

There are many hard things about teaching, although I appreciate that since I’m in mathematics I have advantages over many other fields. For example, students come in with the assumption that there are certainly right and certainly wrong answers to questions. I’m generally spared the problem of convincing students that I have authority to rule some answers in or out. There’s actually a lot of discretion and judgement and opinion involved, but most of that comes in when one is doing research. In an introductory course, there are some techniques that have gotten so well-established and useful we could fairly well pretend there isn’t any judgement left.

But one hard part is probably common to all fields: how closely to guide a student working out something. This case comes from office hours, as I tried getting a student to work out a problem in binomial distributions. Binomial distributions come up in studying the case where there are many attempts at something; and each attempt has a certain, fixed, chance of succeeding; and you want to know the chance of there being exactly some particular number of successes out of all those tries. For example, imagine rolling four dice, and being interested in getting exactly two 6’s on the four dice.

To work it out, you need the number of attempts, and the number of successes you’re interested in, and the chance of each attempt at something succeeding, and the chance of each attempt failing. For the four-dice problem, each attempt is the rolling of one die; there are four attempts at rolling die; we’re interested in finding two successful rolls of 6; the chance of successfully getting a 6 on any roll is 1/6; and the chance of failure on any one roll is —

## One Explanation For Friday the 13th’s Chance

So to give one answer to my calendar puzzle, which you may recall as this: for any given month and year, we know with certainty whether there’s a Friday the 13th in it. And yet, we can say that “Friday the 13ths are more likely than any other day of the week”, and mean something by it, and even mean something true by it. Thanks to the patterns of the Gregorian calendar we are more likely to see a Friday the 13th than we are a Thursday the 13th, or Tuesday the 13th, or so on. (We’re also more likely to see a Saturday the 14th than the 14th being any other day of the week, but somehow that’s not so interesting.)

Here’s one way to look at it. In December 2011 there’s zero chance of encountering a Friday the 13th. As it happens, 2011 has only one month with a Friday the 13th in it, the lowest case which happens. In January 2012 there’s a probability of one of encountering a Friday the 13th; it’s right there on the schedule. There’ll also be Fridays the 13th in April and July of 2012. For the other months of 2012, there’s zero probability of encountering a Friday the 13th.

Imagine that I pick one of the months in either 2011 or 2012. What is the chance that it has a Friday the 13th? If I tell you which month it is, you know right away the chance is zero or one; or, at least, you can tell as soon as you find a calendar. Or you might work out from various formulas what day of the week the 13th of that month should be, but you’re more likely to find a calendar before you are to find that formula, much less work it out.

## How Did Friday The 13th Get A Chance?

Here’s a little puzzle in probability which, in a slightly different form, I gave to my students to work out. I get the papers back tomorrow. To brace myself against that I’m curious what my readers here would make of it.

Possibly you’ve encountered a bit of calendrical folklore which says that Friday the 13ths are more likely than any other day of the week’s 13th. That’s not that there are more Fridays the 13th than all the other days of the week combined, but rather that a Friday the 13th is more likely to happen than a Thursday the 13th, or a Sunday, or what have you. And this is true; one is slightly more likely to see a Friday the 13th than any other specific day of the week being that 13.

And yet … there’s a problem in talking about the probability of any month having a Friday the 13th. Arguably, no month has any probability of holding a Friday the 13th. Consider.

Is there a Friday the 13th this month? For the month of this writing, December 2011, the answer is no; the 13th is a Tuesday; the Fridays are the 2nd, 9th, 16th, 23rd, and 30th. But were this January 2012, the answer would be yes. For February 2012, the answer is no again, as the 13th comes on a Monday. But altogether, every month has a Friday the 13th or it hasn’t. Technically, we might say that a month which definitely has a Friday the 13th has a probability of 1, or 100%; and a month which definitely doesn’t has a probability of 0, or 0%, but we tend to think of those as chances in the same way we think of white or black as colors, mostly when we want to divert an argument into nitpicking over definitions.

## What is .19 of a bathroom?

I’ve had a little more time attempting to teach probability to my students and realized I had been overlooking something obvious in the communication of ideas such as the probability of events or the expectation value of a random variable. Students have a much easier time getting the abstract idea if the examples used for it are already ones they find interesting, and if the examples can avoid confusing interpretations. This is probably about 3,500 years behind the curve in educational discoveries, but at least I got there eventually.

A “random variable”, here, sounds a bit scary, but shouldn’t. It means that the variable, for which x is a popular name, is some quantity which might be any of a collection of possible values. We don’t know for any particular experiment what value it has, at least before the experiment is done, but we know how likely it is to be any of those. For example, the number of bathrooms in a house is going to be one of 1, 1.5, 2, 2.5, 3, 3.5, up to the limits of tolerance of the zoning committee.

The expectation value of a random variable is kind of the average value of that variable. You find it by taking the sum of each of the possible values of the random variable times the probability of the random variable having that value. This is at least for a discrete random variable, where the imaginable values are, er, discrete: there’s no continuous ranges of possible values. Number of bathrooms is clearly discrete; the number of seconds one spends in the bathroom is, at least in principle, continuous. For a continuous random variable you don’t take the sum, but instead take an integral, which is just a sum that handles the idea of infinitely many possible values quite well.