Tagged: baseball Toggle Comment Threads | Keyboard Shortcuts

  • Joseph Nebus 6:00 pm on Monday, 18 July, 2016 Permalink | Reply
    Tags: , baseball, birds, , ,   

    Reading the Comics, July 13, 2016: Catching Up On Vacation Week Edition 

    I confess I spent the last week on vacation, away from home and without the time to write about the comics. And it was another of those curiously busy weeks that happens when it’s inconvenient. I’ll try to get caught up ahead of the weekend. No promises.

    Art and Chip Samson’s The Born Loser for the 10th talks about the statistics of body measurements. Measuring bodies is one of the foundations of modern statistics. Adolphe Quetelet, in the mid-19th century, found a rough relationship between body mass and the square of a person’s height, used today as the base for the body mass index.Francis Galton spent much of the late 19th century developing the tools of statistics and how they might be used to understand human populations with work I will describe as “problematic” because I don’t have the time to get into how much trouble the right mind at the right idea can be.

    No attempt to measure people’s health with a few simple measurements and derived quantities can be fully successful. Health is too complicated a thing for one or two or even ten quantities to describe. Measures like height-to-waist ratios and body mass indices and the like should be understood as filters, the way temperature and blood pressure are. If one or more of these measurements are in dangerous ranges there’s reason to think there’s a health problem worth investigating here. It doesn’t mean there is; it means there’s reason to think it’s worth spending resources on tests that are more expensive in time and money and energy. And similarly just because all the simple numbers are fine doesn’t mean someone is perfectly healthy. But it suggests that the person is more likely all right than not. They’re guides to setting priorities, easy to understand and requiring no training to use. They’re not a replacement for thought; no guides are.

    Jeff Harris’s Shortcuts educational panel for the 10th is about zero. It’s got a mix of facts and trivia and puzzles with a few jokes on the side.

    I don’t have a strong reason to discuss Ashleigh Brilliant’s Pot-Shots rerun for the 11th. It only mentions odds in a way that doesn’t open up to discussing probability. But I do like Brilliant’s “Embrace-the-Doom” tone and I want to share that when I can.

    John Hambrock’s The Brilliant Mind of Edison Lee for the 13th of July riffs on the world’s leading exporter of statistics, baseball. Organized baseball has always been a statistics-keeping game. The Olympic Ball Club of Philadelphia’s 1837 rules set out what statistics to keep. I’m not sure why the game is so statistics-friendly. It must be in part that the game lends itself to representation as a series of identical events — pitcher throws ball at batter, while runners wait on up to three bases — with so many different outcomes.

    'Edison, let's discuss stats while we wait for the opening pitch.' 'Statistics? I have plenty of those. A hot dog has 400 calories and costs five dollars. A 12-ounce root beer has 38 grams of sugar.' 'I mean *player* stats.' 'Oh'. (To his grandfather instead) 'Did you know the average wait time to buy nachos is eight minutes and six seconds?'

    John Hambrock’s The Brilliant Mind of Edison Lee for the 13th of July, 2016. Properly speaking, the waiting time to buy nachos isn’t a player statistic, but I guess Edison Lee did choose to stop talking to his father for it. Which is strange considering his father’s totally natural and human-like word emission ‘Edison, let’s discuss stats while we wait for the opening pitch’.

    Alan Schwarz’s book The Numbers Game: Baseball’s Lifelong Fascination With Statistics describes much of the sport’s statistics and record-keeping history. The things recorded have varied over time, with the list of things mostly growing. The number of statistics kept have also tended to grow. Sometimes they get dropped. Runs Batted In were first calculated in 1880, then dropped as an inherently unfair statistic to keep; leadoff hitters were necessarily cheated of chances to get someone else home. How people’s idea of what is worth measuring changes is interesting. It speaks to how we change the ways we look at the same event.

    Dana Summers’s Bound And Gagged for the 13th uses the old joke about computers being abacuses and the like. I suppose it’s properly true that anything you could do on a real computer could be done on the abacus, just, with a lot ore time and manual labor involved. At some point it’s not worth it, though.

    Nate Fakes’s Break of Day for the 13th uses the whiteboard full of mathematics to denote intelligence. Cute birds, though. But any animal in eyeglasses looks good. Lab coats are almost as good as eyeglasses.

    LERBE ( O O - O - ), GIRDI ( O O O - - ), TACNAV ( O - O - O - ), ULDNOA ( O O O - O - ). When it came to measuring the Earth's circumference, there was a ( - - - - - - - - ) ( - - - - - ).

    David L Hoyt and Jeff Knurek’s Jumble for the 13th of July, 2016. The link will be gone sometime after mid-August I figure. I hadn’t thought of a student being baffled by using the same formula for an orange and a planet’s circumference because of their enormous difference in size. It feels authentic, though.

    David L Hoyt and Jeff Knurek’s Jumble for the 13th is about one of geometry’s great applications, measuring how large the Earth is. It’s something that can be worked out through ingenuity and a bit of luck. Once you have that, some clever argument lets you work out the distance to the Moon, and its size. And that will let you work out the distance to the Sun, and its size. The Ancient Greeks had worked out all of this reasoning. But they had to make observations with the unaided eye, without good timekeeping — time and position are conjoined ideas — and without photographs or other instantly-made permanent records. So their numbers are, to our eyes, lousy. No matter. The reasoning is brilliant and deserves respect.

  • Joseph Nebus 3:00 pm on Tuesday, 31 May, 2016 Permalink | Reply
    Tags: baseball, , , , Poisson, , soccer   

    How Interesting Is A Low-Scoring Game? 

    I’m still curious about the information-theory content, the entropy, of sports scores. I haven’t found the statistics I need about baseball or soccer game outcomes that I need. I’d also like hockey score outcomes if I could get them. If anyone knows a reference I’d be glad to know of it.

    But there’s still stuff I can talk about without knowing details of every game ever. One of them suggested itself when I looked at the Washington Post‘s graphic. I mean the one giving how many times each score came up in baseball’s history.

    I had planned to write about this when one of my Twitter friends wrote —

    By “distribution” mathematicians mean almost what you would imagine. Suppose we have something that might hold any of a range of values. This we call a “random variable”. How likely is it to hold any particular value? That’s what the distribution tells us. The higher the distribution, the more likely it is we’ll see that value. In baseball terms, that means we’re reasonably likely to see a game with a team scoring three runs. We’re not likely to see a game with a team scoring twenty runs.

    Frequency (in thousands) of various baseball scores. I think I know what kind of distribution this is and I mean to follow up about that.

    Philip Bump writes for The Washington Post on the scores of all basketball, football, and baseball games in (United States) major league history. Also I have thoughts about what this looks like.

    There are many families of distributions. Feloni Mayhem suggested the baseball scores look like one called the Beta Distribution. I can’t quite agree, on technical grounds. Beta Distributions describe continuously-valued variables. They’re good for stuff like the time it takes to do something, or the height of a person, or the weight of a produced thing. They’re for measurements that can, in principle, go on forever after the decimal point. A baseball score isn’t like that. A team can score zero points, or one, or 46, but it can’t score four and two-thirds points. Baseball scores are “discrete” variables.

    But there are good distributions for discrete variables. Almost everything you encounter taking an Intro to Probability class will be about discrete variables. So will most any recreational mathematics puzzle. The distribution of a tossed die’s outcomes is discrete. So is the number of times tails comes up in a set number of coin tosses. So are the birth dates of people in a room, or the number of cars passed on the side of the road during your ride, or the number of runs scored by a baseball team in a full game.

    I suspected that, of the simpler distributions, the best model for baseball should be the Poisson distribution. It also seems good for any other low-scoring game, such as soccer or hockey. The Poisson distribution turns up whenever you have a large number of times that some discrete event can happen. But that event can happen only once each chance. And it has a constant chance of happening. That is, happening this chance doesn’t make it more likely or less likely it’ll happen next chance.

    I have reasons to think baseball scoring should be well-modelled this way. There are hundreds of pitches in a game. Each of them is in principle a scoring opportunity. (Well, an intentional walk takes three pitches without offering any chance for scoring. And there’s probably some other odd case where a pitched ball can’t even in principle let someone score. But these are minor fallings-away from the ideal.) This is part of the appeal of baseball, at least for some: the chance is always there.

    We only need one number to work out the Poisson distribution of something. That number is the mean, the arithmetic mean of all the possible values. Let me call the mean μ, which is the Greek version of m and so a good name for a mean. The probability that you’ll see the thing happen n times is \mu^n e^{-\mu} \div (n!) . Here e is that base of the natural logarithm, that 2.71828 et cetera number. n! is the factorial. That’s n times (n – 1) times (n – 2) times (n – 3) and so on all the way down to times 2 times 1.

    And here is the Poisson distribution for getting numbers from 0 through 20, if we take the mean to be 3.4. I can defend using the Poisson distribution much more than I can defend picking 3.4 as the mean. Why not 3.2, or 3.8? Mostly, I tried a couple means around the three-to-four runs range and picked one that looked about right. Given the lack of better data, what else can I do?

    The Poisson distribution starts pretty low, with zero, then rises up high at three runs and dwindles down for ever-higher scores.

    A simulation of baseball, or other low-scoring games, based on a Poisson distribution with mean of 3.4.

    I don’t think it’s a bad fit. The shape looks about right, to me. But the Poisson distribution suggests fewer zero- and one-run games than the actual data offers. And there are more high-scoring games in the real data than in the Poisson distribution. Maybe there’s something that needs tweaking.

    And there are several plausible causes for this. A Poisson distribution, for example, supposes that there are a lot of chances for a distinct event. That would be scoring on a pitch. But in an actual baseball game there might be up to four runs scored on one pitch. It’s less likely to score four runs than to score one, sure, but it does happen. This I imagine boosts the number of high-scoring games.

    I suspect this could be salvaged by a model that’s kind of a chain of Poisson distributions. That is, have one distribution that represents the chance of scoring on any given pitch. Then use another distribution to say whether the scoring was one, two, three, or four runs.

    Low-scoring games I have a harder time accounting for. My suspicion is that each pitch isn’t quite an independent event. Experience shows that pitchers lose control of their game the more they pitch. This results in the modern close watching of pitch counts. We see pitchers replaced at something like a hundred pitches even if they haven’t lost control of the game yet.

    If we ignore reasons to doubt this distribution, then, it suggests an entropy of about 2.9 for a single team’s score. That’s lower than the 3.5 bits I estimated last time, using score frequencies. I think that’s because of the multiple-runs problem. Scores are spread out across more values than the Poisson distribution suggests.

    If I am right this says we might model games like soccer and hockey, with many chances to score a single run each, with a Poisson distribution. A game like baseball, or basketball, with many chances to score one or more points at once needs a more complicated model.

  • Joseph Nebus 3:00 pm on Wednesday, 18 May, 2016 Permalink | Reply
    Tags: baseball, , , ,   

    How Interesting Is A Baseball Score? Some Further Results 

    While researching for my post about the information content of baseball scores I found some tantalizing links. I had wanted to know how often each score came up. From this I could calculate the entropy, the amount of information in the score. That’s the sum, taken over every outcome, of minus one times the frequency of that score times the base-two logarithm of the frequency of the outcome. And I couldn’t find that.

    An article in The Washington Post had a fine lead, though. It offers, per the title, “the score of every basketball, football, and baseball game in league history visualized”. And as promised it gives charts of how often each number of runs has turned up in a game. The most common single-team score in a game is 3, with 4 and 2 almost as common. I’m not sure the date range for these scores. The chart says it includes (and highlights) data from “a century ago”. And as the article was posted in December 2014 it can hardly use data from after that. I can’t imagine that the 2015 season has changed much, though. And whether they start their baseball statistics at either 1871, 1876, 1883, 1891, or 1901 (each a defensible choice) should only change details.

    Frequency (in thousands) of various baseball scores. I think I know what kind of distribution this is and I mean to follow up about that.

    Philip Bump writes for The Washington Post on the scores of all basketball, football, and baseball games in (United States) major league history. Also I have thoughts about what this looks like.

    Which is fine. I can’t get precise frequency data from the chart. The chart offers how many thousands of times a particular score has come up. But there’s not the reference lines to say definitely whether a zero was scored closer to 21,000 or 22,000 times. I will accept a rough estimate, since I can’t do any better.

    I made my best guess at the frequency, from the chart. And then made a second-best guess. My best guess gave the information content of a single team’s score as a touch more than 3.5 bits. My second-best guess gave the information content as a touch less than 3.5 bits. So I feel safe in saying a single team’s score is about three and a half bits of information.

    So the score of a baseball game, with two teams scoring, is probably somewhere around twice that, or about seven bits of information.

    I have to say “around”. This is because the two teams aren’t scoring runs independently of one another. Baseball doesn’t allow for tie games except in rare circumstances. (It would usually be a game interrupted for some reason, and then never finished because the season ended with neither team in a position where winning or losing could affect their standing. I’m not sure that would technically count as a “game” for Major League Baseball statistical purposes. But I could easily see a roster of game scores counting that.) So if one team’s scored three runs in a game, we have the information that the other team almost certainly didn’t score three runs.

    This estimate, though, does fit within my range estimate from 3.76 to 9.25 bits. And as I expected, it’s closer to nine bits than to four bits. The entropy seems to be a bit less than (American) football scores — somewhere around 8.7 bits — and college basketball — probably somewhere around 10.8 bits — which is probably fair. There are a lot of numbers that make for plausible college basketball scores. There are slightly fewer pairs of numbers that make for plausible football scores. There are fewer still pairs of scores that make for plausible baseball scores. So there’s less information conveyed in knowing that the game’s score is.

  • Joseph Nebus 3:00 pm on Friday, 13 May, 2016 Permalink | Reply
    Tags: baseball, , , , ,   

    How Interesting Is A Baseball Score? Some Partial Results 

    Meanwhile I have the slight ongoing quest to work out the information-theory content of sports scores. For college basketball scores I made up some plausible-looking score distributions and used that. For professional (American) football I found a record of all the score outcomes that’ve happened, and how often. I could use experimental results. And I’ve wanted to do other sports. Soccer was asked for. I haven’t been able to find the scoring data I need for that. Baseball, maybe the supreme example of sports as a way to generate statistics … has been frustrating.

    The raw data is available. Retrosheet.org has logs of pretty much every baseball game, going back to the forming of major leagues in the 1870s. What they don’t have, as best I can figure, is a list of all the times each possible baseball score has turned up. That I could probably work out, when I feel up to writing the scripts necessary, but “work”? Ugh.

    Some people have done the work, although they haven’t shared all the results. I don’t blame them; the full results make for a boring sort of page. “The Most Popular Scores In Baseball History”, at ValueOverReplacementGrit.com, reports the top ten most common scores from 1871 through 2010. The essay also mentions that as of then there were 611 unique final scores. And that lets me give some partial results, if we trust that blogger post from people I never heard of before are accurate and true. I will make that assumption over and over here.

    There’s, in principle, no limit to how many scores are possible. Baseball contains many implied infinities, and it’s not impossible that a game could end, say, 580 to 578. But it seems likely that after 139 seasons of play there can’t be all that many more scores practically achievable.

    Suppose then there are 611 possible baseball score outcomes, and that each of them is equally likely. Then the information-theory content of a score’s outcome is negative one times the logarithm, base two, of 1/611. That’s a number a little bit over nine and a quarter. You could deduce the score for a given game by asking usually nine, sometimes ten, yes-or-no questions from a source that knew the outcome. That’s a little higher than the 8.7 I worked out for football. And it’s a bit less than the 10.8 I estimate for college basketball.

    And there’s obvious rubbish there. In no way are all 611 possible outcomes equally likely. “The Most Popular Scores In Baseball History” says that right there in the essay title. The most common outcome was a score of 3-2, with 4-3 barely less popular. Meanwhile it seems only once, on the 28th of June, 1871, has a baseball game ended with a score of 49-33. Some scores are so rare we can ignore them as possibilities.

    (You may wonder how incompetent baseball players of the 1870s were that a game could get to 49-33. Not so bad as you imagine. But the equipment and conditions they were playing with were unspeakably bad by modern standards. Notably, the playing field couldn’t be counted on to be flat and level and well-mowed. There would be unexpected divots or irregularities. This makes even simple ground balls hard to field. The baseball, instead of being replaced with every batter, would stay in the game. It would get beaten until it was a little smashed shell of unpredictable dynamics and barely any structural integrity. People were playing without gloves. If a game ran long enough, they would play at dusk, without lights, with a muddy ball on a dusty field. And sometimes you just have four innings that get out of control.)

    What’s needed is a guide to what are the common scores and what are the rare scores. And I haven’t found that, nor worked up the energy to make the list myself. But I found some promising partial results. In a September 2008 post on Baseball-Fever.com, user weskelton listed the 24 most common scores and their frequency. This was for games from 1993 to 2008. One might gripe that the list only covers fifteen years. True enough, but if the years are representative that’s fine. And the top scores for the fifteen-year survey look to be pretty much the same as the 139-year tally. The 24 most common scores add up to just over sixty percent of all baseball games, which leaves a lot of scores unaccounted for. I am amazed that about three in five games will have a score that’s one of these 24 choices though.

    But that’s something. We can calculate the information content for the 25 outcomes, one each of the 24 particular scores and one for “other”. This will under-estimate the information content. That’s because “other” is any of 587 possible outcomes that we’re not distinguishing. But if we have a lower bound and an upper bound, then we’ve learned something about what the number we want can actually be. The upper bound is that 9.25, above.

    The information content, the entropy, we calculate from the probability of each outcome. We don’t know what that is. Not really. But we can suppose that the frequency of each outcome is close to its probability. If there’ve been a lot of games played, then the frequency of a score and the probability of a score should be close. At least they’ll be close if games are independent, if the score of one game doesn’t affect another’s. I think that’s close to true. (Some games at the end of pennant races might affect each other: why try so hard to score if you’re already out for the year? But there’s few of them.)

    The entropy then we find by calculating, for each outcome, a product. It’s minus one times the probability of that outcome times the base-two logarithm of the probability of that outcome. Then add up all those products. There’s good reasons for doing it this way and in the college-basketball link above I give some rough explanations of what the reasons are. Or you can just trust that I’m not lying or getting things wrong on purpose.

    So let’s suppose I have calculated this right, using the 24 distinct outcomes and the one “other” outcome. That makes out the information content of a baseball score’s outcome to be a little over 3.76 bits.

    As said, that’s a low estimate. Lumping about two-fifths of all games into the single category “other” drags the entropy down.

    But that gives me a range, at least. A baseball game’s score seems to be somewhere between about 3.76 and 9.25 bits of information. I expect that it’s closer to nine bits than it is to four bits, but will have to do a little more work to make the case for it.

    • Bunk Strutts 10:38 pm on Friday, 13 May, 2016 Permalink | Reply

      Unrelated, but it reminded me of a literature class in High School. The teacher gave multiple-choice quizzes every Friday, and I spotted patterns. By mid-semester I’d compiled a list of likely correct answers for each of the questions (i.e, 1. D; 2. B; 3. A, etc.). The pattern was consistent enough that I sold crib sheets that guaranteed a C for those who hadn’t studied. No one ever asked for a refund, and I never read Ethan Fromme.

      Liked by 1 person

      • Joseph Nebus 11:27 pm on Saturday, 14 May, 2016 Permalink | Reply

        I can believe this. It reminds me of the time in Peanuts when Linus figured he could pass a true-or-false test without knowing anything. The thing students don’t realize about multiple choice questions is they are hard to write. The instructor has to come up with a reasonable question, and not just the answer but several plausible alternatives, and then has to scramble where in the choices the answer comes up.

        I remember at least once I gave out a five-question multiple choice section where all the answers were ‘B’, but my dim recollection is that I did that on purpose after I noticed I’d made ‘B’ the right answer the first three times. I think I was wondering if students would chicken out of the idea that all five questions had the same answer. But then I failed to check what the results were and if students really did turn away from the right answer just because it was too neat a pattern.

        Liked by 2 people

        • FlowCoef 1:04 am on Sunday, 15 May, 2016 Permalink | Reply

          Sometime professional MCSA test author here. Writing those things can be a bear, especially getting the distractors right.


          • Bunk Strutts 3:30 am on Sunday, 15 May, 2016 Permalink | Reply

            My story dates back to the days of mimeograph prints. I never considered the difficulty in generating the tests. In retrospect, we had a very good math department, and some of the teachers would do just what JN said – all answers were “B.” Spooked the hell out of me, and yeah, I punted to the next likely answers.

            The bonus questions were always bizarre. You could miss all the questions, but if you got the bonus you got credit for the whole thing. We were still learning how to factor and cross-multiply when we got this:

            Given: a = 1, b = 2, c = 3 etc.
            [(x-a)(x-b)(x-c) … (x-z)] = ?

            Liked by 1 person

            • Bunk Strutts 3:43 am on Sunday, 15 May, 2016 Permalink | Reply

              Last one. Got a timed geometry quiz, 10 questions. At the top of the quiz were the directions to read through all of the problems before answering. Each of the problems 1 through 9 were impossible to complete in the time allotted, but Number 10 said, “Disregard problems 1 through 9, sign your name at the top of the page and turn it in.”

              Liked by 1 person

              • Joseph Nebus 3:16 am on Monday, 16 May, 2016 Permalink | Reply

                You know, I have a vague memory of getting that sort of quiz myself, back around 1980 or so. It wasn’t in mathematics, although I’m not sure just which class it was. This was elementary school for me so all the classes kind of blended together.

                I suspect there was something in the air at the time, since I remember hearing stories about impossible-quizzes like that with a disregard-all-above-problems notes. And I can’t be sure I haven’t conflated a memory of taking one with the stories of disregard-all-above-problems tests being given.


            • Joseph Nebus 3:10 am on Monday, 16 May, 2016 Permalink | Reply

              I only barely make it back to the days of mimeograph machines, as a student, although it’s close.

              That bonus question sounds maddening, although its existence makes me suspect there’s a trick I’ll have to poke it with to see.


          • Joseph Nebus 3:03 am on Monday, 16 May, 2016 Permalink | Reply

            I had interviewed once to write mathematics questions for a standardized test corporation. I didn’t get it, though, and I suspect my weakness in coming up with good distractors was the big problem. I suspect I’d do better now.


  • Joseph Nebus 3:00 pm on Thursday, 13 August, 2015 Permalink | Reply
    Tags: baseball,   

    At The Home Field 

    There was a neat little fluke in baseball the other day. All fifteen of the Major League Baseball games on Tuesday were won by the home team. This appears to be the first time it’s happened since the league expanded to thirty teams in 1998. As best as the Elias Sports Bureau can work out, the last time every game was won by the home team was on the 23rd of May, 1914, when all four games in each of the National League, American League, and Federal League were home-team wins.

    This produced talk about the home field advantage never having it so good, naturally. Also at least one article claimed the odds of fifteen home-team wins were one in 32,768. I can’t find that article now that I need it; please just trust me that it existed.

    The thing is this claim is correct, if you assume there is no home-field advantage. That is, if you suppose the home team has exactly one chance in two of winning, then the chance of fifteen home teams winning is one-half raised to the fifteenth power. And that is one in 32,768.

    This also assumes the games are independent, that is, that the outcome of one has no effect on the outcome of another. This seems likely, at least as long as we’re far enough away from the end of the season. In a pennant race a team might credibly relax once another game decided whether they had secured a position in the postseason. That might affect whether they win the game under way. Whether results are independent is always important for a probability question.

    But stadium designers and the groundskeeping crew would not be doing their job if the home team had an equal chance of winning as the visiting team does. It’s been understood since the early days of organized professional baseball that the state of the field can offer advantages to the team that plays most of its games there.

    Jack Jones, at Betfirm.com, estimated that for the five seasons from 2010 to 2014, the home team won about 53.7 percent of all games. Suppose we take this as accurate and representative of the home field advantage in general. Then the chance of fifteen home-team wins is 0.537 raised to the fifteenth power. That is approximately one divided by 11,230.

    That’s a good bit more probable than the one in 32,768 you’d expect from the home team having exactly a 50 percent chance of winning. I think that’s a dramatic difference considering the home team wins a bit less than four percent more often than 50-50.

    The follow-up question and one that’s good for a probability homework would be to work out what are the odds that we’d see one day with fifteen home-team wins in the mere eighteen years since it became possible.

    • sheldonk2014 6:26 pm on Thursday, 13 August, 2015 Permalink | Reply

      Ok yet another strike for Joseph


    • ivasallay 7:43 pm on Thursday, 13 August, 2015 Permalink | Reply

      I shared this post on facebook with my son who loves baseball. He then wrote to me, “It’s also assuming not just that there is no homefield advantage, but that the two teams are evenly matched. In 10 of the 15 games, the team with the better overall record won. In only 9 of the games did the winning pitcher have a better ERA than the losing pitcher.”


      • Joseph Nebus 6:03 am on Saturday, 15 August, 2015 Permalink | Reply

        You’re right, it does assume the teams are equally matched. That’s not a justified assumption except as an admission of ignorance, that we might not know which team is better. (I assume that the full season is enough to indicate the strongest and the weakest teams, although it’s probably not enough to distinguish which is the 15th versus the 16th versus the 17th-strongest teams.)


    • Angie Mc 2:20 pm on Friday, 14 August, 2015 Permalink | Reply

      Sending this to my college pitcher and will be reading it with my 2 balls players at home today. Very cool, Joseph. Thank you!


      • Joseph Nebus 6:04 am on Saturday, 15 August, 2015 Permalink | Reply

        I hope he enjoyed, though I’d imagine he had seen some talk about the fluke already.

        Liked by 1 person

        • Angie Mc 5:10 pm on Saturday, 15 August, 2015 Permalink | Reply

          My oldest son is our biggest baseball trivia geek, although we all enjoy cool stuff like this. There’s so much to keep up with so thanks for this one :D


          • Joseph Nebus 4:59 am on Tuesday, 18 August, 2015 Permalink | Reply

            Oh, good. I’ve had the good fortune the past few years to read up on the history of baseball statistics — it grew up with the organization of baseball — and that’s fascinating stuff. Partly for the history, partly for the mathematics, partly for the sociology of trying to figure out what needs quantifying and how to do that.

            Liked by 1 person

            • Angie Mc 3:26 pm on Tuesday, 18 August, 2015 Permalink | Reply

              Exactly! When my oldest began playing baseball as a kid, I knew nothing about the sport. Nothing. So, I read everything I could get my hands on about it and fell in love, mainly through the history at first. Baseball is so rich!


              • Joseph Nebus 9:19 pm on Saturday, 22 August, 2015 Permalink | Reply

                I have to admit I’m not much for playing baseball. I would have this problem of not sufficiently holding on to the bat after swinging, although the third baseman was able to jump out of the way of the bat every time. But the lore and the history and the evolution of the game are hard to resist. In short, I’ll buy pretty near anything Peter Morris writes.

                Liked by 1 person

                • Angie Mc 11:56 pm on Saturday, 22 August, 2015 Permalink | Reply

                  LOL! I’m with you about hitting a round ball with a round bat…how can anyone do that?! Morris’s “A Game of Inches” sounds terrific. Have you read it?


                  • Joseph Nebus 7:09 pm on Monday, 24 August, 2015 Permalink | Reply

                    It is a terrific book. I’ve read it and keep going back to leaf through it; there’s something fascinating on pretty near every page.

                    For example, you know how it’s legal for the runner to over-run first base? But not second or third? That’s the fossilized result of the decade or so when it was fashionable to play winter baseball on frozen ponds with all the players wearing ice skates.

                    Liked by 1 person

                    • Angie Mc 6:01 am on Tuesday, 25 August, 2015 Permalink | Reply

                      All right, then, it’s official. I need to read this book! Likely after Christmas, before spring training when I need a baseball fix :D


                      • Joseph Nebus 11:13 pm on Thursday, 27 August, 2015 Permalink

                        Certainly, yes. It’ll fill that niche very nicely. It’s a thick book but everything in it is half- or one-page chunks, basically. And you can browse and graze rather than reading straight through.

                        Liked by 1 person

                      • Angie Mc 11:19 pm on Thursday, 27 August, 2015 Permalink

                        Perfect! Just added it to my Amazon wish list :D


  • Joseph Nebus 4:08 pm on Tuesday, 7 April, 2015 Permalink | Reply
    Tags: baseball, , cryptography, ENIAC, , peace, , ,   

    Reading the Comics, April 6, 2015: Little Infinite Edition 

    As I warned, there were a lot of mathematically-themed comic strips the last week, and here I can at least get us through the start of April. This doesn’t include the strips that ran today, the 7th of April by my calendar, because I have to get some serious-looking men to look at my car and I just know they’re going to disapprove of what my CV joint covers look like, even though I’ve done nothing to them. But I won’t be reading most of today’s comic strips until after that’s done, and so commenting on them later.

    Mark Anderson’s Andertoons (April 3) makes its traditional appearance in my roundup, in this case with a business-type guy declaring infinity to be “the loophole of all loopholes!” I think that’s overstating things a fair bit, but strange and very counter-intuitive things do happen when you try to work out a problem in which infinities turn up. For example: in ordinary arithmetic, the order in which you add together a bunch of real numbers makes no difference. If you want to add together infinitely many real numbers, though, it is possible to have them add to different numbers depending on what order you add them in. Most unsettlingly, it’s possible to have infinitely many real numbers add up to literally any real number you like, depending on the order in which you add them. And then things get really weird.

    Keith Tutt and Daniel Saunders’s Lard’s World Peace Tips (April 3) is the other strip in this roundup to at least name-drop infinity. I confess I don’t see how “being infinite” would help in bringing about world peace, but I suppose being finite hasn’t managed the trick just yet so we might want to think outside the box.

    (More …)

    • ivasallay 4:39 pm on Tuesday, 7 April, 2015 Permalink | Reply

      Andertoons and Birdbrains were the best for me.


    • abyssbrain 3:57 am on Wednesday, 8 April, 2015 Permalink | Reply

      Hilber’s infinite hotel rooms paradox can also show how weird the concept of infinity can get.


      • Joseph Nebus 10:08 pm on Wednesday, 8 April, 2015 Permalink | Reply

        They do, yes. They also suggest to me why the mathematics of infinity draws in a lot of … well, they’re generally called cranks, but maybe the less judgmental way to put it is non-standard mathematicians. The subject is astoundingly accessible; you can understand an interesting problem without any background. But the results are counter-intuitive, and so reasoning carefully is required, and it takes time and practice to do all the careful reasoning involved and to understand why the intuitive answers break down.

        Liked by 1 person

  • Joseph Nebus 10:50 pm on Thursday, 18 December, 2014 Permalink | Reply
    Tags: baseball, , , , ,   

    Gaussian distribution of NBA scores 

    The Prior Probability blog points out an interesting graph, showing the most common scores in basketball teams, based on the final scores of every NBA game. It’s actually got three sets of data there, one for all basketball games, one for games this decade, and one for basketball games of the 1950s. Unsurprisingly there’s many more results for this decade — the seasons are longer, and there are thirty teams in the league today, as opposed to eight or nine in 1954. (The Baltimore Bullets played fourteen games before folding, and the games were expunged from the record. The league dropped from eleven teams in 1950 to eight for 1954-1959.)

    I’m fascinated by this just as a depiction of probability distributions: any team can, in principle, reach most any non-negative score in a game, but it’s most likely to be around 102. Surely there’s a maximum possible score, based on the fact a team has to get the ball and get into position before it can score; I’m a little curious what that would be.

    Prior Probability itself links to another blog which reviews the distribution of scores for other major sports, and the interesting result of what the most common basketball score has been, per decade. It’s increased from the 1940s and 1950s, but it’s considerably down from the 1960s.

    Liked by 1 person

    prior probability

    You can see the most common scores in such sports as basketball, football, and baseball in Philip Bump’s fun Wonkblog post here. Mr Bump writes: “Each sport follows a rough bell curve … Teams that regularly fall on the left side of that curve do poorly. Teams that land on the right side do well.” Read more about Gaussian distributions here.

    View original post

  • Joseph Nebus 9:27 pm on Monday, 20 October, 2014 Permalink | Reply
    Tags: , art history, baseball, , linguistics, Major League Baseball, , , theses,   

    Reading The Comics, October 20, 2014: No Images This Edition 

    Since I started including Comics Kingdom strips in my roundups of mathematically-themed strips I’ve been including images of those, because I’m none too confident that Comics Kingdom’s pages are accessible to normal readers after some time has passed. Gocomics.com has — as far as I’m aware, and as far as anyone has told me — no such problems, so I haven’t bothered doing more than linking to them. So this is the first roundup in a long while I remember that has only Gocomics strips, with nothing from Comics Kingdom. It’s also the first roundup for which I’m fairly sure I’ve done one of these strips before.

    Guy Endore-Kaiser and Rodd Perry and Dan Thompson’s Brevity (October 15, but a rerun) is an entry in the anthropomorphic-numbers line of mathematics comics, and I believe it’s one that I’ve already mentioned in the past. This particular strip is a rerun; in modern times the apparently indefatigable Dan Thompson has added this strip to the estimated fourteen he does by himself. In any event it stands out in the anthropomorphic-numbers subgenre for featuring non-integers that aren’t pi.

    Ralph Hagen’s The Barn (October 16) ponders how aliens might communicate with Earthlings, and like pretty much everyone who’s considered the question mathematics is supposed to be the way they’d do it. It’s easy to see why mathematics is plausible as a universal language: a mathematical truth should be true anywhere that deductive logic holds, and it’s difficult to conceive of a universe existing in which it could not hold true. I have somewhere around here a mention of a late-19th-century proposal to try contacting Martians by planting trees in Siberia which, in bloom, would show a proof of the Pythagorean theorem.

    In modern times we tend to think of contact with aliens being done by radio more likely (or at least some modulated-light signal), which makes a signal like a series of pulses counting out prime numbers sound likely. It’s easy to see why prime numbers should be interesting too: any species that has understood multiplication has almost certainly noticed them, and you can send enough prime numbers in a short time to make clear that there is a deliberate signal being sent. For comparison, perfect numbers — whose factors add up to the original number — are also almost surely noticed by any species that understands multiplication, but the first several of those are 6, 28, 496, and 8,128; by the time 8,128 pulses of anything have been sent the whole point of the message has been lost.

    And yet finding prime numbers is still not really quite universal. You or I might see prime numbers as key, but why not triangular numbers, like the sequence 1, 3, 6, 10, 15? Why not square or cube numbers? The only good answer is, well, we have to pick something, so to start communicating let’s hope we find something that everyone will be able to recognize. But there’s an arbitrariness that can’t be fully shed from the process.

    John Zakour and Scott Roberts’s Maria’s Day (October 17) reminds us of the value of having a tutor for mathematics problems — if you’re having trouble in class, go to one — and of paying them appropriately.

    Steve Melcher’s That Is Priceless (October 17) puts comic captions to classic paintings and so presented Jusepe de Ribera’s 1630 Euclid, Letting Me Copy His Math Homework. I confess I have a broad-based ignorance of art history, but if I’m using search engines correctly the correct title was actually … Euclid. Hm. It seems like Melcher usually has to work harder at these things. Well, I admit it doesn’t quite match my mental picture of Euclid, but that would have mostly involved some guy in a toga wielding a compass. Ribera seems to have had a series of Greek Mathematician pictures from about 1630, including Pythagoras and Archimedes, with similar poses that I’ll take as stylized representations of the great thinkers.

    Mark Anderson’s Andertoons (October 18) plays around statistical ideas that include expectation values and the gambler’s fallacy, but it’s a good puzzle: has the doctor done the procedure hundreds of times without a problem because he’s better than average at it, or because he’s been lucky? For an alternate formation, baseball offers a fine question: Ted Williams is the most recent Major League Baseball player to have a season batting average over .400, getting a hit in at least two-fifths of his at-bats over the course of the season. Was he actually good enough to get a hit that often, though, or did he just get lucky? Consider that a .250 hitter — with a 25 percent chance of a hit at any at-bat — could quite plausibly get hits in three out of his four chances in one game, or for that matter even two or three games. Why not a whole season?

    Well, because at some point it becomes ridiculous, rather the way we would suspect something was up if a tossed coin came up tails thirty times in a row. Yes, possibly it’s just luck, but there’s good reason to suspect this coin doesn’t have a fifty percent chance of coming up heads, or that the hitter is likely to do better than one hit for every four at-bats, or, to the original comic, that the doctor is just better at getting through the procedure without complications.

    Ryan North’s quasi-clip-art Dinosaur Comics (October 20) thrilled the part of me that secretly wanted to study language instead by discussing “light verb constructions”, a grammatical touch I hadn’t paid attention to before. The strip is dubbed “Compressed Thesis Comics”, though, from the notion that the Tyrannosaurus Rex is inspired to study “computationally” what forms of light verb construction are more and what are less acceptable. The impulse is almost perfect thesis project, really: notice a thing and wonder how to quantify it. A good piece of this thesis would probably be just working out how to measure acceptability of a particular verb construction. I imagine the linguistics community has a rough idea how to measure these or else T Rex is taking on way too big a project for a thesis, since that’d be an obvious point for the thesis to crash against.

    Well, I still like the punch line.

  • Joseph Nebus 1:04 pm on Friday, 3 October, 2014 Permalink | Reply
    Tags: baseball, playoffs, postseason, , subway series   

    How weird is it that three pairs of same-market teams made the playoffs this year? 

    The “God Plays Dice” blog has a nice little baseball-themed post built on the coincidence that a number of the teams in the postseason this year are from the same or at least neighboring markets — two from Los Angeles, a pair from San Francisco and Oakland, and another pair from Washington and Baltimore. It can’t be likely that this should happen much, but, how unlikely is it? Michael Lugo works it out in what’s probably the easiest way to do it.


    God plays dice

    The Major League Baseball postseason is starting just as I write this.

    From the National League, we have Washington, St. Louis, Pittsburgh, Los Angeles, and San Francisco.
    From the American League, we have Baltimore, Kansas City, Detroit, Los Angeles (Anaheim), and Oakland.

    These match up pretty well geographically, and this hasn’t gone unnoticed: see for example the New York Times blog post “the 2014 MLB playoffs have a neighborly feel” (apologies for not providing a link; I’m out of NYT views for the month, and I saw this back when I wasn’t); a couple mathematically inclined Facebook friends of mine have mentioned it as well.

    In particular there are three pairs of “same-market” teams in here: Washington/Baltimore, Los Angeles/Los Angeles, San Francisco/Oakland. How likely is that?

    (People have pointed out St. Louis/Kansas City as being both in Missouri, but that’s a bit more of a judgment call, and St. Louis…

    View original post 545 more words

  • Joseph Nebus 2:32 pm on Thursday, 12 June, 2014 Permalink | Reply
    Tags: baseball, , , , ,   

    Reading the Comics, June 11, 2014: Unsound Edition 

    I can tell the school year is getting near the end: it took a full week to get enough mathematics-themed comic strips to put together a useful bundle of them this time. I don’t know what I’m going to do this summer when there’s maybe two comic strips I can talk about per week and I have to go finding my own initiative to write about things.

    Jef Mallet’s Frazz (June 6) is a pun strip, yeah, although it’s one that’s more or less legitimate for a word problem. The reason I have to say “more or less” is that it’s not clear to me whether, per Caulfield’s specification, the amount of ore lost across each Great Lake is three percent of the original cargo or three percent of the remaining cargo. But writing a word problem so that there’s only the one correct solution is a skill that needs development no less than solving word problems is, and probably if we imagine Caulfield grading he’d realize there was an ambiguity when a substantial number of of the papers make the opposite assumption to what he’d had in his mind.

    Ruben Bolling’s Tom the Dancing Bug (June 6, and I believe it’s a rerun) steps into some of the philosophically heady waters that one gets into when you look seriously at probability, and that get outright silly when you mix omniscience into the mix. The Supreme Planner has worked out what he concludes to be a plan certain of success, but: does that actually mean one will succeed? Even if we assume that the Supreme Planner is able to successfully know and account for every factor which might affect his success — well, for a less criminal plan, consider: one is certain to toss heads at least once, if one flips a fair coin infinitely many times. And yet it would not actually be impossible to flip a fair coin infinitely many times and have it turn up tails every time. That something can have a probability of 1 (or 100%) of happening and nevertheless not happen — or equivalently, that something can have a probability of 0 (0%) of happening and still happen — is exactly analogous to how a concept can be true almost everywhere, that is, it can be true with exceptions that in some sense don’t matter. Ruben Bolling tosses in the troublesome notion of the multiverse, the idea that everything which might conceivably happen does happen “somewhere”, to make these impossible events all the more imminent. I’m impressed Bolling is able to touch on so much, with a taste of how unsettling the implications are, in a dozen panels and stay funny about it.

    Enos cheats, badly, on his test.

    Bud Grace’s The Piranha Club for the 9th of June, 2014.

    Bud Grace’s The Piranha Club (June 9) gives us Enos cheating with perfectly appropriate formulas for a mathematics exam. I’m kind of surprised the Pythagorean Theorem would rate cheat-sheet knowledge, actually, as I thought that had reached the popular culture at least as well as Einstein’s E = mc2 had, although perhaps it’s reached it much as Einstein’s has, as a charming set of sounds without any particular meaning behind them. I admit my tendency in giving exams, too, has been to allow students to bring their own sheet of notes, or even to have open-book exams, on the grounds that I don’t really care whether they’ve memorized formulas and am more interested in whether they can find and apply the relevant formulas. But that doesn’t make me right; I agree there’s value in being able to identify what the important parts of the course are and to remember them well, and even more value in being able to figure out the area of a triangle or a trapezoid from thinking hard about the subject on your own.

    Jason Poland’s Robbie and Bobbie (June 10) is looking for philosophy and mathematics majors, so, here’s hoping it’s found a couple more. The joke here is about the classification of logical arguments. A valid argument is one in which the conclusion does indeed follow from the premises according to the rules of deductive logic. A sound argument is a valid argument in which the premises are also true. The reason these aren’t exactly the same thing is that whether a conclusion follows from the premise depends on the structure of the argument; the content is irrelevant. This means we can do a great deal of work, reasoning out things which follow if we suppose that proposition A being true implies B is false, or that we know B and C cannot both be false, or whatnot. But this means we may fill in, Mad-Libs-style, whatever we like to those propositions and come away with some funny-sounding arguments.

    So this is how we can have an argument that’s valid yet not sound. It is valid to say that, if baseball is a form of band organ always found in amusement parks, and if amusement parks are always found in the cubby-hole under my bathroom sink, then, baseball is always found in the cubby-hole under my bathroom sink. But as none of the premises going into that argument are true, the argument’s not sound, which is how you can have anything be “valid but not sound”. Identifying arguments that are valid but not sound is good for a couple questions on your logic exam, so, be ready for that.

    Edison Lee fails to catch a ball because he miscalculates where it should land.

    John Hambrock’s The Brilliant Mind of Edison Lee, 11 June 2014.

    John Hambrock’s The Brilliant Mind of Edison Lee (June 11) has the brilliant yet annoying Edison trying to prove his genius by calculating precisely where the baseball will drop. This is a legitimate mathematics/physics problem, of course: one could argue that the modern history of mathematical physics comes from the study of falling balls, albeit more of cannonballs than baseballs. If there’s no air resistance and if gravity is uniform, the problem is easy and you get to show off your knowledge of parabolas. If gravity isn’t uniform, you have to show off your knowledge of ellipses. Either way, you can get into some fine differential equations work, and that work gets all the more impressive if you do have to pay attention to the fact that a ball moving through the air loses some of its speed to the air molecules. That said, it’s amazing that people are able to, in effect, work out approximate solutions to “where is this ball going” in their heads, not to mention to act on it and get to the roughly correct spot, lat least when they’ve had some practice.

  • Joseph Nebus 3:39 pm on Wednesday, 14 May, 2014 Permalink | Reply
    Tags: baseball, , , , , , ,   

    Reading the Comics, May 13, 2014: Good Class Problems Edition 

    Someone in Comic Strip Master Command must be readying for the end of term, as there’s been enough comic strips mentioning mathematics themes to justify another of these entries, and that’s before I even start reading Wednesday’s comics. I can’t say that there seem to be any overarching themes in the past week’s grab-bag of strips, but, there are a bunch of pretty good problems that would fit well in a mathematics class here.

    Darrin Bell’s Candorville (May 6) comes back around to the default application of probability, questions in coin-flipping. You could build a good swath of a probability course just from the questions the strip implies: how many coins have to come up heads before it becomes reasonable to suspect that something funny is going on? Two is obviously too few; two thousand is likely too many. But improbable things do happen, without it signifying anything. So what’s the risk of supposing something’s up when it isn’t? What’s the risk of dismissing the hints that something is happening?

    Mark Anderson’s Andertoons (May 8) is another entry in the wiseacre schoolchild genre (I wonder if I’ve actually been consistent in describing this kind of comic, but, you know what I mean) and suggesting that arithmetic just be done on the computer. I’m sympathetic, however much fun it is doing arithmetic by hand.

    Justin Boyd’s Invisible Bread (May 9) is honestly a marginal inclusion here, but it does show a mathematics problem that’s correctly formed and would reasonably be included on a precalculus or calculus class’s worksheets. It is a problem that’s a no-brainer, really, but that fits the comic’s theme of poorly functioning.

    Steve Moore’s In The Bleachers (May 12) uses baseball scores and the start of a series. A series, at least once you’re into calculus, is the sum of a sequence of numbers, and if there’s only finitely many of them, here, there’s not much that’s interesting to say. Each sequence of numbers has some sum and that’s it. But if you have an infinite series — well, there, all sorts of amazing things become possible (or at least logically justified), including integral calculus and numerical computing. The series from the panel, if carried out, would come to a pair of infinitely large sums — this is called divergence, and is why your mathematician friends on Facebook or Twitter are passing around that movie poster with a math formula for a divergent series on it — and you can probably get a fair argument going about whether the sum of all the even numbers would be equal to the sum of all the odd numbers. (My advice: if pressed to give an answer, point to the other side of the room, yell, “Look, a big, distracting thing!” and run off.)

    Samson’s Dark Side Of The Horse (May 13) is something akin to a pun, playing as it does on the difference between a number and a numeral and shifting between the ways we might talk about “three”. Also, I notice for the first time that apparently the little bird sometimes seen in the comic is named “Sine”, which is probably why it flies in such a wavy pattern. I don’t know how I’d missed that before.

    Rick Detorie’s One Big Happy (May 13, rerun) is also a strip that plays on the difference between a number and its representation as a numeral, really. Come to think of it, it’s a bit surprising that in Arabic numerals there aren’t any relationships between the representations for numbers; one could easily imagine a system in which, say, the symbol for “four” were a pair of whatever represents “two”. In A History Of Mathematical Notations Florian Cajori notes that there really isn’t any system behind why a particular numeral has any particular shape, and he takes a section (Section 96 in Book 1) to get engagingly catty about people who do. I’d like to quote it because it’s appealing, in that way:

    A problem as fascinating as the puzzle of the origin of language relates to the evolution of the forms of our numerals. Proceeding on the tacit assumption that each of our numerals contains within itself, as a skeleton so to speak, as many dots, strokes, or angles as it represents units, imaginative writers of different countries and ages have advanced hypotheses as to their origin. Nor did these writers feel that they were indulging simply in pleasing pastimes or merely contributing to mathematical recreations. With perhaps only one exception, they were as convinced of the correctness of their explanations as are circle-squarers of the soundness of their quadratures.

    Cajori goes on to describe attempts to rationalize the Arabic numerals as “merely … entertaining illustrations of the operation of a pseudo-scientific imagination, uncontrolled by all the known facts”, which gives some idea why Cajori’s engaging reading for seven hundred pages about stuff like where the plus sign comes from.

  • Joseph Nebus 11:23 pm on Monday, 5 May, 2014 Permalink | Reply
    Tags: baseball, , , , , ,   

    Reading the Comics, May 4, 2014: Summing the Series Edition 

    Before I get to today’s round of mathematics comics, a legend-or-joke, traditionally starring John Von Neumann as the mathematician.

    The recreational word problem goes like this: two bicyclists, twenty miles apart, are pedaling toward each other, each at a steady ten miles an hour. A fly takes off from the first bicyclist, heading straight for the second at fifteen miles per hour (ground speed); when it touches the second bicyclist it instantly turns around and returns to the first at again fifteen miles per hour, at which point it turns around again and head for the second, and back to the first, and so on. By the time the bicyclists reach one another, the fly — having made, incidentally, infinitely many trips between them — has travelled some distance. What is it?

    And this is not hard problem to set up, inherently: each leg of the fly’s trip is going to be a certain ratio of the previous leg, which means that formulas for a geometric infinite series can be used. You just need to work out what the lengths of those legs are to start with, and what that ratio is, and then work out the formula in your head. This is a bit tedious and people given the problem may need some time and a couple sheets of paper to make it work.

    Von Neumann, who was an expert in pretty much every field of mathematics and a good number of those in physics, allegedly heard the problem and immediately answered: 15 miles! And the problem-giver said, oh, he saw the trick. (Since the bicyclists will spend one hour pedaling before meeting, and the fly is travelling fifteen miles per hour all that time, it travels a total of a fifteen miles. Most people don’t think of that, and try to sum the infinite series instead.) And von Neumann said, “What trick? All I did was sum the infinite series.”

    Did this charming story of a mathematician being all mathematicky happen? Wikipedia’s description of the event credits Paul Halmos’s recounting of Nicholas Metropolis’s recounting of the story, which as a source seems only marginally better than “I heard it on the Internet somewhere”. (Other versions of the story give different distances for the bicyclists and different speeds for the fly.) But it’s a wonderful legend and can be linked to a Herb and Jamaal comic strip from this past week.

    Paul Trap’s Thatababy (April 29) has the baby “blame entropy”, which fits as a mathematical concept, it seems to me. Entropy as a concept was developed in the mid-19th century as a thermodynamical concept, and it’s one of those rare mathematical constructs which becomes a superstar of pop culture. It’s become something of a fancy word for disorder or chaos or just plain messes, and the notion that the entropy of a system is ever-increasing is probably the only bit of statistical mechanics an average person can be expected to know. (And the situation is more complicated than that; for example, it’s just more probable that the entropy is increasing in time.)

    Entropy is a great concept, though, as besides capturing very well an idea that’s almost universally present, it also turns out to be meaningful in surprising new places. The most powerful of those is in information theory, which is just what the label suggests; the field grew out of the problem of making messages understandable even though the telegraph or telephone lines or radio beams on which they were sent would garble the messages some, even if people sent or received the messages perfectly, which they would not. The most captivating (to my mind) new place is in black holes: the event horizon of a black hole has a surface area which is (proportional to) its entropy, and consideration of such things as the conservation of energy and the link between entropy and surface area allow one to understand something of the way black holes ought to interact with matter and with one another, without the mathematics involved being nearly as complicated as I might have imagined a priori.

    Meanwhile, Lincoln Pierce’s Big Nate (April 30) mentions how Nate’s Earned Run Average has changed over the course of two innings. Baseball is maybe the archetypical record-keeping statistics-driven sport; Alan Schwarz’s The Numbers Game: Baseball’s Lifelong Fascination With Statistics notes that the keeping of some statistical records were required at least as far back as 1837 (in the Constitution of the Olympic Ball Club of Philadelphia). Earned runs — along with nearly every other baseball statistic the non-stathead has heard of other than batting averages — were developed as a concept by the baseball evangelist and reporter Henry Chadwick, who presented them from 1867 as an attempt to measure the effectiveness of batting and fielding. (The idea of the pitcher as an active player, as opposed to a convenient way to get the ball into play, was still developing.) But — and isn’t this typical? — he would come to oppose the earned run average as a measure of pitching performance, because things that were really outside the pitcher’s control, such as stolen bases, contributed to it.

    It seems to me there must be some connection between the record-keeping of baseball and the development of statistics as a concept in the 19th century. Granted the 19th century was a century of statistics, starting with nation-states measuring their populations, their demographics, their economies, and projecting what this would imply for future needs; and then with science, as statistical mechanics found it possible to quite well understand the behavior of millions of particles despite it being impossible to perfectly understand four; and in business, as manufacturing and money were made less individual and more standard. There was plenty to drive the field without an amusing game, but, I can’t help thinking of sports as a gateway into the field.

    Creator.com's _Donald Duck_ for 2 May 2014: Ludwig von Drake orders his computer to stop with the thinking.

    The Disney Company’s Donald Duck (May 2, rerun) suggests that Ludwig von Drake is continuing to have problems with his computing machine. Indeed, he’s apparently having the same problem yet. I’d like to know when these strips originally ran, but the host site of creators.com doesn’t give any hint.

    Stephen Bentley’s Herb and Jamaal (May 3) has the kid whose name I don’t really know fret how he spent “so much time” on an equation which would’ve been easy if he’d used “common sense” instead. But that’s not a rare phenomenon mathematically: it’s quite possible to set up an equation, or a process, or a something which does indeed inevitably get you to a correct answer but which demands a lot of time and effort to finish, when a stroke of insight or recasting of the problem would remove that effort, as in the von Neumann legend. The commenter Dartpaw86, on the Comics Curmudgeon site, brought up another excellent example, from Katie Tiedrich’s Awkward Zombie web comic. (I didn’t use the insight shown in the comic to solve it, but I’m happy to say, I did get it right without going to pages of calculations, whether or not you believe me.)

    However, having insights is hard. You can learn many of the tricks people use for different problems, but, say, no amount of studying the Awkward Zombie puzzle about a square inscribed in a circle inscribed in a square inscribed in a circle inscribed in a square will help you in working out the area left behind when a cylindrical tube is drilled out of a sphere. Setting up an approach that will, given enough work, get you a correct solution is worth knowing how to do, especially if you can give the boring part of actually doing the calculations to a computer, which is indefatigable and, certain duck-based operating systems aside, pretty reliable. That doesn’t mean you don’t feel dumb for missing the recasting.

    Rick Detorie's _One Big Happy_ for 3 May 2014: Joe names the whole numbers.

    Rick DeTorie’s One Big Happy (May 3) puns a little on the meaning of whole numbers. It might sound a little silly to have a name for only a handful of numbers, but, there’s no reason not to if the group is interesting enough. It’s possible (although I’d be surprised if it were the case) that there are only 47 Mersenne primes (a number, such as 7 or 31, that is one less than a whole power of 2), and we have the concept of the “odd perfect number”, when there might well not be any such thing.

    • elkement 9:33 am on Friday, 9 May, 2014 Permalink | Reply

      I think I was once introduced to the puzzle of the fly and the bicyclists with Landau as the main protagonist ;-)


      • Joseph Nebus 4:16 am on Saturday, 10 May, 2014 Permalink | Reply

        Oh, that’s an interesting variant. I don’t think I’ve encountered it with Landau. I seem to remember a version of it with Robert Oppenheimer.


  • Joseph Nebus 5:00 pm on Saturday, 21 September, 2013 Permalink | Reply
    Tags: baseball, , , , ,   

    Reading the Comics, September 21, 2013 

    It must have been the summer vacation making comic strip artists take time off from mathematics-themed jokes: there’s a fresh batch of them a mere ten days after my last roundup.

    John Zakour and Scott Roberts’s Maria’s Day (September 12) tells the basic “not understanding fractions” joke. I suspect that Zakour and Roberts — who’re pretty well-steeped in nerd culture, as their panel strip Working Daze shows — were summoning one of those warmly familiar old jokes. Well, Sydney Harris got away with the same punch line; why not them?

    Brett Koth’s Diamond Lil (September 14) also mentions fractions, but as an example of one of those inexplicably complicated mathematics things that’ll haunt you rather than be useful or interesting or even understandable. I choose not to be offended by this insult of my preferred profession and won’t even point out that Koth totally redrew the panel three times over so it’s not a static shot of immobile talking heads.

    (More …)

    • elkement 10:30 am on Sunday, 22 September, 2013 Permalink | Reply

      My favorite is the weather joke – as I had once really been asked ‘as a science’ expert what those probabilities quoted on forecast websites do mean.


  • Joseph Nebus 3:16 am on Monday, 22 July, 2013 Permalink | Reply
    Tags: baseball, , ,   

    Distribution of the batting order slot that ends a baseball game 

    The God Plays Dice blog has a nice piece attempting to model a baseball question. Baseball is wonderful for all kinds of mathematics questions, partly because the game has since its creation kept data about the plays made, partly because the game breaks its action neatly into discrete units with well-defined outcomes.

    Here, Dr Michael Lugo ponders whether games are more likely to end at any particular spot in the batting order. Lugo points out that certainly we could just count where games actually end, since baseball records are enough to make an estimate from that route possible. But that’s tedious, and it’s easier to work out a simple model and see what that suggests. Lupo also uses the number of perfect games as a test of whether the model is remotely plausible, and a test like this — a simple check to whether the scheme could possibly tell us something meaningful — is worth doing whenever one builds a model of something interesting.


    God plays dice

    Tom Tango, while writing about lineup construction in baseball, pointed out that batters batting closer to the top of the batting order have a greater chance of setting records that are based on counting something – for example, Chris Davis’ chase for 62 home runs. (It’s interesting that enough people see Roger Maris’ 61 as the “real” record that 62 is a big deal.) He observes that over a 162-game season, each slot further down in the batting order (of 9) means 18 fewer plate appearances.

    Implicitly this means that every slot in the batting order is equally likely to end the game — that is, that the number of plate appearances for a team in a game, mod 9, is uniformly distributed over {0, 1, …, 8}.

    Can we check this? There are two ways to check it:

    • 1. find the number of plate appearances in every game…

    View original post 652 more words

  • Joseph Nebus 12:37 am on Tuesday, 29 January, 2013 Permalink | Reply
    Tags: baseball, , NBA, ,   

    Reblog: Lawler’s Log 

    I don’t intend to transform my writings here into a low-key sports mathematics blog. I just happen to have run across a couple of interesting problems and, after all, sports do offer a lot of neat questions about probability and statistics.

    benperreira here makes mention of “Lawler’s Law”, something I had not previously noticed. The “Law” is the observation that the first basketball team to make it to 100 points wins the game just about 90 percent of the time. It was apparently first observed by Los Angeles Clippers announcer Ralph Lawler and has been supported by a review of the statistics of NBA teams over the decades.

    benperreira is unimpressed with the law, regarding it as just a restatement of the principle that a team that scores more than the league average number of points per game will tend to have a winning record in an unduly wise-sounding phrasing. I’m inclined to agree the Law doesn’t seem to be particularly much, though I was caught by the implication that the team which lets the other get to 100 points first still pulls out a victory one time out of ten.

    To underscore his point benperreira includes a diagram purporting to show the likelihood of victory to points scored, although it’s pretty obviously meant to be a quick joke extrapolating from the data that both teams start with a 50 percent chance of victory and zero points, and apparently 100 points gives a nearly 90 percent chance of victory. I am curious about a more precise chart — showing how often the first team to make 10, or 25, or 50, or so points goes on to victory, but I certainly haven’t got time to compile that data.

    Well, perhaps I do, but my reading in baseball history and brushes up against people with SABR connections makes it very clear I have every possible risk factor for getting lost in the world of sports statistics so I want to stay far from the meat of actual games.

    Still, there are good probability questions to be asked about things like how big a lead is effectively unbeatable, and I’ll leave this post and reblog as a way to nag myself in the future to maybe thinking about it later.


    Ben Perreira

    Lawler’s Law states that the NBA team that reaches 100 points first will win the game. It is based on Lawler’s observations and confirmed by looking back at NBA statistics that show it is true over 90% of the time.

    Its brilliance lies in its uselessness. Like NyQuil helps us sleep but does little to help our immune systems make us well, Lawler’s Law soothes us by making us think it means something more than it does.

    Why is it so useless, one may venture to ask?


    This is a graphical representation of Lawler’s Law. Point A represents the beginning of a game. This team (which ultimately wins this game) has roughly a 50% chance of winning at that point. As the game goes on, and more points are scored, the team depicted here increases its chance of victory based on the number of points it has scored. Point B…

    View original post 142 more words

  • Joseph Nebus 5:08 am on Sunday, 27 January, 2013 Permalink | Reply
    Tags: baseball, baseball game, , pigeon hole principle, , ,   

    Trivial Little Baseball Puzzle 

    I’ve been reading a book about the innovations of baseball so that’s probably why it’s on my mind. And this isn’t important and I don’t expect it to go anywhere, but it did cross my mind, so, why not give it 200 words where they won’t do any harm?

    Imagine one half-inning in a baseball game; imagine that there’s no substitutions or injuries or anything requiring the replacement of a batter. Also suppose there are none of those freak events like when a batter hits out of order and the other team doesn’t notice (or pretends not to notice), the sort of things which launch one into the wonderful and strange world of stuff baseball does because they did it that way in 1835 when everyone playing was striving to be a Gentleman.

    What’s the maximum number of runs that could be scored while still having at least one player not get a run?

    (More …)

    • Rocket the Pony (@Blue_Pony) 3:44 am on Monday, 28 January, 2013 Permalink | Reply

      I’m not certain enough of the rules to be sure this would work, but… What if 24 runs had been scored, and the bases were loaded, with the unlucky #9 batter on third base. The batter at the plate gets a hit that bounces all over the place, staying fair, and the outfielders stumble all over themselves trying to retrieve it, kind of like when we play on Spindizzy. The unlucky #9 batter fails to tag home plate, but thinks that he has, trotting off to the dugout. Meanwhile, the other three runners score before the defending team can get the ball to home plate to tag #9 out. Would that work?


      • Joseph Nebus 8:56 pm on Monday, 28 January, 2013 Permalink | Reply

        I’m not sure. I think that it goes against the spirit of “no freak events”, since a runner missing a base is a fairly abnormal event. But allowing it as the sort of glitch that does happen often enough not to send people running to the rulebooks to find out whether it even is a rule …

        I don’t know. I’m fairly confident that this would put the unlucky runner out, but whether the runs that came in after he missed home plate count or whether they’re voided I’m not sure. I could certainly see a trivia book or column a la Ripley’s claiming there were 27 runs scored in that fateful inning even if the last three were annulled, though.


      • Joseph Nebus 6:11 am on Tuesday, 29 January, 2013 Permalink | Reply

        OK, per D F Manno in alt.fan.cecil-adams, if Unlucky #9 fails to touch home plate, then, he’d be out and neither his run nor the ones after him would count.

        However, it is not an automatic thing: per rule 7.10(d), the defending team would have to tag home plate and appeal to the umpire before the next pitch is thrown or any play (or attempted play) made. (See my comments about stuff being done as if it were still 1835.)

        If the defending team doesn’t tag the plate, or doesn’t appeal the play in time, or the umpire doesn’t agree the runner missed the base, though, then the run counts, which does spoil the setup about Unlucky #9 not getting a run.


  • Joseph Nebus 3:38 am on Thursday, 5 January, 2012 Permalink | Reply
    Tags: attempts, baseball, Bernoulli, , , , , ,   

    From Drew Carey To An Imaginary Baseball Player 

    So, we calculated that on any given episode of The Price Is Right there’s around one chance of all six winners of the Item Up For Bid coming from the same seat. And we know there have been about six thousand episodes with six Items Up For Bid. So we expect there to have been about six clean sweep episodes; yet if Drew Carey is to be believed, there has been just the one. What’s wrong?

    Possibly, nothing. Just because there is a certain probability of a thing happening does not mean it happens all that often. Consider an analogous situation: a baseball batter might hit safely one time out of every three at-bats; but there would be nothing particularly odd in the batter going hitless in four at-bats during a single game, however much we would expect him to get at least one. There wouldn’t be much very peculiar in his hitting all four times, either. Our expected value, the number of times something could happen times the probability of it happening each time, is not necessarily what we actually see. (We might get suspicious if we always saw the expected value turn up.)

    Still, there must be some limits. We might accept a batter who hits one time out of every three getting no hits in four at-bats. If he got no runs in four hundred at-bats, we’d be inclined to say he’s not a decent hitter having some bad luck. More likely he’s failing to bring the bat with him to the plate. We need a tool to say whether some particular outcome is tolerably likely or so improbable that something must be up.

    (More …)

Compose new post
Next post/Next comment
Previous post/Previous comment
Show/Hide comments
Go to top
Go to login
Show/Hide help
shift + esc
%d bloggers like this: