
How Unlucky am I?
Written 08-27-2024
Introduction
As mentioned in a few other places around this website, the things in life that have caught my interest the most are those with a large, but finite, set of possibilities guided by seemingly arbitrary constraints. In my post about ADV Draft, I bring up a few examples: Century Pong, card games like Hearts, and, of course, ADV Draft itself. Baseball falls somewhat neatly into this box -- the game itself is guided by a set of arbitrary rules that present a finite set of possibilities.
Every baseball that has ever left a bat can be quantified by a few numbers: launch angle and exit velocity. If we want to advance further, we can look at the angle compared to the center field line and the temperature and density of the air the ball cuts through. Every pair of numbers within these coordinates has the ball landing in a specific spot, leading to baseball almost being a deterministic game.
Almost, however, is the key word here. Fielder positioning and the fielders themselves are prone to error and create the excitement often seen in baseball. Batter sprint speed and baserunning ability adds more variables -- two baseballs with the same coordinates mentioned above in the same ballpark can have different results. Hitting itself is also not deterministic, even though pitching is once the ball leaves the pitcher’s hand. Some of the most exciting moments off the mound in all of baseball are the battles of batter versus baseball, hurled from one fielder to another, in a dead heat to a base. On it, a hitter getting put on one knee by a wicked pitch or the triumph of a great hitter in a tough matchup, like Barry Bonds getting the best of Eric Gagne in that famous at-bat, help close the loop on a truly fascinating game.
These diverse and exciting moments can, however, be boiled down into outcomes. Baseball is a game with rigid scoring. Millions of the coordinate pairs mentioned above are always home runs. Millions of plays with slightly different fielder positioning and ball landing spot have resulted in doubles. No matter how many pitches it takes or hits and walks are allowed, if a game ends with a team scoring zero runs it is a shutout. Some of these outcomes, of course, are much rarer than others -- at the time of writing, according to Baseball Reference (which we’ll be leaning on pretty heavily during the course of this post), there’s been over 414,000 double plays, but only 736 triple plays in MLB history according to SABR’s Triple Plays Database. There’s been 24 (although it should be 25) perfect games, and 325 no-hitters (although it should be 326).
Armando Galarraga’s 28-out perfect game should be rewritten as a perfect game. This is no secret and not up for debate outside of the worry about altering history. Everyone who has seen the play, including umpire Jim Joyce, knows the call was blown. However, the 326 comment I made earlier is an opinion I hold on my own and not at all based in fact (every call was correct).
When I visited Globe Life Field to watch the Astros take on the Rangers, I witnessed Astros' pitcher Framber Valdez pitch a 26-out no-hit bid. The 27th out, Josh Smith, came up to bat, the count went full, the 3-2 pitch was thrown... and a walk was issued. The next batter, Corey Seager, hit a no doubt, 30/30 home run on the next pitch. No-hit bid ruined. I would love nothing more than to witness a no-hitter someday, and I came quite literally as close as I possibly could to witnessing one without doing so: one strike away. At that time, I could only think: How unlucky am I?
Let's Quantify It!
First, I want to expressly state that this is an exploration into probability. I’m not considering several variables that very likely contributed to the situation such as pitch count, fatigue, weather conditions, times through the lineup, etc. I am solely considering the number of games in which something occurs against the entire sample of historical MLB games. Second, I am treating the probability of this happening to me as a series of independent events for the purposes of this exercise. By virtue of Valdez pitching for both Corey Seager’s home run and Josh Smith’s walk, these events are not fully and truly independent. Third, the historical sample of MLB games I will compare against is the live ball era outside of the current season: 1920 to 2023, with data from Sports Reference and Retrosheet. This is a sample of 183,857 regular-season games. Finally, the analysis I’m doing does not take eras into account -- different events are going to have different likelihoods of occurring in different eras. However, with a collective sample of over one-hundred years of baseball, these fluctuations in probability come out in the wash.
Now, as I just mentioned, there are two distinct events that make what I witnessed truly unlucky - a batter reaching base safely with a two-strike, two-out count (i.e. by a walk, hit by pitch, or error) in the ninth inning, and the hit to break up a no-hitter on the 27th out. The tantalizing closeness of being one strike away, followed by a second opportunity to seal the deal extinguished. By taking the probability of these two independent events happening concurrently, we can find the likelihood that I would witness the tragedy I did in Globe Life field on August 6th, 2024. I am not considering extra-inning games in this analysis, as I am focused on the traditional 27-out no-hitter.
The Math
Josh Smith reached base safely on a 2-strike count in the ninth inning with two outs. Baseball Reference has the following information available for plate appearances with 2 outs in the 9th inning since 1920:
35,270 walks. 7,555 with a two-strike count, 18,028 with unknown count.
5,089 intentional walks. 10 with a two-strike count, 2,679 with unknown count.
4,387 times reached on error. 649 with a two-strike count, 2,774 with unknown count.
2,283 hit by pitches. 557 with a two-strike count, 870 with an unknown count.
37 times reached on catcher interference. 7 with a two-strike count, 21 with an unknown count.
372,568 total plate appearances.
For events with an unknown count, we can apply the percentage of two-strike instances of the event to the unknown number to get an estimate. For example, of the 35,270 walks that fit our description, 7,555 (21.42%) occur with two strikes, meaning that we can estimate 3,862 of them occurred on a 3-2 count, making our total two-strike walk count 11,417. This gives us a total of 13,271 plate appearances that match our criteria.
13,271 divided by 372,568 total ninth-inning, two-out plate appearances between 1920 and 2023 gives us a probability of 3.56% of the batter reaching base safely on a two-strike count. However, the worst was yet to come.
Corey Seager broke up the no-hitter in the ninth inning with two outs. For this, we don’t have to worry about count since Seager blasted it on a 0-0 pitch -- we're just looking for games where this occurred. Luckily, I stumbled across a blog that had the exact relevant information I needed, up to date as of the very game I’m talking about. According to this source, including combined no-hit bids where multiple pitchers collectively held a team hitless up to this point, this has happened 55 times in the time range we’re looking at. 55 divided by our 183,857-game sample gives us a probability of 0.03% that a no-hit bid is broken up in the ninth with two outs.
Combining these two events gives us a probability of 0.00107%, or around 1 in 93,721. Given the fact the sample we took spans 103 years, this is statistically likely to happen once every 53 years. With a basic estimate of a baseball game lasting two and a half hours, it would take me around 26 years and 9 months to go to 93,721 baseball games -- consecutively, with no breaks for human function such as sleep or work. And I was fortunate enough to be able to witness it in less than 40! It’s cruel, in a way, given that a no-hitter (and by extension, a perfect game, although this is unrealistic) is the single thing I want to witness most at a baseball game, and I managed to witness something nearly six times less common.
Other Baseball Events
The Valdez almost-no-hitter had me think about other cool things I’ve witnessed at baseball games. I’ve witnessed some rarer things such as an inside-the-park home run and more mundane things like a triple. This prompted me to do some digging: what is the probability that, given I am headed to a baseball game, I see a certain event?
For this analysis I’ll be using the 2015 through 2023 regular seasons as my sample, also known as the StatCast Era. This gives us a sample size of 20,334 games. The data is extremely complete for these games, and even with rule changes such as extra-inning ghost runners, bigger bases, and the pitch clock coming in the middle of this sample I think it provides a good basis to compare against. Using a more current sample relates more to my ballpark experience, which realistically started in 2023.
So, what are the odds that I go to a baseball game and see a...
Event | Probability | Game Total |
---|---|---|
Single | 100% | 20,334 |
Double | 95.67% | 19,454 |
Home Run (Any Team) | 88.81% | 18,059 |
Home-Team Home Run | 67.24% | 13,672 |
Triple | 26.64% | 5,417 |
10+ Strikeouts from Single Pitcher | 9.72% | 1,977 |
Extra Innings | 8.38% | 1,704 |
Walk-Off | 8.22% | 1,671 |
18+ Team Hits | 2.52% | 513 |
Complete Game Shutout | 1.09% | 221 |
Inside-the-Park Home Run | 0.51% | 103 |
Digging into this table made me realize I had witnessed another shockingly unlikely event in MLB history -- the extra-innings walk-off walk, which I saw with my buddy Lucas at Dodger Stadium. This has happened 313 times in the live-ball era sample we looked at earlier, for a likelihood of 0.17% to go to a random ballgame any time in the last century and see it. This isn’t even considering the fact the game went into the 12th inning. How unlucky am I?
Using the table above, and the estimate that I’ve been to around 40 baseball games, I can also calculate the likelihood that I would have seen at least one...
Event | 40-Game Probability | 50% Likely in... |
---|---|---|
Single | 100% | 1 Game |
Double | 100% | 1 Game |
Home Run (Any Team) | 100% | 1 Game |
Home-Team Home Run | 100% | 1 Game |
Triple | 100% | 3 Games |
10+ Strikeouts from Single Pitcher | 98.33% | 7 Games |
Extra Innings | 96.98% | 8 Games |
Walk-Off | 96.76% | 9 Games |
18+ Team Hits | 64.02% | 28 Games |
Complete Game Shutout | 35.41% | 64 Games |
Inside-the-Park Home Run | 18.38% | 137 Games |
In the above, I also included the number of games that it would take for me to have a 50% chance of seeing the event occur at any one of them. This is a simple fact of probability -- the more “tries” one takes towards an independent event, the more likely the event is to occur at least once in the sample of tries. There are rarer events I could take this dive into, and I’ll want to do this at some point in a separate post. However, the events I’ve listed above all share a common trait: I’ve witnessed them. Whether it was a new stadium as a part of the chase or at one of my home parks, I’ve been able to bear witness to some cool things as a baseball fan thus far.
If I want to finally see that no-hitter, which has occurred 245 times in our 1920 to 2023 sample for a 0.133% chance of occurring per game, I simply must go to more games. I’ve seen the extra-innings walk-off walk with around the same chance of occurring, after all. If I go to 520 games, I would have had a 50% chance of seeing one up to that point. So, I guess only time has the answer to my question: How unlucky am I?
Acknowledgements
I would like to thank Stew Thornley for aggregating and talking to me about his "Lost in the Ninth" post, and the various baseball tools available to me for this analysis (Baseball Savant and Baseball Reference's Stathead).