I think just about anyone reading this is familiar with Fangraphs. One of the neat things about Fangraphs is that they have Win Expectancy charts, showing the likelihood of a team winning based on similar situations (i.e. same run deficit, same inning, and same baserunner situation). Here, for example, is Game 5 of the Reds-Giants NLDS.
The game was tense in the beginning, with neither side having an advantage as both starters kept the opposition off the scoreboard. That changed with a 6 run outburst in the fifth inning, and the Giants become near-locks to win the game. Of course, there was still plenty of excitement left, as Cain nearly surrendered the lead in the 6th, and the Reds again threatened in the 9th.
I've always been curious about the win expectancy in a given series, because ultimately, it's not just about winning the game (unless it's a Game 5 or 7 winner-take-all), but about winning the best-of-five or –seven series. Therefore, I've charted the chance that the Giants would win the series and advance to the NLCS as a function of each event in each game.
A brief word about methodology, and some caveats. Skip below if you just want the data dump. First: I assumed that neither team was a favorite over the other, i.e. that each had a 50% chance to win each game, regardless of home field advantage or personnel on the field. I think the former is a reasonable assumption given that they finished within three games of each other. The latter bit, I think, would be very difficult to impossible to correct for (how much did starting Zito hurt the Giants chances in Game 4?). Nor does it portion out credit for fielding. This is therefore only meant to be a rough guide of the chance of the Giants to advance at each point.
I took the play-by-play win expectancy from Fangraphs playlogs and manually entered them into Excel. I then scaled each individual game's win expectancy to the boundaries of the series. For example: at the end of Game 2, the Giants were losing 9-0, and had a 0% chance of winning the game. They therefore had a series probability of 12.5%, since they had to win the next three in a row (50% * 50% * 50% = 12.5%). If they had been winning 9-0, the series probability would have been 50%, because the Reds and Giants would have been tied 1-1 with 3 games left to play. So a 0-100% win expectancy in Game 2 scaled from a 12.5-50% series probability.
A natural consequence of the probabilities scaling in such a manner is that in any winner-take-all Game, the Series Expectancy is the same thing as the Win Expectancy within the game. Also, events earlier in the series or for the team already ahead in the series, tend to be downplayed. Intuitively, I think that makes a lot of sense. We all know that going up 1-0 in the 1st inning isn't as meaningful as going up 1-0 in the 9th inning. Moreover, that first run is far more important than a sixth run.
Alright, now that I've laid all that out, let's go ahead and take a look at the Series Expectancy. I think the legend should be relatively clear, but for the most part I've tried to note what I thought were the biggest plays at the time, especially plays that would've stood out in anyone following the game.
Since I had all the data available, I decided to also see what the biggest changes in Series Probability added were. These aren't necessarily pro-Giants, but you'll see that most of them are.
1) Crawford RBI triple, Game 5 +20.4% (was surprised this was number 1)
2) Hanigan SO, Bruce CS double play, Game 5 +14.3% (initially tabulated this as two plays since FG split up the plays as Hanigan SO and Rolen SB followed by Bruce CS)
3) Posey grand slam, Game 5, +13.1%
4) Bruce flies out to LF in B9, Game 5, 10.3%
5) Ludwick RBI single in B9, Game 5, -9.4%
6) Blanco 2R HR, Game 4, +8.7%
7) Rolen K’s to end Game 5, +8.5%
8) Phillips 2R HR, Game 1, -8.4%
9) Posey scores on Rolen’s error, Game 3, +8.3%
10) Rolen single to center, Bruce to 2B, Game 5, -7.8%
So 7 of the top 10 are from Game 5. Not surprising, given that it was winner take all.
Let's take a look at some position players, their traditional slash lines in the NLDS, and their series probability added.
Buster Posey: .211/.318/.526 (22 PA), +5.8%
Joaquin Arias: .500/.500/.833 (6 PA), +12.7%
Hunter Pence: .200/.200/.200 (20 PA), -5.8%
Pablo Sandoval: .333/.318/.571 (22 PA), +3.1%
Brandon Phillips: .375/.360/.625 (25 PA), -5.0%
Ryan Ludwick: .333/.455/.833 (22 PA), -9.7%
(The last two were good for the Reds, and their negative scores reflect bad outcomes for the Giants probabilities but good ones for the Reds chances.)
Also noteworthy: Aubrey Huff was the least bad Giants hitter in Game 2, with a 0.000 WPA. Well done, Giants hitters! Also, Arias gets a ton of credit by this method for the Rolen error scoring Posey. That's a good example of one limitation of this method.
And now for some random pitchers:
George Kontos: 3.2 IP, 2 H, 2 BB 0 R, 0 ER, 2 K, + 4.1%
Mat Latos: 8.1 IP, 11 H, 2 BB, 7 R, 6 ER, 5 K, +21.8%
Sean Marshall: 4.0 IP, 0 H, 0 BB, 0 R, 0 ER, 3 K, -8.3%
Santiago Casilla: 3.1 IP, 6 H, 1 BB, 2 R, 1 ER, 5 K, -2.4%
Bronson Arroyo: 7.0 IP, 1 H, 1 BB, 0 R, 0 ER, 4 K, -9.8%
Tim Lincecum: 6.1 IP, 3 H, 0 BB, 1 R, 1 ER, 8 K, +6.2%
In conclusion, Mat Latos, not Buster Posey, was the MVP for the Giants.
Sorry for the formatting mess, and please let me know what you think in the comments!