What's Wrong With OPS
The metric "OPS" (On-base Plus Slugging) is commonly used as a measure of both player and team performance. While it is clearly better than the other "standard" metrics, it is far from being a perfect, or even really good, measure, and it's worth understanding how and why it comes up short.
1. It adds things that should be multiplied.
The basics of run-scoring are to first get a man on base, then drive him in. (One conceptual failure in many run-scoring analyses is a failure to recognize batters as being, in essence, a runner on zeroth base, who can be driven in by a home run.) There are, that is, two components to the process: an on-base event and an RBI event. To get the probability of two discrete events both occurring, one multiplies the individual properties of those events. OPS is able to approximate the correct approach only because the actual range of most statistics in baseball is rather small--no one bats .003 or .792 over a season. But the difference can nevertheless be significant, especially in comparisons. Consider two players, with these stat lines:
Player PA BB AB H TB
A 600 24 576 156 230
B 600 60 540 180 243
(We assume, for simplicity, no SF, SH, HB, or CI.)
Thus, Player A has an OBA of .300 and SA of .400; Player B has an OBA of .400 and SA of .450.
OK, Player B is obviously much better than Player A--but how much better? Let's call the product of OBA times SA the "OTS" (On-base Times Slugging):
Player A: OPS = .700; OTS = 0.120
Player B: OPS = .850; OTS = 0.180
OPS says B performs about 21.4% better than A. But OTS says B performs 50% better. Quite a difference! (And one that, while extreme, is well within the range of plausible real-world values.) So adding instead of multiplying causes the true scale of differences to be compressed, sometimes badly.
2. Slugging Average is a mediocre marker of RBI events.
SA has exactly the same defect as Batting Average: its basis is at-bats, not plate appearances. At-bats are an artificial construct having no relation to anything in the real world. When we are looking for what drives in runners on base, and make the assumption (itself somewhat faulty) that it is hits, and in particular the "size" of those hits, what we need to consider is Total Bases per total Plate Appearances. (That metric, which I have used for decades, I call, with no great ingenuity, "TBA", as parallel with OBA.)
In the example above, by simple arithmetic we find that Player A has an TBA of .383, while Player B has a TBA of .405. Now we can compare them thus (where we use "OXT" for On-base Times TBA, because OTT has other significances, from a player's name to "over the top"):
Player A: OPS = .700; OTS = 0.120; OXT = .115
Player B: OPS = .850; OTS = 0.180; OXT = .162
That suggests that Player B is actually outperforming Player A by about 41%. That is closer to what OTS shows than what OPS does, but is more meaningful yet than OTS.
Another defect is the implicit assumption that Total Bases properly represents RBI events. That is wrong on two counts: first, there are other events that have some RBI value (and remember that "RBI value" actually signifies moving runners along, not necessarily scoring them directly), notably walks; second, the RBI-factor value of hits is not in simple proportion to their bases value. Putting that second in a simpler form, a triple is not 50% more valuable in moving runners along than is a double (3 TB versus 2 TB), but that is what using TB assigns. Getting reasonable multipliers for the true "RBI-event" values for extra-base hits is a complicated matter with no definitive methodology. Nonetheless, if we are simply looking for a better but still not wildly complex metric, TB alone suffices, provided we use it as TBA, not SA.
3. The "compound-interest" value of OBP is understated.
If you look at most "runs created" formulae--virtually all that do not derive from "linear weights"--you will see that at bottom they have that same basic arrangement: on-base rate times total-base rate times plate appearances. Bill James' original "Runs Created" is exactly so--
(H + BB) TB
Runs = ------------- x -------------- x (AB + BB)
(AB + BB) (AB + BB)
--though it is usually just presented as its simplified algebraic equivalent:
(H + BB) x TB
Runs = -------------------
AB + BB
If you look, you see that in the expanded version, the first element is OBA, the second is what I have called TBA, and the third is PA. (This uses the same simplification I mentioned above, to wit, ignoring the minor data SF, SH, HBP, and CI.)
It is, as I have said, correct to postulate that the product of OBP and some RBI-production metric approximates the probability that any given batter will eventually become a run. And, if you then multiply that probability times the total number of men who come to the plate, you get actual runs. But . . . a critical point is that the higher the OBA, the more men that will come to the plate, in a game, in a series, over the season. What is fixed is not number of plate appearances, but number of outs: and the higher the OBA, the less the chance of a given batter making an out, so the more batters it takes to reach the available total of outs. So OBP influences both the probability that a given batter will become a run scored and the number of men who will come to the plate so as to have that chance.
That "compound-interest" effect is the cause of most "runs created" formulae tending to go wrong for teams with great amounts of power. It is not that the usual formulae over-estimate power, but that they under-estimate OBA. And neither OPS nor either of the two better metrics I cited above can correct for that; only a more complete equation that takes into account the effect of OBA on total plate appearances can do the job properly.
OPS becamse popular for the simple reason that it is awfully easy to compute: OBA and SA are both commonly published metrics, for both teams and individuals, and adding them is apple-pie easy. But really, is it so much harder to figure OXT? PA is also a commonly published stat; all that is involved, in this day of hand-held calculators, is to divide TB by PA, then multiply by OBA. (Baseball America, after asking Billy Beane about the formula the A's used: "Then, smiling awkwardly, he adds: 'It had division in it.'")
This FanPost is reader-generated, and it does not necessarily reflect the views of McCovey Chronicles. If the author uses filler to achieve the minimum word requirement, a moderator may edit the FanPost for his or her own amusement.
1 recs |
105 comments
Comments
I think there’s a problem with: “)PS says B performs about 21.4% better than A. But OTS says B performs 50% better. Quite a difference! (And one that, while extreme, is well within the range of plausible real-world values.) So adding instead of multiplying causes the true scale of differences to be compressed, sometimes badly.” and the underlying math.
Comparing .850 to .700 doesn’t make sense because the scale of ability to get and keep a major league job doesn’t start at .000 in terms of OPS. For a position player, it starts at replacement level. WAGging that RL OPS is around. 600, the difference is between .250 and .100 or 150%.
Fred Lewis can stand under my umbrella.
31 May 2007, 21:38 EST - the last time Matteh's career W-L wasn't below .500
We are at war with Los Angeles. We have always been at war with Los Angeles.
Lowering the Quality of Internet Discourse Since 1985™
by S.F. Giangst on Oct 19, 2009 4:00 AM PDT reply actions 0 recs
So?
Whether or not that is a reasonable way to look at things, the base fact remains that OPS seriously—and falsely—contracts the range of variation between players. After all, the same remarks apply to the multiplicative form.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 1:52 PM PDT up reply actions 0 recs
Almost every problem with OPS led to using linear weights and wOBA. I’d just recommend using this.
Still the loving, adoptive father of Hector Sanchez. And who doesn't love switch-hitting catchers with power and patience?
by tedfordfan on Oct 19, 2009 6:12 AM PDT reply actions 0 recs
I wouldn't.
Linear weights has a whole separate set of problems of its own, which I will not digress to analyze. (Anyway, Bill James did it many years ago.)
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 1:55 PM PDT up reply actions 0 recs
Is any of this discussion online somewhere? I’d be interested to read about it.
by Missing Barry on Oct 19, 2009 2:05 PM PDT up reply actions 0 recs
I doubt it.
It was in one of his annuals. Right now, I’m parked in front of my TV watching the Angels complete their self-destruction, but I’ll try to take a look in my library between innings.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 2:47 PM PDT up reply actions 0 recs
Sorry.
I can’t find it by just a quick scan of tables of contents. If I get time later today (which, to be honest, is unlikely), I’ll try more detaied combing. But I remember that James, while staying polite, was rather harsh about his disagreement.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 3:35 PM PDT up reply actions 0 recs
Depends
What are we trying to measure here? Individual hitters or an entire team?
If we’re looking to measure team runs, Linear Weights aren’t great. Base Runs tends to have the lowest RMSE. LWTS works better for individual hitters than BsR or RC- with those dynamic estimators, the “B” and “C” rates can help to overrate good hitters and underrate others. Unless you’re using a theoretical team approach, which James is finally beginning to adopt. Even then, there’s still issues with the framework.
James is notorious for disliking linear weights, but it’s one of the better ways to estimate an individual hitter’s contributions.
by Anticon23 on Oct 19, 2009 3:13 PM PDT up reply actions 0 recs
tl;dr
Short answer: DOESN’T FACTOR IN GAMER-TUDE.
STEVE HOLM! refuses to be the odd man out.
by UnleashTheGore on Oct 19, 2009 7:47 AM PDT reply actions 0 recs
What’s OPS?
Signed Gianta Management
FIRE BRUCE BOCHY NOW!!!!!!
AND TAKE BRIAN SABEAN WITH HIM!!!!!
by 49er16 on Oct 19, 2009 8:09 AM PDT reply actions 0 recs
Making the assumption that this posting’s focus upon OPS is meant to be:
A. Evidence of the Giant’s failings…
B. More specifically, evidence of Sabean’s failings…
My opinion would be that team OPS is a more important than individual OPS. Clearly the Giants were one of the worst run scoring teams in MLB, and at my last glance, the worst team OPS in MLB. So small sample size supports that, and I will trust that Bill James, Sandy Alderson, Paul DePodesta, and Billy Bean did the homework properly and say that this is within expected trends. And since any discussion about runs scored, and OPS, and team building goes back to Moneyball and Sabermetrics, it would be incorrect to use PA instead of AB in statistical analysis since any out is considered to be bad. So your use of PA is bad Moneyball/Sabermetrics.
Based purely upon (runs scored = success) the Giants are an anomaly. The record speaks for itself. The Giants were the 9th best record in MLB this year out of 30 teams. However, I would say that more correctly they were 5th in the NL this year out of 16 teams since these were the teams they played mostly head to head. So, actually I don’t think (runs scored = success) is true. How about:
Team (RS – RA)
____________ = Win%
MLB (RS – RA)
This takes into account runs allowed as being an important ingredient. Since, in a baseball game, the team that scores more runs than the other wins the game, I think this is more accurate. Plus, a positive delta runs will be directly proportional to win%. Since we are now also considering pitching and fielding it is more accurately measures a successful team.
I think the Giants have a huge amount of room for improvement offensively and I am not very encouraged by Bochy’s commitment to more small ball, nor Sabean’s thinking that the addition of a patient hitting role model will help the younger players morph themselves into OBA guys. However, this team is much better than simple offensive analysis tends to say.
If you were simply trying to wonk out on OPS, then fuck it, have a good time. Your post is way too complicated for me.
by toofruss on Oct 19, 2009 9:57 AM PDT reply actions 0 recs
My opinion would be that team OPS is a more important than individual OPS
Just to elaborate on this besides the obvious point that the total team RS is what mattered – when looking at the failings of OPS as a stat team OPS is better than individual player OPS for simple sample size reasons. Teams OPS, due to a much larger sample of PA’s and multiple players with different skills, will tend to be closer to the mean in terms of OBP and SLG as individual components – that is, you won’t find extreme cases of low OBP/high power or high OBP/low power that you find in individuals, and these are the individuals that OPS incorrectly values the most.
Team (RS – RA)
____________ = Win%
MLB (RS – RA)
I don’t know what the exact pythagorean formula people use is, but that’s basically the concept. Run differential is a much, much better predictor of a teams future success (and thus a more accurate reflection of the teams actual skill level) than simple W-L.
However, this team is much better than simple offensive analysis tends to say.
Not sure what you’re trying to say here. If you’re referring to the fact that the Giants are excellent at preventing runs because of their pitching and defense, I think everyone agrees (and we know that’s half of the battle in winning games), but their offense really is as bad as the most extreme “lunatic fringer” thinks. The Giants were dead last in wOBA, and it wasn’t even close. .305 for the team. The next lowest were the Pirates and Padres at .310. The Yankees were at .366. The median NL team average was the Diamondbacks and Cardinals at .324 and .325.
it would be incorrect to use PA instead of AB in statistical analysis since any out is considered to be bad. So your use of PA is bad Moneyball/Sabermetrics
???
by Missing Barry on Oct 19, 2009 10:31 AM PDT up reply actions 0 recs
but their offense really is as bad as the most extreme "lunatic fringer" thinks. The Giants were dead last in wOBA, and it wasn’t even close. .305 for the team. The next lowest were the Pirates and Padres at .310. The Yankees were at .366. The median NL team average was the Diamondbacks and Cardinals at .324 and .325.
I thought I had agreed with this when I said that the Giants have room to improve offensively. I actually think that if they moved close to average offensively that they would be a dominant team. IMHO, the pitching prevented sustained losing streaks, but the hitting prevented sustained winning streaks. If we moved up toward league average in hitting our 8-2 clips would be sustained longer.
The Giants delta runs was pretty decent this year. Not dominant like LA or Philly, but decent.
I just read Moneyball. Outs are bad. The difference between PA and AB are usually events which result in an out. So PA is not good Moneyball.
by toofruss on Oct 19, 2009 10:50 AM PDT up reply actions 0 recs
Well, for the last part most of the difference between PA and AB’s are walks I think, at least for most players. Second, even assuming a lot of them are outs, that helps the point that you need to use PA’s. PA’s take those outs into account, while AB’s does not count them.
by Missing Barry on Oct 19, 2009 11:11 AM PDT up reply actions 0 recs
Hm.
I don’t quite follow the reasoning in some of the comments above.
Runs scored and runs allowed have exactly equal value in determing wins. How wins are derived from R and OR values can be reckoned in a number of ways, all of which are rough approximations to a very complicated mathematical formulation, but approximations plenty good enough to live with.
So PA is not good Moneyball. Say what? I think perhaps I might say that from at least a couple of aspects you are incorrect.
As to the Giants’ offense, based on what they did on the field (as opposed to projections from career stats), they were only 9 runs above projection, which isn’t much; thus, they truly were—excluding sheer chance—the weakest offense in MLB.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 3:05 PM PDT up reply actions 0 recs
I never said the Giants offense was better than it was. I actually said it was the worst. However I did say that a baseball game is won by scoring more runs than the opponent. The Giants did that effectively this year, and based upon the fundamental make-up of the team’s pitching there is no reason to expect them to not be able to repeat this, barring an injury to one or two of the starters.
I did say that the Giants are better than most people give them credit for being, because the focus of these discussions, and the OP, is usually offense. I went on to say that because our pitching and defense are better than average, we can be a dominant team with a few upgrades to the offense.
I also said that I do not have faith that Sabean will be able to pull this off, because he said that the presence of a patient hitter will influence the rest of the younger players. Which leads me to believe he is open minded about that approach, but not committed to making the amount of changes that need to be made. Also, Bochy talked about small ball. Not a good way to start the off-season in my opinion.
Yes, walks and HBP are “non out events”. But bunting and sacrifices, which are PA, are “out events”. An out is bad moneyball. That is one of the fundamentals of moneyball.
It would be better to make a new statistic like “non out events”, instead of using PA. Compromise?
by toofruss on Oct 20, 2009 9:24 AM PDT up reply actions 0 recs
The Giants did that effectively this year, and based upon the fundamental make-up of the team’s pitching there is no reason to expect them to not be able to repeat this, barring an injury to one or two of the starters.
I did say that the Giants are better than most people give them credit for being, because the focus of these discussions, and the OP, is usually offense. I went on to say that because our pitching and defense are better than average, we can be a dominant team with a few upgrades to the offense.
Gotta disagree with all of this, basically. We’re likely to not pitch as well next year. That’s simply the concept of regression to the mean – a team is likely to move in the direction of the average in the future. We also weren’t an 88 win team this year. Focusing on the whole team, we got good situational play (which is generally thought of as out of a teams control, so we can’t count on a repeat performance) in both scoring runs and preventing runs. On average, we should have scored less runs than we did and prevented less runs than we did (meaning we win fewer than 88 games). So I think we’re less good than our 88 win record indicates…
Yes, walks and HBP are "non out events". But bunting and sacrifices, which are PA, are "out events". An out is bad moneyball. That is one of the fundamentals of moneyball.
It would be better to make a new statistic like "non out events", instead of using PA. Compromise?
I think you’re just mixed up right now. We really agree on this point – outs are very bad. Increasing your denominator from AB’s to PA’s makes the offensive stat less good – this is because those outs are bad, so we are factoring the outs in there like you want. Using PA’s gives us the results you’re going for.
by Missing Barry on Oct 20, 2009 9:41 AM PDT up reply actions 0 recs
not to mention that “barring an injury to one or two of the starters” is a pretty friggin’ fragile bar. All teams suffer injuries to a starter somewhere through the season. Given that we don’t have really any good candidate for a 5th starter on the team as of yet (and the FA market is really thin in starting pitching this year) I’d sure hate to see what we’d have to come up with if one of the other 4 went down for any extended period.
My Bucardo is better than yours.
A hot August weekday, before a small crowd, when the only thing at stake is the tissue-thin difference between a thing done well and a thing done ill. Insofar as the clutch hitter is not a sportswriter's myth, it is a vulgarity, like a writer who writes only for money.
by Roger on Oct 20, 2009 1:33 PM PDT up reply actions 0 recs
The point is . . .
. . . that we are not ancient Greeks: we do not settle differences of opinion by the criterion of who argues most glibly. We are moderns, living in the age of science, and we settle differences of opinion in the laboratory: we experiment to see what actually works. Certain measures are quite good at actually deriving—accurately—true, real-world total runs scored by teams from those teams’ basic performance stats. Any metric that does that has demonstrated its validity.
The full metric is too complicated for anything but a computer run, but simplifications of it that can be calculated with relative ease will give numbers that are fairly useful. One such is multiplying OBA by TBA, where TBA is Total Bases per PA. That is, as I say, not something invented ab origino, but rather as a modest simplification of a proven full metric.
(There is a link in one or another of these posts to a graph and discussion of extensive results from the full metric.)
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 4:09 PM PDT up reply actions 0 recs
The vast majority of PA’s that don’t count as AB’s are walks and HBP’s, neither of which results in outs.
HA HA HA LOOK AT ME I'M ALL HAPPY AND STUFF NO REALLY CAN WE STOP WITH THE COOKYMAN IS SAD JOKES?
:-) :-) :-)
by Cookyman on Oct 19, 2009 3:50 PM PDT up reply actions 0 recs
Math hurts my brain.
"Being a McCoven is like being a member of the Green party. It’s powerlessness is part of the appeal." - oldjacket
by scout6 on Oct 19, 2009 10:04 AM PDT reply actions 0 recs
Think about it:
Jack Paar once famously said that he had yet to watch a television program he didn’t like . . . .
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 9:07 PM PDT up reply actions 0 recs
wOBA
Still in despair.
"Use the stencil! Do it!"
konakona:「つかさに教われと...なんか非常に負けたような気がする。」
Shun Kakazu: MOAR JAPANESE PROSPECTS PLZ
by Zetsuboushita on Oct 19, 2009 10:30 AM PDT reply actions 0 recs
I never understood OPS’s popularity when AVG/OBP/SLG tells you so much more with one glance.
by Grant on Oct 19, 2009 10:42 AM PDT up reply actions 0 recs
and the fact that OPS is just OBP+SLG
I R 5
by say hey nation on Oct 19, 2009 10:46 AM PDT up reply actions 0 recs
From The Good Phight
Carlos Ruiz has been great in the post-season, but it hasn’t been consistent. He’s hit well in the World Series. Against the Rays, he posted a .375/.500/.688 triple-slash line for a 1.188 OPS. And he’s been great in the NLCS. Combining this year and last year, he has a .417/.500/625 line for a 1.125 OPS. But, in his three NLDS’s, Ruiz has been almost invisible. He has hit .222/.300/.250 for a .550 OPS over the past three years.
Is giving use both the “slashies” and OPS necessary? I don’t think so.
/isn’t a big deal
I R 5
by say hey nation on Oct 19, 2009 11:49 AM PDT up reply actions 0 recs
Saves us from having to do the addition ourselves…
by Missing Barry on Oct 19, 2009 11:50 AM PDT up reply actions 0 recs
One more glance?
How dare you cut into my laziness!
Also, OPS is pretty much fine on the team level.
Aaron King is still my homeboy... iffy mechanics and all
McFAQ for all you newcomers out there.
GET THAT VORP AND WHIP SH!T OUTTA HERE!!!
by baetown415 on Oct 19, 2009 10:50 AM PDT up reply actions 0 recs
“AVG/OBP/SLG” isn’t catchy enough.
"It's too late now."
by ResDog on Oct 19, 2009 11:07 AM PDT up reply actions 0 recs
Slashies! Check his slashies! What are his slashies?
Slashies!
by Grant on Oct 19, 2009 11:08 AM PDT up reply actions 0 recs
Please rank the 2009 San Francisco Giants by slashies.
Meet my new son: Sundrendy Windster, on the Curacao-SF express (via Arizona).
by EliminateMe on Oct 19, 2009 1:57 PM PDT up reply actions 0 recs
Using the word “slashy” excessively might get you some unwanted search engine hits.
Please hit better, Randy Winn.
by oldjacket on Oct 19, 2009 3:03 PM PDT up reply actions 0 recs
Guys love the OPS cause it gives it up easy.
GROUGTHINK ALERT
The first Chester Arthur fanboy ever.
by groug on Oct 19, 2009 11:08 AM PDT up reply actions 0 recs
Secretly, though, OPS is resentful because we just see the O as another hole. And part of the P. And we can work with the S if we have to.
But it won’t say anything or stop being a big stats slut because it likes being liked too much.
Context, people. More context is good. Less context is bad. If you're willing to be reductive, then you're willing to be wrong.
by howtheyscored on Oct 19, 2009 11:39 AM PDT up reply actions 0 recs
Man, I am happy that I am NOT howies GF!
I R 5
by say hey nation on Oct 19, 2009 11:45 AM PDT up reply actions 0 recs
The best part about screwing around with OPS is that the girlfriend never has to know…
Context, people. More context is good. Less context is bad. If you're willing to be reductive, then you're willing to be wrong.
by howtheyscored on Oct 19, 2009 1:09 PM PDT up reply actions 0 recs
P.S. Infidelity is bad, and I would never partake in it. No matter how many holes happen to be involved.
Context, people. More context is good. Less context is bad. If you're willing to be reductive, then you're willing to be wrong.
by howtheyscored on Oct 19, 2009 1:10 PM PDT up reply actions 0 recs
If you were promised four holes, you wouldn’t be tempted?
GROUGTHINK ALERT
The first Chester Arthur fanboy ever.
by groug on Oct 19, 2009 4:00 PM PDT up reply actions 0 recs
Wait… four? Seriously? I thought three was the max. Oh man… I need to go think for a while.
Context, people. More context is good. Less context is bad. If you're willing to be reductive, then you're willing to be wrong.
by howtheyscored on Oct 19, 2009 4:42 PM PDT up reply actions 0 recs
It requires multiple personalities.
And damned willing ones at that.
Ya know...ignorance really IS bliss.
Well - I do , anyway.
by victor frankenstein on Oct 19, 2009 7:35 PM PDT up reply actions 0 recs
That’s 3 numbers. We like stats that sum it up in 1 number, especially when we’re talking to people face to face instead of on a nerdy blog.
by Missing Barry on Oct 19, 2009 11:12 AM PDT up reply actions 0 recs
Eric Walker?
Aaron King is still my homeboy... iffy mechanics and all
McFAQ for all you newcomers out there.
GET THAT VORP AND WHIP SH!T OUTTA HERE!!!
by baetown415 on Oct 19, 2009 10:33 AM PDT reply actions 0 recs
Yes?
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 1:58 PM PDT up reply actions 0 recs
Just wondering
Aaron King is still my homeboy... iffy mechanics and all
McFAQ for all you newcomers out there.
GET THAT VORP AND WHIP SH!T OUTTA HERE!!!
by baetown415 on Oct 19, 2009 3:42 PM PDT up reply actions 0 recs
That really threw me for a second. I went to High School with an Eric Walker (unrelated).
My Bucardo is better than yours.
A hot August weekday, before a small crowd, when the only thing at stake is the tissue-thin difference between a thing done well and a thing done ill. Insofar as the clutch hitter is not a sportswriter's myth, it is a vulgarity, like a writer who writes only for money.
by Roger on Oct 19, 2009 2:13 PM PDT up reply actions 0 recs
There are an awful lot of Eric Walkers.
As Google will quickly reveal.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 2:19 PM PDT up reply actions 0 recs
It’s popular because it’s stable, easy to calculate and correlates with runs scored on a team level very well. There’s lots of ways to improve on it, but I’d wager that the differences between the various methods aren’t huge.
Please hit better, Randy Winn.
by oldjacket on Oct 19, 2009 11:23 AM PDT reply actions 0 recs
This is explained in detail here
Weighted On Base Average or wOBA
On-base percentage is a great statistic because it tells you something important, and in a clear language: at what rate did this player reach base? It doesn’t tell you how far he reached base (second base? third? home?), but only whether he did or did not.
Slugging percentage is another great statistic because it tells you something important, and in a clear language: how many bases did the batter gain for himself per at-bat? It doesn’t consider walks as either a positive or negative event (it simply strips them away as if they don’t exist). It also tries to establish the importance of the single and HR by weighting the HR four times as much as the single.
We have one statistic that is deficient in one area, and another one that is deficient in another. Why not simply combine them as: OBP plus SLG, and call it this new-age statistic named OPS? Might this statistic allow the deficiencies in OBP and SLG to cancel each other out? Let’s see.
From the preceding section, we know the run values of each event. For example, we know that the run value of the HR is 1.4 runs above average, and 1.7 runs above the run value of the out. In rate measures, like OBP, the value of the out in the numerator is zero. If we recast the run values of the most common events relative to the out (rather than relative to the result of an average plate appearance), we get the following:
HR 1.70, 3B 1.37, 2B 1.08, 1B 0.77, NIBB 0.62.
Those numbers are the values of each of our events (again, relative to an out, which now has a value of zero). If we apply these weights to the statistics of a league-average hitter, and divide by plate appearances, we end up with a rate of almost 0.300. This is a fairly convenient number for an average, but we can do better. Since we like OBP as a measure of a batter’s effectiveness, let’s scale our new statistic so that the resulting values are similar to OBP values. It turns out that, if we add 15% to this 0.300 figure, we get the league-average OBP. Therefore, we will add 15% to the weights of each event and define our new statistic as follows:
(0.72xNIBB + 0.75xHBP + 0.90×1B + 0.92xRBOE + 1.24×2B + 1.56×3B + 1.95xHR) / PA
Note: Depending on the specific analysis, the PA term (plate appearances) may exclude bunts, IBB, and a few of the more obscure plays.
Do we really need another statistic? Yes, we do. Instead of trying to take two statistics (OBP, SLG) and combine and correct their flaws in the hopes of getting one number, we prefer to start from scratch. Furthermore, by recasting the number onto the OBP scale, it makes it much easier for the reader to get a grasp on the number. wOBA is weighted on-base average (we call it an average rather than a percentage). When you look at wOBA numbers throughout the book, just think OBP, and you’ll be fine. In other words, an average hitter is around 0.340 or so, a great hitter is 0.400 or higher, and a poor hitter would be under 0.300.
If you are a little more experienced with run values, you might have figured out the following:
Run value per PA above average = (wOBA for player – wOBA for league) / 1.15
So the run value chart, which we presented in the previous section, and the wOBA statistic defined in this section are directly related.
http://www.insidethebook.com/woba.shtml
Obviously any links in the above post are probably NSFW
by jctGamer on Oct 19, 2009 11:58 AM PDT reply actions 0 recs
The Giants don't rely on OPS ("awps" as Jon Miller pronounces it)
They rely on OOPS.
Neal before Zod!
Official Sponsor of the 1997 San Francisco Giants
by nostocksjustbonds on Oct 19, 2009 12:24 PM PDT reply actions 0 recs
Let's review some basics:
Any measure whatever is just so much tavern-talk fodder unless it can provide absolute, not relative, results. By that I mean that the end product of the metric on a team basis had better be actual runs scored or runs yielded, and that individual men’s performances, so measured, can be combined to yield a team total that—again—equals actual runs scored or yielded. If you cannot readily extract actual runs values from a measure, it is just so much vaporing.
Such measures exist. Some are better than others, though how that is reckoned depends on who is defending which measure, but the essence is that all such absolute metrics give results pretty similar to one another (they have to, else they are scarcely valid, since the output is a number, runs, that can be and is measured against real, knowable results).
OPS is a relative, not absolute, measure. Of the common set of such measures, it is the best, but the point here—which was not specifically aimed at the Giants’ current situation, just as general background—is that “better” is not “good”, and that basing all reckoning, especially of individual men, on OPS can lead to significant mis-evaluations, especially when comparing men.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 2:17 PM PDT reply actions 0 recs
It’s interesting that you say it that way, because I think it goes to the very root of what you’re attempting to do with the statistic. If you’re trying to figure out exactly how much a given player added to a teams total offensive output, your points seem valid to me. I’m not sure that’s always the goal of our statistics, though. Sometimes we simply want to know who the better player is – thinking about it theoretically, if two players played a full season under the exact same conditions – which would contribute more to his team? I’m not sure coming up with absolute results based on runs scored (which is a product of factors that are not equal) adequately answers that question.
Similarly, I also think relative stats as opposed to absolute stats have the potential to better predict what a player will do in the future. Again, if we’re trying to figure out a players true talent level (which is basically the same answer as in my first example) – I think relative stats can take out the noise created from absolute stats (situational hitting, luck, teammates, and other factors beyond a hitters control that show up in total runs scored), and may be a better method for figuring out what a player will do in the future in a given set of conditions.
Thoughts?
by Missing Barry on Oct 19, 2009 2:49 PM PDT up reply actions 0 recs
Good questions.
But I disagree. A basic run-scoring equation will derive from analysis of team scoring. If one then applies such a formula to the stats of an individual player, the result will be, in effect, what a line-up of nine exact clones of that player would normally score in a season. The criticism usually levelled against such measuring is that it is not “contextual”—that is, that the man’s personal OBA interacts with the rest of the team’s TBA, and vice-versa. That is, in a technical sense, true, but the differential is not great.
If one simply weights the individual men’s metrics by playing time and adds them, one typically gets a number that is about 1.5% higher than one gets by doing the correct procedure of separately figuring team OBA and TBA from weighted individual-player stats, then combining those team-level data. That suggests—to me, strongly—that applying the equation derived from team analysis to individual men yields a datum that well reflects both the man’s ability to help his team and serves as a valid and useful way to compare men in the abstract.
It is also as good a predictor of future performance as any other reasonable metric. Indeed, because it is a combining of several aspects of performance, it tends to smooth out annual variations in this or that one stat.
As a sidebar, while performance can be, for calculation purposes, thought of as having two main factors, on-base and power, it is better seen as separated out into three elements, walks, hits, and power (that is, by teasing apart the components of OBA). I look at it this way, to relate stats to actual player performance: walks reflect the man’s ability to judge which pitches to swing at; hits reflect his ability to make good contact with those he does choose to swing at; and power reflects his ability to drive what he does make good contact with.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 3:33 PM PDT up reply actions 0 recs
Hmmm…at this point I’m not sure I follow your methodology. Do you take an individual teams runs scored and then divide credit for each of those runs to players based on playing time, and then the three elements (walks, hits and power) that you discussed? What about that significantly differs from linear weights, other than the fact that linear weights is derived from many years worth of team level data across the league? It seems to me they’re very similar conceptually. If you’re looking for a runs scored end number, it’s easy enough to go from wOBA to RAR (runs above replacement) if you’re into setting a replacement bench level, or you can easily enough derive an aggregate runs scored setting the bench level to 0 from that point.
My first statement was basically thinking along the lines of linear weights as the relative stat as opposed to OPS, by the way.
by Missing Barry on Oct 19, 2009 6:07 PM PDT up reply actions 0 recs
Let me try to clarify.
The runs equation I speak of takes a collection of stats and cranks out the number of seasonal runs that “should” be scored by a team producing at those rates. The actual number of games over which the raw stats are collected is immaterial, because everything is converted to a rate before being used. (Except, of course, for the usual caveat about sample sizes.)
Because the result is always a seasonal run total regardless of quantities of the raw data, the formula can be applied to any coherent collection: a team’s stats for one game, the entirety of a league at any point in a season—or over many years—or, finally, to the numbers for a particular player, whether seasonal or career or whatever. It says that a team playing at that quality of play is thus-and-so good for scoring runs.
(That also has the advantage that the result is intuitively meaningful, because anyone who follows the game understands that, say, 570 runs is a piss-poor season, while 925 runs is pretty good.)
If one uses the formula on the batting roster of a team on a man-by-man basis, then weights those individual “seasonal runs” numbers by percentage of playing time, the net result for the team—as assembled that way, not by using the team stats per se—will be a bit high, usually by something very close to 1.5%. Of course, if one instead weights each man’s OBA and TBA and combines those data for a team OBA and TBA, the result will be correct, since we are back to actual team data. The reason for the discrepancy is that the product of the sums is not, in general, equal to the sum of the products.
To clarify that possibly cryptic remark, consider these two sets of data:
M = A x B
N = C x D
If we average A and C, then average B and D, and multiply those two averages, we will not—except by chance—get the same result as if we average M and N.
6 = 2 × 3
20 = 4 × 5
Average 2 and 4 and you get 3; average 3 and 5 and you get 4; multiply 3 by 4 and you get 12. But average 6 and 20 and you get 13.
To keep it very simple conceptually, assume that each batter who appeared at all got the same number of plate appearances as all the others, so that no weighting is involved: we can just straight average all the individual player numbers produced by the formula. Because each such number is—to oversimplify—the product of two base data, OBA and TBA, such averaging will produce a result different from what you’d get from first separately averaging those two components, then multiplying those two averages (which yields the correct team result).
The crux, though, is that the differential between simple averaging of by-man results and of proper, full calculation is only a very small one, strongly suggesting that the measure is quite satisfactory for rating individual players, even in a “team context”.
I hope that clarified, rather than further muddying . . . .
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 6:47 PM PDT reply actions 0 recs
Oh, and . . .
I keep meaning to remark that I, too, miss Barry.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 6:53 PM PDT up reply actions 0 recs
That made it much clearer, thanks. I’d still like to learn more about potential issues with linear weights, but in general it seems to me like it really wouldn’t be difficult to do some basic math on linear weights (wOBA) to give you a (runs/162 games) rate stat for a player over a given time period that’s basically the same units as that one. The methodology would be very similar, and I think the results would be pretty comparable…
by Missing Barry on Oct 19, 2009 7:53 PM PDT up reply actions 0 recs
You could look it up . . .
If anyone is interested in the accuracy of the particular full formula I use for both team runs and player evaluation, a graph and some discussion is available on line.
Professional baseball analyst since 1980.
by owlcroft on Oct 19, 2009 8:33 PM PDT up reply actions 0 recs
I don’t understand numbers at all. Can I just keep rating players in who I would personally like to be friends with?
Brian Sabean strongly encourages you to disregard the drudgery of your employment responsibilities and join him in the consumption of spirituous libations.
by satyricrash on Oct 19, 2009 10:44 PM PDT reply actions 0 recs
No.
You are only allowed to rate players who would personally like to be friends with you. An affidavit will be required.
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 2:34 AM PDT up reply actions 0 recs
Well that’s ridiculous, because they all want to be my friend. Your methods STINK.
Brian Sabean strongly encourages you to disregard the drudgery of your employment responsibilities and join him in the consumption of spirituous libations.
by satyricrash on Oct 20, 2009 11:13 AM PDT up reply actions 0 recs
HOw do stolen bases and caught steeling fit into the calculations?
by bradleybear on Oct 19, 2009 11:34 PM PDT reply actions 0 recs
That depends.
In the formulation for true team runs scored, CS subtracts out from OB, since it kills off a runner who had reached base successfully. In that same formulation, SB is added in, with a “significance” multiplier (which is about 0.8), to what I have loosely called “RBI events”, since—like hits, and to a lesser extent walks, and to a much lesser extent SH and SF—it acts to move runners already on base along. All in all, though, stolen base tries, successful or not, are not a particularly significant factor in run scoring overall, and in any event the break-even point, whatever it may exactly be, is not much different from the MLB-average success rate. (The full formula appears on the graph linked a post or two above.)
When evaluating individual players, what I (at least) do is set SB and CS to zero, because even for fast, successful stealers they add (or subtract) little. I also set SH to zero, as those are managerially ordered and typically decrease scoring (they also need to be subtracted out from PA).
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 2:33 AM PDT up reply actions 0 recs
NERDS!
/ General Manager Brian Sabean
by kornstar2004 on Oct 20, 2009 1:42 AM PDT reply actions 0 recs
Just a passing word . . . .
I intensely dislike the way Mr. Sabean has assembled his teams, and the way Mr. Bochy has managed them. That does not mean that I think either is some Dark Lord of Baseball, lusting only to do evil. Each is a hard-working man, doing the best he can by his lights. I would like little more than a chance to sit down with either, or both, for a couple of hours of talk; it is my belief—perhaps naive—that any man can be made to see reason if that reason is explained clearly and logically from first principles. Consider ex-player Billy Beane . . . .
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 2:38 AM PDT reply actions 0 recs
Weren’t you on the Giants’ staff for a while? Isn’t there some way you could make this happen?
Aaron King is still my homeboy... iffy mechanics and all
McFAQ for all you newcomers out there.
GET THAT VORP AND WHIP SH!T OUTTA HERE!!!
by baetown415 on Oct 20, 2009 9:47 AM PDT up reply actions 0 recs
With a delorean and crazy dude.
I R 5
by say hey nation on Oct 20, 2009 9:58 AM PDT up reply actions 0 recs
Um . . .
. . . that was a quarter-century ago, and it didn’t last long or work out well. I think Tom Haller already knew he was in trouble and was grasping at any straw to try for help, but he never really paid me much mind. But I did get to go to an All-Star party.
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 4:13 PM PDT up reply actions 0 recs
Random unrelated question to owlcroft (or anyone else that wants to participate). This is something I’ve been struggling with for a little bit of time now – I generally use all Fangraphs methodology for player valuation for the simple fact that it’s free and easy to access. Fangraphs doesn’t give relievers a whole lot of credit for helping teams win games. Essentially their methodology assumes 1 IP = 1 IP, whether it’s a starter in the 2nd inning or closer in the 9th. Now to a degree I buy this, but at the same time, I love win probability things…and the simple truth is the degree an action in the 9th inning of a game affects win probability is much greater than the degree an action in the 2nd (or other early inning) affects win probability.
That would imply that 1 IP does not always equal 1 IP necessarily. Given that relievers (especially good ones) pitch in much higher leveraged situations where win probability fluxuates more, what are your thoughts on their importance to winning games for a team as a whole? Does Fangraphs undervalue them? How do you deal with that?
by Missing Barry on Oct 20, 2009 1:47 PM PDT reply actions 0 recs
Don’t they have some sort of leverage measure on fangraphs?
by Evan on Oct 20, 2009 2:11 PM PDT up reply actions 0 recs
Yes – it’s not factored into WAR calculations, though, as far as I know.
by Missing Barry on Oct 20, 2009 2:22 PM PDT up reply actions 0 recs
I don't know
If FanGraphs implements LI into their WAR for relievers. I do, though. The formula that I recommend (provided by JinAZ):
WAR = (((RPG – LgRPG) / 9 * IP * -1 * pLI) + ((0.07 * LgRPG) / 9 * IP)) / 10
by Anticon23 on Oct 20, 2009 2:24 PM PDT up reply actions 0 recs
What kind of results does that give you? Are good relievers still generally at the 1-2 WAR level or so (or even less)? Do you implement LI with starters? What are RPG and LgRPG?
by Missing Barry on Oct 20, 2009 2:51 PM PDT up reply actions 0 recs
I’m guessing Runs Per Game and League Runs Per Game.
I know you nerds know NOTHING about the real game of baseball, or any other athletic endeavor requiring teamwork under physical stress.
Mr. F! | comics | art | New Nattowear | Unofficial McImage Directory
by Natto on Oct 20, 2009 2:56 PM PDT up reply actions 0 recs
No need to implement LI with a starter; their pLI is always exactly 1.00. As for RPG and LgRPG, that stands for “Runs Per Game” and “League Runs Per Game,” as Natto stated.
Let’s run through an example. Here’s what Brian Wilson’s WAR was for 2009, if we’re using FIP (up to you if you want to use BsR, FIP, tRA or what-have-you):
(((2.50 – 4.20) / 9 * 72.1 * -1 * 2.28) + ((0.07 + 4.20) / 9 * 72.1)) / 10 = 3.34 WAR
So Wilson was a well-above average reliever according to WAR, and it’s his pLI that really brings up his value (led the league in pLI; Jonathan Papelbon came in second at 2.17).
by Anticon23 on Oct 20, 2009 3:37 PM PDT up reply actions 0 recs
Interesting, it seems this method, at least in Wilson’s case, gives him a higher WAR than Fangraphs method does. What’s the p in pLi, by the way, and how is that calculated?
by Missing Barry on Oct 20, 2009 4:48 PM PDT up reply actions 0 recs
pLI: A player’s average LI for all game events.
It’s used via Win Probability.
It’ll give relievers with a higher leverage index (i.e. closers) more credit for pitching in tougher situations, as compared to assuming all relievers enter a game in the same situation- which we know is most certainly not the case.
by Anticon23 on Oct 20, 2009 5:25 PM PDT up reply actions 0 recs
I put it that way so it’d be easier to copy it into Excel, if someone wanted to try it out for themselves. But yeah, it’s a doozy.
by Anticon23 on Oct 20, 2009 3:37 PM PDT up reply actions 0 recs
I don't think so:
[T]he degree an action in the 9th inning of a game affects win probability is much greater than the degree an action in the 2nd (or other early inning) . . . .
There is no weighting scheme in baseball. Games won or lost in early April mean as much as games won or lost “down the stretch”. Runs scored in the second inning are not discounted as compared to runs scored in the ninth inning. If a game is close in the ninth, it’s because teams scored (or failed to score) in earlier innings.
The current mindless focus on “closers” and “saves” (rot in Hell, Holtzman) fails to understand that, and results in the gross misuse of pitching. Indeed, pitching-resource use and misuse is the last major area of the game still wildly out of touch with the basic facts of both baseball and human physiology. If the game is scoreless in the second inning and the bases are loaded with no one out, and got that way not by chance but because the starter obviously has nothing that day, then is when using the best pitcher in your bullpen might really “save” the game.
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 4:19 PM PDT up reply actions 0 recs
Games won or lost in early April mean as much as games won or lost "down the stretch". Runs scored in the second inning are not discounted as compared to runs scored in the ninth inning. If a game is close in the ninth, it’s because teams scored (or failed to score) in earlier innings.
Yeah, that would be the assumption for 1 IP = 1 IP regardless of situation. For games won and lost I think it’s more than reasonable – for it to be true it has to mean that what happens in one game is unrelated to what happens in another game, and I don’t see any reason to question that. I’m not sure if what happens in one inning is independent of what happens in another inning of the same game, though (or at least close enough to independent to ignore). I think I need to give the win probability thing a little more thought, I don’t know what to make of it right now.
As for your second point, for all Bochy’s failings, I actually think he does a good job of using the pen relative to his peers. He isn’t tied to the traditional closer role – he used Wilson late in the game, yes, but brought him in when we needed it instead of just for save situation. He did a good job getting Affeldt in earlier in the game when we needed it. Overall I think he does a pretty good job getting our best bullpen guys into important situations.
by Missing Barry on Oct 20, 2009 4:54 PM PDT up reply actions 0 recs
A player who gets a hit in the bottom of the ninth in a tie game with two outs and a runner on third has done far more to help his team win than a player who gets a hit with two down and nobody on in the sixth of a nine-run blowout. I don’t know why one would want to build this into an estimate of player’s value, since it’s pretty clear at this point that it’s a function not of skill but of opportunity, but it is the reality of things.
“Rot in Hell, Holtzman”? Really?
by Evan on Oct 20, 2009 4:59 PM PDT up reply actions 0 recs
The thing is . . .
“Clutch hitting” seems rather clearly not to exist. There is no point in trying to build something that doesn’t exist into any metric. Events tend to scatter out randomly. Do they always “even out”? No; nor do we expect them to, whence the term “outlier”. But the scatter is still random, demonstrably so.
As to Mr. Holtzman, while I wouldn’t literally wish Hell on any soul, he does have a lot to answer for. Statistics are supposed to record the game, not control it. Any time a stat records something that is not actually a game occurrence, it is dubious. We were—and are—bad enough off with the “win”, which itself too often to at least some extent controls how a manager uses his staff; the “save” has had a drastic, and, I think, terrible effect on bullpen use. (I won’t even speak of the abominable “hold”, or the blessedly departed GWRBI).
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 5:53 PM PDT up reply actions 0 recs
My only thoughts are while situations should tend to even out for hitters like in Evan’s example, for relief pitchers that’s not true. You look at someone like Mariano Rivera and his contributions – his outings, on average, will be more important, more leveraged, than other lesser relievers. My thoughts are a simple model based on something like FIP and IP doesn’t do anything to take this bias, which clearly exists, into account…
Like I said before, I need to really give these issues some deep thought. They trouble me, but I don’t know at the moment if my concerns are valid or not, or whether I ultimately agree or disagree with how we currently value relievers…
by Missing Barry on Oct 20, 2009 8:22 PM PDT up reply actions 0 recs
(No clever post title occurs to me.)
We should value relievers as we do any pitcher: by the quality level of their pitching. The importance of situations in which they are used is immaterial, and not even something they have any control over. Mind, if one wants to believe that some pitchers cannot perform as well in “pressure situations” as in “non-pressure”, that might be another matter, but I for one would be deeply skeptical of such claims for a given pitcher till shown data that meets the 3-SD significance test, and even more skeptical of it as a general proposition about pitching.
“Quality of pitching”, especially for relievers, who often enter games with one or two out, and equally often leave with fewer than three outs, is often hard to correlate with ERA. Over a sufficiently long period of time, the two will fall into good agreement, but for short relievers, that time can be a while, even several seasons. (“Quality of pitching” is readily reckoned by applying the same formulae to pitching results as for batter stats—something that was not possible till not so many years ago, when opponents’-batting-against stats became readily available.)
As a sidebar note, it has long been a theory of mine, which I never generated enough energy to follow out in investigation, that performance, especially pitcher performance, is more a function of outings than of innings. I suspect that on a given day, a given pitcher is likely to either be pretty good or pretty bad, with his long-term results being an averaging. It’s a bit tricky, since if that is so, a man will obviously get more “good-day” innings than “bad-day” innings. It would be amusing to track pitching outing by outing, simply classing a given outing as good, bad, or indifferent (or maybe by five, rather than three, levels), and see how those are distributed.
Professional baseball analyst since 1980.
by owlcroft on Oct 20, 2009 8:45 PM PDT up reply actions 0 recs
But there is a difference
Late-inning relievers will have to face tougher hitters. If it’s a close game in the 9th inning, no pitcher is going to bat for the team that’s behind. The light-hitting SS might even get pulled for a power-hitting pinch-hitter. So unless your pitching metric accounts for quality of hitters faced (and I don’t know of any that do), 1 IP = 1 IP does not hold when it comes to comparing the 2nd inning to the 9th inning of a close game.
That said, I would tend to agree with owlcroft that pitchers should not get bonus points for retiring guys in the 9th inning, unless we can come up with some way of quantifying the value. I don’t see why “leverage” matters. If we don’t believe in “clutch hitting,” why would we believe leverage is important?
by taliesin on Oct 21, 2009 11:31 AM PDT up reply actions 0 recs
Some points by owlcroft that make sense are when evauating a players actual ability (which is what we care about for their future performance), he’s right that pitching situation/opportunities do not matter. Those get in the way of evaluating the quality of the pitcher. Makes sense.
On the other hand, what if we want to know how much each player actually adds to the end result (W-L record)? I still think it’s a relevant point. Adding a great bullpen piece, who you use in the right situations, may add additional value you don’t capture just by looking at his ability in terms of quality (tRA, FIP, or whatever other stat) * quantity (IP) – because you use him in situations where the game is on the line. That doesn’t mean the pitcher himself is better, but it may mean the team adds to their W-L record more than expected.
Again, going back to the 9th inning vs. 1st inning stuff, is each inning independent enough of each other inning that we can just treat them all the same? I don’t have an answer to that. Like I said before, I buy that games are largely enough independent of each other that a W is a W no matter what day or month it is, but I’m still unsure about treating seperate innings in the same game that way.
So unless your pitching metric accounts for quality of hitters faced
I honestly don’t have any clue whether this is a factor or not, but I suspect for the most part the answer to that is it’s largely inconsequential…
If we don’t believe in "clutch hitting," why would we believe leverage is important?
Mind, if one wants to believe that some pitchers cannot perform as well in "pressure situations" as in "non-pressure",
Don’t worry, I’m not trying to argue that there’s some special closer mentality here. I’m just asking whether giving up a run in an earlier inning is as equally important to winning the game as giving up a run in the 9th inning. Earlier in the game, there will still be plenty more opportunities for the team to come back and win (which is why when it’s 1-0 in the 2nd the winning team’s probability of winning is much closer to 50% than when it’s 1-0 in the 8th) – so again, the question is….how independent is each individual inning from each other?
Thanks for the discussion.
by Missing Barry on Oct 21, 2009 1:26 PM PDT up reply actions 0 recs
Not especially . . .
[Y]ou use him in situations where the game is on the line . . . .
As the title says, not especially. Going to the bottom of the ninth with the away team leading by one run does not put the game “on the line” any more than it does going to the bottom of the second with that score. Yes, if the home team scores in the ninth, the game is tied or lost; but if all else is equal so is it lost or tied if they score in the bottom of the second.
The fact is that most of the time, the closer is just being asked to start and get through one inning without giving up more than a run or two; but even asking a man to get through an inning scoreless is not that big a deal: a league average pitcher will necessarily (from simple arithmetic) pitch at least half his innings scoreless, and in reality probably a deal more—if he never gives up more than one run in any inning in which he gives up any runs at all, he will pitch about half his innings scoreless; every time he gives up more than one run in an inning, that means more innings somewhere else in which he gave up zero runs. DTM. Closing per se is not a big deal.
Innings are independent. You can take the final box score and interchange all the innings, or half innings, and the end result is the same.
The game is “on the line” when the makings of a big inning are assembling, regardless of which inning it is: say, men on second and third with no one out and iatop or in the middle of the lineup—that is when you need your best reliever to take the mound, even if it’s the third inning. But that is not how teams use their best relievers (thanks in good part to Mr. Holtzman and his gross misunderstanding of the game).
Professional baseball analyst since 1980.
by owlcroft on Oct 21, 2009 5:59 PM PDT up reply actions 0 recs
Well, no, innings are not fungible. As the visiting team, if you have the lead, you can afford to go down one or more runs in the second or third inning because you’ve still got a handful of frames to get even or go ahead. What you can’t afford to do if you’re up a run in the ninth is to give up two runs and “go behind” because in this context that equates to a loss. Nor is it desirable to give up one run because that generally increases the likelihood of another run, or entering into the dice-roll of extra innings.
Also it should be kept in mind that the correct managerial strategy shouldn’t be to maximize the odds of winning one particular game but maximizing the wins in one season. If the starter “just doesn’t have it that day” and the bullpen is tired, well sometimes you just have to take your lumps, rest some arms and take the loss so your chances in the next few games are much improved. You let the starter get some pitches in, pull him before he’s totally humiliated and put the game in the hands of the long man who almost without exception is in that role because he’s not so good.
Don’t take all this the wrong way, I do agree that the importance of the closer is vastly overrated and that the save as written now is a junk stat.
Fred Lewis can stand under my umbrella.
31 May 2007, 21:38 EST - the last time Matteh's career W-L wasn't below .500
We are at war with Los Angeles. We have always been at war with Los Angeles.
Lowering the Quality of Internet Discourse Since 1985™
by S.F. Giangst on Oct 21, 2009 8:09 PM PDT up reply actions 0 recs
Play 'em one at a time . . .
Innings are fungible after the fact. To say that As the visiting team, if you have the lead, you can afford to go down one or more runs in the second or third inning because you’ve still got a handful of frames to get even or go ahead implies possession of a really, really good crystal ball. No manager can ever know what is to come, and so must play at all times to maximize runs scored and minimize runs allowed.
(The only exception is a half inning that will end the game, such as the bottom of the ninth, or a later inning in a tie.)
The problem of the tired bullpen and playing tomorrow’s game today is another matter. Pitching is still not used in anything like a rational way, but even so relievers are run in and out much too often, mostly for the same reasons doctors order too many expensive tests: so no one can say afterwards “Why didn’t you . . . ?” Not enough rested arms to get through a given game is almost always a sign of poor managerial use of pitching.
As best I recall, the biggest run differential ever overcome for a win was 12 runs—now do you suppose that either team in that game supposed that that was going to happen? As the old press-box saying runs, come out to the ballyard every day for twenty years and you’ll still see something new every time. Today might be the day that a team comes back from down 13. It’s the game without a clock: two out, nobody on, bottom of the ninth, down a lot, you can still win. Unlikely, sure, but in baseball—unlike clock-bound sports—the unlikely is not the impossible.
You play each game as if it means the season, because it might. I get so, so tired of “analysts” talking about this or that late September game costing a team its division, while forgetting all the early April games that they might as likely have won.
Professional baseball analyst since 1980.
by owlcroft on Oct 22, 2009 3:43 AM PDT up reply actions 0 recs
You’re arguing in circles. Pulling your starter in the third in favor of a “closer” just because a potential risky situation occurs results in exactly the wasteful “relievers are run in and out much too often” scenario you descry a couple of paragraphs later. You then have to strip your bullpen to get through the ninth, or go to the long man who’s generally a failed starter or failed set-up man. What you take away by a slightly higher probability of snuffing out the opposition rally early, you give back in spades by having to use the worst part of your pitching corps.
Once again, I have no beef with you, and I’m actually a huge fan of contrarian strategies. I think pitchers are used wrong in a lot of ways also.
I’ve always been a proponent of the infrequently-suggested idea that a team should have “openers” rather than “closers”: 3 low-WHIP, short-recuperation guys who can start a game and go two innings, 55 or 60 times a season. Then go to the bullpen where there are 2 guys fresh, hopefully one lefty and one righty, who can go 6 or 7 innings i.e. the starter’s expected role. If things follow the averages, your “starter” gets to ease in against the bottom of the opposing line-up and will quite frequently have a lead. By switching handedness, you get matchup advantage or force the opposing manager to go to his bench early.
You may now shred…
Fred Lewis can stand under my umbrella.
31 May 2007, 21:38 EST - the last time Matteh's career W-L wasn't below .500
We are at war with Los Angeles. We have always been at war with Los Angeles.
Lowering the Quality of Internet Discourse Since 1985™
by S.F. Giangst on Oct 22, 2009 4:42 AM PDT up reply actions 0 recs
This gets complicated.
No, I’m not arguing in circles. Running relievers in and out, typically on batter handedness, is not the same as pulling a badly failing pitcher, especially a starter.
Your idea is an extremely good beginning: I have been strenuously advocating such a system for 15 or 20 years now. In the AL, with the DH simplifying things, the way to go is an “opener” (I like that term) who pitches to 9 men, then gives way to a “starter-type” who pitches to 18 men, when he gives way to a “finisher” who pitches to 9 more men. Since there are usually more than 36 PA in a game, one can choose whether to have the last man just go a little longer, of whether to bring in another man to wind up the game; this last man, if one is used, is not a standard “closer” but simply a utility guy.
I have asked quite a few people how often a man can pitch to 9 batters, especially if he knows just when he is coming in and exactly how many men he has to face, and never gotten a good answer. My gut feeling is that every other day is too often (though not by sheer seasonal bater or pitch counts), so you’d need 6 opener/finisher types plus 4 “starter” types. If your staff is 13 (because a well-constituted team doesn’t need over a dozen position players), you have 3 spots open; those can be a utility man for closing games or extra-inning affairs, plus two “firemen”—your best pitchers—who come in when there are deep problems and exit as soon as the crisis is over.
At times, that can get a bit strained or out of whack, but it should do so far less often than the current conventional methods, which are still based, even if now only semi-consciously, on the idea that the starter “ought to” finish every game.
There is a very great deal more that could be said, but we’re off-topic enough already.
Professional baseball analyst since 1980.
by owlcroft on Oct 23, 2009 5:58 PM PDT up reply actions 0 recs
But . . .
Late-inning relievers will have to face tougher hitters.
Pinch hitting is over-rated. The upside from, say, the platoon differential is roughly equalled or even exceeded by the downside from coming into the game cold—especially if, as is often the case, the pinch hitter is not just an everyday player who was being rested that day, but rather a bench reserve.
The number of cases in which a batter is replaced by another who is truly superior, even allowing for the “off the bench” effect, is just not a large fraction of all men a closer ends up facing over a season.
(In principle, pitcher merit metrics could be adjusted by quality of batters faced, and for that matter batter metrics by quality of pitcher faced, but the implicit assumption that over any reasonable sample size those factors will largely or wholly even out is a good one.)
Professional baseball analyst since 1980.
by owlcroft on Oct 21, 2009 5:39 PM PDT up reply actions 0 recs
This effect is not just from pinch hitters. Often times managers will use their best non-closer against the toughest part of the lineup, while the long reliever gets used on the B-lineup in blowouts.
I think that for starting pitchers assuming that quality of lineup belongs in the error term is fine, but not for relievers. For example, Romo and Affeldt had to face batters with average OPS of .761 and .773, while Justin Miller faced guys who averaged .723.
Please hit better, Randy Winn.
by oldjacket on Oct 21, 2009 10:31 PM PDT up reply actions 0 recs
Yes, but . . .
. . . does one season for a reliever constitute a “reasonable sample size”? I think not. But whether significant quality-of-opponent differences persist over multiple seasons is a good question, and one that were I still active I’d take the trouble to pursue, now that such detailed data are readily available.
Your point about when and how managers use relievers is in general quite sound, and is part of what I’ve been saying: use your best when it matters most.
Professional baseball analyst since 1980.
by owlcroft on Oct 22, 2009 3:49 AM PDT up reply actions 0 recs
In the NL
It is a dead certainty that this effect is not going to be evened out by sample size, because closers never face pitchers and starters (and long relievers) do. So even if the PH effect is insignificant (and it is probably small), in the National League relievers in close games face tougher hitters on average than starters and mop-up guys. One out of nine guys going from .400 OPS to .730 is surely a large enough effect to matter.
by taliesin on Oct 22, 2009 7:19 AM PDT up reply actions 0 recs
One out of nine guys going from .400 OPS to .730 is surely a large enough effect to matter.
Well, maybe, it’s certainly plausible, but I don’t think we can say without at least looking into it a little bit if it really makes a significant difference or not…
by Missing Barry on Oct 22, 2009 9:10 AM PDT up reply actions 0 recs
Quick stab
A back of the envelope calculation would be thus: an average pitcher has a wOBA of around .200. Take out the silly 1.15 adjustment and he contributes about 0.175 R/PA. An average position player has a wOBA around .330, which means he should contribute about .285 R/PA. So divide by 9 (for 1/9 hitters), multiply by 9 (per 9 innings), and the average closer should, all else equal. have a tRA about 0.11 higher than a starter who pitched just as well.
I think I did that right… Using owlcroft’s OTS yields a similar result: pitcher 0.04, position player 0.13. So either way about a hundredth of a run per inning. Doesn’t seem that significant, I guess — probably results in a blown save every other year or something like that.
by taliesin on Oct 22, 2009 10:31 AM PDT up reply actions 0 recs
There are just too many assumptions I’m not comfortable with in that math to buy it without seeing some data. Owlcroft says PH’ing takes away from a players production. Will the avereage PH’er be an average hitter overall? How often do starters face PH’ers? Are there times when the closer faces lesser competition (better hitting players have been subbed out for some reason)? Does double-switching have any impact on the 1/9 assumption?
Basically, there are a lot of unclear questions that I don’t thik we can answer theoretically, I think to get an answer we’d just have to go to the data and see what it tells us.
by Missing Barry on Oct 22, 2009 10:39 AM PDT up reply actions 0 recs
The difference is remarkable.
The actual study I did, over 15 years ago now, involved finding men who, in the preceding 5 seasons, had had at least two seasons of 300 or fewer plate appearances and at least two seasons of 400 or more PAs. (That way there was no nonsense like a man at 298 PA and another at 307 PA being put in different categories.)
For each such man, his performance in part-time play versus his performance in full-time play was compared. By assuring at least two seasons of each sort, we tend to weed out those men who either were part-time because injured or full-time because pressed into service to replace an injured man.
The differences were simply shocking: 25% to 35% improvement by playing full time. (That is probably why so many scrubs seem to “step up” when a key player goes down—they were better than they had been allowed to show.) Now it does not follow as the night the day that a man who normally plays every day will, if sitting out a given game on the bench, be as far off his norm as that, but you may know the old player saying: “one day off and you know it; two days off and your teammates know it; three days off and the world knows it.” Clearly, players feel that even a day off dulls their batting edge, which—as sensitive as timing is—is not unreasonable.
Professional baseball analyst since 1980.
by owlcroft on Oct 23, 2009 6:09 PM PDT up reply actions 0 recs

by 


















