clock menu more-arrow no yes

Filed under:

Understanding baseball analytics is easier than ever before

New, 6 comments

If you’re curious about how they work and why they’re useful, read on.

DKRZ Supercomputer Crunches Climate Data Photo by Morris MacMatzen/Getty Images

McCovey Chronicles will be covering news from around the league all season long with our new daily MLB Chronicles column.

You’ve probably seen the a lot of Statcast data integrated into local and national telecasts of games. Launch angle, exit velocity, sprint speed, and defensive outs above average all make for good, graphically overlayed TV on replays.

If you’re looking for a little bit more, because it’s an off day and you’re bored or because your favorite team is really bad and is likely to be really bad for a couple more years, then definitely take some time to familiarize yourself with MLB.com’s Baseball Savant, which houses all publicly available Statcast data.

The landing page is pretty straightforward: categories, the “Gamefeed” links to each scheduled game for the day, and then a grid of the trending players. The search option in the categories bar lets you get really granular with that stuff, like how figuring out how much weak contact Buster Posey had last season, for instance, and if you don’t want to get into all that, then just stick with the basic Gamefeed.

The Gamefeed gives you all the data for that game in real time. So, if you had had the Giants-Nationals game open as it was going on, then you would’ve seen a live update of Buster Posey’s pinch hit groundout, which was the #13 on the hardest hit balls of the game (34 in total were put into play).

The trending players grid takes you to their player page, which can be sorted for all the basic stats, but then also their Statcast data as well as their rankings compared to the rest of the league. So, the #1 player as I look at the trending grid right now is Franmil Reyes. Go to his player page and you’re given this nifty graphic:

So, before even diving into the numbers, I can see that he hits the ball really hard and based on all that, the Statcast system expects him to be near the top of the league in most run-generating categories. Again, just based on batted ball data.

If this all feels very intimidating, consider that the nerds working on this system are also working really hard to make it more accessible. I can generally understand what’s going on, and if you’ve been reading the site since I took over from Grant, then you’ve caught on to my being one of the dumbest people to have ever lived.

Sure, stuff like wOBA and xwOBA and xwOBAcon look and sound hopelessly convoluted, but they’re just trying to evaluate a bunch of different parts of a player’s skill set simultaneously.

In this particular trio of categories, weighted On Base Average is an all-encompassing stat. It’s not just “on base percentage”, it weights the value of each event that leads to a batter getting on base. So, it takes into account walks, intentional walks, hit by pitches, and sacrifice flies. xwOBA is expected weighted On Base Average takes wOBA and adds in Statcast data — specifically, “exit velocity, launch angle and, on certain types of batted balls, Sprint Speed.” xwOBAcon is a very specific derivation: it is the expected Weighted On Base Average of just the contact, so it excludes walks and hit by pitches, but still uses Statcast data on the contact.

All of this is premised on the idea that hitting the ball hard leads to offense. Let’s see how that holds up to scrutiny. Statcast also has a league leaderboard for the teams. The top 5 teams in weighted On Base Average (wOBA) are:

  1. Dodgers .371
  2. Mariners .355
  3. Braves .354
  4. Cubs .351
  5. Brewers .349

The top 5 teams by runs scored:

  1. Mariners, 132
  2. Dodgers, 122
  3. Athletics, 104
  4. Cubs, 103
  5. Mets, 103

That’s pretty close. And, keep in mind, the Mariners and A’s have way more plate appearances than the rest of the league at this point in the season because they started the year earlier. The Dodgers are right behind the A’s, though, (786 PA to Oakland’s 788) because, well, you remember how they opened the season: with a home run carousel.

The Brewers are 8th in MLB with 99 runs scored, the Braves are 12th with 91. And when we look at the expected weighted On Base Average, the top 5 looks like this:

  1. Dodgers .361
  2. Braves, .356
  3. Twins, .354
  4. Yankees, .352
  5. Nationals, .351

So, just based on how hard players have been hitting the ball, these teams are expected to be this great at generating runs. We know walks and strikeouts matter in the run creation game, which is why the Twins, Nationals, and Yankees didn’t appear in the wOBA or Runs Scored Top 5, but just based on the quality of contact, these are the five best hitting teams in baseball at the moment. That’s not too surprising, all things considered.

That doesn’t mean these are the five best offenses, but these are the most fearsome collections of mashers as recorded by Statcast. Again, at the moment. The bottom 5? Is this surprising?

  1. Reds, .269
  2. Rockies, .272
  3. Giants, .283
  4. Orioles, .289
  5. Cleveland, .299

The bottom five in runs scored:

  1. Miami, 48
  2. Detroit, 54
  3. Cincinnati, 58
  4. Pittsburgh & Colorado, 59
  5. Giants, 61

So, again, pretty close. We know the weirdness of a 9-inning game can explain why these charts don’t line up 1:1 (bad calls, replays, park dimensions, human error), but look at how much the data tells you that the bad baseball you’ve seen is actually bad.

To finish the set, here’s the top 5 for xwOBAcon:

  1. Nationals, .428
  2. Mets, .420
  3. Yankees, .420
  4. Rays, .413
  5. Braves, .412

Well, look at that. All those teams have very healthy offenses, and if they’re not at the top of the charts in terms of Runs Scored, they’re right there in terms of creating run opportunities. The quality of contact can matter a whole lot, especially if you’re a team that doesn’t get on base in other ways.

That bottom five, then, looks both a little surprising and not surprising:

  1. Reds, .319
  2. Angels, .327
  3. Rockies, .334
  4. Orioles, .335
  5. Giants, .340

Yeah, the Angels outside of Mike Trout and with Justin Upton out of the lineup, really struggle to hit the ball with authority. and when you look at their wOBA (.288 - 8th worst) you can see that overall, it’s not a good offense.

Interesting note about the Giants’ situation. Their .340 xwOBAcon is significantly higher than their wOBAcon — that is, their actual weighted On Base Average based on the quality of contact — of .313. This has to be because of Oracle and why they should absolutely considering redefining the dimensions.

In any case, if you want to learn something new in this long baseball season, give Statcast a try. It might not make you feel better about the direction of your favorite team, but then again, it might actually provide some relief that you’re not just imagining things one way or the other.