## Archive for **April 2011**

## Rays Run Rampant

The tales of their untimely demise have been greatly exaggerated, etc., etc. But not just that, this, too: Desmond Jennings is still in the minors awaiting his shot to be the next Evan Longoria. Oh, yeah, and Evan Longoria is still awaiting his rehab assignment and eventual return to the team.

The Rays have been doing this, you see, without Manny Ramirez, Carl Crawford, Carlos Pena, Evan Longoria, multiple pieces of last year’s lights-out bullpen, and their Next Big Thing, Jennings.

And boy, oh boy, have they been doing it, and the words “Red Hot” may not be enough to cover it.

They’ve won five straight.

They blew the Twins out of the water, dropping the Twins to the worst record in the AL and the worst run differential in baseball. Good thing I’m nursing both a hangover and my hopes for Minnesota with a “small sample size” mantra-kinda’ thing: whenever I feel like I’m going to vomit I put my head between my knees and repeat “small sample size, small sample size, SMALL sample size,” over and over until the sensation passes.

As of April 10, the Rays were 1-8, and they had scored 20 runs while allowing 44 (Pythogorean winning percentage of .191).

Terrible.

But from April 11 forward through the 28th, the Rays are 13-3, and they have scored 88 runs while allowing only 46 (Pythogorean winning percentage of .766). Thrilling.

They are good. They are fun to watch. Yes, their home games are still played inside an oversized tuna can. But they are a good team that is stocked with young talent and is fun to watch.

Watch the Rays.

## Quote of the Day, Red Sox Status-Wise

Ben Kabak at *Baseball Prospectus* nails it (several times) but this stood out to me as a quite balanced assessment:

Over their first 23 games, the Sox clearly haven’t been a team hitting on all cylinders. With a collective .237/.329/.372 line contrary to pre-season expectations, with Carl Crawford posting a .200 on-base percentage, and with just one home run from Adrian Gonzalez, this could be a team waiting to erupt. On the other hand, it could be a team vulnerable to left-handed pitching with no offensive production from its catching spot, a weak offensive bench, and streaky pitchers.

The stuff Kaybak writes about Rafael Soriano’s struggles deserves reflection:

Soriano’s no-good, very bad month has served as a disappointing introduction to New York. He has thrown 10.1 innings with some heinous results. He has allowed 12 hits and eight walks while striking out just seven. His hits per nine innings are up to 10.5 from 5.2 last year, and although the 2011 Yanks’ defense is worse than the 2010 Rays’, that can’t account for the entirety of the difference. Soriano’s strikeout totals are exhibiting a downward trend as well. After peaking at 12.1 per nine IP in 2009, he struck out 8.2 per nine innings last year and is sitting on 6.1 this year—a full 3.4 strikeouts below his career average. After giving up 12 earned runs last year and walking just 14 batters, he’s now at nine and eight, respectively, on the season. All of this can be yours for the bargain price of $12 million a year.

The Yankees are proving why paying big for bullpen help is folly. Relief pitching is fungible (in terms of Rauch, specifically, look at this season’s Toronto Blue Jays pitching statistics), or, like running back in the NFL: find some young guys instead of paying big dollars.

## Futility and Sabermetrics

So, there is a new blog at the (Minneapolis) *Star-Tribune* website called Sabermetrics 101.

The first column discussed Pythagorean projections, and, boy, what an uninspiring response.

The first comment, by callmestupid, reads:

I don’t get why that’s cool…..If you already know the outcome….Runs scored for the year and runs against….Then you already know the games won/lost…So why do the math? It seems to me you can tweak it till you come up with something that works. Give us something that can predict wins and losses before they happen and I’ll be impressed

I responded to that comment with this long ‘graph:

To callmestupid: it matters and is cool for a couple of reasons. (1) This sort of “pythagorean” projection is a better predictor of a team’s record in the following season than their actual record is (see 2009-10 Seattle Mariners; in 2009 they way overperformed their runs scores/runs allowed numbers and thus their “regression to the mean” in 2010 should have been expected rather than having been a surprise). (2) At any point in the season, if you wish to make a projection of how well a team is actually playing, this formula gives you/us a basis for evaluation, though the later in the year you wait, the more accurate the projection (small sample size caveats apply); such a run-differential-based projection better predicts the relative quality of teams than their actual record–due to, among other things, luck. Also, John, the formula is more accurate if you use 1.83 as the exponent rather than 2 (squaring). That is, you raise runs scored and runs allowed to the exponential power of 1.83 rather than simply squaring. This is established in the literature, but see the Hardball Times website for a better explanation.

My responses elicited the following comment from MyjahLeesa:

How does this actually predict anything? All I see here is a measure of things that have already happened–that’s not a prediction! It doesn’t give me any new information, it’s just putting the information in a slightly different form… I think that is the inherent limitation of statistics, they are all backward looking. They can’t tell you how all the tangible aspects of the game are causing these numbers just by multiplying the numbers together in different ways. And I think understanding the tangible aspects are key to making more useful predictions anyway. To me, calling these numbers predictions is a little like the tail wagging the dog, no?

Okay, I guess. But what kills me is that my explanation gets a thumbs down and both of the other comments get thumbs up. (!?!)

As you know, Pythagorean projections are better predictors of future performance than actual winning percentage over the course of the rest of the season (at least, those really smart kids over at the always awesome *Baseball Prospectus* tell me this is so; their info tends to pan out so I trust them; also there is some formal statistical support for this–see the section called “Theoretical Explanation”).

Hey, I can’t tell you *how* electricity makes my computer operate, but I *can* tell you that my computer works a lot better when plugged than when it’s not. In the same way, I can’t explain *how* the Pythogorean projection provides accurate predictions (within a standard error of +/- 4 wins), but I *can* tell you that ever since I started playing around with it in, oh, 1987 that it does a pretty good job of predicting wins and losses.

The response to my comments is sort of funny and sort of sad. The answerer says “it’s all backward looking,” which is kind of true but–deep breath–*so are ALL empirical explanations and the subsequent predictions they generate.* That is how social scientific theories are justified, for example: you collect your data set, evaluate it, and then you see if historical conditions conform to your hypotheses. If so, then they are considered more likely to be true in the future. And in any event, they are *descriptions*, and with accurate descriptions you have a better chance of making a valid prediction. Oh, whatever….

Where the rubber really hits the road, *arguing that the past isn’t relevant to the future completely invalidates the collection of ANY baseball statistics. *That is, every time you hear someone–a commentator, someone in your fantasy baseball league, whomever–say, “This guy is only hitting .250 this year, but he’s a .320 career hitter,” they are assuming that past performance predicts future performance. Any time someone says, “This pitcher had a 2.35 ERA last year, but this year it’s 4.20, so what gives,” s/he is assuming that the past tells us something about what we should expect in the future. So, if you stick to the whole “the past is meaningless for the future” position, you reject ANY sort of statistical analysis, like, say, actuarial tables, economic forecasts, expectations that today will be like yesterday, etc. No one lives their lives that way.

[Oh, regarding that pitcher with the ERA that blew up: one should probably look at the pitcher’s BABIP to see if his defense is letting him down, and then take a look at his defense-independent pitching stats, like FIP and xFIP.]

While the map is not the territory, and statistics can’t provide the ability to make perfect predictions, there is a difference between a wild-ass guess and making an inference rooted in some knowledge about how things are and how they’ve been.

Therefore, I consider the response to my response kind of, to be as nice as possible, *ill-conceived*. To say that I–and a host of other people way, way, way more talented than I will ever be–am letting the tail wag the dog is simply, uh, silly. No, those are not just some random numbers that we’re multiplying (first of all, a very well-trained spreadsheet does all the math). They are runs scored and runs allowed, the building blocks of wins and losses.

I am now getting off my soapbox and going to listen to a Jonah Keri–Dave Cameron podcast, two guys who are down with sabermetrics and worth every baseball fan’s attention.

## The Importance of the First Pitch Strike

I decided to find out how important getting ahead of hitters actually is, so I did a study of first pitch strikes compared to first pitch balls.

Bill Feber*‘s The Book on the Book *got me thinking about this. Felber did a 5000 pitcher/batter interaction study of results in various pitch counts. Thomas Boswell of *The Washington Post* cited Felber’s study in an article about watching baseball while paying more attention to the count, since the count can tell you a lot of things. Boswell focuses on 1 ball 1 strike counts and what results, since 1-2 is a lot different for both pitchers and batters than is 2-1.

However, my focus is just on what happens after that first pitch. That is, what are the results after a 1-0 count compared to the results after an 0-1 count. In short, how important is the First Pitch Strike?

Now, I started to do this study, the really, really hard way. I looked at MLB Gameday for every individual game and began recording results in Microsoft Excel. One day’s games did me in. Too much tedium. Instead, on a lark, I visited Baseball Reference.com and I found this: tabulated data for every pitch count and what happens after reaching a particular pitch count. By parsing this data I was able to find out how results differed between 1-0 counts and 0-1 counts.

First of all, the overall data for 2010 revealed that hitters go .262/.333/.406 with a wOBA of .346. Walks result in 9% of plate appearances, and 18.27% of plate appearances ended with strikeouts, which gives us a walk to strikeout ratio (BB/K) of 0.49.

In plate appearances that are determined by one (the first) pitch, batters did really well: .345/.350/.555 (AVG/OBP/SLG, also called the “slash line“). (Note also that the OBP is higher than the AVG because some batters manage to get hit by pitches) Batters have a wOBA of .413 and an OPS of .905 in these plate appearances. Obviously there is no BB/K ratio since a batter can neither walk nor strike out in one pitch.

After 1-0 counts, first pitch balls, hitters produced the following statistical line: .273/.391/.436, a wOBA of .389, and an OPS of .827. The walk rate was 15.91% and the strikeout rate was 14.47%, producing a BB/K ratio of 1.10.

After 0-1 counts, first pitch strikes, hitters produced at the following rates: .228/.272/.348, with a collective (and woeful) wOBA of .294, with a similarly woeful OPS of .620. The walk rate for these batters was 4.92% and their strikeout rate was 25.67%. The BB/K ratio for these batters was 0.19 (yes, 0.19).

Now, let us assume that every time a plate appearance resolves on the first pitch that the pitch is a strike. This is an obviously untrue assumption since (a) HBP’s don’t happen on strikes, and (b) hitters such as Pablo Sandoval swing at first pitches that are not in the strike zone. But let’s make that assumption for the moment; after all, if the plate appearance resolves in one pitch, the batter almost always has swung at the pitch, and presumably, batters attempt to swing only at pitches in the strike zone, that is, pitches which are strikes.

If we combine the data from one-pitch plate appearances, which are very good for hitters, with the data from plate appearances that resolve after 0-1 counts (first pitch strikes), we *still see *that hitters performed less well than they did following 1-0 counts. The combined data from one-pitch plate appearances and those following 0-1 counts indicates a collective hitter slash line of .250/.286/.387, with a wOBA of .315 and an OPS of .673. Look at the OBP! .286? That is terrible, even for a utility middle infielder with broker glasses. By contrast, following 1-0 counts, hitters’ OBP figure is .391. Looking further at the first pitch strike data, we find a walk rate of 4.02% and a strikeout rate is 21.02%, which creates a BB/K rate of 0.19.

2010 data is as follows:

Total Data shows us 63,516 AB, 16,458 Hits, .259 AVG/.330 OBP/.406 SLG, .346 wOBA, .736 OPS, 9.00% BB rate, 18.27% K rate, 0.49 BB/K.

First Pitch Resolution comes us great for hitters with 7283 AB, 2515 Hits, .345 AVG/.350 OBP/.555 SLG, .413 wOBA, .905 OPS, and nonexistent BB and K rates and a nonexistent BB/K ratio.

First Pitch Balls create the following results: 24.637 AB, 6,735 Hits, .273 AVG/.381 OBP/.436 SLG, .389 wOBA, .827 OPS, with 15.91% BB rate and 14.47 K rate, for a BB/K ratio of 1.10.

First Pitch Strikes are followed by these results: 31,596 AB, 7208 Hits, .228 AVG/.272 OBP/.348 SLG, .294 wOBA, .620 OPS, with a 4.92% BB rate and 25.67 K rate, for a BB/K ratio of 0.19.

Combining the results for First Pitch Resolution with that of First Pitch Strikes gives us the following numbers: 38,879 AB, 9123 Hits, .250 AVG/.286 OBP/.387 SLG, .315 wOBA, .673 OPS, a walk rate of 4.02% and a K rate of 21.01%, for a BB/K ratio of 0.19 (which is a K/BB ratio of 5+!).

Thus *even if *your pitchers are giving up all that good stuff to hitters on plate appearances resolved on one pitch, **even then**, it appears that first pitch strikes are simply awesome for the pitching team. The on-base percentage differential is so huge (.286 vs. .391 when the count goes 1-0), and attributable to the incredibly lower proportion of walks that your pitching allows.

2010 data is no aberration. After accumulating the information back through the 2006 season, it proves remarkably consistent.

In plate appearances resolved on the first pitch: .341/.346/.552, .409 wOBA, .898 OPS.

In plate appearances begun with a First Pitch Ball (resolved after a 1-0 count): .279/.394/.456, .396 wOBA, .805 OPS, 4.82% BB, 13.88% K, 1.12 BB/K.

In plate appearances begun with a First Pitch Strike (resolved after a 0-1 count): .236/.279/.361, .301 wOBA, .637 OPS, 4.82% BB, 25.05% K, 0.19 BB/K.

In plate appearances either resolved in one pitch or begun with a First Pitch Strike: .257/.292/.399, .332 wOBA, .684 OPS, 3.88% BB, 20.17 % K, 0.19 BB/K.

The walk to strikeout ratio (BB/K) is the most, er, striking thing. After 1-0 counts, batters walk more often than they strikeout, while after 0-1 counts batters walk less than a fifth of the time they strikeout. That is a huge, huge difference. (Maybe it’s not the actual bases on balls that give managers their gray hairs; maybe it’s the first pitch balls that given them those gray hairs, as they spend the rest of the plate appearance envisioning the walks and other bad things that will follow.)

The conclusion is simply put: **It pays to throw First Pitch Strikes.**

Even though are going to have some guys get on base or hit bombs because they are first pitch swinging, the on-base percentage difference between plate appearances starting with first pitch balls (.394) and those that don’t (.292) is simply huge. This represents a lot of runs saved in plate appearances that don’t start with balls.

In fact, if you look at the summarized data from 2006 to 2010, you will see that the **on-base percentage** in First Pitch Strike plate appearances (.279) is about the same as the **batting average** in First Pitch Ball plate appearances and the OBP in First Pitch Ball plate appearances if a “robust” .394, which is All-Star-quality good (or bad, from the pitcher’s perspective). Runners simply do not get on base much after a First Pitch Strike.

An additional thing to consider is that many plate appearances that are resolved after one pitch are swings that occur when the batter thinks he’s gotten “his pitch,” that is when the pitch is in the location where the batter is looking for a pitch to hit. This means that those first pitch swings resolve plate appearances often happen when the batter thinks he has the best chance of doing something with the pitch. I can’t quantify how often this happens, but I *can *point out that batters are said to approach the first pitch in such a way.

Results obviously vary from batter to batter. After all, some guys don’t mind hitting with two strikes. Others mess up in 3-1 situations. But, *as a general rule*, the First Pitch Strike seems to be the best pitch to throw.

On the other hand, James Shields’ 2010 provides perhaps the most obvious counterexample/counterargument to what I’ve said, but still…the numbers are pretty conclusive.