Look: I am eager to learn stuff I don't know--which requires actively courting and posting smart disagreement.

But as you will understand, I don't like to post things that mischaracterize and are aimed to mislead.

-- Brad Delong

Copyright Notice

Everything that appears on this blog is the copyrighted property of somebody. Often, but not always, that somebody is me. For things that are not mine, I either have obtained permission, or claim fair use. Feel free to quote me, but attribute, please. My photos and poetry are dear to my heart, and may not be used without permission. Ditto, my other intellectual property, such as charts and graphs. I'm probably willing to share. Let's talk. Violators will be damned for all eternity to the circle of hell populated by Rosanne Barr, Mrs Miller [look her up], and trombonists who are unable play in tune. You cannot possibly imagine the agony. If you have a question, email me: jazzbumpa@gmail.com. I'll answer when I feel like it. Cheers!

Tuesday, May 6, 2014

What is a Good Batting Average?

I think we'd all agree that .300 is a good batting average, and that .200 isn't.   Also, that 280 is pretty good and .220 is pretty poor.  But where do you draw the line?  Let's say that a good batting average is anything above the mean for the league in that year, and the degree of goodness is defined by the difference from the mean.  Similarly, poor batting is hitting below the league mean.

Fortunately, mean batting average data is available at Baseball Almanac, going back to 1901, so the comparisons are easy to make.  "These totals include every player, every at-bat and every hit during the season listed."

Graph 1 shows the mean batting average for each league from 1901 through 2013.

Graph 1

There have been some pretty big changes over time as the tide shifts in the battle between hitters and pitchers.  Within the broad sweeps, there are also year to year variances that tend to run more or less parallel in the two leagues.  So, in general, whatever is happening, happens in both leagues.

To get a clearer picture of the broad sweeps, I took 13 year moving averages for both leagues [Yes - averages of averages of averages.]  This can be seen in Graph 2.

Graph 2

This clarifies two things - times when either pitchers or hitters were gaining on the other, and times when one of the leagues had either better hitting or better pitching.   Batters became increasingly more dominant from the mid 19-teens through the early thirties.  Then pitchers took over until the early to mid seventies.  Afterward, batters gained ground until 2007.  That trend may now be reversing. 

Back in the early days, the performances in the two leagues was, in gross terms, nearly identical.  There were big differences between the leagues in individual years, but a lot of year-to-year flipping eliminated dominance by one league over the other.  Since about 1950, there have been some robust separations. Graph 3 shows this in detail.

Graph 3

From 1901 until 1938, a  difference of .013 or more between the leagues was fairly common, occurring 8 times in those 37 years, 4 with the AL on top, and 4 with the NL on top.  Over that span the AL grand average was .002 above the NL average.  Since 1938 there have only been 3 occurnces of a .013+ difference.

From 1943 to 1972 the NL batters did better, by an average of .004 per year.  The greatest difference was .016 in 1966.  Since 1973 the AL has done better, out hitting the NL by an average of .007 per year, with peaks of .014 in 1989 and .015 in 1996.  These average differences in the three eras are indicated with heavy blue horizontal lines. The yellow line is the average difference for the entire data set.  Since 2007, as batting averages in both leagues have dropped by large margins, the gap between the leagues has gotten smaller.

It occurred to me that a good batter should be above the mean by some margin, and standard deviation ought to be a useful number.  I got the data here for the 2013 season, and arbitrarily eliminated players with fewer than 200 at bats.  The mean batting average of the 170 qualifying players was 0.2568, as compared to 0.2558 for the every player, every at bat method cited above.  The standard deviation was .030.

So, if you're lenient, a good batting average in the American League was anything over .257.  If you want to be more strict, use .287.  For 2 Std Devs above the mean, it's .317.

Miguel Cabrera's league leading batting average of .348 for 2013 was a full 3 standard deviations above the mean.  Here is a list of all the qualifying AL players with batting averages over .300 in 2013.

Five full season Tigers were on that list of 15 names.  They picked up Iglesias at the End of June, 2013 then lost Peralta and Infante over the winter.  Iglesias is now out with stress fractures in both legs, and will miss the entire 2014 season.  He is expected to make a full recovery.

Afterthought: 2007 was the year MLB got serious about enforcing bans on PEDs.  Since then mean batting averages have dropped by about 20 points.  Coincidence?  I don't think so.


Jerry Critter said...

Interesting afterthought on the enforcement of the van on PEDs. Do you think the increased use of PEDs accounts for the increase in the 70's?

Jazzbumpa said...

Jerry -

I haven't studied the issue, but I strongly suspect that that is the case.