HOME

TESTIMONIALS

DISCUSSION FORUM

MEMBERS ONLY

CONTACT US

PRIVACY POLICY

ADVERTISING INFO

LATEST NEWS courtesy of Rotowire

LINKS 

 

New Page 1

Size as a Statistical Field: Part I
by David Luciani
published January 19, 2005

Occasionally I might pull out a forecast that ultimately proves to be right despite expectations and more importantly, despite what readers would consider to be statistical data supporting the initial forecast.  This seems to puzzle some readers who write to ask how someone as statistically-oriented as I am would have projected the player to do so well when there was no data to support it.  In a similar spirit, a former big league GM told me that the problem with relying heavily on statistics, as I no doubt do, is that it can't capture or consider things like how fast a pitcher throws and the size of a player or his potential size.  I argued quite the opposite, that these very things were statistics and could be classified as such where height and weight and other such things were a so-called "statistical field" that we could analyze like any other.  No doubt, a player's projected eventual height and weight (with weight being the more likely to change except in the youngest of players) is something you need an expert, which we'll call a scout, to project but the simple fact is that if you can quantify it either in a numerical or categorical way, then it is already a statistic.  If there is a so-called "secret" to my forecasting methods, and I don't think I've been secretive about it, is that while I do emphasize statistics in my forecasting, it's just that I'm looking at a player's performance in a different way than a person looking in from the outside of statistical analysis might have expected.  I tried to highlight an example of this in this past week's "Ask David" when I gave a bit of insight into my Melvin Mora forecast for 2005 though even it is an oversimplification to attempt to respond to the reader question in a single essay.

Admittedly, I tend to be overly defensive of those of us in the so-called statistical crowd, though I no doubt can be considered a crossover at times in that I often include scouting data in my forecasts but I quantify it to make it statistical.  This way I know what percentage of hitters who get an "A+" for power in my ratings actually do end up hitting home runs, for example.  It is far too common for those on the extreme scouting side of the debate to quickly conclude that the statistics that we're looking at exclude much of the meaningful, good data they collect.  In fact, I not only support its collection but argue that it is crucial to the forecasting exercise, particularly the long-term forecasting of players who are five or more years out of their eventual prime ability.

The size of a player in particular has always fascinated me and I have begun to compile some summary data for a new book I am writing that attempts to explain how it is I do forecasting.  Some readers have questioned how it is that I can suddenly downgrade a player's projected stolen bases when he shows up for spring training far overweight.  Just because it's for a book, I see no reason to hold back useful data from Baseball Notebook's readers because it applies to much of the information you see here at the site.

Unlike analyzing for the effects of minor league competition levels, parks, age and other elements we've already succeeded at neutralizing, the problem I have in analyzing the effect of the actual size of a player is that there isn't good data readily available on the changes that take place in a player's size within a season, particularly his weight.  Unfortunately, there isn't even data on how much weight has changed over player careers and the reason I prefer to see the changes within a season is because then I am not confusing the effect of some other changing element, such as age, with the effect of the change in a player's size.  So, I approach this summary of data with both caution and a caveat:  This is not the way I would normally prefer to analyze data but it's the best I can do with the data I have available.

I finally decided that since I don't have the data I want, the best I can do until I acquire that data to is try to summarize the statistical performances of all players at different heights and weights to see if any patterns emerge.  Would the taller pitchers have better fastballs because of their increased leverage?  Would the heavier hitters be more likely to be home run hitters because of their strength and ability to resist the incoming fastball?  If a former stolen base champ shows up for spring training looking like he lived off a winter of double cheese burgers, will he be able to run as he always has?

While these questions remain open, I think the reader will greatly benefit from seeing some of the far too obvious patterns that emerged in this, only the first round of a much-needed lengthier study on the issue.  Because I don't have the changing weight (and possibly changing height for very young players) over a player's career, I had to settle for the listed heights and weights available in commonly published sources and databases.  That means I don't have reliable data for every player like a Babe Ruth who started as a much skinner young pitcher and developed into a heavier and more powerful hitter through his career.  In rare cases, I was able to plug in those known changes from biographical information but in general I just didn't have that data.

Ruth isn't a factor here anyway as I also decided to set the cutoff for players considered from 1969 forward so we were essentially looking at the current game, conceding that the 1970s is somewhat different than the 1990s but similar enough that I wanted to consider it.  There's more than thirty years of data to work with here and it did yield some obvious patterns for purposes of this summary.

Finally, I decided to present all data in the form of either "per plate appearance" for hitters or "per batter faced" for pitchers because it makes it entirely easy to compare the different groups.

In part one, we'll look at hitting and next time out, I'll share with you some of the data we found for pitchers.

Let's start with just weight for a hitter.  I took all hitting data for all players since (and including) the 1969 season and split out performances based on the weights.  I decided to break weight into twelve distinct categories as follows.  There is plenty of data for each group and some of the peak groups, like the 190-199 range, have more than eight thousand seasonal performances considered:

149 lbs or less
150-159
160-169
170-179
180-189
190-199
200-209
210-219
220-229
230-239
240-249
250+

Here then is a summary of how each group performed per 550 plate appearances.  The averages, OBP and SLG columns are more precise than the rounded numbers you see displayed here and they use the actual non-rounded totals:

Weight Group AB H 2b 3b HR HBP BB K SB CS SH SF Avg Obp Slg
<=149 lbs 489 126 19 5 3 2 46 73 26 10 10 3 .258 .324 .336
150-159 497 129 19 4 4 3 38 51 19 8 9 3 .259 .314 .338
160-169 489 126 20 4 6 3 47 65 17 7 8 4 .259 .324 .351
170-179 492 128 22 4 8 3 45 72 14 6 7 4 .261 .324 .370
180-189 490 127 22 3 11 4 47 77 12 6 6 4 .259 .326 .383
190-199 488 128 23 3 13 3 49 82 8 4 5 4 .263 .332 .405
200-209 488 127 23 3 16 3 50 90 6 3 5 4 .260 .329 .416
210-219 488 125 23 2 18 3 49 99 5 3 5 4 .256 .326 .423
220-229 485 126 24 2 20 4 53 98 4 3 4 4 .260 .335 .438
230-239 488 128 24 2 20 5 47 111 5 3 6 4 .263 .331 .444
240-249 477 125 23 2 24 3 61 113 5 3 3 5 .262 .346 .469
250+ 457 131 24 1 27 3 83 90 2 1 1 6 .288 .397 .522

You can't help but look at the chart and be amazed at the obvious and in some cases not surprising patterns that emerge.  The heavier a player gets, the more his home run skill improves but just as importantly, his walk ability goes up.  One reason the walk ability could be improving is because as a player becomes more powerful, pitchers are understandably more reluctant to pitch to him and thus try to nibble at the corners or are so careful as to throw many more balls than strikes than they would if he were a light hitter.  The increase in the doubles column isn't as profound.  The stolen base column is not only obvious but it's critical to appreciate if you're to be a good forecaster.  Basically, the heavier a player becomes, the less frequently, on average, he's going to steal bases compared to what his skills would look like if he were lighter.  The batting average actually slightly increased as the player got heavier but this pattern isn't so obvious.  Triples, clearly, drop.  The light hitters, probably because they aren't considered to be as important to the offense, are asked to bunt more often.  The big hitters, by virtue of their power, get more sacrifice flies to go with their home runs.  In terms of strikeouts, the heavier you get, the more there is a tendency to strike out more but the pattern there is a bit sketchy.

A thinking reader might question why it is that clubs don't actually ask players to gain weight because it is clear that the greater value performances are coming from the heavier players.  Well, they often do and in other cases, they shouldn't.  Remember that the player has to field a position and this analysis in no way shows you the negative effects a weight gain would have on a player's range, a negative defensive effect that could more than offset any offensive gain from the player.  I do think that weight gain shouldn't be as frowned on as it is, especially from defensive positions that don't require the same range like first base but in general, there are many other components to a player's game and a weight gain isn't an instant recipe for success.

Now players can control their weight but they can't control their height.  Again, I've broken out the heights into categories as follows, this time thirteen groups that ensure we get many seasonal performances in each group:

5'6" or less
5'7"
5'8"
5'9"
5'10"
5'11"
6'
6'1"
6'2"
6'3"
6'4"
6'5"
6'6"+

Again, here's the summary of each group per 550 plate appearances:

Height Group AB H 2b 3b HR HBP BB K SB CS SH SF Avg Obp Slg
<=5'6" 496 120 17 4 3 4 37 67 21 8 10 3 .241 .298 .311
5'7" 470 124 23 3 9 3 68 63 22 7 5 4 .265 .358 .383
5'8" 487 132 21 5 7 4 49 64 21 8 7 4 .270 .339 .376
5'9" 490 129 21 3 7 4 46 64 17 7 6 4 .262 .328 .360
5'10" 488 127 21 4 9 3 48 69 14 6 6 4 .260 .328 .371
5'11" 490 129 22 4 9 3 47 69 12 6 6 4 .264 .329 .379
6' 489 129 23 3 12 3 48 76 10 5 5 4 .263 .331 .394
6'1" 490 126 22 3 13 3 47 86 8 4 5 4 .258 .325 .398
6'2" 489 128 23 3 14 3 49 85 8 4 5 4 .261 .330 .409
6'3" 489 126 23 3 17 3 47 96 7 4 6 4 .258 .325 .418
6'4" 486 122 23 2 16 3 49 104 5 3 7 4 .251 .321 .407
6'5" 483 121 22 2 17 3 51 108 6 4 9 4 .251 .324 .411
6'6"+ 484 117 20 2 22 3 50 123 7 3 9 4 .242 .314 .428

Again, you can see obvious patterns though I believe height is still less of a factor than weight and of course, both work in combination with each other to create the package we call the player.  In this case, as a player gets taller his power improves and surprisingly, his walk ability improves but his OBP does decline.  With an increased strike zone working against him, it's possible that he gets into deeper pitcher's counts and has to swing at pitches he doesn't like, leading to the lower batting average but that is difficult to reconcile with the increase in walks.  It's also possible that the taller players simply aren't players who try to hit for average, preferring the home run to the single.  It's also difficult to tell from this chart how much height is really impacting things because, on average, the taller a player is, the heavier he is and so weight can skew these results.

In any case, it's undeniable that weight and height are factors in a player's performance and if we treat them as a statistical field, we can and should apply them to our forecasts.  Next time out, I'll show you some performance data for the pitchers.

 

Register for our 2010 projections and get instant access!  Click here for details.

Fill out your e-mail address
to receive our FREE newsletter!

Baseball Notebook is an online editorial publication that actively reports on baseball at all levels.  All materials, unless otherwise stated, are © 1998-2010 Baseball Notebook.  All historical statistics displayed are obtained through licensed affiliation with Baseball Info Solutions.  As an independent editorial publication, Baseball Notebook is not sponsored by, authorized by, affiliated with or associated with any other mentioned entities, including those who own reserved trademarks that may be used for descriptive purposes only.  The terms Major League Baseball, World Series, National League, American League, All-Star Game and the names of the Major League Baseball teams are trademarks of Major League Baseball Entities.  Baseball Notebook does not offer, create, administer or endorse any fantasy baseball game or competition.