|
New Page 1
Making Contact and Margins of Error
by David Luciani
published February 23, 2005
Long-time readers of my forecasts know just how much emphasis I put on
breaking down the actual skills of a player in assembling a forecast. I gave
examples of this in recent "Ask David" columns when talking about
forecasts for Melvin Mora and Andruw Jones and last year, I gave similar
examples early in the season when talking about my continued optimism for Ichiro
Suzuki and my pessimism for Ron Belliard to maintain a hot start.
Often, I will refer to a player's rate of making contact. I know it's not a
statistic I invented, though I was under that false and no doubt arrogant
impression for a few years until I found that Bill James had talked about the
concept before me. The contact rate for a hitter is a crucial aspect of the
forecast and it's the one skill which early in the season, sometimes even in
spring training, a hitter can show to demonstrate that he is markedly improved
and bring all of his other numbers up along with it.
First let me explain the simplified contact rate which is calculated as
(AB-K)/AB. As I say, I didn't invent this and I'm not sure where to apply credit
as I thought I had invented it independently and have since seen both Bill James
and Ron Shandler, among others, use the statistic in print. I suspect I probably
first heard about if from James. The idea is simple in that the calculation
shows you the percentage of at bats that a player puts a ball in play. It
excludes consideration of other events likes walks and times hit by pitch and
it's a crucial element of the forecasting exercise.
One benefit of this category is that it's one of the first, for a hitter, for
which you can make quick conclusions about a change in skills, this because the
sample size (at bats) gets so large so quickly. I have emphasized that one key
difference between my forecasts and others is that I don't often see the
denominator as being at bats for other categories. I recall explaining in this
space in May of 2004 why Ichiro Suzuki was not performing outside the margin of
error when I continued to project him to break out of a season-long batting
average slump. Ichiro even went on to far exceed my expectations later.
Basically, when it comes to categories like hits, I am often looking upon the
individual hit types like singles, doubles, triples and home runs, separately,
and the denominator for those categories is not at bats but rather categories
like the number of times a player puts a ball in play.
To further emphasize the importance of the contact rate, take the example of
a hitter whose real skill is to hit home runs 8% of the time he puts the ball in
play. If his pure power skills stays intact but his contact rate improves from
70% to 80% from season to season, over a 550 at bat season he suddenly goes from
being a 30 home run type to being a 35 home run player, even without an
improvement in his rate of home runs per ball in play. In hand with such
improvements you will typically see increases in runs scored, RBI and so on.
Batting average by its very nature should improve as a contact rate improves
because even if the player's number of hits per ball in play stays the same, if
he puts more balls in play, he's obviously going to get more hits per at bat. It
seems obvious and yet I see forecasts out there from other sources which seem to
disregard this, often accepting an increased skill in contact rate but not
accepting an increase in batting average. No doubt it's possible that as a
player makes a greater effort to make contact, he could do so by giving up some
bat speed, but it should be the exception rather than the rule to downgrade his
other skills when his contact rate improves.
To further emphasize the point, check out the following ten hitters. Each of
them were the leaders in improved contact rate from 2003 to 2004 (among players
with a minimum of 400 at bats) and check out the improvements in other
categories:
|
|
|
ab
|
h
|
2b
|
3b
|
hr
|
r
|
rbi
|
k
|
avg
|
contact%
|
|
Durham,
Ray
|
2003
|
410
|
117
|
30
|
5
|
8
|
61
|
33
|
82
|
0.285
|
80.0%
|
|
|
2004
|
471
|
133
|
28
|
8
|
17
|
95
|
65
|
60
|
0.282
|
87.3%
|
|
Rollins,
Jimmy
|
2003
|
628
|
165
|
42
|
6
|
8
|
85
|
62
|
113
|
0.263
|
82.0%
|
|
|
2004
|
657
|
190
|
43
|
12
|
14
|
119
|
73
|
73
|
0.289
|
88.9%
|
|
Pierzynski,
A.J.
|
2003
|
487
|
162
|
35
|
3
|
11
|
63
|
74
|
55
|
0.333
|
88.7%
|
|
|
2004
|
471
|
128
|
28
|
2
|
11
|
45
|
77
|
27
|
0.272
|
94.3%
|
|
Kotsay,
Mark
|
2003
|
482
|
128
|
28
|
4
|
7
|
64
|
38
|
82
|
0.266
|
83.0%
|
|
|
2004
|
606
|
190
|
37
|
3
|
15
|
78
|
63
|
70
|
0.314
|
88.4%
|
|
Finley,
Steve
|
2003
|
516
|
148
|
24
|
10
|
22
|
82
|
70
|
94
|
0.287
|
81.8%
|
|
|
2004
|
628
|
170
|
28
|
1
|
36
|
92
|
94
|
82
|
0.271
|
86.9%
|
|
Wigginton,
Ty
|
2003
|
573
|
146
|
36
|
6
|
11
|
73
|
71
|
124
|
0.255
|
78.4%
|
|
|
2004
|
494
|
129
|
30
|
2
|
17
|
63
|
66
|
82
|
0.261
|
83.4%
|
|
Ramirez,
Aramis
|
2003
|
607
|
165
|
32
|
2
|
27
|
75
|
106
|
99
|
0.272
|
83.7%
|
|
|
2004
|
547
|
174
|
32
|
1
|
36
|
99
|
103
|
62
|
0.318
|
88.7%
|
|
Hillenbrand,
Shea
|
2003
|
515
|
144
|
35
|
1
|
20
|
60
|
97
|
70
|
0.280
|
86.4%
|
|
|
2004
|
562
|
174
|
36
|
3
|
15
|
68
|
80
|
49
|
0.310
|
91.3%
|
|
Burroughs,
Sean
|
2003
|
517
|
148
|
27
|
6
|
7
|
62
|
58
|
75
|
0.286
|
85.5%
|
|
|
2004
|
523
|
156
|
23
|
3
|
2
|
76
|
47
|
52
|
0.298
|
90.1%
|
|
Chavez,
Endy
|
2003
|
483
|
121
|
25
|
5
|
5
|
66
|
47
|
59
|
0.251
|
87.8%
|
|
|
2004
|
502
|
139
|
20
|
6
|
5
|
65
|
34
|
40
|
0.277
|
92.0%
|
It's interesting to see how batting average overall gets affected by improved
contact rates. Of the 36 hitters who hit at least .300 with a minimum of 400 at
bats in 2004, a total of 23 of them or about 64% had better contact rates in
2004 than 2003. This was true in 2003 as well when 68% of the 38 hitters who hit
at least .300 with at least 400 at bats had improved their contact rate from a
year earlier. Eight of the nine hitters last year who hit .325 or better last
year (again with a minimum of 400 at bats) improved their contact rate from 2003
to 2004 with Todd Helton being the only exception.
The key to recognizing such players, of course, is identifying early that
their contact rate has really improved and fortunately, it's a category that can
accumulate a significant sample size quickly.
This is where I want to give the reader the briefest of discussions about
so-called "standard deviation" but emphasizing that this is an
oversimplification. Basically, when it comes to taking a sample of data, we can,
when given a normal curve, presume that a sample will be within about two
so-called "standard deviations" of its real chance of happening about
95% of the time.
Much of what makes my baseball forecasting successful is the application of
statistical principles to baseball analysis. Much like an election poll takes a
sample of data, I treat the baseball performance as a sample of a player's skill
and then attempt to estimate what his real skill is based on the sample. Often,
the equations are not as you might expect and this leads to readers wondering
why I haven't quickly upgraded or downgraded a player.
When it comes to the contact rate, there are certain margins outside of which
a player can go, even in spring training, that can warrant a close revisiting of
their skill ratings. Basically, if a player gets two standard deviations away
from what I had previously projected his real ability is, I can assure you that
I will be taking an even closer look at his skills to see if something has
changed to affect his performance. Perhaps he is out of shape. Possibly an
injury enters the equation. Maybe he has just been unlucky.
Contact rate is one of the few categories where, even in spring training,
enough data can be accumulated to give me a significant sample size that I can
use to make last-minute modifications to a player's forecast. The sample size is
key and here are some examples of ranges I would expect to see given a certain
number of at bats and my original theory about a player's contact rate:
|
Original Theory / Contact Rate
|
30 at bats
|
60 at bats
|
90 at bats
|
|
65%
|
48% - 82%
|
53% - 77%
|
55% - 75%
|
|
70%
|
53% - 86%
|
59% - 81%
|
60% - 79%
|
|
75%
|
59% - 90%
|
64% - 86%
|
66% - 84%
|
|
80%
|
66% - 94%
|
70% - 90%
|
72% - 88%
|
In other words, as you're watching spring training, you might think that a
player is performing so miserably that you think they deserve an upgrade or
downgrade. But consider that even after 60 at bats, a player for whom I've
projected about a 70% contact rate can have anywhere between a 59% and an 81%
contact rate and being doing exactly what they should be expected to do within
the 95% confidence interval. In other words, short runs can yield bad luck and
tables like this one help us identify the players whose skills may have changed
for real and whose forecasts may really be worth reconsidering.
Take Ichiro for example. My current theory about his contact rate is an
amazing 90.8%. Standard deviation analysis tells us that if those were his real
skills (and remember that this is just our theory and we never know if we're
correct), in a 30 at bat sample, Ichiro's contact rate could be anywhere between
80% and 100%. When we've got a 60 at bat sample size, it tightens up where we
would expect his contact rate to be between 83% and 97%, 95% of the time.
Of course, we must keep in mind that if we're using a 95% confidence rate,
that means that about 5% of the players should be outside the margin here.
That's when statistical analysis gets more interesting because we will often
discover that there are more players than expected performing outside of those
ranges. Our job then as forecasters is to identify which are the real
non-performers and which are simply suffering from bad luck. In previous years,
there have been some players whose spring training was so good or bad that they
got that last minute modification in these pages.
Contact rate is likely to be one of the only categories where spring training
yields enough data that I may have to downgrade or upgrade a hitter's forecast,
with rare exceptions. Tables like this will place in context notes I may later
make late into spring training about players performing "outside the margin
of error" who subsequently get considered for significant last-minute
modifications in the forecasts before Opening Day. In general, spring is
most useful for assessing playing time more than performance but in categories
like the contact rate, we can actually make occasional conclusions about a
player's new ability even before the first pitch of the season is thrown.
|