HOME

TESTIMONIALS

DISCUSSION FORUM

MEMBERS ONLY

CONTACT US

PRIVACY POLICY

ADVERTISING INFO

LATEST NEWS courtesy of Rotowire

LINKS 

 

New Page 1

Making Contact and Margins of Error
by David Luciani
published February 23, 2005

Long-time readers of my forecasts know just how much emphasis I put on breaking down the actual skills of a player in assembling a forecast. I gave examples of this in recent "Ask David" columns when talking about forecasts for Melvin Mora and Andruw Jones and last year, I gave similar examples early in the season when talking about my continued optimism for Ichiro Suzuki and my pessimism for Ron Belliard to maintain a hot start.

Often, I will refer to a player's rate of making contact. I know it's not a statistic I invented, though I was under that false and no doubt arrogant impression for a few years until I found that Bill James had talked about the concept before me. The contact rate for a hitter is a crucial aspect of the forecast and it's the one skill which early in the season, sometimes even in spring training, a hitter can show to demonstrate that he is markedly improved and bring all of his other numbers up along with it.

First let me explain the simplified contact rate which is calculated as (AB-K)/AB. As I say, I didn't invent this and I'm not sure where to apply credit as I thought I had invented it independently and have since seen both Bill James and Ron Shandler, among others, use the statistic in print. I suspect I probably first heard about if from James. The idea is simple in that the calculation shows you the percentage of at bats that a player puts a ball in play. It excludes consideration of other events likes walks and times hit by pitch and it's a crucial element of the forecasting exercise.

One benefit of this category is that it's one of the first, for a hitter, for which you can make quick conclusions about a change in skills, this because the sample size (at bats) gets so large so quickly. I have emphasized that one key difference between my forecasts and others is that I don't often see the denominator as being at bats for other categories. I recall explaining in this space in May of 2004 why Ichiro Suzuki was not performing outside the margin of error when I continued to project him to break out of a season-long batting average slump. Ichiro even went on to far exceed my expectations later. Basically, when it comes to categories like hits, I am often looking upon the individual hit types like singles, doubles, triples and home runs, separately, and the denominator for those categories is not at bats but rather categories like the number of times a player puts a ball in play.

To further emphasize the importance of the contact rate, take the example of a hitter whose real skill is to hit home runs 8% of the time he puts the ball in play. If his pure power skills stays intact but his contact rate improves from 70% to 80% from season to season, over a 550 at bat season he suddenly goes from being a 30 home run type to being a 35 home run player, even without an improvement in his rate of home runs per ball in play. In hand with such improvements you will typically see increases in runs scored, RBI and so on. Batting average by its very nature should improve as a contact rate improves because even if the player's number of hits per ball in play stays the same, if he puts more balls in play, he's obviously going to get more hits per at bat. It seems obvious and yet I see forecasts out there from other sources which seem to disregard this, often accepting an increased skill in contact rate but not accepting an increase in batting average. No doubt it's possible that as a player makes a greater effort to make contact, he could do so by giving up some bat speed, but it should be the exception rather than the rule to downgrade his other skills when his contact rate improves.

To further emphasize the point, check out the following ten hitters. Each of them were the leaders in improved contact rate from 2003 to 2004 (among players with a minimum of 400 at bats) and check out the improvements in other categories:

 

 

ab

h

2b

3b

hr

r

rbi

k

avg

contact%

Durham, Ray

2003

410

117

30

5

8

61

33

82

0.285

80.0%

 

2004

471

133

28

8

17

95

65

60

0.282

87.3%

Rollins, Jimmy

2003

628

165

42

6

8

85

62

113

0.263

82.0%

 

2004

657

190

43

12

14

119

73

73

0.289

88.9%

Pierzynski, A.J.

2003

487

162

35

3

11

63

74

55

0.333

88.7%

 

2004

471

128

28

2

11

45

77

27

0.272

94.3%

Kotsay, Mark

2003

482

128

28

4

7

64

38

82

0.266

83.0%

 

2004

606

190

37

3

15

78

63

70

0.314

88.4%

Finley, Steve

2003

516

148

24

10

22

82

70

94

0.287

81.8%

 

2004

628

170

28

1

36

92

94

82

0.271

86.9%

Wigginton, Ty

2003

573

146

36

6

11

73

71

124

0.255

78.4%

 

2004

494

129

30

2

17

63

66

82

0.261

83.4%

Ramirez, Aramis

2003

607

165

32

2

27

75

106

99

0.272

83.7%

 

2004

547

174

32

1

36

99

103

62

0.318

88.7%

Hillenbrand, Shea

2003

515

144

35

1

20

60

97

70

0.280

86.4%

 

2004

562

174

36

3

15

68

80

49

0.310

91.3%

Burroughs, Sean

2003

517

148

27

6

7

62

58

75

0.286

85.5%

 

2004

523

156

23

3

2

76

47

52

0.298

90.1%

Chavez, Endy

2003

483

121

25

5

5

66

47

59

0.251

87.8%

 

2004

502

139

20

6

5

65

34

40

0.277

92.0%

It's interesting to see how batting average overall gets affected by improved contact rates. Of the 36 hitters who hit at least .300 with a minimum of 400 at bats in 2004, a total of 23 of them or about 64% had better contact rates in 2004 than 2003. This was true in 2003 as well when 68% of the 38 hitters who hit at least .300 with at least 400 at bats had improved their contact rate from a year earlier. Eight of the nine hitters last year who hit .325 or better last year (again with a minimum of 400 at bats) improved their contact rate from 2003 to 2004 with Todd Helton being the only exception.

The key to recognizing such players, of course, is identifying early that their contact rate has really improved and fortunately, it's a category that can accumulate a significant sample size quickly.

This is where I want to give the reader the briefest of discussions about so-called "standard deviation" but emphasizing that this is an oversimplification. Basically, when it comes to taking a sample of data, we can, when given a normal curve, presume that a sample will be within about two so-called "standard deviations" of its real chance of happening about 95% of the time.

Much of what makes my baseball forecasting successful is the application of statistical principles to baseball analysis. Much like an election poll takes a sample of data, I treat the baseball performance as a sample of a player's skill and then attempt to estimate what his real skill is based on the sample. Often, the equations are not as you might expect and this leads to readers wondering why I haven't quickly upgraded or downgraded a player.

When it comes to the contact rate, there are certain margins outside of which a player can go, even in spring training, that can warrant a close revisiting of their skill ratings. Basically, if a player gets two standard deviations away from what I had previously projected his real ability is, I can assure you that I will be taking an even closer look at his skills to see if something has changed to affect his performance. Perhaps he is out of shape. Possibly an injury enters the equation.  Maybe he has just been unlucky.

Contact rate is one of the few categories where, even in spring training, enough data can be accumulated to give me a significant sample size that I can use to make last-minute modifications to a player's forecast. The sample size is key and here are some examples of ranges I would expect to see given a certain number of at bats and my original theory about a player's contact rate:

Original Theory / Contact Rate

30 at bats

60 at bats

90 at bats

65%

48% - 82%

53% - 77%

55% - 75%

70%

53% - 86%

59% - 81%

60% - 79%

75%

59% - 90%

64% - 86%

66% - 84%

80%

66% - 94%

70% - 90%

72% - 88%

In other words, as you're watching spring training, you might think that a player is performing so miserably that you think they deserve an upgrade or downgrade. But consider that even after 60 at bats, a player for whom I've projected about a 70% contact rate can have anywhere between a 59% and an 81% contact rate and being doing exactly what they should be expected to do within the 95% confidence interval. In other words, short runs can yield bad luck and tables like this one help us identify the players whose skills may have changed for real and whose forecasts may really be worth reconsidering.

Take Ichiro for example. My current theory about his contact rate is an amazing 90.8%. Standard deviation analysis tells us that if those were his real skills (and remember that this is just our theory and we never know if we're correct), in a 30 at bat sample, Ichiro's contact rate could be anywhere between 80% and 100%. When we've got a 60 at bat sample size, it tightens up where we would expect his contact rate to be between 83% and 97%, 95% of the time.

Of course, we must keep in mind that if we're using a 95% confidence rate, that means that about 5% of the players should be outside the margin here. That's when statistical analysis gets more interesting because we will often discover that there are more players than expected performing outside of those ranges. Our job then as forecasters is to identify which are the real non-performers and which are simply suffering from bad luck. In previous years, there have been some players whose spring training was so good or bad that they got that last minute modification in these pages.

Contact rate is likely to be one of the only categories where spring training yields enough data that I may have to downgrade or upgrade a hitter's forecast, with rare exceptions. Tables like this will place in context notes I may later make late into spring training about players performing "outside the margin of error" who subsequently get considered for significant last-minute modifications in the forecasts before Opening Day.  In general, spring is most useful for assessing playing time more than performance but in categories like the contact rate, we can actually make occasional conclusions about a player's new ability even before the first pitch of the season is thrown.

 

Register for our 2010 projections and get instant access!  Click here for details.

Fill out your e-mail address
to receive our FREE newsletter!

Baseball Notebook is an online editorial publication that actively reports on baseball at all levels.  All materials, unless otherwise stated, are © 1998-2010 Baseball Notebook.  All historical statistics displayed are obtained through licensed affiliation with Baseball Info Solutions.  As an independent editorial publication, Baseball Notebook is not sponsored by, authorized by, affiliated with or associated with any other mentioned entities, including those who own reserved trademarks that may be used for descriptive purposes only.  The terms Major League Baseball, World Series, National League, American League, All-Star Game and the names of the Major League Baseball teams are trademarks of Major League Baseball Entities.  Baseball Notebook does not offer, create, administer or endorse any fantasy baseball game or competition.