SELECTING BASEBALL’S BEST BATTER: A MATHEMATICAL ANALYSIS!
CHRISTINA WALL
Latta High School, GRADE 10
ABSTRACT
This project employed a mathematical rating and weighting system to determine an “all time best batter” throughout history. The purpose of this project was to determine whether a weighted statistical analysis of batters’ statistics differ from an unweigted analysis in the selection of the best batter throughout history?
The hypothesis was “When comparing baseball-batting
statistics using a mathematical analysis which weights each desirable
characteristic, a significantly different result will be evident in the
selection of the ‘best baseball batter ever’ rather than the one selected using
an unweighted mathematical analysis which assigns all characteristics an equal
importance value.” A survey was
developed to receive coaches’ opinions to create a rating scale on batting
characteristics. Statistics on the players were collected from the internet,
and the top fifteen players in each of the eighteen different statistics were
ranked from one to fifteen and assigned values. The number of points each
player received in each category were totaled.
Three different ranking methods were employed:
1.
unweighted,
where all batting statistics were weighted equally,
2.
a
weighted version which weighted only the desirable characteristics chosen by
the coaches’ surveys, and
3.
another
weighted version which weighted the top four desirable characteristics along
with the other unweighted statistics.
The top batter in the unweighted statistical version
was Lou Gehrig, the top batter in the weighted only version was Babe Ruth, and
the top batter in the weighted only plus other statistics was Babe Ruth.
The top fifteen batters in each method was
determined and assigned a numerical value.
Points from all three versions were totaled up and a final overall best
batter was decided.
The project’s hypothesis was only partially
supported. There was a difference in
the three versions but only a very slight one.
Lou Gehrig was the best batter statistically overall after all three
versions had been combined and points totaled up.
INTRODUCTION
What makes a great
batter? Is it their on base percentage?
Perhaps it is how many home runs a player hits during his career? Coaches and sports fans all around the world
have their own opinion on what makes a great batter. Most coaches and sports fans base their opinion on only one or
two statistics when in reality, there are many more statistics that people
should take into consideration.
“Hitting a baseball is one of the most challenging
skills in sports- an art in which 30% success rate is considered above
average.” (1) Batters have to deal with many things including the pitching
ability, the bat and ball both having round surfaces, batting stance,
positioning, attacking the ball, and thinking outside the box.
Baseball is the game best suited for statistical
analysis. To have a batting average
of “300” means that for every ten times
at bat then the hitter got a hit three times. Statistics in baseball can be figured in many different ways. For example, a person can determine their
career homerun average or mean by totaling their total number of home runs in
his career divided by the total seasons played. By ranking ones statistics from highest to lowest, a player can
determine what characteristic he needs to work on more. Not only are statistics
useful in determining a batter’s own percentages, but they can also enable a
batter to evaluate where he stands in relation to other batters. Statistics are extremely useful in
determining baseball percentages and comparisons.
Many characteristics go into the
production of a great batter. Some
people say that the size of the batter influences the way a batter hits. “Current baseball scouts generally focus
their attention on larger prospects, particularly pitchers.”(2) However if Babe Ruth was magically
transported into the 21st century, he would not stand out in any
team picture. Although he was a
legendary giant, if he were alive today, he would only stand taller than 48% of
the players who were on major-league 40-man rosters at the start of spring
training.
There are many statistics that should go into a great batter, but
frequently people consider only one or two batting statistics to be
important. Many fans feel that the
number of home runs is the ultimate statistic that makes a great batter. Home runs are the most exciting play of the
game. “Whoever called the triple the
most exciting play in baseball was either narrow minded or nuts. We all know what baseball’s most exciting
play is: the long ball. That’s what puts derrieres in the seats. Chicks don’t dig three-baggers. Come on, how many of you knew that Christian
Guzman led the major leagues with 20 triples last season?” (3) Home runs can be
very important during a ballgame, but home runs don’t count for
everything. Other statistics into play
such as: on base percentage, runs batted in, and slugging percentage, yet these
are not even all of the important statistics which should be considered when
deciding on the best batter.
Just as the world has changed over
the past years, baseball has also changed.
“The ‘National Pastime,’ is a whole new ballgame compared to what it was
like in 1942.” (4) That may wound the sensibilities of those so entranced by
nostalgia and a romantic view of the game’s mythology as to contend baseball is
immutable. But to insist that Barry
Bonds, Sammy Sosa, Jason Giambi and Randy Johnson are playing the same game as
their great predecessors of the 1940’s, like Joe DiMaggio, Ted Williams, Stan
Musial, Bob Feller, Mort Cooper and the rest, is sheer self-delusion.
The purpose of this is to determine the best batter
throughout history by using a statistical analysis over three versions: an unweighted
version, a weighted only version, and a weighed version plus other unweighted
statistics. The unweighted version
weights each of eighteen batting statistics equally therefore giving all
statistics equal importance. The
weighted version using only the top four desirable characteristics chosen by
the coaches’ survey therefore giving only certain characteristics
importance. The weighted statistics
plus the other unweighted statistics using both the unweighted and the weighted
statistics therefore giving everything importance, but some statistics gaining
more importance than others. The
hypothesis for this experiment was “When comparing baseball-batting statistics
using a mathematical analysis which weights each desirable characteristic, a
significantly different result will be evident in the selection of the “best
baseball batter ever” rather than the one selected using an unweighted
mathematical analysis which assigns all characteristics an equal importance
value.”
Methodology
STEP 1: Prepare survey to be sent to coaches asking their opinion on the top
10 baseball batters old and new, and their opinions on characteristics of what
makes the best batter. (See survey on following page.)
STEP 2: Consult Coach Eddie Collins (Oklahoma Baseball Coach Hall of Famer) to
revise survey.
STEP 3: Fax or E-Mail 30 coaches.
STEP 4: After surveys have been returned, tabulate coaches’ responses to the
surveys to find top four characteristics they felt were most desirable and
record using Microsoft Excel. Also from the survey’s results, determine the top
50 players.
STEP 5: Gather data on 18 batting statistics for those top 50 players from the
Internet using www.baseballreference.com.
(See list of 18 statistics on the following pages) and record into a database
in Microsoft Excel.
STEP 6: Rank each player for each batting statistic by numbering each player
as 1 being the best and 50 being the worst.
STEP 7: Assign the top 15 players in each statistical category points by
giving the number 1 in each category 15 points, the number 2 in each category
14 points, and so forth until the top 15 players have been given points in each
category.
VERSION 1: UNWEIGHTED STATISTICAL VERSION:
STEP 8: Rank batters by their total points and determine who is the best
batter by using an unweighted statistical version.
VERSION 2: FOR WEIGHTED USING ONLY COACHES’ SELECTED CHARACTERISTICS:
STEP 9: Weight the top 4 statistics chosen by the survey by multiplying the
most desirable batting statistic by 5, the next most desirable statistic by 4,
the next most desirable characteristic by 3, and the last most desirable
characteristic by 2.
STEP 10:Use only the four most desirable characteristics and rank the batters
by their total of points and determine who the best batter is by using the
weighted statistics only.
VERSION 3: COMBINATION OF
WEIGHTED COACH PREFERRED STATISTICS AND THE OTHER UNWEIGHTED STATISTICS:
STEP 11: Copy the four statistics used in version #2 for all the batters and
then record the other unweighted statistics from version #1 and record into a
new database.
STEP 12: Rank the batters by their total number of points and determine who the
best batter is by using the unweighted statistics and the weighted statistics.
STEP
13: Assign the top fifteen in all
three categories points by giving the number one fifteen points, the number two
fourteen, and so fourth till all fifteen in each category have been given
points.
STEP 14: Place each player’s new points into a new spreadsheet.
STEP 15: Total each player’s points, rank, and determine the overall best
batter.
STEP 16: Analyze the difference between the three versions, if any, and decide
whether the project supported the hypothesis or not.
THE 18 STATISTICS USED
(ABREVIATION/STATISTIC)
H- Hits –career
H- Hits – single season
2B- Doubles – career
2B- Doubles –single season
3B- Triples – single season
HR- Home Runs – career
HR- Home Runs – single season
RBI- Runs Batted In – career
SO- Strikeouts – career
SO- Strikeouts – single season
BA- Batting average –
career (hits/ at bats)
BA- Batting average – single
season
OBP- On base percentage – career (hits + base on balls + hit by pitcher / at bat + base on balls + sacrifice flies + hit by pitcher)
OBP- On base percentage – single
season
SLG- Slugging percentage – career (total bases/ at bats)
SLG- Slugging percentage – single
season
When it comes down to the final statistics, what makes a great batter?
Please give you opinion about the following questions.
¨ What is the most desirable
characteristic of a hitter?
¨ Please rank the following
list (1 being the best and 5 being the worst)
SINGLE SEASON
_______A batter who has a great on-base percentage
_______A batter who has many home runs
_______A batter who has several home runs and an a high average on-base percentage (.325 batting average)
_______ A
batter who not only has a good on-base percentage but also gets multiple bases
(doubles and triples)
_______A batter who has many RBI’s
_______ A batter who has a consistently high batting average over many
seasons
_______ A batter who has a consistently high number of homeruns each
season
_______A batter whose career last many years
_______ A batter who has one record breaking season
_______A batter who has very few strikeouts in their career
PLEASE LIST YOUR TOP 10 FAVORITE OLD AND NEW BASEBALL BATTERS
RESULTS
Five sets of data resulted after the data was analyzed: the coaches’ survey, an unweighted statistical version, a version using only the weighted statistics, and a weighted plus unweighted statistical version. From the coaches’ survey, the top four desirable characteristics chosen by the coaches were used in the experiment. These characteristics were batting average (BA), home runs (HR), slugging percentage (SLG %), and strikeouts (SO). Batting average, home runs, and strikeouts were all chosen from the career section of the survey, but slugging percentage was also included in the cumulative ranking from the single season section of the survey. After sending the coaches surveys, it was realized that the survey should have been written more clearly to be properly rated by the coaches. Therefore, slugging percentage is an overall statistic that covered the top three choices: great on base percentage, several home runs and a high average on base percentage, and a high number of RBI’s. Eighty percent of the coaches selected batting average as the most desirable characteristic. Ten percent of the coaches chose home runs as their most desirable characteristic, and ten percent of the coaches chose fewest strikeouts as their most desirable characteristic. Though home runs and strikeouts had the same number of most desirable characteristics, home runs received more second and third choices than strikeouts. Since slugging percentage was an overall desirable trait formed by the combination of three statistics we placed it as second best in the top four desirable characteristics.
The second set of results is the
unweighted statistical version. The top
five in this category are Lou Gehrig with 143 points, Ty Cobb with 142 points,
Rogers Hornsby with 139 points, Tris Speaker with 134 points, and Babe Ruth
with 127 points.
The third set of results is the weighted only
statistical version. The top five in
this category are Babe Ruth with 152, Ted Williams with 150 points, Lou Gehrig
with 122 points, Todd Helton with 110, and Rogers Hornsby with 104 points.
The fourth set of results is the
weighted version plus the other unweigthed statistics. The top five in this category are Babe Ruth
with 240 points, Lou Gehrig with 234 points, Ted Williams with 223 points,
Rogers Hornsby with 216 points, and Ty Cobb with 210 points.
The fifth and final set of results
is the complete total or overall best batter of all three versions
combined. The top five in this category
are Lou Gehrig with 42 points, Babe Ruth with 41 points, Rogers Hornsby with 36
points, Ted Williams with 36 points, and Ty Cobb with 35 points.


|
SURVEY RESULTS |
|
|
|
|
|
|
|
|
|
|
PLEASE
RANK THE FOLLOWING 1 BEING WORST AND 5 BEING THE BEST |
|
|
|
||||||
|
SINGLE
SEASON |
|
|
|
|
|
|
|
|
|
|
___1___A BATTER WHO HAS A GREAT ON-BASE PERCENTAGE |
|
|
|
||||||
|
___2___A BATTER WHO HAS MANY HOME RUNS |
|
|
|
|
|
||||
|
___3___A BATTER WHO HAS SEVERAL HOME RUNS
AND A HIGH AVERAGE ON-BASE % (.325) |
|
||||||||
|
___4___A BATTER WHO NOT ONLY HAS A GOOD
ON-BASE % BUT ALSO GETS MULTIPLE BASES |
|||||||||
|
___5___A BATTER WHO HAS MANY RBI'S |
|
|
|
|
|
|
|||
|
CAREER |
|
|
|
|
|
|
|
|
|
|
___6___A BATTER WHO HAS A CONSISTENTLY HIGH
BATTING AVERAGE OVER MANY SEASONS |
|||||||||
|
_ _ 7___A BATTER WHO HAS A CONSISTENTLY
HIGH NUMBER OF HOMERUNS EACH SEASON |
|
||||||||
|
___8___A BATTER WHOSE CAREER LAST MANY
SEASONS |
|
|
|
|
|||||
|
_ _9___A BATTER WHO HAS ONE RECORD BREAKING |
SEASON |
|
|
|
|
||||
|
__10___A BATTER WHO HAS VERY FEW STRIKEOUTS IN THEIR CAREER |
|
|
|
||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
CHOICE
# |
# OF
1'S |
# OF
2'S |
# OF
3'S |
# OF
4'S |
# OF
5'S |
|
|
|
|
|
1 |
5 |
0 |
2 |
2 |
1 |
|
|
|
|
|
2 |
0 |
0 |
2 |
3 |
5 |
|
|
|
|
|
3 |
4 |
3 |
1 |
2 |
0 |
|
|
|
|
|
4 |
4 |
4 |
2 |
0 |
0 |
|
|
|
|
|
5 |
1 |
3 |
3 |
1 |
2 |
|
|
|
|
|
6 |
8 |
2 |
0 |
0 |
0 |
|
|
|
|
|
7 |
1 |
4 |
3 |
2 |
0 |
|
|
|
|
|
8 |
0 |
0 |
5 |
4 |
1 |
|
|
|
|
|
9 |
0 |
0 |
1 |
3 |
6 |
|
|
|
|
|
10 |
1 |
3 |
3 |
2 |
1 |
|
|
|
|
From the survey’s results, it was
seen that coaches do indeed have their own favorite characteristics for
batters. The hypothesis of this
experiment was only partially supported in the sense that there was a slight
difference at the top of the three versions but not a major one. Lou Gehrig placed first in the unweighted
version and Babe Ruth placed first in the weighted only version and the
weighted plus unweighted statistics version.
The majority of the top batters placed in the top in
all three methods, but there are exceptions to this. For example, Todd Helton placed third and received 12 points in
the weighted only version, but only received a total of twelve points in the
other two versions combined. Between
the top five players, there was only a difference of seven points, and each of
the top players placed at least third in one of the three versions. Lou Gehrig,
the player with the most points after all three versions were completed, placed
1st in the unweighted version, 2nd in the weighted
version plus the unweighted statistics category, and 3rd in the
weighted only version.
It is interesting to note that all of the top five
players were from the older years.
Perhaps they were more dedicated to the sport instead of the
dollars. If this project was completed
again, the survey should be more clearly written with more precise directions
so the choices would be more specific and the wording would be clearer. This project could be used to show fans
around the world that there are many other characteristics of a batter that
should be considered while watching baseball games other than just home runs.
The project could also show people that more credit should be given to those
batters who excel in the other categories instead of just focusing on only a
couple of high profile batters.
I would like to thank my
computer teacher, Mrs. Dansby, for teaching me Microsoft Excel, without which
this project would have been impossible.
I would like to thank my science
teacher, Mrs. Stevens, for all the time and assitance in which she has given
me.
1.
Bonavita,
M., HITTING BASICS, SPORTING NEWS,
JULY 17, 2000
2.
Schmuck,
P., DOES SIZE REALLY MATTER?, BASEBALL
DIGEST, JULY, 2001
3.
Deveney, S., DEEP THOUGTS. (BASEBALL PLAYERS
GIVE THEIR OPINION ON HITTING HOME RUNS), SPORTING NEWS, APRIL 23, 2001
4.
Vass,
G., HOW GAME HAS CHANGED IN 68 YEARS, BASEBALL
DIGEST, FEBUARY, 2002
5.
Johnston,
R., ELEMENTARY STATISTICS, seventh edition
6.
COACH EDDIE COLLINS, OKLAHOMA BASEBALL COACH HALL A FAMER
7.
THE BASEBALL ENCYCLOPEDIA, tenth edition
8.
Baseball
statistic website, www.baseballreference.com
9. Jim
Albert and Jay Bennett, CURVE BALL, July, 2001