From Batting Average to On-Base Percentage: How Inductive Analysis Transformed Baseball Metrics and Team Success

The importance of advanced statistics has taken the forefront of the professional sports landscape. Using traditional statistics to help better analyze a sports team's performance has become more and more difficult with the emergence of advanced statistics in the past few decades. The traditional ways that success used to be measured in Baseball was seen through the player's ability to have a high batting average (AVG) and produce a lot of runs batted in (RBI) (Burroughs, 2018p.248-265). Batting average in Baseball is measured through the percentage chance of a player getting a hit—the simple mathematics of average is hits divided by the number of at-bats that a player gets (Hits/Plate Appearances=Batting Average)(Major League Baseball, 2024). Runs batted in is when a batter is awarded for each runner who scores due to the batter's action. This can include a hit, fielder's choice, sacrifice fly, bases-loaded walk, or hit by pitch(Major League Baseball, 2024). The batter's batting average does not necessarily increase for getting a run batted in; however, it very likely can. These two statistics were the most critical indicators leading toward offensive success and wins. Nevertheless, as the game of Baseball has evolved, economists, statisticians, and mathematicians have repeatedly analyzed whether these two statistics were the best way to measure a player and a team's success. However, the creator of Sabermetrics, Bill James applied inductive reasoning to large quantities of baseball-related data. He was able to prove that on-base percentage had a stronger correlation with a team's win percentage and success than batting average and runs batted in(Slowinski, 2024)

As on-base percentage has evolved into one of the key metrics in Baseball to date,(Lewis, (2017) this paper examines how the inductive approach revealed and isolated the key relationship between on-base percentage and win percentage. This paper explores how the inductive approach has been used to discover the breakthrough of on-base percentage, reshaping how strategies and decision-making alter the way in which professional baseball teams are built.

The problem with traditional metrics like batting average and RBIs is that it troubles  mathematicians and statisticians being able to isolate the most relevant variables that drive specific outcomes(Perry, 2004) (Gregg, 2017). Therefore, using the inductive approach is important when finding how the on-base percentage statistic better correlates with win percentage compared to traditional metrics. There was a longstanding emphasis that both batting average and RBIs were the most crucial statistics for a team's success(Foley, 2020). With these two statistics, mathematicians and statisticians had trouble isolating the most relevant factors that drive the specific and clear outcome. Yet here is where batting average and RBIs have their limitations in proving their direct correlation to team success. Batting average is solely based upon hits and is the ratio of hits to the amount of times the batter gets the opportunity to hit (Plate Appearances). While batting average measures a players ability to get a hit, this relevant stat fails to paint the entire picture of a players potential offensive output. On-Base percentage isn't focused on the success of getting a hit, rather it focuses on the ability not to fail(Berri, 2017 p.178) Batting average doesn't account for important plays, like drawing a walk or getting on base. Players who have a good eye and are able to draw walks are a much more undervalued commodity than a player's ability to have a high amount of RBIs and batting average. RBIs  measure the amount of runs that have been scored because of the player's performance. Although it's a measurement of run production that leads to a greater team's success, there are prerequisites to this statistic. Unless a player hits a homerun (which accounts for RBI[s]), there needs to be productive players, that is players that themselves can get on  base, up to bat beforehand. This allows for them to get on base in order for the statistic to be relevant in showing a player's value. This statistic is not an independent measurement of how good the player is(Grabiner, 1999)(Lewis, 2003 p.71). This is how the inductive approach came about to change the game of Baseball. 

Towards the end of the 1970s, the discovery of Sabermetrics was implemented in Baseball thanks to Bill James and a few other Sabermetricians. Sabermetrics is the application of analysis used in baseball statistics. With the increased relevance of Sabermetrics, the game of baseball has become better at discovering the true value of players(Lewis, 2003 p.71). The key point about the inductive approach that James and other sabermetricians knew was that using this method wouldn't outright lead to the answer they were looking for. Rather, it canceled out other possible theories that were believed to be true before, yet with the advancement of Sabermetrics that wouldn't be the case anymore. The reasoning opened the door to other possibilities canceling out the weaker theories. They did this through analyzing large amounts of data. Eventually they uncovered patterns that brought new discoveries and answers that altered how baseball statistics have been viewed.”(Kelly, 2023)

Real world evidence of applying on-base percentage theory to a professional team can be seen in the early 2000s Oakland Athletics. Under General Manager Billy Beane's reign, the A's used the inductive approach applying the stat of on-base percentage when recruiting players. They built a team of undervalued players who were sufficient in getting on base. The A’s did all their recruiting with one of the most constrained budgets in baseball still to date ranking in the bottom five of all teams(Orinick, 2024).With these factors apparently holding them down, the team went on to become one of the greatest Cinderella stories in sports. Using this strategy allowed the A's to consistently compete with teams with far larger budgets year in and out. Through the A's success in prioritizing on-base percentage when recruiting talent, empirical evidence successfully shows that the inductive approach worked. The team was not only able to build an efficient offensive team but also compete at the highest level with one of the lowest budgets in the entire league. This challenged the traditional way players were valued and how a major league baseball team was “supposed” to be built (Digby, 2024).

Bill James’ first concept was a theory called “offensive efficiency.”It was the first theory seen in the sabermetric era. The theory was that a player with a higher on-base percentage leads to higher chances of scoring rather than a player with a higher batting average. With this theory came a stat known as runs created. Runs created is a stat that estimates the number of runs a hitter contributes to their team. A team with a higher collective on-base percentage leads to a higher amount of runs created. That being said, a team is more offensively efficient when their on-base percentage is higher thus leading to more runs created. This is a more reliable metric for predicting a team's performance and ability to score runs compared to batting average(Teeter, 2014) (Hooper, 2017) The graph below showcases the collective team statistics for the 2016 MLB regular season. The point of this graph is to showcase the estimated number of runs compared to the actual number of runs scored by that team as well as seeing if team on-base percentage correlates with runs scored. The run conversion rate can be found using this equation: Run Conversion Rate = Runs Scored / Runs Created. Run conversion is there to showcase how efficient each team has been with their given opportunity of runs created producing runs. The teams are organized from most productive offense (runs scored) to least productive offense. The green is for teams ranked in the top 10 of that category and red is for teams in the bottom 10 of that category.

(Major League Baseball, 2016)(Sabermetrics, 2016)

I added in the team on-base percentage (OBP) stat to determine whether OBP really does correlate to more runs scored hence leading to more chances of games won? After plugging in the numbers the scatter chart came out with this. 

Although there are a few outliers, the general trend among the scatter plot is clear. This graph tells us that the relationship between a higher on-base percentage and more runs scored is evident. Now let’s look at the scatter plot of team batting average and runs scored.

The team batting average compared to the number of runs scored is a lot more inconsistent than comparing runs scored and team on-base percentage. These two scatter plots show how reliable on-base percentage is if you want to score more runs. There is a very slight trend on this scatter plot, yet still, some teams scoring more runs have a lower batting average than teams scoring fewer runs with a higher batting average. 

The relationship between the on-base percentage stat and a team's success has solidified with the empirical evidence. Using empirical evidence removes the curtain on traditional statistics and helps to dig deeper to find answers like the scatter plots above, proving that traditional metrics do not provide the entire answer for statistics that dictate a team's success. X

The discovery of on-base percentage and how it leads to a team winning has altered how Baseball is viewed. This connection is a compelling example of how the inductive approach uncovers how traditional metrics can be unclear or conflicting when trying to find a theory. By challenging these traditional metrics, people in the game of Baseball, like Bill James and Billy Beane, have revolutionized how the game of Baseball is run, from building a team to how a player is most valued.

Bibliography :

 Burroughs, B. (2018). Statistics and baseball fandom: Sabermetric Infrastructure of Expertise. Games and Culture, 15(3), 248–265. https://doi.org/10.1177/1555412018783319 

Batting average: Glossary. MLB.com. (2024). https://www.mlb.com/glossary/standard-stats/batting-average 

Runs batted in (RBI): Glossary. MLB.com. (2024b). https://www.mlb.com/glossary/standard-stats/runs-batted-in# 

Slowinski, P. (2024). OBP. Sabermetrics Library. https://library.fangraphs.com/offense/obp/ 

Lewis, J. (2017, October 3). On base percentage vs. batting average: Which is more important?. Bleacher Report. https://bleacherreport.com/articles/40132-on-base-percentage-vs-batting-average-which-is-more-important 

19, D. Perry              February, Perry, D., Grimm, N., Grimm, N., Jackson, T., Jackson, T., Megdal, H., Megdal, H., Schiller, G., Schiller, G., Paternostro, J., Staff, B. P., Goldstein, C., Goldstein, C., Dubuque, P., & Perry, D. (2004, February 19). Baseball prospectus basics: Measuring offense. Baseball Prospectus. https://www.baseballprospectus.com/news/article/2562/baseball-prospectus-basics-measuring-offense/ 

 Franco, T. (2024, July 1). (PDF) use of machine learning and deep learning to predict the outcomes of Major League Baseball matches. https://www.researchgate.net/publication/351597865_Use_of_Machine_Learning_and_Deep_Learning_to_Predict_the_Outcomes_of_Major_League_Baseball_Matches 

Gregg, P. (2017, September 14). The Struggle to Define ‘Valuable’: Tradition vs. Sabermetrics in the 2012 AL MVP Race. Society for American Baseball Research. https://sabr.org/journal/article/the-struggle-to-define-valuable-tradition-vs-sabermetrics-in-the-2012-al-mvp-race/ 

Foley, J. (2020, May 11). Twinkie Town Analytics Fundamentals: The flaws of batting average. Twinkie Town. https://www.twinkietown.com/2020/5/11/21253031/twinkietown-analytics-fundamentals-come-learn-baseball-with-john-sabermetrics-batting-average-flaws 

David Berri, Downward, P., Dawson, A., & Dejonghe, T. (2017). Sports economics. (Page 178) Routledge. 

Grabiner, D. (1999, February 10). The Sabermetric Manifesto. Sabermetric manifesto - the baseball archive. https://web.archive.org/web/20090323044515/http://baseball1.com/bb-data/grabiner/manifesto.html 

Lewis, M. (2003). Moneyball: The art of winning an unfair game.(p.71)

 W.W. Norton & Company.

Lewis, M. (2003). Moneyball: The art of winning an unfair game.(p.73)

 W.W. Norton & Company.

Kelly, M. (2023, February 16). What is Sabermetrics?. MLB.com. https://www.mlb.com/news/sabermetrics-in-baseball-a-casual-fans-guide 

Orinick, S. (2024, April). MLB team payrolls. Major League Baseball Team Payrolls 1998-2024. http://www.stevetheump.com/Payrolls.htm 

Digby, E. (2024, June 27). The real story behind Moneyball: How analytics changed baseball. RunPee. https://runpee.com/the-real-story-behind-moneyball-how-analytics-changed-baseball/ 

Teeter, C. (2014, February 5). Measuring a team’s offensive efficiency. Beyond the Box Score. https://www.beyondtheboxscore.com/2014/2/5/5380156/offensive-efficiency 

Hooper, T. (2017, April 7). Measuring offensive efficiency. Community Blog. https://community.fangraphs.com/measuring-offensive-efficiency/ 

2016 MLB team hitting stat leaders. MLB.com. (2016, October). https://www.mlb.com/stats/team/on-base-percentage/2016 

2016 Major League Baseball Sabermetric batting. Baseball. (2016, October). https://www.baseball-reference.com/leagues/majors/2016-sabermetric-batting.shtml 

















Previous
Previous

Arsenal Football Club: A Financial Blueprint for Global Investment

Next
Next

Tampa Tycoon Takes on English Football: Behind Galinson's Gillingham Acquisition