HOME OFFENSE RATIO WEB SITE  CONCEPT
Graphical Presentation of Batting Statistics for Major League Baseball Players and Teams
Historic and Current Stats

How good is Offense Ratio as a measure of run production?

To measure how well an estimate fits actual results, a regression analysis is used. If one set of data is graphed against a second set, and the resulting points fall along a well defined line or curve, the equation of the line or curve can be used to predict results.

Graphing the actual runs per game for each year from 1920 to 1997 for the National League and American League together, against the corresponding Offense Ratio for each year (using the sum of the hits, walks, total bases and at bats from both leagues), the data points fall along a well defined straight line.

The relationship between the data is even better when broken up into two periods, 1920 to 1939, and 1950 to the present (click on the chart below to see full size).  The exclusion of the decade of the 40s is explained in the War Years file.

Correlation Chart

The equations for the lines representing the estimated runs per game based on the Offense Ratio are as follows:

1920-1939: RPG = 9.736 x OR - 1.719

1950-1997: RPG = 9.095 x OR - 1.622

The accuracy of these equations in predicting results is measured by standard error, or by a number called the regression coefficient.

The standard error is the magnitude of the spread of the actual results about the line representing the estimate. 68% of the actual runs per game will be within one standard error on either side of the estimate line. 95% of the actual runs per game will be within two standard errors on either side of the line, and 99.7% of the actual runs per game will be within three standard errors on either side of the line.

The standard errors for the two equations are:

Percentage of Results Falling Within Range

 
Standard Error
68%
95%
99.7%
1920-1939: 0.11 Runs per Game
+/-0.11RPG
+/-0.22 RPG
+/-0.33 RPG
1950-1997: 0.06 Runs per Game
+/-0.06 RPG
+/-0.12 RPG
+/-0.18 RPG

This means that using the Offense Ratio to predict the average number of runs per game in either league in any year between 1950 and 1997, 99.7% of the estimates will be within plus or minus 0.18 runs per game of the actual number.

The regression coefficient is a measure of how well the estimate line represents the actual data. A value of 1.0 is an exact fit, a value of zero means the relationship is completely random. The regression coefficients for the two equations are:

        Regresssion Coefficient

1920-1939:     0.96

1950-1997:     0.98

The average runs per game in each league in the years 1920 to 1997 was approximately 4.5 runs per game.

The equation for runs per game as a function of Offense Ratio predicts RPG, for either league, with an accuracy of +/-0.18 runs per game, or +/- 4 percent, in 99.7% of the years between 1950 and 1997.

Given this level of accuracy, it becomes clear that walks are relatively important in producing runs. Applying this analysis to individual batters proceeds with the following scenario:

Assume that the same batter goes to the plate every at bat until 27 outs are recorded (one game). A full season is played in this fashion. Then the average number of runs per game for this one man team will be

RPG = 9.095 x OR - 1.622 = 9.095 x [(TB + BB)/(AB - H)] - 1.622            (1950 - 1997)

This is the analogous to the method used to calculate a pitcher’s earned run average. Although few pitchers pitch a full nine inning game anymore, we estimate the number of runs that a pitcher would allow in an average complete game. If it works for pitchers, it should work for batters.

Therefore, one can rank batters by Offense Ratio, and have a high level of confidence that they would fall into the same order in terms of potential run production.

The following chart compares the actual runs per game for the National League to the runs per game predicted by using the offense ratio correlation, from 1920 to 1997.

In the next file the OR is used to compare single season and career performances of individual players.

NEXT Further Discussion of Walk Value

BACK to Derivation and Relationship to Other Stats

 


Copyright © 1999, Paul M. Adel, All Rights Reserved