What’s the most fashionable idea in major league baseball right now? That Jim Thome’s 600th home run is more impressive than Derek Jeter’s 3000th hit? I only wish. Shirseys? Nope. The MLB Fan Cave? Absolutely not.
No, the most fashionable idea right now is pointing out that the San Diego Padres, in last place in the National League West, have scored more runs than they’ve allowed, which gives the Padres a better run differential than a lot of other teams. Some tweets from the last few days:
It’s true. At the end of play on Sunday, August 21, the Padres had scored 496 runs but allowed only 492, for a run differential of +4 but their win-loss record was 59-70. The San Francisco Giants, on the other hand, had scored 439 runs and allowed 454 for a run differential of -15, yet had a win-loss record of 68-60.
And the AL Central? By the end of the day on August 21, the Tigers had scored 573 runs and allowed 572 (+1) but were leading the AL Central with a win-loss record of 68-58. The Indians were at -7 but with a winning record of 62-61 and the White Sox were at -8 with an even record of 63-63.
Why all this attention on run differentials? Because runs scored, runs allowed and the run differential are the key ingredients in the Pythagorean Winning Percentage–the formula originally developed by Bill James to determine a team’s expected winning percentage. James theorized that a team’s expected winning percentage was more closely indicative of a team’s performance than its actual winning percentage.
As explained by Baseball-Reference.com:
The rationale behind Pythagorean Winning Percentage is that, while winning as many games as possible is still the ultimate goal of a baseball team, a team’s run differential (once a sufficient number of games have been played) provides a better idea of how well a team is actually playing. Therefore, barring personnel issues (injuries, trades), a team’s actual W-L record will approach the Pythagorean Expected W-L record over time, not the other way around. Expected W-L is almost always within 3 games of actual W-L at the end of a season (although a recent exception is the 2005 and 2007Arizona Diamondbacks, who both beat their expected W-L by 11 games). Deviations from expected W-L are often attributed to the quality of a team’s bullpen, or more dubiously, “clutch play”; many sabermetrics advocates believe the deviations are the result of luck and random chance.
James’ original formula was simple and straightforward:
W%=[(Runs Scored)^2]/[(Runs Scored)^2 + (Runs Allowed)^2] (^2=to the power of 2)
Over the years, sabermetricians have developed variations on the original. If you’re interested in that nitty-gritty, you can read more here and here. I’m using the Baseball-Reference formula:
W%=[(Runs Scored)^1.83]/[(Runs Scored)^1.83 + (Runs Allowed)^1.83]
Back to the Padres. With 496 runs scored and 492 runs allowed, the Padres’ Pythagorean winning percentage is .503 which would result in a win-loss record of 65-64. The Padres’ actual record is 59-70. That’s a difference of 6 games. How to explain this? Is it the result of trades or injuries or luck or something else?
The answer is: I don’t know. But I’ve noticed something interesting about the Padres’ runs scored/runs allowed numbers that may shed some light on the question.
With 496 runs scored over 129 games, the Padres average runs scored per game=3.84. And with 492 runs allowed over 129 games, the Padres average runs allowed/game=3.81. The Padres, like all teams, have outliers at the extremes: games where they scored significantly more than their average runs scored/game and games where they allowed significantly more than their average runs allowed/game. The outliers, it turns out, appear to have a relationship to the fact that the Padres’ Pythagorean Winning Percentage is significantly better than its actual winning percentage.
Let me explain.
I looked at the final scores of all Padres games to date this season here. I identified the Padres’ five best games in terms of runs scored and added up the runs scored in those five games (69). I then looked to see the percentage of those runs to the total number of runs scored in the season (69/496=13.9%). I did the same with the Padres’ five worst games in terms of runs allowed (59/492=12%).
The Padres five best games in terms of runs scored account for a greater percentage of the Padres total runs scored than the Padres five worst games in terms of runs allowed account for their total runs allowed. That tells me that the Padres five best games for run production are skewing the Pythagorean Winning Percentage toward a more favorable record than the Padres actual record.
I then ran the numbers for the top 3 teams in the AL Central. Remember, we started this journey with the observation that the Padres had a better run differential than all the teams in the AL Central. I wanted to see if the outliers for the Tigers, Indians and White Sox showed any relationship to the difference between those teams actual winning percentage and their Pythagorean winning percentage. I also ran the numbers for the other teams still contending for a playoff birth: Red Sox, Yankees, Rangers, Angels, Phillies, Braves, Brewers, Diamondbacks and Giants.
Here are the results:
|Team||Runs Scored||Runs Allowed||Actual Record||Pythagorean Record||Top 5 Games in Runs Scored||Top 5 Games Runs Scored/Total Runs Scored||Top 5 Games Runs Allowed||Top 5 Games Runs Allowed/Total Runs Allowed
Like the Padres, the Yankees and the Rangers are underperforming when compared to their Pythagorean Winning Percentage. But unlike the Padres, the Yankees’ and the Rangers’ five best run scoring games as a percentage of their total runs scored is less than the five worst runs allowed games as a percentage of total runs allowed. For the Yankees, the percentages are the same (11.9%). For the Rangers, the five worst games for runs allowed % is slightly higher than the five best games for runs scored % (12.3% v. 11.8%). Perhaps this means that the Yankees and Rangers are due for regression to their actual record. Or perhaps it means that my observation about the Padres is meaningless. Or something else.
But wait, there’s more.
The Tigers, White Sox, Angels, Phillies, Braves and Brewers follow the opposite pattern. Those teams are all outperforming their Pythagorean Winning Percentages. And for all of them, the five worst games in terms of runs allowed as a percentage of total runs allowed is greater than the five best games for runs scored as a percentage of total runs scored. The outliers for these teams for runs allowed is skewing the Pythagorean Winning Percentage toward a less favorable record.
The Red Sox are performing precisely how their Pythagorean Winning Percentage expects. The Indians are right around there, too.
And that leaves us with the Giants and Diamondbacks, the two teams battling for the NL West division title. Well, if you can call their performances lately as “battling.” Both teams are significantly outperforming their Pythagorean Winning Percentage and yet both teams’ five best games for runs scored as a percentage of total runs scored is significantly greater than the five worst games for runs allowed as a percentage of runs allowed. In other words, the Padres, Diamondbacks and Giants all follow the same pattern for their outliers, and yet the Padres are significantly underperforming, and the Giants and Diamondbacks are significantly outperforming, their Pythagorean Winning Percentages.
What does it mean? For sure, the Giants and Diamondbacks have had to contend with serious injuries to key players. And the Padres traded away key players at the deadline. So perhaps those factors are to blame for the topsy-turvy NL West.
At a minimum, it’s further confirmation that the National League West is the most volatile division right now. And the Padres are in line to be serious spoilers for either the Giants or the Diamondbacks. The Giants have five games remaining against the Padres (with two starting tonight) and the Diamondbacks have six games with the Padres.
The next five weeks will be interesting. Buckle up.