Posts Tagged most valueable player
I haven’t found an offshore book that currently has MVP odds posted, unfortunately. The odds above depend on the total number of players receiving votes, so if I limit the odds to only those in the top 10:
Once again, I decided to make an arbitrary formula for pitchers since the distribution of voting points is wildly inconsistent from year to year for pitchers that earned voting points (largely due to the relatively low correlation between voting points and WPA, voting points and ERA or WHIP). In contrast to the NL, where only five pitchers have even been considered for the award since 2000, 51 pitchers in the AL have received voting points over the last eleven years. Unfortunately this does little to satisfy voting trends for pitchers, due to the aforementioned inconsistencies. Because of this, I used the ’99 and ’00 seasons from Pedro Martinez as models for what pitchers have to do relative to offensive players being considered for the MVP award, to finish in the top five. Essentially a 10 WAR pitcher with a WPA around 7 or greater for a playoff team and an ERA+ of about 200 has a legitimate shot to win the MVP in any given season. Justin Verlander falls short of these arbitrary values , and the table above shows where he ranks in the top 15.
We can actually assess how many wins above average Verlander is worth that may offer more clarity. The Tigers score 4.73 runs per game and are 25-9 when Verlander starts. For simplicity, let’s make the assumption that psychological factors do not come into play, and 4.73 r/g is solely contingent on the listed starter of the opposition. When Verlander doesn’t start, the Tigers allow 4.87 r/g. Using Pythagenpat, and an average pitcher resolving Verlander’s 34 starts in the same run environment, the Tigers would win 16-17 of those 34 starts. This would place Verlander at between 8-9 wins above average for his team, and the Tigers would still win the division rather comfortably.
Miguel Cabrera has made a vicious surge in September, with a ridiculous 1.291 OPS and an impressive 2.484 WPA, all this amidst a jaw-dropping .443 BABIP. For the season his BABIP is .363, not outlandish when you consider for his career his hit/contact rate approaches 35%.
Is he the MVP? He’s third in the AL in WAR, and again the table above merely reflects a voting trend for hitters since 2000. But this isn’t 2000. Sabermetrics is an unstoppable force for which there appears to be no barrier. If we rank the contenders solely by WAR, there is still a major flaw. WAR for pitchers and WAR for hitters are founded on different units. Can we convert performance metrics to one robust measure for both pitchers and hitters? Well one can measure runs allowed or runs produced per inning, but hitters account for three or four times as many innings as a typical starting pitcher.
One possible way would be to calculate how many runs the Tigers need to score to maintain that 25-9 record if an average pitcher pitched in place of Verlander. I’m going to use base runs to ensure the units are consistent, and the Tigers allow 4.79 BsR/g when Verlander doesn’t start. The quick way to find the runs needed to maintain a 69% winning percentage over 34 games is to use solver in Microsoft Excel, and the answer is 7.16 BsR/g, which equates to .27 BsR/out. For Cabrera use the BsR formula for offensive players to find an approximate estimation of total run production, and divide by the number of outs (AB – H). The result is .32 r/out. An extremely crude way to compare hitters and pitchers but intuitively Cabrera being worth about .05 more r/out than Verlander is reasonable.
I’m not finished yet. In proportion one can create a scenario where Verlander’s hypothetical offensive output mirrors his pitching output by removing hitters of similar value after a certain number of innings pitched to express innings pitched per start. This scenario was reconciled by the calculations on Verlander in the previous paragraph, but much of the variance can at times be explained by how well the bullpen performs. The goal is for the offense to score 7.16 BsR/g to achieve 25 wins in 34 games. If Verlander averages 7 IP/GS, then after 7 IP his hypothetical offensive performers will be removed from the lineup accordingly, though in this case his equivalent worth will continue on through the 9th inning. The Tigers currently average 4.86 BsR/g during Verlander’s starts, or 1.08 BsR every two innings, which means the Tigers with an offensive player of Verlander’s value inserted into the lineup every inning would score .29 r/out, increasing his runs per out by .02 runs. This explanation at least accounts for a pitcher’s ability to pitch late in games.
The AL is actually much easier to deal with because there is no “Barry Bonds” factor. Regardless, the formula has been changing daily, and after some thoughtful and sensible analysis I’ve arrived at the conclusion that voters are not consistent evaluators of MVP candidacy. There are relationships to be found between the distribution of voting points and the metrics that we use to gauge player performance, but that is only because there are only about five players each season that could even be considered. From there the selection of the ultimate winner is mostly driven by the motives of the people voting, and where their loyalties lie (See 2006 AL MVP). To elucidate this concept, I created a graph showing WAR for each winner and average WAR for the top 5 since 1990. Now obviously I wouldn’t expect a straight line from left to right, nor a steady increase. The concept being elucidated is not one to show fault of the voters, but of the unpredictability of how voters view an MVP winner. It appears to change from year to year.
At first glance, one might think this simply is a representation of fluctuating talent. The statistic itself adjusts for league wide scoring trends for each particular season, and with each team having access to the diverse international talent pool, the average bio-mechanical limits of players are at a league-wide equilibrium, and have been since the talent pool expanded decades ago. Other than the steroid jerk from around 1998-2004, player ability, as betrayed by the left side of the graph above, hasn’t increased nor decreased drastically in any given year. The year 2000 appears to be the only anomaly on the graph, steroids notwithstanding, and Pedro Martinez and his ridiculous 10.3 WAR (4th MVP) is enough to explain the spike. Stephen Jay Gould would be proud.
Statistics are becoming more and more sophisticated, and writers/bloggers are doing whatever they can to appear more sophisticated. Thus many of them have
embraced adopted WAR among other saber-stats. Because of this general propensity, I anticipated the lower WAR values for MVP winners to be from the 90s. To some degree this is true. Dennis Eckersley won the MVP in 1992, with a WAR of 3, outstanding for a relief pitcher (WAR is a counting statistic, so relievers have lower WARs by default). And Bill James will be happy to know swing-happy Juan Gonzalez has the lowest WAR for any MVP winner since 1990, at 2.8. But there is nothing else one can take from the graph other than randomness. Even the two highest WARs are from 1990 and 1991, Henderson and Ripken respectively. Obviously I didn’t expect with the creation of WAR comes an overall increase in player ability, which is just silly. I don’t know what I expected. Though it seems I should increase my sample size to span those years dating back to 1990, and probably further, rather than only using the last eleven seasons. Having said that, the current formula correctly selected eight of the last eleven MVP winners, so all the extra effort would probably be wasted energy. I’m only doing this to find value.
As I said in the previous post, I separated the MVP candidates into three groups: hitters, starting pitchers, and relief pitchers. This should be obvious enough, as the metrics used to define the best players in each category are drastically different.
I had been entertaining the idea of including WPA (Win probability added). Intuitively it makes sense that WPA is strongly linked to standard measures of offensive ability (AVG, HRS, RBI), as most events within a game occur when the run differential is within three runs. However, pitchers aren’t always in control of their statistical fate. At the same time WPA is taken directly from each individual event. Imagine a starting pitcher up 3-2 in the 7th inning with two outs leaving the game with runners on first and second. His replacement promptly surrenders a three-run HR. Two runs are charged to the starting pitcher, hurting his ERA, and he is now in line for the loss. At the same time, his WPA has not changed from another pitcher’s event. The last measurement taken for his WPA was whatever occurred with the batter before being replaced. Because of this, there is a conspicuous asymmetry in the relationship between raw statistics and WPA.
Obviously there are situations when hitters could see an increase in WPA while seeing a reduction in AVG, perhaps due to an error by the defense. But the impact is not as severe.
Team wins is another variable I had used, but are team wins indicative of voting trends or merely a by-product of the best players playing on the best teams? If I replace team wins with just a binary appropriation of playoff outlook (0 for no, 1 for yes), the table is more in agreement with intuition while possessing similar descriptive statistics. Take a quick gander at the last MVP update and you’ll understand why I replaced team wins with a yes/no playoff variable. Human thought can occasionally outwit statistics, as long as it suits one’s agenda.
Batter: Playoff, WAR, WPA, BA, HR, RBI, C
Pitcher: Playoff, WAR, WHIP, W%, C
The variables above represent a trend from 2000-2010, therefore some statistics, like ERA, do not translate to voting points to a certain degree. On a couple of occasions, a pitcher with a 4+ ERA received voting points, and the only reason WHIP is included is due to its lower overall variance. Nonetheless it works much better in this particular formula, and I can’t control what the voters decide. Again, I’m trying to find value based on historical data. Those who think Verlander is too low consider I only used data from 2000-2010, which didn’t see any pitcher win the MVP. If/When I include seasons dating back thirty years, Verlander’s odds may increase slightly.
The way the formula works, if a player is having an above average season on a great team, then they project favorably in the MVP predictor. Having said that, the formula allows for players having exceptional seasons on mediocre teams to make an impact on the distribution of voting points.
Last post was filled with out-loud ruminations on WPA and how it appears to correlate highly with the eventual MVP winners. Appearances can be converted to number form thanks to the invention of statistics. The linear correlation coefficient for the AL was .47, and the NL it was .54. That’s a statistically significant relationship which suggests at some point voting points and WPA diverge from being independent data sets. The same can be said for WAR, which has a coefficient of .49 for the AL, and .57 for the NL.
WAR (bref version) was already included in the set of variables used for regression, and adding WPA appears to resolve more of the variance in voting points than before.
Here is the new AL MVP table:
Jacoby Ellsbury has made quite a surge lately, and when compared to the formula that doesn’t use WPA, his probability almost doubles. I should point out that I recently added stolen bases to the equation as well, which would explain why his chances increase by 100%, while Bautista, who has the highest WPA in the AL, only increases 2%.
Last year’s predictor formula did a solid job in not only predicting the eventual winners in both leagues (Votto and Hamilton) but the order in which they finished. Thus I didn’t feel it was necessary to update the formula to include stats from 2010. The predictor uses numbers from 2000-2009 to regress certain statistics onto voting points. I also make adjustments for projecting stats to year-end numbers using a very crude and simple application of each player’s career statistics. In other words, I assume cumulative statistics are representative of a player’s overall ability. I made one further adjustment and regressed all batting averages to a .285 hitter. This may throw Bautista’s numbers askew, as obviously he’s not the same player he use to be. But at the same time sustaining a 1.150 OPS for the rest of the year is a prodigious feat of hitting, certainly compared to league averages this year.
Many may be a little dumbfounded by the last row in the AL column. Never underestimate the power of a walk.
The winner of the MVP is ultimately constrained to the implications of the award name. Unlike the Cy Young, where from 2000-2009 a playoff appearance literally reduced a player’s chances of winning, the winner of the MVP largely is awarded to an offensive player from a team with post-season prospects.
Here are the latest odds via Sportsbook.com (AL followed by NL)
In the NL, the fact that neither Tulowitzki nor CarGo are even posted is puzzling to say the least. In that respect I might say the field at 4/1 has some consequential value. Some of the other candidates listed above seem out of place, and its obvious which ones. Sportsbook.com has failed to grasp a sense of reason here, though with Votto being the likely winner the other candidates are probably trivial, however the situation in the NL does warrant a few comments.
Since the Cardinals have fallen out of the race, I would expect Pujols to be of little consideration for the final vote. Troy Tulowitzki‘s historic September surge has catapulted his corresponding odds from contiguously nonexistent to a viable contender, alongside his steady teammate Carlos Gonzalez. Votto is yet the favorite, however if the Rockies or Padres find a way to make the playoffs as a result of some heroic performance(s) by their respective MVP candidates, the voters will have to undergo thoughtful and sensible deliberation on the matter. Perhaps this may lead to a surprise winner.
In the AL, despite Josh Hamilton‘s recent injury woes, a surprise winner is unlikely IMO. Hamilton is the considerable favorite to a high degree, sitting at -300. Though having said that, Cano at 5/1 is worth a look.
The odds are still not out, so my odds below compare to nothing. To compare, my odds are below.
(WAR stats are from baseball-ref.com)
The Cy Young race in both leagues is far me intriguing and layered. I’ll provide commentary at length with my odds compared to what Sportsbook.com shows.