Friday, July 25, 2014

End of the BCS: (spoiler) would have benefited from a four or eight team playoff

Bye, Bye BCS, we hardly knew you.

The controversial bowl/playoff system was the most exclusive in American sports. Only the top two teams in the country advanced to the postseason.

In 2014, a brand new four-team playoff will debut. The altered format sparked the question: Retrospectively, who would have benefited from the new system?
In Athens, the crystal trophy is a tease seen temporarily at a Wal-Mart display.
An expanded playoff gives teams like Georgia more chances. Photo courtesy: Dean Janssen
First, I'll look at the four-team playoff and will then expand into a hypothetical eight-team one.

Four Teams

The SEC, Big 12, Pac-12 and Big Ten would have nearly equally benefited from a four-team playoff system, according to the BCS rankings released prior to bowl games. 

Below is the amount of teams ranked No. 3 and No. 4 by the final BCS standings, dating back to its inception in 1998. It's first sorted by conference and then by team.

Then Alignment/Present Alignment:
SEC: 7/7
Big 12: 7/7
Pac-12: 7/8
Big Ten: 6/7
ACC: 1/2
Big East/AAC: 2/1
MWC: 2/0

As if Alabama fans weren't spoiled enough, they were the team that would have benefited the most from a four-team playoff. The Crimson Tide finished in the top four of the BCS standings in 1999, 2008, 2009, 2011, 2012 and 2013, winning the championship under Nick Saban in 2009, 2011 and 2012.

By Team:
3: Alabama
2: Michigan, Ohio State, Oregon, Stanford, TCU, Texas, USC
1: Auburn, Cincinnati, Colorado, Florida, Georgia, Kansas State, LSU, Miami, Michigan State, Nebraska, Oklahoma, Oklahoma State, Penn State, Virginia Tech, Washington

Obviously the BCS standings aren't perfect. For example I would have been surprised if Virginia Tech (BCS No. 3) and Oklahoma (BCS No. 4) got into the postseason ahead of Georgia (BCS No. 5) in 2007.

Teams who would have had the most appearances in a four-team playoff are: Alabama (6), Ohio State (5), Oklahoma (5), Florida State (4), LSU (4), Texas (4), USC (4), Miami (3), Auburn (3) and Oregon (3).

Eight Teams

If we were to expand the playoffs to eight teams, the following conferences and teams would benefit. (This is ONLY teams ranked No. 5 - 8 in the final BCS).

Then Alignment/Present Alignment

SEC: 15/17
Big Ten: 11/12
Big 12: 14/11
Pac-12: 13/15
ACC: 3/5
MWC: 3/3
Big East/AAC: 2/1
WAC: 2/0
Independent: 1/1

By Team:
4: Florida, Georgia, Kansas State, Ohio State
3: Boise State, Oregon, Tennessee, USC, Wisconsin
2: Arkansas, Missouri, Oklahoma, Stanford, Texas, Utah, Virginia Tech
1: Arizona, Baylor, California, Florida State, Illinois, Iowa, Kansas, LSU, Louisville, Michigan, Miami, Notre Dame, Nebraska, Oregon State, Penn State, Texas Tech, Texas A&M, UCLA, Washington State

Adding six extra teams in the postseason makes Ohio State the big winner. They would have appeared in a playoff in six additional seasons beyond the national title during that time. Georgia, Florida, Kansas State, Oregon and USC would have appeared five more times.

Teams with the most total appearances in the hypothetical eight-team playoff are: Ohio State (9) USC (7), Florida (7), Oklahoma (7), Oregon (6), Alabama (6), Texas (6), Georgia (5), Florida State (5), LSU (5) and Kansas State (5).

Thursday, July 17, 2014

MLB Team Moneyball: Part III

This is Part III of a three-part series. This is the "Moneyball" section. We'll determine how efficient teams are at spending on their money.

In Part I of this series, we found that opening day salaries in major league baseball rose 85.6% from 2000 to 2013, a 37.2% real increase. In addition, 18 teams spent record amounts on their opening day roster in 2014.  

In Part II of this series, we found that spending and winning vary on a year to year, but over time, teams with high payrolls win more than teams with lower ones in the regular and postseason. 

Some management circles in major league baseball are concerned with winning, and others focus on efficiently winning. If only there was a way to do the latter-- win games without having to spend money. 

Oakland Athletics general manager Billy Beane made this concept “Moneyball” famous. His teams always ranked near the bottom of MLB in payroll (never higher than 16th in the 2000s), yet his teams kept making the playoffs. Beane couldn’t afford elite free agents in Oakland, so he utilized other approaches and exploited inefficiencies in the market to win. 

In Part III of this series, we’ll find out if the movie got it right. Were the A’s the real “Moneyball” champs of the 2000s?

The approach in Part III will be similar to Part II, where we looked at how teams finished with varying level of amounts spent. We determined there is some correlation over the long haul between spending and winning, but not as much on a year-to-year basis. This time, we'll compare how much teams spent if they did so efficiently.

This first graph plots teams into four quadrants based on the concept that if you spend more, you should win more, on an equivalent payroll and standing intervals. Most teams fall into two quadrants; either they are cheap spenders and finish better than expected, or they are heavy spenders and finish worse than expected. The team carrying the highest payroll should finish with the best record. 

The average payroll rank is plotted on the X-axis and the average MLB standing on the Y because we’re trying to determine which teams have outperformed expectations based upon salary. The line on the graph is not a “best-fit” line per se, it’s just the equation Y = X.

The farther away a team is from the Y = X line indicates how much difference the team has performed relative to expectations. Above the line: bad. Below: good. The vertical asymptope crosses at X = 15.5 because that’s the midway point on average payroll.

There are only five teams who were above-average spenders and finished better than expected or were below-average spenders and finished worse than expected. The Atlanta Braves and St. Louis Cardinals fall into the first category and the Baltimore Orioles, Houston Astros and Colorado Rockies are in the latter.

The Rockies' well-documented problem is pitching at Coors Field. The team has finished on average 24.64 in ERA from 2000-13 and 14th in the NL. The Orioles' issue is being trapped in a stacked division (the Yankees and Red Sox lead MLB in opening day payroll (2000-13)).

Twenty-one of 30 teams averaged 10-20 in the standings proves the point that teams aggregate toward the mean, despite what they spent. The best-fit linear line, which is not shown, is .467x + 8.26, showing that it's not a 1-to-1 ratio.  So the model would predict that the best possible payroll rank would be at 8.73 in the standings. At 5, 10.95. At 10, 12.93. 15, 15.26.

Here’s this graph shown from a table point of view, where we see that indeed the A's were the "Moneyball" kings of the 2000s. They finished 14.14 spots in the standings ahead of the theoretical expectation. Rounding out the top five are the Florida/Miami Marlins (+7.61), Tampa Bay Rays/Devil Rays (+6.72), the Minnesota Twins (+5.86), and the San Diego Padres (+5.39).

The common trend among the top-five teams is they all had basement payrolls. The A’s and Twins won despite the payroll, while the Marlins, Rays and Padres still finished below-average in the standings. Does finishing slightly below average with a bargain payroll constitute a success?

The five worst “Moneyball” teams according to this approach are the New York Mets (-10), the Chicago Cubs (-9.68), the Baltimore Orioles (-7.29), the Seattle Mariners (-6.17) and the Los Angeles Dodgers (-5.21). The Cubs and Mets both took on bad contracts in the 2000s, such as the four-year $66 million deal for Jason Bay or the eight-year $136 million deal with Alfonso Soriano. The Mariners haven't made the playoffs since 2001 when the team won the most games in MLB history.

This next scatter plot takes the hard numbers – salary and winning – and puts them on the same curve. The salaries are adjusted for inflation. On the X-axis is the inflation adjusted salary (2000-13) with a team’s winning percentage on the Y-axis, also from the same period. Based on the actual data, we’ve come up with a best-fitting line which is Win % = (.000000000872) * inflation-adjusted average salary + .4203. I’ve started this curve at $50 million, so don’t extrapolate and say that a team would win 42 percent of its games if it had $0 on its opening day payroll. That would be truly be Moneyball.

Here's how to read the graph: at $50 million average payroll (inflation-adjusted), a team would win roughly 46.4% of its games. At $75 million, that turns out to 48.6%. At $100 million, it’s 50.75%. At $125 million, 52.9%.

This approach gives us a different picture, although the A’s (+7.2%) are still the "Moneyball" kings and the Twins fourth (+2.6%). The St. Louis Cardinals (+5.2%) and Atlanta Braves (+4.4%) are the new two-three combo.

If you remember from the last chart, the Braves and Cardinals were the only two teams that finished with an above-average payroll rank and finished higher in the standings. The Rays (+0.3%) and Padres (0.3%) only had modest increases.

The new bottom five is as follows: Orioles (-5.0%), Kansas City Royals (-4.5%), Cubs (-4.0%), Mets (-3.4%) and Pittsburgh Pirates (-3.3%).

We've already touched on the Orioles, Cubs and Mets. The Royals and Pirates have made the playoffs once during this studied period.

Thanks for reading this series! 

Tuesday, July 15, 2014

How do MLB home run derby participants fare after the All-Star break?

There's a popular theory around baseball that participating in MLB's home run derby messes up a hitter's swing. After practicing too many home run cuts, a hitter then loses his normal swing and morphs into a player only going for the long ball.

I decided to test this theory by analyzing the data from the past decade of all home run derby performers.

Before the home run derby, the 80 hitters batted .297 with home runs in every 15.71 at-bats. Following the All-Star break, the same group of sluggers hit .290 with home runs every 18.74 at-bats.
Prince Fielder is a regular in the home run derby. He has two
wins in five appearances. -- Justin Janssen/UT San Diego
In the first half, 71 percent of derby participants hit home runs at a better rate per at-bat and 61 percent hit for a higher batting average than in the second half of the season.

While those figures are quite overwhelming, there's other factors to consider before reaching the conclusion that the derby causes home runs to drop, such as luck/career-year factors or opponents better game planning.

It generally takes a player who has hit a lot of home runs in the first half to be invited into the home run derby. This sampling bias leaves the possibility of second half funks while almost always taking players with a good first half.

To illustrate this bias, only Ivan Rodriguez (2005) belted fewer than 12 home runs in the first half prior to entering the derby, while 37/80 players (granted in roughly 20 fewer games per team) failed to reach 12 in the second half.

Perhaps some of these player invited are having a career season - or at least a career first half. These players pop up every year and are sometimes rewarded with trips to the All-Star Game or home run derby.

In this year's competition, the Reds' Todd Frazier and the Twins' Brian Dozier are having their best seasons of their young careers. The Orioles' Chris Davis would have been in Roger Maris territory if he maintained his 37 first-half home runs.

While the home run rate from the 10 participants in this year's derby is likely to drop in the second half of the season, it's still anyone's guess as to whether the home run derby causes it.

Update: 9:55 p.m.

I went back from the past decade and found the players who entered the All-Star break in the top 10 in the majors in home runs (breaking ties by fewest at-bats needed for home run total).

What I found was that 75 of the 100 hitters -- 34 of which were in the home run derby -- did not maintain their rate of home runs from the first half following the break. 76.5% of the derby entrants and 74.2% of those not participating had lower home run outputs in the second half.

Monday, July 14, 2014

Does higher spending more yield more World Series titles? Part II

This is Part II of a three-part series. This section will determine the relationship between spending and winning, both in the regular season and postseason. 

In Part I of this series, we found that opening day salaries in major league baseball rose 85.6% from 2000 to 2013, a 37.2% real increase. In addition, 18 teams spent record amounts on their opening day roster in 2014.  

In Part III of this series, we’ll explore the concept of “Moneyball,” and determine which teams spent their payroll dollars most efficiently. 

Money may not buy happiness, but does it buy wins?

The New York Yankees are often criticized for their high payroll, with detractors saying that the lack of a salary cap gives them a significant advantage over their competitors. But the Yankees have only won one championship in the last 13 seasons, despite having an inflation-adjusted opening day payroll 37.8% and 64.4% higher than the next two most expensive organizations from 2000-13. This year is the first since the 1990s where the Yankees didn't lead the league in payroll.

Before getting into championships, we have to start in the regular season. After all, a team has to win from April-September to qualify for October baseball. 

To account for the rise in opening day payrolls, even beyond inflation rates, we’ll be using a metric known as payroll rank, as explained in Part I. We found there was 90.8% correlation and 88.9% R-square between payroll rank and adjusted salary paid.

In the first scatter plot show below, each individual payroll rank from 2000-13 is on the X-axis and where that team finished is on the Y-axis with the theory being that spending causes winning. The raw data doesn’t support this hypothesis because the two have just a 38.4% correlation, with 0 being the minimum and 1 (or -1) being the max. We can also only explain 14.7 percent of the variability between payroll rank and average salary rank.

A related graph with inflation-adjusted payroll on the X-axis and winning percentage on the Y-axis yields a similar 39.1% correlation (not pictured). Another almost identical graph (again not pictured) reveals a correlation of 38.9% with payroll rank and win percentage from all data 2000-13. The discussion ultimately doesn’t end there. If we take the average finishes instead of each individual data point, we find much stronger correlation. This next scatter plot has the exact same data, but averages the MLB standing for each payroll rank from 2000-13.

Not surprisingly, our correlation has spiked to 81.6% and we can now explain nearly 70% of the data. The Y = X graph is shown to indicate what an identical relationship between the payroll rank and average MLB standing would look like, but this is not the best-fit curve. The top opening day payroll averaged fourth in the standings (all the Yankees), while the second-highest annual payroll finished at an average of 10.5. 

In the graph below, the X-axis remains unchanged, but average winning percentage is used instead of MLB standing on the Y-axis. We found in Part I that there is a 97.6% correlation between these two variables, so naturally, there is almost an identical correlation between the two graphs. There appear to be two outliers in this graph, with the 11th ranked payroll winning at a 54.9% clip (second-most) and the sixth cheapest payroll at 51.1%.

From these graphs above, we can conclude winning and spending are moderately related over the long-term, but on a year-to-year basis, there are too many other factors that constitute winning.

For the rest of Part II, we'll look at the postseason to determine if winning buys championships. This first chart compares the amount of playoff appearances with each payroll rank, clustered in groups of five. There were 116 playoff appearances in the stretch (14 seasons * 8 teams + 4 additional wild cards). There are 420 observed teams, so the league average is 27.6 percent chance of making the postseason.

What we found is that it is most likely that a team makes the postseason if they spend more money.  Since 2000, not one team has appeared in postseason with the lowest opening payroll or the ninth-worst. The data is clustered because of some oddities, such as the second-worst spending team having advanced into the postseason more times (four) than the third-most expensive team (three). Teams in the top-third of payroll comprised of one-half of all postseason appearances, while the middle third had 31.9% and the bottom third 19.1%.

Now that we've made it into the postseason, it's time to find out if those expensive teams actually win more championships.

Granted it's still a small sample size, but 13 of the 14 World Series winners and 24 of 28 to make it to the final series had opening day payrolls in the top half of MLB. The only team in the bottom half of payroll to win the World Series was the Florida Marlins (25th) in 2003. What's interesting is No. 16-20 did not appear in the World Series from 2000-13, striking out on all 16 postseason appearances.

Teams in the top half of MLB in opening day payroll appeared in the World Series during 30.4% of their postseason appearances, with the top rate (42.9%) belonging to the No. 11-15 cluster. Teams in the bottom half of MLB in opening day payroll converted 10.8% of their postseason chances into World Series appearances, although Nos. 1-5 (25%) and Nos. 6-10 (26.9%) didn't perform drastically better than Nos. 21-25 (20%) and Nos. 26-30 (18.2%).

These next two tables compare how differing seeds in the postseason fared and the numbers behind each finish in the postseason, all from 2000-13. I'm excluding the wild card loser (which consists of four teams) in the the first set.

What's interesting is the wild card teams finish with a better record than the No. 3 overall seeds. These days wild cards teams are severely punished for not winning their division, having to face each other in a one-game playoff to move on to the Division Series. The MLB postseason is often criticized for its crapshoot nature. One example is that wild card teams have been to the World Series more times than any other seed from 2000-13 and the World Series winners have lower regular season winning percentages than teams that lose in the World Series, ALCS/NLCS and ALDS/NLDS.

Teams that made the postseason spent more than teams that didn't. Those teams with home-field advantage drastically spent more than any other seed, while the World Series winners only outspent the league championship series losers when converting historical dollars into 2013.

Check back later in the week for Part III!

Thursday, July 10, 2014

Exploring the Correlation Between Winning and MLB Salaries: Part I

This is Part I of a three-part series. This section serves as an introduction to the other two sections. It also provides background on the history of free agency. 

Money doesn’t buy happiness, but does it bring rings? We'll unveil the data of money and championships in Part II of this series. 

In Part III of this series, we’ll explore the concept of “Moneyball,” and determine if the movie got it right.

Picture a world where an employer controls an employee for the duration of his working life. The employee doesn't have to worry about his long-term status with the company. Sounds like a good deal, right?

What if that employer also severely underpays the employee and can send him to another company without interjection. Doesn't sound as good.

MLB operated like this for nearly a century under the old reserve clause. The reserve clause allowed teams to indefinitely keep their existing players prior to the free agency era.

This system hurt player contracts because they didn’t have the right to an open market. The only way that a player could become a free agent was if the player was given “unconditional release,” which means the team doesn’t want the player anymore. The player could also change teams in the current one decides to trade him.

In 1970, owners and players agreed to the 10/5 rule, meaning that a player with 10 service years, including five with the current team, has the right to veto any trade. While that only applies to a few players, others have no-trade clauses in their contracts. Five years later, free agency replaced the reserve clause system which was practiced since the late 1800s. 

Free agency sparked a fear that contracts would balloon into amounts owners couldn't afford. To this day, an often-cited problem with the game is the skyrocketing contracts, all guaranteed. For example, stars Robinson Cano and Albert Pujols recently signed $240 million contracts. Alex Rodriquez has made nearly half a billion in his career.

Still, some players don't receive their true value because signing top-tier free agents can cause that team to lose a first-round draft pick. 

For all of these mega-deals, many more players are bound to small minor league contracts, where they lack union protection under MLB's collective bargaining agreement. Former minor league players have filed a class action lawsuit saying their wages didn't meet legal standards. 

This isn’t like the NBA or the NFL where draftees instantly make the team and can become free agents after four seasons. In MLB, it takes six years of service to become a free agent. And that’s if they make it to the big leagues.

About one-third of all first-round picks and half of second-round picks never make it to the majors, according to my analysis of the MLB draft from 1976-2005.

Teams are spending more than ever before on player contracts. In addition to free agents, teams are locking up younger players to long-term deals, buying out arbitration and free agent years, effectively closing the market on the player.

In 2014, 18 teams had a record opening day payroll, although adjusting for inflation that figure drops to eight or nine. Those teams are the San Diego Padres, Toronto Blue Jays, Milwaukee Brewers, San Francisco Giants, Cincinnati Reds, Kansas City Royals, Washington/Montreal Nationals/Expos, Detroit Tigers, and possibly the Colorado Rockies, depending on the final 2014 CPI. 

Here is the source where I found opening day payrolls. I’ve seen differing totals everywhere, but the 2014 totals are also here, and 2013 here.

The average team had $103 million on its opening day payroll in 2013, which is an 85.6 percent nominal increase from 2000. Adjusting for inflation (CPI), it’s really a 37.2 percent increase in that period.

Here’s the full chart:  

The only time CPI decreased from 2000-13 was in 2009, when the recession hit. MLB contracts have increased at a relatively fast pace. 

To move on any further in this series, it's important to understand a few concepts.

R-square… I’ll be using a lot of scatter plots in Part II and III. Basically, R-square is the amount of variability explained by a best-fit curve. The higher the R-square (which ranges from 0 to 1), the more the curve explains the data. An R-square of .9 would explain 90 percent of the variability from the data.

Payroll Rank… This is just where a team ranked in amount of money on its opening day payroll. Because there are 30 teams in MLB, the average payroll rank is 15.5.

Here is a chart displaying the correlation between payroll rank and inflation-adjusted opening day payrolls from 2000-13:

The R-square of this chart (.8891) and correlation (-.908) is pretty good. Don't worry about the negative correlation; all that means is this graph has a negative slope. Not shown, but I’ve also plotted unadjusted payroll dollars. The R-square was (.772) and the correlation (-.8443) which means we’ve explained half of the variability that we previously could not explain with this new chart. The 11 percent which we currently cannot explain is likely because salaries have increased faster than inflation.

MLB Standing … I don’t break ties when doing MLB standings. If the New York Yankees and Boston Red Sox both finish with the best record in MLB, they both finished in 1.5th place, because 1.5 is the middle between 1 and 2. MLB, and any other sports organization, has arbitrary tie-breakers. If three teams finish with the best record, then they’re all in second (1 + 2 + 3) / 3 = 2. This is done so the average MLB standing is 15.5. 

Here is the scatter between MLB standing and win percentage, from 2000-13:

All I can say is too bad Seattle. The Mariners are the top outlier in this scatter. They won an MLB-record 116 games in 2001, and did NOT win the World Series. This graph is on point until about 29th in MLB standing. It’s hard to predict how bad the worst team in MLB will be.

Check back for further installments on this three part series.