Something Great

Arsene Wenger cobbled together starting lineups with spit and duct tape and Denilson and somehow the team dragged its ass over the finish line in third or fourth.

Tuesday, August 25, 2009

Fun With Statistics Part 1 - Relegation

Welcome to the first installment of Fun With Statistics. In this post TLOCA will examine the unique system of relegation. Does the system work? Let's take a look.

Since the year 2000, the following point totals have been relegated from the Premiership to the Football League Championship, formerly the First Division. Starting from the campaign that ended in 2000:

24, 31, 33, 26 ,34, 34, 28, 30, 36, 19, 26, 42, 33, 33, 33, 33, 33, 32, 34, 30, 15, 38, 34, 28, 36, 35, 11, 34, 32, 32

It's approximately a normal distribution and without Derby (2008 - 11 points) the distribution fits a bell curve even better.

Perhaps the data are skewed to the left* but that is understandable when you consider how payrolls are also heavily skewed to the left. I digress, as payroll stats is a topic for another day.

So with a reasonably normal distribution we can use the standard deviation to tell us if relegated teams are statistically different from those that remain in the top flight.

If you know a little bit about stats then enjoy, otherwise just skip to the punch-line below;

let y = relegation points
and we'll model them with N(30.6, 6.49)

Using a 95% confidence interval, which is to say, we're 95% confident a particular relegation score is between two values based on the numbers, we get

30.6 points plus/minus 3.4 points, or , an interval of (27.2 - 34.0) points**.

Does the system work? If the system works then the three worst point-totals are justifiably being relegated each year. We only care about the upper limit - 34 points - and how many times a team has been relegated when earning more than 34 points. If you glance at the data;

24, 31, 33, 26 ,34, 34, 28, 30, 36, 19, 26, 42, 33, 33, 33, 33, 33, 32, 34, 30, 15, 38, 34, 28, 36, 35, 11, 34, 32, 32

you'll see it happens quite a bit. The question though, can be asked in a better way. How many times was a team not relegated when earning 34 points or less. Even if West Ham several years ago was relegated with 42 points, which they were, they were still one of the worst 3 point-total teams that year. The system works on a relative basis, sending down the 3 worst teams relative to the others. That happened to be a year of high parity near the bottom, so sorry West Ham.
The key then is to find those teams that are no different (well, 95% confident that they're no different) from relegation teams in the past but still managed to stay in the Gloryship.

This has happened only once in the past 10 years. In 2005, West Brom was not relegated but only earned 34 points. So in essence, 4 teams should have been relegated that year, not 3. But hey, one in thirty ain't bad.

Next up - Promotion

*I know intuitively the data look skewed to the right. Statisticians for some silly reason have accepted the fact that skewed to the "left" means the "hump" is on the right side. ...I don't like it either.

**I took out the outliers, Derby, Sunderland, etc, and the average with interval still rounds to 34 points. I even took out the top 3 and bottom 3 point totals and the point total is still 34 points. Rounding in this case is OK because a team must earn an integer value for points.


Ben said...

Who knew stats could prove useful? Actually, my professor, she talked about it many times. And yes, explained all about the skewness. How fun. Now I get to have stats in class and in soccer.

Jim said...

We should rename this blog the Legend of Juan Arhancet. Kind of like the St. Louis Albert Pujols.