A little while before IPL 5, I started seeing a bunch of tournament previews/speculation/prognostications pop up. Many were interesting and insightful, but few of them offered concrete data on the latest performances of participating teams. Given how fundamental some version of the Elo rating system has become to most competitive video games, I thought it would be interesting to calculate Elo ratings of professional League of Legends teams and maintain that list going forward.
Picking an algorithm
After researching the various Elo-style rating systems, I became intrigued by Microsoft's TrueSkill algorithm. TrueSkill adds to the original Elo rating system by tracking a degree of uncertainty associated with the rating (much like the Glicko rating system) as well as the ability to track individual performance within a team. The latter isn't necessary for team ratings, but would come in handy if I ever decided to track individual player ratings. TrueSkill also improves on how quickly ratings converge to their "correct" value, requiring fewer data points to give an accurate assessment.
The best part of my research was coming across this gem of a blog post by software developer Jeff Moser, explaining in layman's terms the fundamentals behind TrueSkill (I certainly didn't have the requisite mathematics training to easily understand the original paper). If you're interested at all in rating systems, I highly suggest you read through his post. Even better, he provides a C# open source project that's well-documented and complete with unit tests. Since I'm using ASP.NET MVC for LoLPortal, this was perfect.
One of the things that I discovered in implementing his project is that it doesn't take into account uncertainty over time. Since a League of Legends competitive season is played over many months, and the game and competition is constantly changing, results from several months ago should matter much less than results from the past weekend. I took a similar (albeit simpler) approach to the Glicko system by introducing a constant that increases the uncertainty factor by a small amount for each day that a team does not play a match to help solve this.
I'd already compiled a list of tournament results for LoLPortal, but I hadn't tracked matches yet. Thankfully, the folks at LeaguePedia have a great database of the major and minor tournament results and matches. Special thanks also to the many tournaments who archived their tournament results rather than giving me 404 pages, as they were extremely helpful to determine the exact dates of when matches were played. For those that were unclear, I estimated the match dates based on the tournament dates and LeaguePedia edit history. Manual data entry of hundreds of tournaments and thousands of matches wasn't the most intellectually stimulating thing I've ever done, but I got through it :).
The current algorithm has a couple of caveats that one should be aware of when analysing the list.
Every game is weighted equally
That is, a game from the grand finals of season two are weighted the same as a game from a weekly online tournament. In actuality, this hasn't turned out to be that big of an issue, since generally, teams that are doing well post good results in both online and offline tourneys. Teams that are slumping likewise do poorly in both. I've thought about weighting the various tournaments differently, but haven't come up with a good way of doing so yet. If you have any thoughts about this, let me know! In the end, I chose the simpler route, which has worked out fine for now.
Update (December 13, 2012): I've played around tweaking certain variables in the TrueSkill algorithm based on tournament type. I've found that weighting offline tournaments slightly stronger results in a very slight improvement to the number of matches correctly predicted by the algorithm, and I've implemented this improvement. I'll continue to test out different variable combinations and monitoring how the algorithm does for upcoming tournaments.
Comparison of ratings across regions is less certain than within the same region
League of Legends tournaments that feature cross-region competition are significantly rarer than same-region tournaments. The algorithm relies on these competitions so that it can hold in check any artificial inflation in ratings of top teams in one region that haven't competed against teams in another. While the abundance of competition between teams of the same region means that their ratings relative to each other are likely to be fairly accurate, comparing across regions is less accurate for now. Hopefully we'll see Season 3 feature more cross-region competitions as we saw towards the end of Season 2.
You can see the current ratings list here. In my next blog post, I'll examine how the ratings fared for IPL 5.