Reputation: 569
I'm using Trueskill to try to create a rating system for a tennis tournament among my friends. Games are 1v1, so I'm trying out the following:
from trueskill import Rating, quality_1vs1, rate_1vs1
alice, bob = Rating(25), Rating(25)
print('No games')
print(alice)
print(bob)
alice, bob = rate_1vs1(alice, bob)
print('First game, winner alice')
print(alice)
print(bob)
alice, bob = rate_1vs1(bob, alice)
print('Second game, winner bob')
print(alice)
print(bob)
This outputs the following:
No games
trueskill.Rating(mu=25.000, sigma=8.333)
trueskill.Rating(mu=25.000, sigma=8.333)
First game, winner alice
trueskill.Rating(mu=29.396, sigma=7.171)
trueskill.Rating(mu=20.604, sigma=7.171)
Second game, winner bob
trueskill.Rating(mu=26.643, sigma=6.040)
trueskill.Rating(mu=23.357, sigma=6.040)
I would have expected both players having the same rating after these two games but I'll go with that, no issue. However, if I remove the second game and replace it with a draw and re-run the thing:
alice, bob = rate_1vs1(bob, alice, True)
print('Second game, draw')
print(alice)
print(bob)
I get the following:
First game, winner alice
trueskill.Rating(mu=29.396, sigma=7.171)
trueskill.Rating(mu=20.604, sigma=7.171)
Second game, draw
trueskill.Rating(mu=23.886, sigma=5.678)
trueskill.Rating(mu=26.114, sigma=5.678)
bob
seems to have a better ranking when having drawn than when having won.
What's going on here? What am I doing wrong?
Upvotes: 0
Views: 63
Reputation: 5467
I would have expected both players having the same rating after these two games
The TrueSkill FAQ mentions that it "takes more recent game outcomes more into account than older game outcomes".
It looks like TrueSkill only remembers two numbers per player (mu and sigma), so if it ever wants to forget the past, it has to do some kind of exponential decay of its old knowledge.
Bob seems to have a better ranking when having drawn than when having won. What's going on here?
I don't know, but I think you did everything right. The answer is probably in the formula (or maybe the implementation). Note how the sigma has decreased more after the draw outcome, so the algorithm seems to think that it gained much stronger evidence from the drawn result. That the mu values move a lot more when the evidence is stronger is only logical. So the question to ask is: Why should it consider the draw to be more informative?
Upvotes: 0