Reputation: 4807

How can I sort this "matrix" of data?

I run a sporting website, and it's often useful to rank people against each other based on their previous meetings.

See this example set of data:

On the left is the raw "unsorted" view. On the right is the correctly sorted (in my opinion) list.

Each square shows the number of times they've competed against each other, and the percentage of victories. They're shaded based on the percentage.

I have this in a webpage, with "up" and "down" controls alongside each row, and I can manually nudge them around until I get what I want.

I'm just not quite sure of the best way to automate this.

The numbers at the end of each row are a quick idea I roughed out, and equal to the sum across the row of (percentage-50 * column number). As you can see, they do a fairly good job, with only the first two rows being "wrong". They don't give any weight to number of meetings though, only the win percentage.

The final-column numbers change depending on the row order as well, as can be seen by comparing the left and right tables in the image, so sorting on the initial values wouldn't work that well. Looping a sort + re-calc a couple of times might do the job.

I expect I can cobble something together to make this work... but I feel SO will have some much better ideas, and I'm all ears.

Upvotes: 1

Answers (2)

Edward Doolittle

Reputation: 4100

Just to add to j_random's answer, you can isolate cycles using something like Tarjan's strongly connected components algorithm. Within a strongly connected component, you could use another method for sorting the items.

Upvotes: 1

j_random_hacker

Reputation: 51226

A tournament is a graph in which there is exactly one directed edge between every pair of vertices; to create such a graph from your input, you could make a graph with a vertex for each player, and then "point" each edge between two players in the direction of the player who won a majority of games (you will need to break ties somehow).

If you do this, and if there are no cycles A > B > ... > A of the type I mentioned in my comment in this graph, then the graph is transitive, and you can order the players using a topological sort of the graph. This takes just linear time in the number of edges, i.e. O(n^2) for n players.

If there are such cycles, then there's no "perfect" ordering of players: any ordering will place at least one player after some player that they have beaten. In that case, a reasonable alternative is to look for orderings that minimise the number of these "edge violations". This turns out to be a well-studied NP-hard problem in computer science called (Minimum) Feedback Arc Set in Tournaments (FAST). A feedback arc set is a set of directed edges ("arcs") which, if deleted from the graph, would leave a graph with no directed cycles -- which can then be easily turned into an order using topological sort as before.

This paper describes a recent attack on the problem. I haven't read the paper, but this is an active area of research and so the algorithm is probably quite complicated -- but it might give you ideas about how to create a simpler algorithm that runs slower (but acceptably fast on such small instances), or how to create a heuristic.

Upvotes: 1

How can I sort this &quot;matrix&quot; of data?

Answers (2)

Related Questions

How can I sort this "matrix" of data?