Reputation: 650
There are two tables:
Table user:
+----+-----------+
| id | user_name |
+----+-----------+
| 1 | Alice |
| 2 | Steve |
| 3 | Tommy |
+----+-----------+
Table result:
+----+---------+-------+-------------+
| id | user_id | score | timestamp |
+----+---------+-------+-------------+
| 1 | 1 | 22 | 1410793838 |
| 2 | 1 | 16 | 1410793911 |
| 3 | 2 | 9 | 1410793920 |
| 4 | 1 | 27 | 1410794007 |
| 5 | 3 | 32 | 1410794023 |
+----+---------+-------+-------------+
What I have so far is a "top 3", which works great and looks like this:
SELECT MAX(m.score) AS score, u.user_name
FROM result AS r
INNER JOIN user AS u ON r.user_id = u.id
GROUP BY r.user_id
ORDER BY r.score DESC
LIMIT 3;
+-------+-----------+
| score | user_name |
+-------+-----------+
| 32 | Tommy |
| 27 | Alice |
| 9 | Steve |
+-------+-----------+
The table is actually filled with hundreds of results, this is just an example. I'm looking for a compact algorithm to get the rank of a specific user in relation to all other users in %. The goal is to output something like "you are in the top 5%/10%/20%/50%" or "you are below average". While it's easy to determine if someone is below average (score < AVG(score)), I have no clue how to determine the other ranks.
Upvotes: 2
Views: 1875
Reputation: 26784
Here is another version
SELECT user_name, score,(CASE
WHEN score BETWEEN @max-((@max-@min)/10) AND @max THEN '10'
WHEN score BETWEEN @max-((@max-@min)/5) AND @max THEN '20'
WHEN score BETWEEN @max-((@max-@min)/2) AND @max THEN '50'
ELSE 'more50'
END) as rangescore,
user_name
FROM result r
INNER JOIN user u ON r.user_id = u.id,
(SELECT @max := MAX(score) FROM result)x,
(SELECT @min := MIN(score) FROM result)y
ORDER BY score DESC
You can use AVG(score)
instead of MAX
f you want to compare the average score of the user.
Or remove the aggregate functions and GROUP BY if you want every score.
Upvotes: 1
Reputation: 2200
Ok, now with your new declarations, I will provide you another solution:
As you said you can use PHP and MySQL togheter, I provide you a combined one.
You want to compute your quantiles (to have an idea what this is quantiles on wikipedia), because if you have a top scorer with about 10000 points and all other players just have 100 points and below, the ones with 100 points are in the top 5% as player, but there score is under 50% of the top scorer.
With this in mind, we can count the number of players, get the score at the wanted percent of players and compare, were the players score fits in.
First select all the maximas, minimas, counter etc.
SELECT
COUNT(`result`.`score`) `count`,
MAX(`result`.`score`) `max`,
MIN(`result`.`score`) `min`,
AVG(`result`.`score`) `avg`
FROM
`result`
GROUP BY
`result`.`user_id`
ORDER BY
`result`.`score` DESC
After you have the complete data, you can compute your quantiles.
SELECT
`result`.`score`
FROM
`result`
GROUP BY
`result`.`user_id`
ORDER BY
`result`.`score` DESC
LIMIT FLOOR($count*$percent), 1
//where $count is the value from the first query and $percent is the wanted quantile e.g. 5%
After that you know the value for your quantile and you can compare the actual value with the ones here.
//where $percentNN is the score from the previous query
if($score > $percent50) echo "top 50%";
if($score > $percent20) echo "top 20%";
if($score > $percent10) echo "top 10%";
if($score > $percent5) echo "top 5%";
maybe, we can combine the multiple queries to one.
Upvotes: 0
Reputation: 37365
If I got all correctly, it's just relative maximum calculation:
SELECT
user_name,
MAX(score) AS max_score,
CASE
WHEN ROUND(100*MAX(score)/maximum, 2)>=95 THEN 'In top 5%'
WHEN ROUND(100*MAX(score)/maximum, 2)>=90 THEN 'In top 10%'
WHEN ROUND(100*MAX(score)/maximum, 2)>=75 THEN 'In top 25%'
WHEN ROUND(100*MAX(score)/maximum, 2)>=50 THEN 'In top 50%'
WHEN ROUND(100*MAX(score)/maximum, 2)>=0 THEN 'Below average'
END AS score_mark
FROM
`result`
INNER JOIN `user`
ON `result`.user_id=`user`.id
CROSS JOIN
(SELECT MAX(score) AS maximum FROM `result`) AS init
GROUP BY
user_id
So, counting from maximum score per all table and grouping it for specific user. Check the fiddle.
As mentioned below, this counting method involves simple way to determine average (i.e. all it's based on total maximum). This may be not the thing which is needed. By that I mean, that if question is about calculation relative position according to other scores (not maximum) - then it's more complicated:
SELECT
maxs.*,
@num:=@num+1 AS order_num,
CASE
WHEN 100*(@num-1)/(user_count-1) <= 5 THEN 'In top 5%'
WHEN 100*(@num-1)/(user_count-1) <= 10 THEN 'In top 10%'
WHEN 100*(@num-1)/(user_count-1) <= 25 THEN 'In top 25%'
WHEN 100*(@num-1)/(user_count-1) <= 50 THEN 'In top 50%'
WHEN 100*(@num-1)/(user_count-1) <= 100 THEN 'Below average'
END AS score_mark
FROM
(SELECT
user_name,
MAX(score) AS max_score
FROM
`result`
INNER JOIN `user`
ON `result`.user_id = `user`.id
GROUP BY
user_id
ORDER BY
max_score DESC) AS maxs
CROSS JOIN
(SELECT
@num:=0,
COUNT(DISTINCT user_id) AS user_count
FROM
`result`) AS init
-since now we must first re-count our positions and later build relative calculations over that. Here is the corresponding fiddle. Here, however, I'm applying linear formula to count 1-st position as "zero" and last position as "100". If that's not an intention (there will be edge-cases, like "3" in "50%" for "5 total" in fiddle) - then you may change divisor to user_count
Upvotes: 2