Reputation: 129
I have 3 fairly simple tables:
users
user_id
1
2
3
radio_songs
song_id song
1 SomeName
2 OtherName
radio_rates
user_id song_id rate (from 1 to 5)
1 1 5
2 1 4
1 2 2
2 2 2
I've wrote quite complex query targetting MySQL that calculates current "position" (rank) of the song according to lower bound of Wilson score confidence interval for a Bernoulli parameter.
SELECT rank FROM(
SELECT x.song AS song, x.ci_lower_bound AS ci_lower_bound, (@row:= @row + 1) AS rank FROM(
SELECT song, ((SUM((rate - 1) * 0.25) + 1.9208) / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25)) - 1.96 * SQRT((SUM((rate - 1) * 0.25) * SUM((5 - rate) * 0.25)) / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25)) + 0.9604) / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25))) / (1 + 3.8416 / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25))) AS ci_lower_bound
FROM radio_rates
INNER JOIN radio_songs ON radio_rates.song_id = radio_songs.song_id
GROUP BY radio_rates.song_id
ORDER BY ci_lower_bound DESC
) x, (SELECT @row := 0) r
) xx WHERE xx.song = @song
This query basically accepts @song
parameter and:
ROW_NUMBER()
in MySQLThe query works properly and I'm quite satisfied with it, but when we have multiple songs with the same score then due to sorting the resulting rank can vary between executions of the same SQL query. I wanted to avoid that by getting MIN()
rank from all songs that have the same score as the target one, but the query got so complicated that I'm struggling how to do that without a temporary table - is it even possible?
I'd appreciate help, as well as any recommendations in terms of performance/optimization of above query.
I know that it'd be worth it to consider simply adding another score column to songs table and calculating it on each insert/update via trigger, but I'd want to avoid that if possible and calculate rank on-demand. Therefore SQL query itself is most important for me.
Thank you in advance.
Upvotes: 0
Views: 411
Reputation: 31812
This might work for you:
SELECT rank FROM(
SELECT x.song AS song,
(@row:= @row + 1) AS rn,
IF(@last_score = x.ci_lower_bound, @rank, @rank := @row) AS rank
(@last_score := x.ci_lower_bound) AS ci_lower_bound
FROM(
SELECT song, ((SUM((rate - 1) * 0.25) + 1.9208) / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25)) - 1.96 * SQRT((SUM((rate - 1) * 0.25) * SUM((5 - rate) * 0.25)) / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25)) + 0.9604) / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25))) / (1 + 3.8416 / (SUM((rate - 1) * 0.25) + SUM((5 - rate) * 0.25))) AS ci_lower_bound
FROM radio_rates
INNER JOIN radio_songs ON radio_rates.song_id = radio_songs.song_id
GROUP BY radio_rates.song_id
ORDER BY ci_lower_bound DESC
) x, (SELECT @row := 0, @rank := null, @last_score := null) r
) xx WHERE xx.song = @song
The changes are:
SELECT x.song AS song,
(@row:= @row + 1) AS rn,
IF(@last_score = x.ci_lower_bound, @rank, @rank := @row) AS rank
(@last_score := x.ci_lower_bound) AS ci_lower_bound
and
(SELECT @row := 0, @rank := null, @last_score := null) r
In this line
IF(@last_score = x.ci_lower_bound, @rank, @rank := @row) AS rank
You set the rank to the row number only if the score has changed compared to the last row. If the score is the same, the rank from the last row is used.
Warning: Using session variables this way, you are always at risk that your code will return unexpected results when you upgrade to a new version. If it works, then it is because of how the engine is implemented. There is no guarantee, that the expressions will be executed in the expected order.
As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement. For example, to increment a variable, this is okay:
SET @a = @a + 1;
For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed. In the following statement, you might think that MySQL will evaluate @a first and then do an assignment second:
SELECT @a, @a:=@a+1, ...;
However, the order of evaluation for expressions involving user variables is undefined.
Upvotes: 1