Reputation: 79
I am now trying to make random selections from each grouped column array, with chances followed by the weight of each row. For example, I have a table (DemoTable) like this: http://sqlfiddle.com/#!9/23470/3/0
Name | State | Grade | Weight |
---|---|---|---|
John | NY | 100 | 1 |
Liam | NY | 90 | 2 |
Olivia | NY | 90 | 3 |
Emma | NY | 80 | 4 |
James | CA | 10 | 1 |
Henry | CA | 20 | 1 |
Mia | NJ | 50 | 1 |
Ava | NJ | 30 | 4 |
For State = 'NY', there are four rows with grade array: [100, 90, 90, 80] and the weight [1, 2, 3, 4], respectively. So 80 has the largest chance to be picked while 100 has the least within its State group. I made a query for it:
SELECT a.*,
(SELECT b.Grade FROM DemoTable b WHERE a.State = b.State
ORDER BY RAND() * -b.Weight LIMIT 1) AS 'random_val' FROM DemoTable a;
and it worked with the result:
Name | State | Grade | Weight | random_val |
---|---|---|---|---|
John | NY | 100 | 1 | 80 |
Liam | NY | 90 | 2 | 80 |
Olivia | NY | 90 | 3 | 80 |
Emma | NY | 80 | 4 | 90 |
James | CA | 10 | 1 | 20 |
Henry | CA | 20 | 1 | 10 |
Mia | NJ | 50 | 1 | 30 |
Ava | NJ | 30 | 4 | 30 |
Though, I would like to know if there is any other method like join or union instead of using order by rand() alone.
Is there any other way to modify my MySQL query that gives the same result?
I've searched for solving this problem all day, but couldn't find the proper way to do so; and that's why I asked here for the aid.
I would sincerely appreciate if I could get some advice.
Upvotes: 0
Views: 87
Reputation: 86735
My first attempt using analytic functions, though I suspect yours is faster over larger datasets...
WITH
ranged AS
(
SELECT
*,
SUM(weight) OVER (PARTITION BY state ORDER BY id) - weight AS weight_range_lower,
SUM(weight) OVER (PARTITION BY state ORDER BY id) AS weight_range_upper,
SUM(weight) OVER (PARTITION BY state ) * rand() AS rand_threshold
FROM
DemoTable
)
SELECT
ranged.*,
lookup.grade AS random_grade
FROM
ranged
INNER JOIN
ranged AS lookup
ON lookup.state = ranged.state
AND lookup.weight_range_lower <= ranged.rand_threshold
AND lookup.weight_range_upper > ranged.rand_threshold
ORDER BY
ranged.id
Or, if you want all members of the same state to be given the same random_grade...
SELECT
*,
FIRST_VALUE(grade) OVER (PARTITION BY state ORDER BY weight * rand() DESC)
FROM
DemoTable
ORDER BY
id
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=133f9e86b013a477ac342d0295132dd5
Upvotes: 2