Reputation: 11879
Say I have a simple table with 3 fields: 'place', 'user' and 'bytes'. Let's say, that under some filter, I want to group by 'place', and for each 'place', to sum all the bytes for that place, and randomly select a user for that place (uniformly from all the users that fit the 'where' filter and the relevant 'place'). If there was a "select randomly from" aggregate function, I would do:
SELECT place, SUM(bytes), SELECT_AT_RANDOM(user) WHERE .... GROUP BY place;
...but I couldn't find such an aggregate function. Am I missing something? What could be a good way to achieve this?
Upvotes: 11
Views: 5590
Reputation: 1223
With a custom aggregate function, you could write expressions as simple as:
SELECT place, SUM(bytes), SELECT_AT_RANDOM(user) WHERE .... GROUP BY place;
SELECT_AT_RAMDOM
would be the custom aggregate function.
Here is precisely an implementation in PostgreSQL.
Upvotes: 1
Reputation: 49089
I think your question is DBMS specific. If your DBMS is MySql, you can use a solution like this:
SELECT place_rand.place, SUM(place_rand.bytes), place_rand.user as random_user
FROM
(SELECT place, bytes, user
FROM place
WHERE ...
ORDER BY rand()) place_rand
GROUP BY
place_rand.place;
The subquery orders records in random order. The outer query groups by place
, sums bytes
, and returns first random user, since user is not in an aggregate function and neither in the group by clause.
Upvotes: 3
Reputation: 1270593
I would do a bit of a variation on Martin's solution:
select place, sum(bytes), max(case when seqnum = 1 then user end) as random_user
from (select place, bytes,
row_number() over (partition by place order by newid()) as sequm
from t
) t
group by place
(Where newid() is just one way to get a random number, depending on the database.)
For some reason, I prefer this approach, because it still has the aggregation function in the outer query. If you are summarizing a bunch of fields, then this seems cleaner to me.
Upvotes: 0
Reputation: 453658
If your RDBMS supports analytical functions.
WITH T
AS (SELECT place,
Sum(bytes) OVER (PARTITION BY place) AS Sum_bytes,
user,
Row_number() OVER (PARTITION BY place ORDER BY random_function()) AS RN
FROM YourTable
WHERE .... )
SELECT place,
Sum_bytes,
user
FROM T
WHERE RN = 1;
For SQL Server Crypt_gen_random(4)
or NEWID()
would be examples of something that could be substituted in for random_function()
Upvotes: 5