Reputation: 35
I am trying to come up with a way to assign two people together from a larger dataset of about 6 people. I was toying around with the random() function in postgres but had no luck. I have access to postgres or oracle whichever may be easier to accomplish this. For example if I had 6 names I'd like to take those 6 names and assign them to one another using some sort of randomizing query:
The output would be something along the lines of:
Orignal Name | Match
Any help would be greatly appreciated!
Thank you.
Upvotes: 1
Views: 52
Reputation: 50200
In postgres you could generate a row_number() on a random number and then join on that. This is nice and fast, but it could cause people to be buddied up with themselves:
SELECT t1.name, t2.name
FROM (SELECT row_number() OVER (ORDER BY random()) as id, name FROM table) t1
INNER JOIN (SELECT row_number() OVER (order by random()) as id, name FROM table) t2
ON t1.id = t2.id;
Here is a method using the cartesian product that results from joining the table to itself. This is not a nice solution if the data is huge as there is an intermediate result set that is (N * (N - 1)) rows, but noone will get matched up with themselves:
SELECT name1,
name2
FROM (
SELECT t1.NAME name1,
t2.NAME name2,
row_number() OVER (PARTITION BY t1.NAME ORDER BY random()) AS rn
FROM yourtable t1,
yourtable t2
WHERE t1.NAME <> t2.NAME
) subquery
WHERE rn = 1;
Here is a hybrid of the two. Joining the table to itself on a range of randomly generated ids, also specifying that the names don't match. The intermediate result set will have 1-3 randomly chosen names from t2
for each name in t1
. Then we just grab one at random. This has an intermediate result set which will ALWAYS be less than (N*3) records which isn't too bad.
UPDATE: this will, however, match the same person up multiple times... Leaving it here in case it spawns so good idea for that INNER JOIN that prevents that from happening.
WITH randnames AS
(
SELECT row_number() OVER (ORDER BY random()) AS id,
NAME
FROM yourtable
)
SELECT name1, name2
FROM (
SELECT t1.NAME name1,
t2.NAME name2,
ROW_NUMBER() OVER (PARTITION BY t1.NAME ORDER BY 1) AS rn
FROM randnames t1
INNER JOIN randnames t2
ON t1.NAME <> t2.NAME
AND t2.id BETWEEN t1.id - 1 AND t1.id + 1
) subquery
WHERE rn = 1;
I feel like there is probably some prettier way to do this, but the complete lack of answers on this question an hour after it was asked suggests that it's not an easy problem to solve in SQL.
Upvotes: 1