Reputation: 35772
I have two tables like these:
CREATE TABLE people (
id INT NOT NULL,
PRIMARY KEY (id)
)
CREATE TABLE pairs (
person_a_id INT,
person_b_id INT,
FOREIGN KEY (person_a_id) REFERENCES people(id),
FOREIGN KEY (person_b_id) REFERENCES people(id)
)
I want to select pairs of people at random from the people table, and after selecting them I add the randomly select pair to the pairs table. person_a_id always refers to the person with the lower id of the pair (since the order of the pair is not relevant).
The thing is that I never want to select the same pair twice, so I need to check the pairs table before I return my randomly selected pair.
Is it possible to do this using just a single SQL query in a reasonably efficient and elegant manner?
(I'm doing this using the Java Persistence API, but hopefully I'll be able to translate any answers into JPA code)
Upvotes: 4
Views: 6059
Reputation: 107736
select a.id, b.id
from people1 a
inner join people1 b on a.id < b.id
where not exists (
select *
from pairs1 c
where c.person_a_id = a.id
and c.person_b_id = b.id)
order by a.id * rand()
limit 1;
Limit 1
returns just one pair if you are "drawing lots" one at a time. Otherwise, up the limit to however many pairs you need.
The above query assumes that you can get
1 - 2
2 - 7
and that the pairing 2 - 7
is valid since it doesn't exist, even if 2 is featured again. If you only want a person to feature in only one
pair ever, then
select a.id, b.id
from people1 a
inner join people1 b on a.id < b.id
where not exists (
select *
from pairs1 c
where c.person_a_id in (a.id, b.id))
and not exists (
select *
from pairs1 c
where c.person_b_id in (a.id, b.id))
order by a.id * rand()
limit 1;
If multiple pairs
are to be generated in one single query, AND the destination table is still empty, you could use this single query. Take note that LIMIT 6
returns only 3 pairs.
select min(a) a, min(b) b
from
(
select
case when mod(@p,2) = 1 then id end a,
case when mod(@p,2) = 0 then id end b,
@p:=@p+1 grp
from (
select id
from (select @p:=1) p, people1
order by rand()
limit 6
) x
) y
group by floor(grp/2)
Upvotes: 4
Reputation: 10444
This cannot be accomplished in a single-query set-based approach because your set will not have knowledge of what pairs are inserted into the pairs table.
Instead, you should loop
WHILE EXISTS(SELECT * FROM people
WHERE id NOT IN (SELECT person_a_id FROM pairs)
AND id NOT IN (SELECT person_b_id FROM pairs)
This will loop while there are unmatched people.
Then you should two random numbers from 1 to the CNT(*)
of that table
which gives you the number of unmatched people... if you get the same number twice, roll again. (IF you're worried about this, randomize numbers from the two halves of the set... but then you're losing some randomness based on your sort criteria)
Pair those people.
Wash, rinse, repeat.... Your only "redo" will be when you generate the same random number twice... more likely as you get few people but still only a 25% chance at most (much better than 1/n^2)
Upvotes: 1