Reputation: 2077
I have a very simple table:
create table #Person_Tasks( Person_ID int, Task_ID int)
I like to find all people that worked on the EXACT same tasks. For instance, P1 worked on T1 and T2. How can I find all the other people that ONLY worked on T1 and T2?
Here is some sample data:
insert into #Person_Tasks
values
(0,0),
(0,1),
(0,2),
(1,0),
(1,1),
(1,3),
(2,0),
(2,1),
(2,2)
In this example only P0 and P2 have the exact same tasks. P1 has the same number of tasks but one task doesn't match the other people's task ID.
The table is quite big with 1000s of people and 100Ks of Tasks.
Upvotes: 1
Views: 77
Reputation: 1269993
Here is one method, assuming that the rows are unique:
with pt as (
select pt.*, count(*) over (partition by person_id) as num_tasks
from person_tasks pt
)
select pt.person_id, pt2.person_id
from pt left join
pt pt2
on pt2.task_id = pt.task_id and pt.person_id <> pt2.person_id
group by pt.person_id, pt2.person_id, pt.num_tasks, pt2.num_tasks
having (count(*) = pt.num_tasks and pt.num_tasks = pt2.num_tasks) or
pt2.person_id is null
This does a self join and then counts the number of matches for each pair. If all the counts match, then the two people have the same tasks.
Here is a db<>fiddle showing it working.
Upvotes: 4
Reputation: 3498
based on your sample, this will work as well :
SELECT Person_ID
FROM (
SELECT *,
COUNT(Task_ID) OVER(PARTITION BY Person_ID) cnt
FROM Person_Tasks
WHERE
Task_ID IN(1,2)
) D
WHERE
cnt > 1
GROUP BY Person_ID
Upvotes: 0