Overlap of a column based on a different column

Question

I have a simple table in Postgres that contains days of the week and people.

INSERT INTO mytable (day, person)
values
('Monday', 'A'),
('Monday', 'A'),
('Monday', 'B'),
('Tuesday', 'A'),
('Wednesday', 'A'),
('Wednesday', 'B'),
('Thursday', 'B'),
('Thursday', 'B');

I would like to find the number of overlapping people for each pair of days. So for example, Monday has Person A and B. Tuesday has person B only, so that would yield a row as follows:

('Monday', 'Tuesday', 1)

The final output should look like this:

('Monday', 'Monday', 2),
('Monday', 'Tuesday', 1),
('Monday', 'Wednesday', 2),
('Monday', 'Thursday', 1),
('Tuesday', 'Tuesday', 1),
('Tuesday', 'Monday', 1),
('Tuesday', 'Wednesday', 1),
('Tuesday', 'Thursday', 0),
('Wednesday', 'Wednesday', 2),
('Wednesday', 'Monday', 2),
('Wednesday', 'Tuesday', 1),
('Wednesday', 'Thursday', 1),
('Thursday', 'Thursday', 1),
('Thursday', 'Monday', 1),
('Thursday', 'Tuesday', 0),
('Thursday', 'Wednesday', 1)

I have been playing around with cross joins and intersect, but nothing seems to yield the correct result.

Gordon Linoff · Accepted Answer

Use a self join:

select t1.day, t2.day, count(distinct t2.person) as num_overlaps
from mytable t1 join
     mytable t2
     on t1.person = t2.person
group by t1.day, t2.day
order by t1.day, t2.day;

EDIT:

I see you want to include zero overlaps as well. This is a little trickier. For this, you need to generate all combinations of days and then join in the data:

select d1.day, d2.day, count(distinct t2.person)
from (select distinct day from mytable) d1 cross join
     (select distinct day from mytable) d2 left join
     mytable t1
     on t1.day = d1.day left join
     mytable t2
     on t2.day = d2.day and t2.person = t1.person
group by d1.day, d2.day
order by d1.day, d2.day;

Here is a db<>fiddle.

Overlap of a column based on a different column

Answers (1)

Related Questions