Reputation: 777
I have a postgres table full of records which I would like to compare to every other record in the same table. Consider this table.
create table temp (rec char(1));
Populated with this data:
insert into temp (rec) values ('a');
insert into temp (rec) values ('b');
insert into temp (rec) values ('c');
insert into temp (rec) values ('d');
And queried with this SQL:
select a.rec, b.rec
from temp a, temp b
This returns 16 rows (as expected) but it includes extra rows where 'a' = 'a' and 'b' = 'b'. To remove these I can change the query to.
select a.rec, b.rec
from temp a, temp b
where a.rec <> b.rec
This cuts the results down to 12 rows:
"a" - "b"
"a" - "c"
"a" - "d"
"b" - "a"
"b" - "c"
"b" - "d"
"c" - "a"
"c" - "b"
"c" - "d"
"d" - "a"
"d" - "b"
"d" - "c"
However, it still pulls back some duplicates, not only does it pull back a record where 'a' = 'b' but it also pulls back 'b' = 'a' which I have no interest in (I've already compared record 'a' with record 'b' so have no interest in doing the same comparison the other way around)
So, how can I alter the query to compare every row with every other row once and once only?
Upvotes: 3
Views: 2637
Reputation: 1269973
Simply change your condition to <
:
select a.rec, b.rec
from temp a join
temp b
on a.rec < b.rec;
Upvotes: 2
Reputation:
Use distinct with your last query:
select distinct least(a.rec, b.rec), greatest(a.rec, b.rec)
from temp a
cross join temp b
where a.rec <> b.rec;
The expression least(a.rec, b.rec), greatest(a.rec, b.rec)
turns b,a
into a,b
and the distinct
then removes the duplicates.
Online example: https://rextester.com/TDZQ5153
Upvotes: 3