franco_b
franco_b

Reputation: 898

Postgresql remove duplicate reversed pairs

I have this table:

 origin     destination

new york      seattle
new york      chicago
new york      portland
seattle       new york
seattle       chicago
chicago       new york

I have to build a graph so I need to remove all the duplicated reversed pairs to have:

origin      destination oneway

new york      seattle     0
new york      chicago     0
new york      portland    1
seattle       chicago     1

I already read this post: SQL -- Remove duplicate pairs but it's not useful for me because I have string filed.

Thanks

Upvotes: 0

Views: 819

Answers (3)

If your table is a Cartesian product (What i see in that snippet, that means you have reversed duplicate of every row in your table) and you don't care what will be on each side, it is much easier then everything above

select origin, destination
from table t
origin > destination

Otherwise you can use other answers

Upvotes: 1

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49270

One option with row_number and count using least and greatest.

select origin,dest,case when cnt_per_pair=1 then 1 else 0 end as one_way
from (select t.*,row_number() over(partition by least(origin,dest),greatest(origin,dest) 
                                   order by dest) as rnum,
      count(*) over(partition by least(origin,dest),greatest(origin,dest)) as cnt_per_pair
      from tbl t
     ) t
where rnum=1

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270873

One method uses aggregation:

select origin, destination,
       (case when exists (select 1
                          from t t2
                          where t2.origin = t.destination and t2.destination = t.origin
                         )
             then 0 else 1
        end) as one_way
from t
where origin < destination
union all
select origin, destination, 1
from t
where origin > destination;

An alternative method uses window functions:

select origin, destination, (cnt = 1)::int as one_way
from (select t.*,
             count(*) over (partition by least(origin, destination), greatest(origin, destination)) as cnt
      from t
     ) t
where origin < destination or
      (origin > destination and cnt = 1);

Upvotes: 1

Related Questions