Ganesh M
Ganesh M

Reputation: 638

How to find the intersection of a pair of columns in pandas dataframe with pairs in any order?

I have below dataframe

col1     col2
a        b
b        a
c        d
d        c
e        d

Desired Output should be unique pair from two columns

col1    col2
a        b
c        d
e        d

Upvotes: 1

Views: 110

Answers (1)

jezrael
jezrael

Reputation: 863751

Convert values to frozenset and then filter by DataFrame.duplicated in boolean indexing:

df = df[~df[['col1','col2']].apply(frozenset, axis=1).duplicated()]
print (df)
  col1 col2
0    a    b
2    c    d
4    e    d
    

Or you can sorting values by np.sort and remove duplicates by DataFrame.drop_duplicates:

df = pd.DataFrame(np.sort(df[['col1','col2']]), columns=['col1','col2']).drop_duplicates()
print (df)
  col1 col2
0    a    b
2    c    d
4    d    e
    

Upvotes: 1

Related Questions