msv123
msv123

Reputation: 43

Find out intersection of 2 pandas DataFrame according to 2 columns

I would to find out intersection of 2 pandas DataFrame according to 2 columns 'x' and 'y' and combine them into 1 DataFrame. The data are:

df[1]:
    x   y       id    fa
0   4   5  9283222   3.1
1   4   5  9283222   3.1
2  10  12  9224221   3.2
3   4   5  9284332   1.2
4   6   1    51249  11.2

df[2]:
    x   y        id   fa
0   4   5  19283222  1.1
1   9   3  39224221  5.2
2  10  12  29284332  6.2
3   6   1     51242  5.2
4   6   2     51241  9.2
5   1   1     51241  9.2

The expected output is something like (can ignore index):

    x   y       id    fa
0   4   5  9283222   3.1
1   4   5  9283222   3.1
2  10  12  9224221   3.2
3   4   5  9284332   1.2
4   6   1    51249  11.2
0   4   5  19283222  1.1
2  10  12  29284332  6.2
3   6   1     51242  5.2

Thank you very much!

Upvotes: 4

Views: 4701

Answers (2)

Bastien Delaval
Bastien Delaval

Reputation: 1

The simpliest solution:

df1.columns.intersection(df2.columns)

Upvotes: 0

akuiper
akuiper

Reputation: 214957

You can find out the intersection by joining the x,y columns from df1 and df2, with which you can filter df1 and df2 by inner join, and then concatenating the two results with pd.concat should give what you need:

intersection = df1[['x', 'y']].merge(df2[['x', 'y']]).drop_duplicates()
pd.concat([df1.merge(intersection), df2.merge(intersection)])

enter image description here

Upvotes: 3

Related Questions