EmJ
EmJ

Reputation: 4608

How to identify common elements in two pandas dataframes

Given two pandas dataframes, I want to identify its common elements.

df1

  title description
0    mmm         mmm
1    mmm         mmm
2    mmm         mmm
3    mmm         mmm
4    mmm         mmm
5    mmm         mmm
6    mmm         mmm
7    nnn         nnn
8    nnn         nnn
9    lll         lll
10   jjj         jjj

df2

  title description
0    mm          mm
1    mmm         mmm
2    mmm         mmm
3    mmm         mmm
4    mmm         mmm
5    mmm         mmm
6    mmm         mmm
7    nn          nn
8    nn          nn
9    ll          ll
10   jjj         jjj

So the common elements should be;

  title description
0    mmm         mmm
1   jjj         jjj

I tried to use the following code.

import pandas as pd
df1 = pd.DataFrame({"title":["mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nnn", "nnn", "lll", "jjj"], "description":["mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nnn", "nnn", "lll", "jjj"]})
df2 = pd.DataFrame({"title":["mm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nn", "nn", "ll", "jjj"], "description":["mm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nn", "nn", "ll", "jjj"]})
df1.intersection(df2)

However, it returns an error; AttributeError: 'DataFrame' object has no attribute 'intersection'. Just wondering where I am making things wrong.

I am happy to provide more details if needed.

Upvotes: 1

Views: 69

Answers (2)

piRSquared
piRSquared

Reputation: 294218

set intersection

def f(d): return {*zip(*map(d.get, d))}
pd.DataFrame(f(df1) & f(df2), columns=[*df1])

  title description
0   mmm         mmm
1   jjj         jjj

Upvotes: 3

BENY
BENY

Reputation: 323226

We can using merge with inner then drop_duplicates

df1.merge(df2,how='inner').drop_duplicates()
   title description
0    mmm         mmm
42   jjj         jjj

Upvotes: 3

Related Questions