Reputation: 4608
Given two pandas dataframes, I want to identify its common elements.
df1
title description
0 mmm mmm
1 mmm mmm
2 mmm mmm
3 mmm mmm
4 mmm mmm
5 mmm mmm
6 mmm mmm
7 nnn nnn
8 nnn nnn
9 lll lll
10 jjj jjj
df2
title description
0 mm mm
1 mmm mmm
2 mmm mmm
3 mmm mmm
4 mmm mmm
5 mmm mmm
6 mmm mmm
7 nn nn
8 nn nn
9 ll ll
10 jjj jjj
So the common elements should be;
title description
0 mmm mmm
1 jjj jjj
I tried to use the following code.
import pandas as pd
df1 = pd.DataFrame({"title":["mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nnn", "nnn", "lll", "jjj"], "description":["mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nnn", "nnn", "lll", "jjj"]})
df2 = pd.DataFrame({"title":["mm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nn", "nn", "ll", "jjj"], "description":["mm", "mmm", "mmm", "mmm", "mmm", "mmm", "mmm", "nn", "nn", "ll", "jjj"]})
df1.intersection(df2)
However, it returns an error; AttributeError: 'DataFrame' object has no attribute 'intersection'
. Just wondering where I am making things wrong.
I am happy to provide more details if needed.
Upvotes: 1
Views: 69
Reputation: 294218
set
intersectiondef f(d): return {*zip(*map(d.get, d))}
pd.DataFrame(f(df1) & f(df2), columns=[*df1])
title description
0 mmm mmm
1 jjj jjj
Upvotes: 3
Reputation: 323226
We can using merge
with inner
then drop_duplicates
df1.merge(df2,how='inner').drop_duplicates()
title description
0 mmm mmm
42 jjj jjj
Upvotes: 3