nobodyAskedYouPatrice
nobodyAskedYouPatrice

Reputation: 131

How to use compare several dataframes and return the matches using pandas

If I have a several pandas dataframes that look like this:

Name       Score
Sam        4
Aaron      5
Neil       6
Ben        7

Name       Score
Morgan     5
Neil       6
Adam       8
Ben        5

Name       Score
Evan       5
Nathan     4
Neil       6
Ben        2

How can I use pandas.concat(join) to join all the dataframes in one large one and then only return the names that are found in all three dataframes?

Expected Output:

Name
Neil
Ben

Upvotes: 2

Views: 2621

Answers (2)

piRSquared
piRSquared

Reputation: 294258

You can use pd.concat with the parameter join='inner'

pd.concat([d1, d2, d3], axis=1, join='inner')

      Score  Score  Score
Name                     
Neil      6      6      6
Ben       7      5      2

If you needed to differentiate the columns, you can pass a keys parameter

pd.concat(
    [d.Score for d in [d1, d2, d3]],
    axis=1, join='inner', keys=['d1', 'd2', 'd3']
)

      d1  d2  d3
Name            
Neil   6   6   6
Ben    7   5   2

Upvotes: 3

Vaishali
Vaishali

Reputation: 38415

If you are interested in only the names, you can get intersection like this

pd.Series(list(set(df1.Name) & set(df2.Name) & set(df3.Name)))

0    Neil
1     Ben

Upvotes: 4

Related Questions