bayesianpower
bayesianpower

Reputation: 91

Intersect two dataframes in Pandas with respect to first dataframe?

I want to intersect two Pandas dataframes (1 and 2) based on two columns (A and B) present in both dataframes. However, I would like to return a dataframe that only has data with respect to the data in the first dataframe, omitting anything that is not found in the second dataframe.

So for example:

Dataframe 1: 
A | B | Extra | Columns | In | 1 |
----------------------------------
1 | 2 | Extra | Columns | In | 1 |
1 | 3 | Extra | Columns | In | 1 |
1 | 5 | Extra | Columns | In | 1 |

Dataframe 2: 
A | B | Extra | Columns | In | 2 |
----------------------------------
1 | 3 | Extra | Columns | In | 2 |
1 | 4 | Extra | Columns | In | 2 |
1 | 5 | Extra | Columns | In | 2 |

should return:

A | B | Extra | Columns | In | 1 |
----------------------------------
1 | 3 | Extra | Columns | In | 1 |
1 | 5 | Extra | Columns | In | 1 |

Is there a way I can do this simply?

Upvotes: 1

Views: 98

Answers (1)

Mayank Porwal
Mayank Porwal

Reputation: 34086

You can use df.merge:

df = df1.merge(df2, on=['A','B'], how='inner').drop('2', axis=1)

how='inner' is default. Just put it there for your understanding of how df.merge works.

As @piRSquared suggested, you can do:

df1.merge(df2[['A', 'B']], how='inner')

Upvotes: 1

Related Questions