How to get rows from one dataframe if values are in another dataframe

Question

I have one dataframe that is a result set of dropping duplicates from another dataframe.

changes = full_set.drop_duplicates(subset=['Employee ID', 'Benefit Plan Type', 'Sum of Premium'], keep='last')

Then I have another where the ID and Plan Type is still listed twice

dupe_accts = changes.set_index(['Employee ID', 'Benefit Plan Type']).index.get_duplicates()

What I'm trying to do now is have a third dataframe that would be if ID and plan type are in

dupe_accts

it would output the rows from

changes

into a new dataframe

So far I have

dupes = changes[['Employee ID', 'Benefit Plan Type']].isin(dupe_accts)

but this is outputting

False False
False False
False False
False False
False False

piRSquared · Accepted Answer

You don't need to set the index and get dupes that way. You can use duplicated to get a boolean array and mask the change dataframe with that.

The keep=False parameter will identify all duplicates. This is opposed to the other options in which it will not identify the first or last as a duplicate.

duplicated = changes.duplicated(
    subset=['Employee ID', 'Benefit Plan Type'], keep=False)
dupe_accts = changes[duplicated]

How to get rows from one dataframe if values are in another dataframe

Answers (1)

Related Questions