johnbumble
johnbumble

Reputation: 700

Pandas how to create new data frame that only has duplicate ids

I am trying to create a new dataframe that has the columns id and name, for all the duplicate ids in the dataframe.

My dataframes structure is:

id, name,lat, lon, price, minimum_nights, review_cnt

I tried the .duplicated function, but I am not getting what I need. I think I might be using it wrong

Upvotes: 0

Views: 618

Answers (1)

fishmulch
fishmulch

Reputation: 376

.duplicated() by default returns all duplicated features except the first feature. To get all duplicated features for 'id' and 'name' including the first occurrence:

df = df[['id', 'name']].copy()
df[df.duplicated(keep=False)]

Upvotes: 1

Related Questions