Beket Kassymbekov
Beket Kassymbekov

Reputation: 33

How to drop duplicates in column with respect to values in another column in pandas?

I have a database with person names and the date of their visit. I need to remove duplicated rows in "Visit_date" column with respect to each person in another column. I have a very big database, so I need a working code. I've spent several days trying to do this, but no result. Here is a sample:

      Person      Visit_date
  0     John     11.09.2020
  1     John     11.09.2020
  2     John     11.08.2020
  3     Andy     11.07.2020
  4     Andy     11.09.2020
  5     Andy     11.09.2020
  6     George   11.09.2020
  7     George   11.09.2020
  8     George   11.07.2020
  9     George   11.07.2020

The code should return:

            Person   Visit_date
      0     John     11.09.2020
      1     John     11.08.2020
      2     Andy     11.07.2020
      3     Andy     11.09.2020
      4     George   11.09.2020
      5     George   11.07.2020

Upvotes: 0

Views: 109

Answers (1)

AziMez
AziMez

Reputation: 2082

Hope this help you. Using df.drop_duplicates() then df.reset_index(drop=True)

import pandas as pd
df = pd.DataFrame({"Person" :['John','John','John','Andy','Andy','Andy','George','George','George','George'],"Visit_date" :['11.09.2020','11.09.2020','11.08.2020','11.07.2020','11.09.2020','11.09.2020','11.09.2020','11.09.2020','11.07.2020','11.07.2020']})

df=df.drop_duplicates()
df=df.reset_index(drop=True)

print(df)

[Result]:

    Person  Visit_date
0    John  11.09.2020
1    John  11.08.2020
2    Andy  11.07.2020
3    Andy  11.09.2020
4  George  11.09.2020
5  George  11.07.2020

Upvotes: 1

Related Questions