Reputation: 4398
I have dataset, df, with the following data:
starttime endtime ID Diff
1/10/2020 9:05:00 PM 1/10/2020 9:05:10 A 10
1/10/2020 9:05:00 PM 1/10/2020 9:05:10 A 10
1/10/2020 9:06:00 PM 1/10/2020 9:06:10 B 10
Desired outcome:
starttime endtime ID Diff
1/10/2020 9:05:00 PM 1/10/2020 9:05:10 A 10
1/10/2020 9:06:00 PM 1/10/2020 9:06:10 B 10
If you notice, one of the rows from Group A was removed, because it was an exact duplicate:
1/10/2020 9:05:00 pm 1/10/2020 9:05:10 A 10
This is the code I am using, however, I am unsure as to what to include in the parentheses, or if this is correct:
df.drop_duplicates(subset=None, keep=False)
Any suggestions are appreciated.
Upvotes: 1
Views: 62
Reputation: 4456
Try looking at the docs. If you can't figure out what's most appropriate for your case, then ask again, providing a context (e.g. example).
The link is for pandas 0.25
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html
Upvotes: 1
Reputation: 14094
You can supply the column
df.drop_duplicates(subset='ID', keep=False)
Upvotes: 2