Michel1506
Michel1506

Reputation: 73

Python/Pandas select first duplicate in dataframe

I have the following dataframe df:

                 Value
Date     
2021-06-26       7
2021-06-27       3  
2021-06-28       9
2021-06-28       10

As you can see, I have two rows with the same index "2021-06-28" but with different values. Now I want to simply select the first row of this duplicate index. In this case:

                 Value
Date       
2021-06-28       9

I tried the following:

duplicateRows = df.loc["2021-06-28"]
firstRow = duplicateRows.iloc[0]

However, this gives me the error: "cannot concatenate object of type '<class 'numpy.float64'>'; only Series and DataFrame objs are valid".

Any help is appreciated :)

Upvotes: 1

Views: 1271

Answers (1)

jezrael
jezrael

Reputation: 862771

Chain 2 masks - for get first values and also for get all duplicates by Index.duplicated with inverted mask and with parameter keep=False and filter in boolean indexing:

df = df[~df.index.duplicated() & df.index.duplicated(keep=False)]
print (df)
            Value
Date             
2021-06-28      9

Upvotes: 2

Related Questions