Reputation:
I have the date as below and the Date as Index. I want to remove the duplicated Date
Stock Open High Low Close Adj Close Volume
Date
2016-05-13 AAD 5.230000 5.260000 5.200000 5.260000 5.260000 5000
2016-05-16 AAD 5.220000 5.260000 5.220000 5.260000 5.260000 6000
2016-05-17 AAD 5.210000 5.260000 5.210000 5.260000 5.260000 2000
2016-05-17 AAD 5.210000 5.260000 5.210000 5.260000 5.260000 2000
2016-05-18 AAD 5.200000 5.250000 5.200000 5.250000 5.250000 3000
The output I needed
Stock Open High Low Close Adj Close Volume
Date
2016-05-13 AAD 5.230000 5.260000 5.200000 5.260000 5.260000 5000
2016-05-16 AAD 5.220000 5.260000 5.220000 5.260000 5.260000 6000
2016-05-17 AAD 5.210000 5.260000 5.210000 5.260000 5.260000 2000
2016-05-18 AAD 5.200000 5.250000 5.200000 5.250000 5.250000 3000
I try by using df.drop_duplicates()
and the output delete extra lines after the duplicated date.
Stock Open High Low Close Adj Close Volume
Date
2016-05-13 AAD 5.230000 5.260000 5.200000 5.260000 5.260000 5000
2016-05-16 AAD 5.220000 5.260000 5.220000 5.260000 5.260000 6000
2016-05-17 AAD 5.210000 5.260000 5.210000 5.260000 5.260000 2000
Upvotes: 2
Views: 8474
Reputation: 153460
Let's use the information Jezrael provided.
Input Dataframe:
print(df)
Stock Open High Low Close Adj Close Volume
2016-05-13 AAD 5.23 5.26 5.20 5.26 5.26 5000
2016-05-16 AAD 5.22 5.26 5.22 5.26 5.26 6000
2016-05-17 AAD 5.21 5.26 5.21 5.26 5.26 2000
2016-05-17 AAD 5.21 5.26 5.21 5.26 5.26 2000
2016-05-18 AAD 5.20 5.25 5.20 5.25 5.25 3000
df1 = df[~df.index.duplicated(keep='last')]
print(df1)
Output:
Stock Open High Low Close Adj Close Volume
2016-05-13 AAD 5.23 5.26 5.20 5.26 5.26 5000
2016-05-16 AAD 5.22 5.26 5.22 5.26 5.26 6000
2016-05-17 AAD 5.21 5.26 5.21 5.26 5.26 2000
2016-05-18 AAD 5.20 5.25 5.20 5.25 5.25 3000
Upvotes: 3