Reputation: 906
My dataframe looks like this:
In [120]: data.head()
Out[120]:
date open high low close volume
0 2017-08-07 2.276 2.276 2.253 2.257 0.0
1 2017-08-08 2.260 2.291 2.253 2.283 0.0
2 2017-08-09 2.225 2.249 2.212 2.241 0.0
3 2017-08-10 2.241 2.241 2.210 2.212 0.0
4 2017-08-11 2.199 2.222 2.182 2.189 0.0
after doing:
data.index = pd.to_datetime(data['date'])
I end up with this:
In [122]: data.head()
Out[122]:
date open high low close volume
date
2017-08-07 2017-08-07 2.276 2.276 2.253 2.257 0.0
2017-08-08 2017-08-08 2.260 2.291 2.253 2.283 0.0
2017-08-09 2017-08-09 2.225 2.249 2.212 2.241 0.0
2017-08-10 2017-08-10 2.241 2.241 2.210 2.212 0.0
2017-08-11 2017-08-11 2.199 2.222 2.182 2.189 0.0
how can i avoid ending up with a duplicate date column? Grateful for your help. (Pandas 0.21.1)
Upvotes: 3
Views: 3568
Reputation: 122
I'm using drop instead of inplace. Works too.
data['date'] = pd.to_datetime(data['date'])
data.set_index('date', drop=True)
When you set drop property to True, it delete the column to be used as the new index
Upvotes: 2
Reputation: 85442
Convert to date first and use .set_index()
:
data['date'] = pd.to_datetime(data['date'])
data.set_index('date', inplace=True)
print(data)
Output:
open high low close volume
date
2017-08-07 2.276 2.276 2.253 2.257 0.0
2017-08-08 2.260 2.291 2.253 2.283 0.0
2017-08-09 2.225 2.249 2.212 2.241 0.0
2017-08-10 2.241 2.241 2.210 2.212 0.0
2017-08-11 2.199 2.222 2.182 2.189 0.0
inplace=True
modifies your data
instead of creating a new dataframe.
Upvotes: 6