steff
steff

Reputation: 906

Set index in pandas df without creating duplicate column

My dataframe looks like this:

In [120]: data.head()
Out[120]: 
         date   open   high    low  close  volume
0  2017-08-07  2.276  2.276  2.253  2.257     0.0
1  2017-08-08  2.260  2.291  2.253  2.283     0.0
2  2017-08-09  2.225  2.249  2.212  2.241     0.0
3  2017-08-10  2.241  2.241  2.210  2.212     0.0
4  2017-08-11  2.199  2.222  2.182  2.189     0.0

after doing:

data.index = pd.to_datetime(data['date'])

I end up with this:

In [122]: data.head()
Out[122]: 
                  date   open   high    low  close  volume
date                                                      
2017-08-07  2017-08-07  2.276  2.276  2.253  2.257     0.0
2017-08-08  2017-08-08  2.260  2.291  2.253  2.283     0.0
2017-08-09  2017-08-09  2.225  2.249  2.212  2.241     0.0
2017-08-10  2017-08-10  2.241  2.241  2.210  2.212     0.0
2017-08-11  2017-08-11  2.199  2.222  2.182  2.189     0.0

how can i avoid ending up with a duplicate date column? Grateful for your help. (Pandas 0.21.1)

Upvotes: 3

Views: 3568

Answers (2)

Sorginah
Sorginah

Reputation: 122

I'm using drop instead of inplace. Works too.

data['date'] = pd.to_datetime(data['date'])

data.set_index('date', drop=True)

When you set drop property to True, it delete the column to be used as the new index

Upvotes: 2

Mike Müller
Mike Müller

Reputation: 85442

Convert to date first and use .set_index():

data['date'] = pd.to_datetime(data['date'])
data.set_index('date', inplace=True)
print(data)

Output:

             open   high    low  close  volume
date                                          
2017-08-07  2.276  2.276  2.253  2.257     0.0
2017-08-08  2.260  2.291  2.253  2.283     0.0
2017-08-09  2.225  2.249  2.212  2.241     0.0
2017-08-10  2.241  2.241  2.210  2.212     0.0
2017-08-11  2.199  2.222  2.182  2.189     0.0

inplace=True modifies your data instead of creating a new dataframe.

Upvotes: 6

Related Questions