Reputation: 711

pandas dataframe sort by date

I made a dataframe by importing a csv file. And converted the date column to datetime and made it the index. However, when sorting the index it doesn't produce the result I wanted

print(df.head())
df['Date'] = pd.to_datetime(df['Date'])
df.index = df['Date']
del df['Date']
df.sort_index()
print(df.head())

Here's the result:

         Date     Last
0  2016-12-30  1.05550
1  2016-12-29  1.05275
2  2016-12-28  1.04610
3  2016-12-27  1.05015
4  2016-12-23  1.05005
               Last
Date               
2016-12-30  1.05550
2016-12-29  1.05275
2016-12-28  1.04610
2016-12-27  1.05015
2016-12-23  1.05005

The date actually goes back to 1999, so if I sort this by date, it should show the data in ascending order right?

Upvotes: 17

Answers (2)

Marjan Moderc

Reputation: 2859

Just expanding MaxU's correct answer: you have used correct method, but, just as with many other pandas methods, you will have to "recreate" dataframe in order for desired changes to take effect. As the MaxU already suggested, this is achieved by typing the variable again (to "store" the output of the used method into the same variable), e.g.:

df = df.sort_index()

or by harnessing the power of attribute inplace=True, which is going to replace the content of the variable without need of redeclaring it.

df.sort_index(inplace=True)

However, in my experience, I often feel "safer" using the first option. It also looks clearer and more normalized, since not all the methods offer the inplace usage. But I all comes down to scripting sytle I guess...

Upvotes: 15

Ishan Khatri

Reputation: 575

The data looks like this

Date,Last
2016-12-30,1.05550
2016-12-29,1.05275
2016-12-28,1.04610
2016-12-27,1.05015
2016-12-23,1.05005

Read the data by using pandas

import pandas as pd
df = pd.read_csv('data',sep=',')
# Displays the data head
print (df.head())
         Date     Last
0  2016-12-30  1.05550
1  2016-12-29  1.05275
2  2016-12-28  1.04610
3  2016-12-27  1.05015
4  2016-12-23  1.05005

# Sort column with name Date
df = df.sort_values(by = 'Date')
         Date     Last
4  2016-12-23  1.05005
3  2016-12-27  1.05015
2  2016-12-28  1.04610
1  2016-12-29  1.05275
0  2016-12-30  1.05550

# reset the index
df.reset_index(inplace=True)

# Display the data head after index reset
       index        Date     Last
0      4  2016-12-23  1.05005
1      3  2016-12-27  1.05015
2      2  2016-12-28  1.04610
3      1  2016-12-29  1.05275
4      0  2016-12-30  1.05550

# delete the index
del df['index']

# Display the data head
print (df.head())
         Date     Last
0  2016-12-23  1.05005
1  2016-12-27  1.05015
2  2016-12-28  1.04610
3  2016-12-29  1.05275
4  2016-12-30  1.05550

Upvotes: 8

pandas dataframe sort by date

Answers (2)

Related Questions