curios
curios

Reputation: 163

Pandas Sort Values By Date Doesn't Sort By Year

I have a large data set that is in this format

enter image description here

I'd like to order this data set by the "created_at" column, so I converted the "created_at" column to type datetime following this guide: https://www.geeksforgeeks.org/how-to-sort-a-pandas-dataframe-by-date/

data = pd.read_csv(PATH_TO_CSV)

data['created_at'] = data['created_at'].str.split("+").str[0]
data['created_at'] = pd.to_datetime(data['created_at'],format="%Y-%m-%dT%H:%M:%S")

data.sort_values(by='created_at')

But it's not sorting by year as expected. The values starting with 2012 should be at the top, but they aren't

print(data)
print(type(data['created_at'][0]))

enter image description here

What am I missing?

Upvotes: 0

Views: 48

Answers (2)

Rabinzel
Rabinzel

Reputation: 7923

As in the comments already stated. the sorted df needs to be assigned again. sort_values doesn't work inplace by default.

data = data.sort_values(by='created_at')

# OR 

data.sort_values(by='created_at', inplace=True)

Upvotes: 1

mozway
mozway

Reputation: 261964

With a datetime type, this should be able to sort directly, make sure to assign the output as sorting is not in place:

# no need for an intermediate column nor to pass the full format
data['created_at'] = pd.to_datetime(data['created_at'].str.split("+").str[0])

# assign output
data = data.sort_values(by='created_at')

Upvotes: 2

Related Questions