Data_User
Data_User

Reputation: 51

Sort pandas column with month year string

I have a pandas index column with values as month_year. Example 'March_2021', 'April 2020', 'December_2021'. I would like to sort the index in an intelligent manner that sorts the index in the right order as: 'April 2020', 'March_2021', 'December_2021' without having the correct order in a list as I have below.

My current implementation:

month_year_sorter = ['April 2020','March_2021', 'December_2021']
df = df.reindex(month_year_sorter)

Upvotes: 0

Views: 115

Answers (1)

Daweo
Daweo

Reputation: 36450

Observation: "April 2020" does work with pandas.to_datetime whilst "March_2021" does not

import pandas as pd
a2020 = pd.to_datetime("April 2020")
print(a2020.month)  # 4
print(a2020.year)  # 2020
m2021 = pd.to_datetime("March_2021")  # ValueError: ('Unknown string format:', 'March_2021')

Possible exploit: convert your strings into format which pandas.to_datetime can consume and use that for sorting that, for example

import pandas as pd
def convert_to_timestamp(x):
    return pd.to_datetime(x.replace('_',' '))
month_year_unsorted = ['December_2021','April 2020','March_2021']
month_year_sorted = sorted(month_year_unsorted,key=convert_to_timestamp)
print(month_year_sorted)  # ['April 2020', 'March_2021', 'December_2021']

Upvotes: 2

Related Questions