Reputation: 310
I have a column of date strings I know are from a single month, in this case the dates are all between January and February 2020. I want to sort them in ascending order. However, they are in different formats some in mm/dd/yy, some in dd/mm/yy. How can I sort them?
data = {
'date': ['1/1/2020','20/1/2020', '1/1/2020', '1/28/2020','21/1/2020', '1/25/2020', '29/1/2020'],
}
df = pd.DataFrame(data)
print(df)
Edit
Another sample of dates I'd like to be sorted
import pandas as pd
data = {'Tgl': {
1: '1/1/2023',
2: '1/1/2023',
3: '1/3/2023',
4: '1/5/2023',
5: '1/5/2023',
6: '1/9/2023',
7: '10/1/2023',
8: '12/1/2023',
9: '16/1/2023'}}
df = pd.DataFrame(data)
df = pd.to_datetime(df['Tgl'])
df = pd.to_datetime(df['Tgl'], dayfirst = True)
Upvotes: 0
Views: 119
Reputation: 260735
In the provided example, there is limited ambiguity as you don't have cases for which a day ≤ 12 is different from the month.
So you can use pandas.to_datetime
(pd.to_datetime(df['date'])
) to convert to a clean datetime, or, to sort while keeping the original strings:
df.sort_values(by='date', key=pd.to_datetime)
Output:
date
0 1/1/2020
2 1/1/2020
1 20/1/2020
4 21/1/2020
5 1/25/2020
3 1/28/2020
6 29/1/2020
If you have ambiguous dates (like 1/2/2020
) you can choose to give priority to days/months with the dayfirst
parameter:
df.sort_values(by='date', key=lambda x: pd.to_datetime(x, dayfirst=True))
Example:
date
2 2/1/2020 # Jan 2nd
1 20/1/2020
4 21/1/2020
5 1/25/2020
3 1/28/2020
6 29/1/2020
0 1/2/2020 # Feb 1st
Let's assume the first number is the day, unless the value is > 2, in which case we convert it to month.
def custom_date(s):
return (
pd.to_datetime(s, dayfirst=True)
.mask(lambda x: x.dt.month>2,
pd.to_datetime(s, dayfirst=False))
)
df.sort_values(by='date', key=custom_date)
Output (with an additional column to see the result of the custom conversion):
date converted
2 2/1/2020 2020-01-02
7 10/1/2020 2020-01-10 # both converted
8 1/10/2020 2020-01-10 # to Jan 10
1 20/1/2020 2020-01-20
4 21/1/2020 2020-01-21
5 1/25/2020 2020-01-25
3 1/28/2020 2020-01-28
6 29/1/2020 2020-01-29
0 1/2/2020 2020-02-01
Upvotes: 3