Reputation: 962
I have a column Date Time
in my dataframe which has the date and time as a string.
Product Date Time
ABC Wed Jan 2 11:14:24 2019
ABC123 Wed Jan 2 11:14:24 2019
ABCXZY Wed Jan 2 11:14:24 2019
BVF123 Mon Jan 14 10:24:20 2019
ABC456 Mon Jan 14 10:24:20 2019
ABC000 Mon Feb 4 10:44:08 2019
ABCXYZ Mon Feb 4 10:44:08 2019
ABC678 Mon Feb 4 10:44:08 2019
ABCQYZ Wed Feb 20 09:14:40 2019
ABC090 Wed Feb 20 09:14:40 2019
I have converted this column to a datetime format using -
df['Date'] = pd.to_datetime(df['Date Time']).dt.strtime('%d-%b-%Y')
I want to now sort this dataframe on the basis of the Date
column to plot the quantities for each date in ascending order of date, but when I use -
df.sort_values(by='Date', inplace=True, ascending=True)
it only gets sorted by the date and ignores the month name, i.e as
02-Jan-2019
04-Feb-2019
08-Mar-2019
13-Feb-2019
14-Jan-2019
20-Feb-2019
21-Mar-2019
instead of
02-Jan-2019
14-Jan-2019
04-Feb-2019
13-Feb-2019
20-Feb-2019
08-Mar-2019
21-Mar-2019
How can I get the desired sorting using pandas datetime or any other module?
Upvotes: 1
Views: 526
Reputation: 326
pd.to_datetime(df['Date Time']).dt.strtime('%d-%b-%Y')
returns a series of string ("object type" to be precise) but not a series of datetime. That's why your sorting is wrong. Here is a code to do it :
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
### Dataframe
data = {'Product' : ['ABC', 'ABC123', 'ABCXZY', 'BVF123', 'ABC456', 'ABC000', 'ABCXYZ', 'ABC678', 'ABCQYZ', 'ABC090'], 'Date Time' : ['Wed Jan 2 11:14:24 2019', 'Wed Jan 2 11:14:24 2019', 'Wed Jan 2 11:14:24 2019', 'Mon Jan 14 10:24:20 2019', 'Mon Jan 14 10:24:20 2019', 'Mon Feb 4 10:44:08 2019', 'Mon Feb 4 10:44:08 2019', 'Mon Feb 4 10:44:08 2019', 'Wed Feb 20 09:14:40 2019', 'Wed Feb 20 09:14:40 2019']}
df = pd.DataFrame(data)
### Conversion to datetime
df['Date'] = pd.to_datetime(df.loc[:, 'Date Time'])
### Sorting
df.sort_values(by = 'Date', inplace = True)
### Plot
ax = df.groupby('Date').count().Product.plot()
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d-%b-%Y')) # Formatting x labels
Upvotes: 1
Reputation: 323226
We can do argsort
df=df.iloc[pd.to_datetime(df['Date Time'],format='%d-%b-%Y').argsort()]
Out[20]:
Date Time
3 14-Jan-2019
0 04-Feb-2019
2 13-Feb-2019
4 20-Feb-2019
1 08-Mar-2019
5 21-Mar-2019
Update
s=df.groupby(['Date Time']).size()
s.index=pd.to_datetime(s.index,format='%d-%b-%Y')
s.sort_index(inplace=True)
s.index=s.index.strftime('%d-%b-%Y')
Upvotes: 1