Reputation: 113
Here is a sample of my data:
Date Count
11.01.2019 1
01.02.2019 7
25.01.2019 4
23.01.2019 4
16.03.2019 1
04.02.2019 5
06.04.2019 1
04.04.2019 5
Required output:
Month Total_Count
Jan 9
Feb 12
Mar 1
Apr 6
I have used the following code, for the above operation of summing up, and it works fine, but the months are all jumbled up and not sorted accordingly like Jan,Feb
(df.groupby(pd.to_datetime(df['Date'], format='%d.%m.%Y')
.dt.month_name()
.str[:3])['Count']
.sum()
.rename_axis('Month')
.reset_index(name='Total_Count'))
Upvotes: 3
Views: 672
Reputation: 399
Try this:
new_df = (df.sort_values('Date')
.groupby(df['Date'].dt.month_name().str[:3], sort=False)['Count']
.sum()
.rename_axis('Month')
.reset_index(name='Total_Count'))
print(new_df)
Upvotes: 0
Reputation: 863281
Idea is convert column to datetimes, then sorting and grouping with sort=False
for avoid default sort in groupby
:
df['Date'] = pd.to_datetime(df['Date'], format='%d.%m.%Y')
df1 = (df.sort_values('Date')
.groupby(df['Date'].dt.month_name().str[:3], sort=False)['Count']
.sum()
.rename_axis('Month')
.reset_index(name='Total_Count'))
print (df1)
Month Total_Count
0 Jan 9
1 Feb 12
2 Mar 1
3 Apr 6
Another idea, thank you anky is use ordered Categorical
s, then is necessary remove sort=False
:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
df1 = (df.groupby(pd.Categorical(pd.to_datetime(df['Date'], format='%d.%m.%Y')
.dt.month_name().str[:3],ordered=True,categories=months))['Count']
.sum()
.rename_axis('Month')
.reset_index(name='Total_Count'))
Or using Series.reindex
:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
df1 = (df.groupby(pd.to_datetime(df['Date'], format='%d.%m.%Y')
.dt.month_name().str[:3])['Count']
.sum()
.rename_axis('Month')
.reindex(months, fill_value=0)
.reset_index(name='Total_Count'))
print (df1)
Month Total_Count
0 Jan 9
1 Feb 12
2 Mar 1
3 Apr 6
4 May 0
5 Jun 0
6 Jul 0
7 Aug 0
8 Sep 0
9 Oct 0
10 Nov 0
11 Dec 0
Upvotes: 5