Reputation: 71
I have a dataframe with data on a number of gained users by day. The dataframe:
users = {'Date':['13.02.2020', '15.02.2020', '19.02.2020', '20.02.2020', '23.02.2020', '26.02.2020'], 'Users_gain':['2', '1', '3', '1', '4', '5']}
df_users = pd.DataFrame(users)
Date Users_gain
0 13.02.2020 2
1 15.02.2020 1
2 19.02.2020 3
3 20.02.2020 1
4 23.02.2020 4
5 26.02.2020 5
I need to group by week, where the start day is Thursday. So the result should look like this:
Date Users_gain
0 13.02.2020 - 19.02.2020 6
1 20.02.2020 - 26.02.2020 10
I would appreciate if someone can help me with this question.
This is what I tried. But I don't know how to change the start date from Sunday to Thursday:
df_users.groupby(pd.Grouper(key='Date', freq='W')).sum()
Upvotes: 1
Views: 1755
Reputation: 862661
First convert columns to numbers and datetimes by to_datetime
:
df_users['Users_gain'] = df_users['Users_gain'].astype(int)
df_users['Date'] = pd.to_datetime(df_users['Date'], format='%d.%m.%Y')
Then aggregate by DataFrame.resample
or with Grouper
by day Wednesday
:
df_users = df_users.resample('W-Wed',on='Date')['Users_gain'].sum().reset_index()
#alternative
#df_users = df_users.groupby(pd.Grouper(key='Date', freq='W-Wed')).sum().reset_index()
Last change format of datetimes with subtract 6 days and Series.dt.strftime
:
s = (df_users['Date'] - pd.offsets.DateOffset(days=6)).dt.strftime('%d.%m.%Y-')
df_users['Date'] = s + df_users['Date'].dt.strftime('%d.%m.%Y')
print (df_users)
Date Users_gain
0 13.02.2020-19.02.2020 6
1 20.02.2020-26.02.2020 10
Upvotes: 1