Aigerim
Aigerim

Reputation: 71

How to group by week (start is Thursday) using pandas?

I have a dataframe with data on a number of gained users by day. The dataframe:

users = {'Date':['13.02.2020', '15.02.2020', '19.02.2020', '20.02.2020', '23.02.2020', '26.02.2020'], 'Users_gain':['2', '1', '3', '1', '4', '5']}
df_users = pd.DataFrame(users)

Date    Users_gain
0   13.02.2020  2
1   15.02.2020  1
2   19.02.2020  3
3   20.02.2020  1
4   23.02.2020  4
5   26.02.2020  5

I need to group by week, where the start day is Thursday. So the result should look like this:

Date    Users_gain
0   13.02.2020 - 19.02.2020 6
1   20.02.2020 - 26.02.2020 10

I would appreciate if someone can help me with this question.

This is what I tried. But I don't know how to change the start date from Sunday to Thursday:

df_users.groupby(pd.Grouper(key='Date', freq='W')).sum()

Upvotes: 1

Views: 1755

Answers (1)

jezrael
jezrael

Reputation: 862661

First convert columns to numbers and datetimes by to_datetime:

df_users['Users_gain'] = df_users['Users_gain'].astype(int)
df_users['Date'] = pd.to_datetime(df_users['Date'], format='%d.%m.%Y')

Then aggregate by DataFrame.resample or with Grouper by day Wednesday:

df_users = df_users.resample('W-Wed',on='Date')['Users_gain'].sum().reset_index()
#alternative
#df_users = df_users.groupby(pd.Grouper(key='Date', freq='W-Wed')).sum().reset_index()

Last change format of datetimes with subtract 6 days and Series.dt.strftime:

s = (df_users['Date'] - pd.offsets.DateOffset(days=6)).dt.strftime('%d.%m.%Y-')
df_users['Date'] = s + df_users['Date'].dt.strftime('%d.%m.%Y')

print (df_users)
                    Date  Users_gain
0  13.02.2020-19.02.2020           6
1  20.02.2020-26.02.2020          10

Upvotes: 1

Related Questions