Pandas GroupBy Date Chunks

Question

I am trying to group a Pandas Dataframe into buckets of 2 days. For example, if I do the below:

df = pd.DataFrame()
df['action_date'] = ['2017-01-01', '2017-01-01', '2017-01-03', '2017-01-04', '2017-01-04', '2017-01-05', '2017-01-06']
df['action_date'] = pd.to_datetime(df['action_date'], format="%Y-%m-%d")
df['user_name'] = ['abc', 'wdt', 'sdf', 'dfe', 'dsd', 'erw', 'fds']
df['number_of_apples'] = [1,2,3,4,5,6,2]
df = df.groupby(['action_date', 'number_of_apples']).sum()

I get a dataframe grouped by action_date with number_of_apples per day.

However, if I wanted to look at the dataframe in chunks of 2 days, how could I do so? I would then like to analyze the number_of_apples per date_chunk, either by making new dataframes for the dates 2017-01-01 & 2017-01-03, another for 2017-01-04 & 2017-01-05, and then one last one for 2017-01-06, OR just by regrouping and working within.

EDIT: I ultimately would like to make lists of users based on the the number of apples they have for each day chunk, so do not want to get the sum nor mean of each day chunk's apples. Sorry for the confusion!

Thank you in advance!

jezrael · Accepted Answer

You can use resample:

print (df.resample('2D', on='action_date')['number_of_apples'].sum().reset_index())
  action_date  number_of_apples
0  2017-01-01                 3
1  2017-01-03                12
2  2017-01-05                 8

EDIT:

print (df.resample('2D', on='action_date')['user_name'].apply(list).reset_index())
  action_date        user_name
0  2017-01-01       [abc, wdt]
1  2017-01-03  [sdf, dfe, dsd]
2  2017-01-05       [erw, fds]

Pandas GroupBy Date Chunks

Answers (2)

Related Questions