hope288
hope288

Reputation: 1445

Cumsum with groupby for date accumulation

I am trying to sum the number of categories for each month and cumulate the previous months. The group by needs to be by month,year, and category. I tried different ways of using cumsum but I just can't get it.

This is the df:

sample=[
{'name':11,'category':'A','year':2017,'month':1},
{'name':22,'category':'A','year':2017,'month':2},
{'name':33,'category':'B','year':2015,'month':1},
{'name':33,'category':'C','year':2017,'month':3},
{'name':33,'category':'B','year':2017,'month':8},
{'name':44,'category':'B','year':2016,'month':4},
{'name':44,'category':'A','year':2017,'month':6},
{'name':55,'category':'C','year':2016,'month':9},
{'name':55,'category':'C','year':2017,'month':5},
{'name':55,'category':'B','year':2017,'month':11}]

sample_df=pd.DataFrame(sample)

I grouped by month,year, and category and aggregated by month

sample_counts = sample_df.groupby(['month','year','category']).agg({
                                    'category': 'count',
                                }).rename(columns={'category':'category_count'}).reset_index()

The output of sample_counts is this:

month   year  category  category_count
1       2015     B         1
1       2017     A         1
2       2017     A         1
3       2017     C         1
4       2016     B         1
5       2017     C         1
6       2017     A         1
8       2017     B         1
9       2016     C         1
11      2017     B         1

So I want the output to add all the B's (for example) as the months progress

So by 11/2017 the 'B' sum should be 2.

I tried different variations of

cumul_df=sample_counts.groupby(['month','year','category']).sum().groupby(level=0).cumsum().reset_index()

I would like the FINAL output to look like this (only showing for 2017 below, but 2016 and 2018 would be the same logic and look and would be in the same table below):

Category    Month        Year       Cumulative_By_Category               
A            1            2017       1
A            2            2017       2
A            3            2017       2
A            4            2017       2
A            5            2017       2
A            6            2017       3    
A            7            2017       3 
A            8            2017       3
A            9            2017       3
A            10           2017       3
A            11           2017       3
A            12           2017       3
B            1            2017       0
B            2            2017       0
B            3            2017       0
B            4            2017       0
B            5            2017       0
B            6            2017       0
B            7            2017       0
B            8            2017       1
B            9            2017       1  
B            10           2017       1
B            11           2017       2
B            12           2017       2
C            1            2017       0
C            2            2017       0
C            3            2017       1
C            4            2017       1
C            5            2017       2
C            6            2017       2
C            7            2017       2
C            8            2017       2
C            9            2017       2
C            10           2017       2
C            11           2017       2
C            12           2017       2

Upvotes: 2

Views: 517

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150765

If I understand correctly, this is just cumcount:

sample_df['cat_count'] = sample_df.groupby(['year','category']).cumcount() + 1

Output:

   name category  year  month  cat_count
0    11        A  2017      1          1
1    22        A  2017      2          2
2    33        B  2015      1          1
3    33        C  2017      3          1
4    33        B  2017      8          1
5    44        B  2016      4          1
6    44        A  2017      6          3
7    55        C  2016      9          1
8    55        C  2017      5          2
9    55        B  2017     11          2

Upvotes: 1

Related Questions