astroboy
astroboy

Reputation: 197

how to cumulate the data by week wise in pandas - cumulative values

I have a sample dataframe ( Date format : YYYY-MM-DD )

Region   Area   Headquarter   Sales   Date              SalesPersonId     
 R1      A1        H1        2500    2020-01-02          AA
 R1      A1        H1        6000    2020-01-05          AA
 R1      A1        H1        8000    2020-01-11          AA
 R1      A1        H1        1000    2020-01-12          AA
 R1      A1        H1        2000    2020-01-16          AA
 R1      A1        H1        3000    2020-01-26          AA
 R1      A2        H2        1000    2020-01-03          BB
 R2      A2        H2        3000    2020-01-21          BB

The explanation of above table is sales created on different days in a month and salespersonId.

So I have to create

Region      Area        Headquarter       Sales         Till_week        SalesPersonId
 R1         A1            H1             8500            1              AA
 R1         A1            H1             17500           2              AA
 R1         A1            H1             19500           3              AA
 R1         A1            H1             22500           4              AA
 R1         A2            H2             1000            1              BB
 R1         A2            H2             4000            3              BB

The second table explanation is for example on 1st, 4th, 5th of a month, the sales is created. So it falls in first week. So summing it. On 8th, 9th, 11th of a month, the sales is created again which falls in second week, so summing the second week value and first week. Likewise the rest.

Please help me in solving this

Thanks in advance

Ps: I referred the other similar questions too,but I didn't help.

Upvotes: 1

Views: 815

Answers (2)

Mayank Porwal
Mayank Porwal

Reputation: 34046

Like this:

In [1307]: df.Date = pd.to_datetime(df.Date)
In [1309]: df['Till_Week'] = df['Date'].dt.week

In [1320]: res = df.groupby(['Region', 'Area','Headquarter', 'Till_Week', 'SalesPersonId'], as_index=False)['Sales'].sum()

In [1323]: cum_sum = res.groupby(['Region', 'Area'])['Sales'].cumsum().tolist()

In [1324]: res.Sales = cum_sum

In [1325]: res
Out[1325]: 
  Region Area Headquarter    Till_Week   SalesPersonId  Sales
0     R1   A1          H1            1            AA   8500
1     R1   A1          H1            2            AA  17500
2     R1   A1          H1            3            AA  19500
3     R1   A1          H1            4            AA  22500
4     R1   A2          H2            1            BB   1000
5     R2   A2          H2            4            BB   3000

Upvotes: 0

jezrael
jezrael

Reputation: 862581

Convert values to weeks by isocalendar, then aggregate sum and last cumulative sum:

df['Date'] = pd.to_datetime(df['Date'])
df['Till_week'] = df['Date'].dt.isocalendar().week

df1 = (df.groupby(['Region','Area','Headquarter','SalesPersonId','Till_week'])['Sales']
        .sum()
        .groupby(level=[0,1,2,3])
        .cumsum()
        .reset_index())

print (df1)
  Region Area Headquarter SalesPersonId  Till_week  Sales
0     R1   A1          H1            AA          1   8500
1     R1   A1          H1            AA          2  17500
2     R1   A1          H1            AA          3  19500
3     R1   A1          H1            AA          4  22500
4     R1   A2          H2            BB          1   1000
5     R2   A2          H2            BB          4   3000

Upvotes: 1

Related Questions