BLuta
BLuta

Reputation: 247

How to aggregate totals after every day

I have a table that looks like this.

      Date        Hospital Total Operations Errors
2001-10-01      Hospital A             1101      0
2001-10-01      Hospital B               32      0
2001-10-02      Hospital A             1184      0
2001-10-02      Hospital B               74      0
2001-10-03      Hospital A             1350      0
2001-10-03      Hospital B               72      0
2001-10-04      Hospital A             1364      0
2001-10-04      Hospital B              232      0
2001-10-05      Hospital A             1345      0
2001-10-05      Hospital B              212      0

And want to accrue totals for each hospital every day and add them to that specific day's row so that my final table can look like this.

      Date        Hospital Total Operations Errors Aggregated Operations Aggregated Errors
2001-10-01      Hospital A             1101      0                  1101                 0
2001-10-01      Hospital B               32      0                    32                 0
2001-10-02      Hospital A             1184      0                  2285                 0
2001-10-02      Hospital B               74      0                   106                 0
2001-10-03      Hospital A             1350      0                  3635                 0
2001-10-03      Hospital B               72      0                   178                 0
2001-10-04      Hospital A             1364      0                  4999                 0
2001-10-04      Hospital B              232      0                   410                 0
2001-10-05      Hospital A             1345      0                  6344                 0
2001-10-05      Hospital B              212      0                   622                 0

Is there a way to do that in python? Any assistance is truly appreciated.

Upvotes: 1

Views: 67

Answers (4)

Rob Raymond
Rob Raymond

Reputation: 31236

Simple use of transform() and concat results to original df

df = pd.concat([df, df.groupby(['Hospital']).transform("cumsum").add_suffix(" Cumulative")], axis=1)

Upvotes: 1

Celius Stingher
Celius Stingher

Reputation: 18377

You can use groupby() and cumsum() as I suggested previously in the comments:

df = pd.DataFrame({'Date':['2001-10-01','2001-10-01','2001-10-02','2001-10-02','2001-10-03','2001-10-03','2001-10-04','2001-10-04'],
                   'Hospital':['Hospital A','Hospital B','Hospital A','Hospital B','Hospital A','Hospital B','Hospital A','Hospital B'],
                   'Total Operations':[1101,32,1184,74,1350,72,1364,232],
                   'Errors':[0,0,0,0,0,0,0,0]})
df['Aggreated Operations'] = df.groupby(['Hospital'])['Total Operations'].cumsum()
df['Aggreated Erros'] = df.groupby(['Hospital'])['Errors'].cumsum()
print(df)

This will output:

         Date    Hospital  ...  Aggreated Operations  Aggreated Erros
0  2001-10-01  Hospital A  ...                  1101                0
1  2001-10-01  Hospital B  ...                    32                0
2  2001-10-02  Hospital A  ...                  2285                0
3  2001-10-02  Hospital B  ...                   106                0
4  2001-10-03  Hospital A  ...                  3635                0
5  2001-10-03  Hospital B  ...                   178                0
6  2001-10-04  Hospital A  ...                  4999                0
7  2001-10-04  Hospital B  ...                   410                0

Upvotes: 3

Chris
Chris

Reputation: 16172

df.join(df.sort_values(by=['Date','Hospital']).groupby(['Hospital']).cumsum(), rsuffix=' Aggregated')

Output

         Date    Hospital  Total Operations  Errors  Total Operations Aggregated  Errors Aggregated
0  2001-10-01  Hospital A              1101       0                         1101                  0
1  2001-10-01  Hospital B                32       0                           32                  0
2  2001-10-02  Hospital A              1184       0                         2285                  0
3  2001-10-02  Hospital B                74       0                          106                  0
4  2001-10-03  Hospital A              1350       0                         3635                  0
5  2001-10-03  Hospital B                72       0                          178                  0
6  2001-10-04  Hospital A              1364       0                         4999                  0
7  2001-10-04  Hospital B               232       0                          410                  0
8  2001-10-05  Hospital A              1345       0                         6344                  0
9  2001-10-05  Hospital B               212       0                          622                  0

Upvotes: 1

rhug123
rhug123

Reputation: 8778

This code should work.

df['Aggregated Operations'] = df.groupby('Hospital')['Total Operations'].cumsum()

Upvotes: 1

Related Questions