sum columns within a pandas dataframe

Question

I have a pandas DataFrame with columns of numbers indexed by datetime, like so:

                                  ABC  XYZ
2016-12-01 06:43:02.541287-06:00    2   11 
2016-12-01 06:43:06.341349-06:00   52    3
2016-12-01 06:43:11.745070-06:00   16    0 
2016-12-01 06:43:18.245247-06:00    9  100
2016-12-01 06:43:25.545450-06:00    0   75
2016-12-01 06:43:35.953279-06:00    4    0 
2016-12-01 06:43:45.553310-06:00   13    8 
2016-12-01 06:43:55.053600-06:00    3   12
2016-12-01 06:43:59.153890-06:00    6    9

I would like to "groupby" every 10 seconds and compute the sum of ABC and XYZ, and generate a new dataframe with those sums that looks like this:

                                  ABC  XYZ
2016-12-01 06:43:10.000000-06:00   54   14
2016-12-01 06:43:20.000000-06:00   25  100
2016-12-01 06:43:30.000000-06:00    0   75
2016-12-01 06:43:40.000000-06:00    4    0
2016-12-01 06:43:50.000000-06:00   13    8
2016-12-01 06:44:00.000000-06:00    9   21

i.e. the first two rows are within the same 10 second period, so for ABC I add 2+52 to get 54 and for XYZ I add 11+3 to get 14. Then I put these sums in the new DataFrame. My code is so horribly kludgey that I don't even want to include it. Any suggestions would be much appreciated.

jezrael · Accepted Answer

Use resample with sum:

df = df.resample('10S').sum()
print (df)
                     ABC  XYZ
2016-12-01 12:43:00   54   14
2016-12-01 12:43:10   25  100
2016-12-01 12:43:20    0   75
2016-12-01 12:43:30    4    0
2016-12-01 12:43:40   13    8
2016-12-01 12:43:50    9   21

But if need work with timezones, first tz_localize to default UTC and then convert to -6 timezone (like US/Indiana-Starke) by tz_convert:

df = df.tz_localize('UTC').resample('10S').sum().tz_convert('US/Indiana-Starke')
print (df)
                           ABC  XYZ
2016-12-01 06:43:00-06:00   54   14
2016-12-01 06:43:10-06:00   25  100
2016-12-01 06:43:20-06:00    0   75
2016-12-01 06:43:30-06:00    4    0
2016-12-01 06:43:40-06:00   13    8
2016-12-01 06:43:50-06:00    9   21

sum columns within a pandas dataframe

Answers (1)

Related Questions