Reputation: 37
i want to sum two columns independently aggregated weekly(on Monday) sum by id and date. .
df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
'distance_europe':[100, 100, 200, 200, 100],
'distance_USA': [0, 200, 100, 100, 3]})
expected output
2020-12-28 x2 100 0
2020-12-28 x1 500 203
2021-01-04 x2 100 200
update ---- My solution below works finely ---------
df = df.groupby('id').resample('W', on='date').sum().reset_index()
df['due_date'] = df.date - pd.Timedelta('6D')
Upvotes: 1
Views: 350
Reputation: 862591
Use Grouper
with closed='left'
and label='left'
with id
and aggregate sum
:
df = pd.DataFrame({'id':['x2', 'x2', 'x1', 'x1', 'x1'],
'date':['2021-01-03','2021-01-09', '2021-01-02', '2021-01-01', '2021-01-01'],
'distance_europe':[100, 100, 200, 200, 100],
'distance_USA': [0, 200, 100, 100, 3]})
df['date'] = pd.to_datetime(df['date'])
df = (df.groupby([pd.Grouper(key='date', freq='W-Mon', closed='left', label='left'), 'id'])
.sum()
.reset_index())
print (df)
date id distance_europe distance_USA
0 2020-12-28 x1 500 203
1 2020-12-28 x2 100 0
2 2021-01-04 x2 100 200
Your solution is change for grouping from by Sundays
(for W is default Sundays)
to Mondays
with same parameters like above:
df = df.groupby('id').resample('W-Mon', on='date', closed='left', label='left').sum().reset_index()
print (df)
id date distance_europe distance_USA
0 x1 2020-12-28 500 203
1 x2 2020-12-28 100 0
2 x2 2021-01-04 100 200
Upvotes: 1