Reputation: 63
I have weekly data grouped by region. I'm trying to figure out how to sum a set of rows based on a condition for each region. For example:
Region | Week | Year | value
------------------------------
R1 | 53 | 2016 | 10
R1 | 1 | 2017 | 8
R2 | 53 | 2017 | 10
R2 | 1 | 2018 | 17
R3 | 53 | 2018 | 30
R3 | 1 | 2019 | 1
I would like add every value of week 53 from the previous year to the first week of the following year to turn it into:
Region | Week | Year | value
------------------------------
R1 | 1 | 2017 | 18
R2 | 1 | 2018 | 27
R3 | 1 | 2019 | 31
Thanks.
Upvotes: 1
Views: 4198
Reputation:
agg
can be very useful here. Try this:
df = df.groupby('Region', as_index=False).agg({'Year':'max', 'value':'sum'})
Output:
>>> df
Region Year value
0 R1 2017 18
1 R2 2018 27
2 R3 2019 31
Upvotes: 5
Reputation: 26676
Format Year and week of the year to be able to convert into date.
Extract the time components and proceed to groupby and sum
s=pd.to_datetime(df.Year * 1000 + df.Week * 10 + 0, format='%Y%W%w')
df=(df.assign(Year=np.where(df['Week']==53,s.dt.year, df['Year']),
Week=np.where(df['Week']==53,s.dt.isocalendar().week, df['Week']))
.groupby(['Region', 'Year', 'Week']).agg('sum'))
Upvotes: 1