Reputation: 2929
I have a df with daily data and some levels per day:
date | value1 | value2 | level
2020-01-01 | 1 | 2 | "a"
2020-01-01 | 3 | 10 | "b"
2020-01-01 | 2 | 3 | "c"
2020-01-02 | 1 | 2 | "a"
2020-01-02 | 3 | 10 | "b"
2020-01-02 | 2 | 3 | "c"
... | ... | ... | ...
2021-02-01 | 10 | 1 | "a"
2021-02-01 | 8 | 4 | "b"
2021-02-01 | 1 | 5 | "c"
2021-02-03 | 10 | 1 | "a"
2021-02-03 | 8 | 4 | "b"
2021-02-03 | 1 | 5 | "c"
I need the sum per month of value1 and value2 in a new column while keeping the daily rows, like:
date | value1 | value2 | level | value1_permonth | value2_permonth
2020-01-01 | 1 | 2 | "a" | 12 | 30
2020-01-01 | 3 | 10 | "b" | 12 | 30
2020-01-01 | 2 | 3 | "c" | 12 | 30
2020-01-02 | 1 | 2 | "a" | 12 | 30
2020-01-02 | 3 | 10 | "b" | 12 | 30
2020-01-02 | 2 | 3 | "c" | 12 | 30
... | ... | ... | ... | ... | ...
2021-02-01 | 10 | 1 | "a" | 38 | 20
2021-02-01 | 8 | 4 | "b" | 38 | 20
2021-02-01 | 1 | 5 | "c" | 38 | 20
2021-02-03 | 10 | 1 | "a" | 38 | 20
2021-02-03 | 8 | 4 | "b" | 38 | 20
2021-02-03 | 1 | 5 | "c" | 38 | 20
How can I do this with pandas?
Upvotes: 1
Views: 224
Reputation: 862581
Use Grouper
with GroupBy.transform
for new columns filled by aggregate values:
cols = ['value1','value2']
df1 = df.groupby(pd.Grouper(freq='MS', key='date'))[cols].transform('sum')
cols = ['value1','value2']
df1 = df.resample('MS', on='date')[cols].transform('sum')
Or use monthle periods by Series.dt.to_period
passed to groupby
:
cols = ['value1','value2']
df1 = df.groupby(df['date'].dt.to_period('m'))[cols].transform('sum')
print (df1)
df2 = df.join(df1.add_suffix('_permonth'))
print (df2)
date value1 value2 level value1_permonth value2_permonth
0 2020-01-01 1 2 a 12 30
1 2020-01-01 3 10 b 12 30
2 2020-01-01 2 3 c 12 30
3 2020-01-02 1 2 a 12 30
4 2020-01-02 3 10 b 12 30
5 2020-01-02 2 3 c 12 30
6 2021-02-01 10 1 a 38 20
7 2021-02-01 8 4 b 38 20
8 2021-02-01 1 5 c 38 20
9 2021-02-03 10 1 a 38 20
10 2021-02-03 8 4 b 38 20
11 2021-02-03 1 5 c 38 20
Upvotes: 1