Reputation: 587
I have created the below pivotable in pandas
>>> out
Year 2021 2022 2023
Month Feb Mar Sep Oct Dec Jan Jun Aug Oct Jun Sep Nov Dec
ID
1 0 8 1.5 6.5 6 8 8 2 7.0 9 9 3 0
2 4 4 0.0 0.0 0 0 0 2 8.5 0 0 0 3
Code:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
months = pd.CategoricalDtype(months, ordered=True)
rng = np.random.default_rng(2023)
df = pd.DataFrame({'ID': rng.integers(1, 3, 20),
'Year': rng.integers(2021, 2024, 20),
'Month': rng.choice(months.categories, 20),
'Value': rng.integers(1, 10, 20)})
out = (df.astype({'Month': months})
.pivot_table(index='ID', columns=['Year', 'Month'], values='Value',
aggfunc='mean', fill_value=0))
Now I would like add a total column after each year:
Year 2021 Total 2022 Total 2023 Total
Month Feb Mar Sep Oct Dec Jan Jun Aug Oct Jun Sep Nov Dec
ID
1 0 8 1.5 9.5 6.5 6 8 8 2 31.5 7.0 9 9 3 0 28
2 4 4 0.0 8 0.0 0 0 0 2 2 8.5 0 0 0 3 11.5
How could I get this? Thanks!
Upvotes: 0
Views: 32
Reputation: 120391
Append a new category to months
:
months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec',
'Total'] # extra columns here before pivot
months = pd.CategoricalDtype(months, ordered=True)
# Pivot here
out = ...
tot = out.groupby(level='Year', axis=1).sum()
tot.columns = pd.MultiIndex.from_product([tot.columns, pd.Index(['Total'], name='Month', dtype=months)])
out = pd.concat([out, tot], axis=1).sort_index(axis=1)
Output:
>>> out
Year 2021 2022 2023
Month Feb Mar Sep Oct Dec Total Jan Jun Aug Oct Total Jun Sep Nov Dec Total
ID
1 0 8 1.5 6.5 6 22.0 8 8 2 7.0 25.0 9 9 3 0 21.0
2 4 4 0.0 0.0 0 8.0 0 0 2 8.5 10.5 0 0 0 3 3.0
Upvotes: 1
Reputation: 154
You can do:
df_year = df.groupby(by = ['Year', 'ID']).sum(numeric_only = True)
Upvotes: 0