Reputation: 3
I have a pandas DataFrame with some independent columns, and I'm looking for an efficient way to unwind / aggregate them.
So, let's say I have the table:
+-----+-----+-------+------+-------+
| One | Two | Three | Four | Count |
+-----+-----+-------+------+-------+
| a | x | y | y | 3 |
+-----+-----+-------+------+-------+
| b | z | x | x | 5 |
+-----+-----+-------+------+-------+
| c | y | x | y | 1 |
+-----+-----+-------+------+-------+
Where rows Two, Three and Four are independent.
I would like to end up with the table:
+-----+-------+-------+
| One | Other | Count |
+-----+-------+-------+
| a | x | 3 |
+-----+-------+-------+
| a | y | 6 |
+-----+-------+-------+
| b | x | 10 |
+-----+-------+-------+
| b | z | 5 |
+-----+-------+-------+
| c | x | 1 |
+-----+-------+-------+
| c | y | 2 |
+-----+-------+-------+
How would be the best way to achieve this?
Upvotes: 0
Views: 1287
Reputation: 215107
You can use melt
function from pandas
to reshape your data frame from wide to long format then groupby the One
and Other
columns and sum
the Count
column:
import pandas as pd
pd.melt(df, id_vars = ['One', 'Count'], value_name = 'Other').groupby(['One', 'Other'])['Count'].sum().reset_index()
One Other Count
0 a x 3
1 a y 6
2 b x 10
3 b z 5
4 c x 1
5 c y 2
Upvotes: 1