Reputation: 4842
I am new to Pandas
My DataFrame:
df
A B C 1 2 3 4 5 6 7 8 9
5 2 4 True False False True False True False True False
2 2 1 True True False False False True False True False
5 4 7 False False True False True True False True True
4 4 1 False True False False False True False True True
2 0 8 False False True False True True False True True
My goal:
To calculate sum per cateogory 1-9
and columns A
, B
, C
.
So that I could answer these kidn of questions:
What is the sum
of column A
values where column 1
is True
, what is the sum
of C
where column 5
is True
.
In reality, I have about 50 categories 1-50
and I want to know if there is a smart way of calculating these sums without having to have this kind of line 50 times:
df['Sum of A where 1 is True'] = df.A.where(df.1)).sum()
and so on.
Thank you for your suggestions.
Upvotes: 2
Views: 172
Reputation: 150825
IIUC, this is just matmul
:
# replace your columns accordingly
df[list('123456789')].T @ df[list('ABC')]
Output:
A B C
1 7 4 5
2 6 6 2
3 7 4 15
4 5 2 4
5 7 4 15
6 18 12 21
7 0 0 0
8 18 12 21
9 11 8 16
Upvotes: 0
Reputation: 863731
You can use DataFrame.melt
with filtering by True
s with DataFrame.pop
for extract column and then aggregate sum
:
df = (df.melt(['A','B','C'], var_name='Type', value_name='mask')
.loc[lambda x: x.pop('mask')]
.groupby('Type')
.sum())
print (df)
A B C
Type
1 7 4 5
2 6 6 2
3 7 4 15
4 5 2 4
5 7 4 15
6 18 12 21
8 18 12 21
9 11 8 16
Upvotes: 2