Reputation: 5279
I have dataframe like this
A B C
0 1 7 a
1 2 8 b
2 3 9 c
3 4 10 a
4 5 11 b
5 6 12 c
I would like to get groupby result (key=column C) below;
A B
d 12 36
"d" means a or b ,
so I would like to groupby only with "a" and "b".
and then put together as "d".
when I sum up with all the key elements then drop, it consume much time....
Upvotes: 0
Views: 172
Reputation: 214927
One option is to use pandas
where
to transform the C column so that where it was a
or b
becomes d
and then you can groupby the transformed column and do the normal summary on it, and if rows with c
is not desired, you can simply drop it after the summary:
df_sum = df.groupby(df.C.where(~df.C.isin(['a', 'b']), "d")).sum().reset_index()
df_sum
# C A B
#0 c 9 21
#1 d 12 36
df_sum.loc[df_sum.C == "d"]
# C A B
#1 d 12 36
To see more clearly how the where
clause works:
df.C.where(~df.C.isin(['a','b']), 'd')
# 0 d
# 1 d
# 2 c
# 3 d
# 4 d
# 5 c
# Name: C, dtype: object
It acts like a replace method and replace a
and b
with d
which will be grouped together when passed to groupby
function.
Upvotes: 1