Reputation: 7458
I am trying to create a new column amount_0_flag
for a df
, the values in that column are based on groupby
another column key
, for which if amount
sum
is 0, assigned True
to amount_0_flag
, otherwise False
. The df
looks like,
key amount amount_0_flag negative_amount
1 1.0 True False
1 1.0 True True
2 2.0 False True
2 3.0 False False
2 4.0 False False
so when df.groupby('key')
, cluster with key=1
, will be assigned True
to amount_0_flag
for each element of the cluster, since within the cluster, one element has negative 1 and another element has postive 1 as their amount
s.
df.groupby('key')['amount'].sum()
only gives the sum of amount
for each cluster not considering values in negative_amount
and I am wondering how to also find the cluster and its rows with 0
sum amounts consdering negative_amount
values using pandas/numpy.
Upvotes: 1
Views: 71
Reputation: 153500
Let's try this where I created a 'new_column' showing the comparison to your 'amount_0_flag':
df['new_column'] = (df.assign(amount_n = df.amount * np.where(df.negative_amount,-1,1))
.groupby('key')['amount_n']
.transform(lambda x: sum(x)<=0))
Output:
key amount amount_0_flag negative_amount new_column
0 1 1.0 True False True
1 1 1.0 True True True
2 2 2.0 False True False
3 2 3.0 False False False
4 2 4.0 False False False
Upvotes: 2