Reputation: 31
Please help, I need to fix a column in my df that assigns a value (a
or b
) based on its group. If the group sums roughly to 0 (anywhere between -50 to 100 accepted), it is coded 'a'
. If not it is coded 'b'
.
My columns are: ['Assets', 'Value']
Asset Value
0 2000 500
1 2000 -500
2 3000 1000
3 3000 1000
4 4000 3000
5 4000 -100
6 5000 -100
7 5000 9000
so far I've done this:
df['Sum_Assets'] = df['Value'].groupby(df['Asset']).transform('sum')
Then I've done:
(df['Sum_Assets'] > -50) & (df['Sum_Assets'] <= 100),
(df['Sum_Assets'] <= -50),
(df['Sum_Assets'] > 100 ),
]
# create a list of the values to assign for each condition
values = ['a', 'b', 'b']
# create new column and assign values to it
df['Action'] = np.select(conditions, values)
However, I only need to do this if there is a negative number in the 'value'
column for each asset group. For instance with the expected output below: Assets grouped with asset code = 3000 (index 2 and 3) would not be assigned anything as it has no negative number under column 'values'
Expected Output:
Asset Value Action
0 2000 500 a
1 2000 -500 a
2 3000 1000
3 3000 1000
4 4000 3000 b
5 4000 -100 b
6 5000 -100 b
7 5000 9000 b
Upvotes: 1
Views: 209
Reputation: 13349
It should be:
df.Asset = df.Asset.astype(int)
df.Value = df.Value.str.replace(',', '').astype(float).astype(int)
df['Sum_Assets'] = df.groupby('Asset')['Value'].transform('sum')
conditions = [
(df['Sum_Assets'].gt(50)) & (df['Sum_Assets'].le(100)),
(df['Sum_Assets'].le(-50)) | (df['Sum_Assets'].gt(100))
]
values = ['a', 'b']
df['Action'] = np.where(df.Value<0, None,np.select(conditions, values))
Upvotes: 1