Reputation: 35
My dataframe looks like this:
Person Number
BOB 1.23
BOB -1.23
Jeff 18.01
Jack 82.04
Jack 14.02
Jeff -18.01
what I want to do is look at the Persons columns and then say if there are multiple people under the name BOB and we know that BOB is unique so its the same person. I want to then look at the Number column and if their is a negative value present for BOB as well as a Postive I want to change both numbers to Zero. Furthermore, for cases like Jack where there are two positive numbers I dont want to do anything for him. So all I want to do is find where the Person is the same and if their is a presence of a negative value and a positive value under the same name then change both numbers to 0
So the final dataframe would look like such...
Person Number
BOB 0
BOB 0
Jeff 0
Jack 82.04
Jack 14.02
Jeff 0
Another issue I am having is that Python isn't detecting the negatives appropriately so it changes negatives to look like this ($4.00) instead of keeping it like -1.23. I am just not sure how I can achieve the designated answer so any help would be greatly appreciated, thank you
Upvotes: 1
Views: 2619
Reputation: 836
You can find those Persons with positive values, negative values and then the intersection.
pos = set(df.loc[df.Number > 0, 'Person'])
neg = set(df.loc[df.Number < 0, 'Person'])
intersection = pos & neg
And then set Number = 0 where Person is in this intersection.
df.loc[df.Person.isin(intersection), 'Number'] = 0
df
Person Number
0 BOB 0.00
1 BOB 0.00
2 Jeff 0.00
3 Jack 82.04
4 Jack 14.02
5 Jeff 0.00
Note that the output is different from what you've posted as the expected output, but to me it seems there's a bit of contradiction between what you've asked and the expected output. I've gone with:
So all I want to do is find where the Person is the same and if their is a presence of a negative value and a positive value under the same name then change both numbers to 0
Upvotes: 4
Reputation: 323226
Let us do it by using filter
and .loc
df1=df.groupby('Person').filter(lambda x : x.Number.lt(0).any())
df.loc[df1.index,"Number"]=0
df
Out[608]:
Person Number
0 BOB 0.00
1 BOB 0.00
2 Jeff 0.00
3 Jack 82.04
4 Jack 14.02
5 Jeff 0.00
Or using isin
df.loc[df.Person.isin(df.loc[df.Number<0,'Person'].tolist()),'Number']=0
df
Out[616]:
Person Number
0 BOB 0.00
1 BOB 0.00
2 Jeff 0.00
3 Jack 82.04
4 Jack 14.02
5 Jeff 0.00
Upvotes: 0