Reputation: 326
I have a df with multiple columns and values. Say:
ID | Name | Cost |
---|---|---|
123 | Jo | $10 |
345 | Bella | $20 |
567 | IgnoreMe | $5000 |
I also have a defined list of names to ignore. In this example, it contains one value, but it can have more.
names_to_ignore = ['ignoreme']
The goal is to replace all cost values with null when Name
is in the ignore list.
I tried:
#aligning conventions
df = df.apply(lambda var: var.lower())
ignore_set = [x.lower() for x in ignore_set]
#ignoring
df.loc[df['Name'] in ignore_set, 'Cost'] = ''
But it didn't work. I get:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Any thoughts?
Upvotes: 0
Views: 43
Reputation: 1247
You can try np.where()
.
df['cost'] = np.where( (n for n in df['Name'] if n in names_to_ignore), None, df['cost'])
Upvotes: 0
Reputation: 24324
Try:
names_to_ignore = ['ignoreme','IgnoreMe']
Finally:
c=df['Name'].isin(names_to_ignore) #checking if this condition satisfies or not
df.loc[c,'Cost']=float('NaN')
OR
via np.where()
:
#import numpy as np
df['Cost']=np.where(c,np.nan,c)
OR
via mask()
:
df['Cost']=df['Cost'].mask(c)
OR
via where()
:
df['Cost']=df['Cost'].where(~c)
Upvotes: 1