Reputation: 25
I want to loop over all the rows in a df, checking that two conditions hold and, if they do, replace the value in a column with something else. I've attempted to do this two different ways:
if (sales.iloc[idx]['shelf'] in ("DRY BLENDS","LIQUID BLENDS")) & np.isnan(sales.iloc[idx]['traceable_blend']):
sales.iloc[idx]['traceable_blend'] = False
and:
if (sales.iloc[idx]['shelf'] in ("DRY BLENDS","LIQUID BLENDS")) & (sales.iloc[idx]['traceable_blend'] == np.NaN):
sales.iloc[idx]['traceable_blend'] = False
By including print statements we've verified that the if statement is actually functional, but no assignment ever takes place. Once we've run the loop, there are True and NaN values in the 'traceable_blend' column, but never False. Somehow the assignment is failing.
It looks like this might've worked:
if (sales.iloc[idx]['shelf'] in ("DRY BLENDS","LIQUID BLENDS")) & np.isnan(sales.iloc[idx]['traceable_blend']):
sales.at[idx, 'traceable_blend'] = False
But I would still like to understand what's happening.
Upvotes: 0
Views: 96
Reputation: 476
Pandas offers two functions for checking for missing data (NaN or null): isnull()
and notnull()
- They return a boolean value. I suggest to try these instead of isnan()
You can also determine if any value is missing in your series by chaining .values.any()
Upvotes: 0
Reputation: 150735
This, sales.iloc[idx]['traceable_blend']=False
, is index chaining, and will almost never work. In fact, you don't need to loop:
sales['traceable_blend'] = sales['traceable_blend'].fillna(sales['shelf'].isin(['DRY BLENDS', 'LIQUID BLENDS']))
Upvotes: 1