Reputation: 207
I want to change NaN
values from one column based on the value of another column.
First table
A | B |
---|---|
2 | 3 |
0 | NaN |
2 | NaN |
4 | 2 |
Targeted table
A | B |
---|---|
2 | 3 |
0 | 0 |
2 | NaN |
4 | 2 |
So I want to change NaN to 0, if the value of column A in that row is also 0.
The python code that I create:
for i in range(len(df)):
if df.loc[i,'A'] == 0 and math.isnan(df.loc[i,'B']) = True:
df.loc[i,'B'] = 0
But this code seems so slow, because I want to do this for 20 million rows. Is there any faster way to do this?
Upvotes: 2
Views: 81
Reputation: 41437
First combine the conditions using &
(element-wise and
operator):
conditions = df['B'].isnull() & df['A'].eq(0)
# 0 False
# 1 True
# 2 False
# 3 False
# dtype: bool
Then you can vectorize the assignment (no loop):
Either use a single loc
:
df.loc[conditions, 'B'] = 0
Or mask
the values that satisfy those conditions with 0:
df['B'] = df['B'].mask(conditions, 0)
Output:
# A B
# 0 2 3.0
# 1 0 0.0
# 2 2 NaN
# 3 4 2.0
Upvotes: 2