el-cheapo
el-cheapo

Reputation: 207

Changing NaN values to 0 based on the value on other column in pandas DataFrame

I want to change NaN values from one column based on the value of another column.

First table

A B
2 3
0 NaN
2 NaN
4 2

Targeted table

A B
2 3
0 0
2 NaN
4 2

So I want to change NaN to 0, if the value of column A in that row is also 0.

The python code that I create:

for i in range(len(df)):
    if df.loc[i,'A'] == 0 and math.isnan(df.loc[i,'B']) = True:
        df.loc[i,'B'] = 0

But this code seems so slow, because I want to do this for 20 million rows. Is there any faster way to do this?

Upvotes: 2

Views: 81

Answers (1)

tdy
tdy

Reputation: 41437

First combine the conditions using & (element-wise and operator):

conditions = df['B'].isnull() & df['A'].eq(0)

# 0    False
# 1     True
# 2    False
# 3    False
# dtype: bool

Then you can vectorize the assignment (no loop):

  • Either use a single loc:

    df.loc[conditions, 'B'] = 0
    
  • Or mask the values that satisfy those conditions with 0:

    df['B'] = df['B'].mask(conditions, 0)
    

Output:

#    A    B
# 0  2  3.0
# 1  0  0.0
# 2  2  NaN
# 3  4  2.0

Upvotes: 2

Related Questions