ecjb
ecjb

Reputation: 5449

If statement on a cell in panda gives "The truth value of a Series is ambiguous"

I have a csv file that I'm converting in a panda in Python 3.7. I want then to check if certain cells have NaN(i.e. there are empty in my case) and only in this case, I want to replace the content of the cell with another value.

I'm selecting the cell with values inside other cells in other columns (columns family_name and first_name) on the same row. Here is a MWE:

import csv
import pandas as pd
import numpy as np
df = pd.DataFrame({"family_name":["smith", "duboule", "dupont"], "first_name":["john","jean-paul", "luc"], "weight":[70, 85, pd.np.nan]})
value_to_replace = 90
if df["weight"][(df["family_name"] == "dupont") & (df["first_name"] == "luc")] == pd.np.nan:
    df["weight"][(df["family_name"] == "dupont") & (df["first_name"] == "luc")] = value_to_replace

I get the following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mymac/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 1576, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

I also tried the to add .bool() == True in the following form but I got the same error message:

if pd.isna(df["weight"][(df["family_name"] == family_name) & (df["first_name"] == first_name)]).bool() == True:
    df["weight"][(df["family_name"] == "dupont") & (df["first_name"] == "luc")] = value_to_replace

Upvotes: 0

Views: 344

Answers (2)

Terry
Terry

Reputation: 2811

remove all your if statement, and use this

df.loc[ (df["family_name"] == "dupont") & 
        (df["first_name"] == "luc") & 
        (df["weight"].isnull()), 'weight'] = value_to_replace

I suggest you to read the pandas API to learn how loc select/edit data

Upvotes: 1

Erfan
Erfan

Reputation: 42886

Use np.where

Works like following: np.where(condition, true value, false value)

df['weight'] = np.where((df.family_name == 'dupont') & (df.first_name == 'luc'), value_to_replace, df.weight)

print(df)
  family_name first_name  weight
0       smith       john    70.0
1     duboule  jean-paul    85.0
2      dupont        luc    90.0

Edit after OP's comment
Only if weight is NaN, you can use .isnull:

df['weight'] = np.where((df.family_name == 'dupont') & (df.first_name == 'luc') & (df.weight.isnull()), value_to_replace, df.weight)

Upvotes: 2

Related Questions