Reputation: 5449
I have a csv file that I'm converting in a panda in Python 3.7
. I want then to check if certain cells have NaN
(i.e. there are empty in my case) and only in this case, I want to replace the content of the cell with another value.
I'm selecting the cell with values inside other cells in other columns (columns family_name
and first_name
) on the same row. Here is a MWE
:
import csv
import pandas as pd
import numpy as np
df = pd.DataFrame({"family_name":["smith", "duboule", "dupont"], "first_name":["john","jean-paul", "luc"], "weight":[70, 85, pd.np.nan]})
value_to_replace = 90
if df["weight"][(df["family_name"] == "dupont") & (df["first_name"] == "luc")] == pd.np.nan:
df["weight"][(df["family_name"] == "dupont") & (df["first_name"] == "luc")] = value_to_replace
I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mymac/anaconda3/lib/python3.7/site-packages/pandas/core/generic.py", line 1576, in __nonzero__
.format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I also tried the to add .bool() == True
in the following form but I got the same error message:
if pd.isna(df["weight"][(df["family_name"] == family_name) & (df["first_name"] == first_name)]).bool() == True:
df["weight"][(df["family_name"] == "dupont") & (df["first_name"] == "luc")] = value_to_replace
Upvotes: 0
Views: 344
Reputation: 2811
remove all your if statement, and use this
df.loc[ (df["family_name"] == "dupont") &
(df["first_name"] == "luc") &
(df["weight"].isnull()), 'weight'] = value_to_replace
I suggest you to read the pandas API to learn how loc
select/edit data
Upvotes: 1
Reputation: 42886
Use np.where
Works like following: np.where(condition, true value, false value)
df['weight'] = np.where((df.family_name == 'dupont') & (df.first_name == 'luc'), value_to_replace, df.weight)
print(df)
family_name first_name weight
0 smith john 70.0
1 duboule jean-paul 85.0
2 dupont luc 90.0
Edit after OP's comment
Only if weight is NaN
, you can use .isnull
:
df['weight'] = np.where((df.family_name == 'dupont') & (df.first_name == 'luc') & (df.weight.isnull()), value_to_replace, df.weight)
Upvotes: 2