Reputation:
I have an issue trying to replace values using np.where without changing the nan values as well.
here is the code I used but it changes the nan values as well although I don't them changed for now.
df["col"] = np.where(df["col"]>2, 1, 0)
I tried using a mask, but it didn't work as well.
df["col"] = np.ma.array(df["col"], mask=np.isnan(df["col"]))
df["col"] = np.where(df["col"]>2, 1, 0)
is there a way to do that?
Upvotes: 0
Views: 249
Reputation: 23227
You can use .loc
on column col
to locate only the rows not NaN
(.notna()
) to apply the np.where()
, as follows:
df.loc[df['col'].notna(), 'col'] = np.where(df.loc[df['col'].notna(), 'col']>2, 1, 0)
or use .mask()
with .notna()
and np.where()
, as follows:
df['col'] = df['col'].mask(df['col'].notna(), np.where(df["col"]>2, 1, 0))
Upvotes: 0
Reputation: 215117
Add another np.where
condition to keep nan
s:
np.where(df["col"] > 2, 1, np.where(np.isnan(df.col), np.nan, 0))
Or use pandas Series.where
:
df.col.where(df.col.isnull(), (df.col > 2).astype(int))
df = pd.DataFrame({'col': [3, np.nan, 0, 2, np.nan]})
df
col
0 3.0
1 NaN
2 0.0
3 2.0
4 NaN
df['col'] = np.where(df["col"] > 2, 1, np.where(np.isnan(df.col), np.nan, 0))
df
col
0 1.0
1 NaN
2 0.0
3 0.0
4 NaN
Upvotes: 0