user9364978
user9364978

Reputation:

replacing values on pandas without changing the nan values as well

I have an issue trying to replace values using np.where without changing the nan values as well.

here is the code I used but it changes the nan values as well although I don't them changed for now.

df["col"] = np.where(df["col"]>2, 1, 0)

I tried using a mask, but it didn't work as well.

df["col"] = np.ma.array(df["col"], mask=np.isnan(df["col"]))
df["col"] = np.where(df["col"]>2, 1, 0)

is there a way to do that?

Upvotes: 0

Views: 249

Answers (2)

SeaBean
SeaBean

Reputation: 23227

You can use .loc on column col to locate only the rows not NaN (.notna()) to apply the np.where(), as follows:

df.loc[df['col'].notna(), 'col'] = np.where(df.loc[df['col'].notna(), 'col']>2, 1, 0)

or use .mask() with .notna() and np.where(), as follows:

df['col'] = df['col'].mask(df['col'].notna(), np.where(df["col"]>2, 1, 0))

Upvotes: 0

akuiper
akuiper

Reputation: 215117

Add another np.where condition to keep nans:

np.where(df["col"] > 2, 1, np.where(np.isnan(df.col), np.nan, 0))

Or use pandas Series.where:

df.col.where(df.col.isnull(), (df.col > 2).astype(int))

df = pd.DataFrame({'col': [3, np.nan, 0, 2, np.nan]})
df
   col
0  3.0
1  NaN
2  0.0
3  2.0
4  NaN

df['col'] = np.where(df["col"] > 2, 1, np.where(np.isnan(df.col), np.nan, 0))
df
   col
0  1.0
1  NaN
2  0.0
3  0.0
4  NaN

Upvotes: 0

Related Questions