Shawn
Shawn

Reputation: 603

Replace string with numerical value while preserving NaN

I have a case of needing to replace string values across a number of columns:

  1. The value to substitute changes per column
  2. I need to preserve existing NaN

I have a series of steps that seems that it should work, to me, but does not; the 'inplace' step does not work. Some dummy test code:

make a dataframe

df = pd.DataFrame([[np.nan, 2, np.nan, np.nan],
                   [3, 4, np.nan, 1],
                   [np.nan, np.nan, np.nan, 5],
                   [np.nan, 3, np.nan, 'foo']],
                  columns=list('ABCD'))

calculate substitute value, say from last column

special_value = pd.to_numeric(df['D'], errors='corece').min() / 2
special_value
0.5

have a look

df

seems to work here

pd.to_numeric(df['D'].dropna(), errors='coerce').fillna(value=special_value) 
1    1.0
2    5.0
3    0.5
Name: D, dtype: float64

but no, it doesn't

pd.to_numeric(df['D'].dropna(), errors='coerce').fillna(value=special_value, inplace = True) 
0    NaN
1      1
2      5
3    foo
Name: D, dtype: object

Upvotes: 0

Views: 173

Answers (1)

LaChatonnn
LaChatonnn

Reputation: 159

If you use .fillna, it is not going to preserve NaN values.

Try this:

def add_value(df,col):
    condition = df[col].apply(lambda x : True if type(x) == int else False) 
    sp_value = df[col][condition].min()/2 
    df[col] = df[col].apply(lambda x : sp_value if type(x) == str else x)

output is

add_value(df,'D')

   A    B   C   D
0   NaN 2.0 NaN NaN
1   3.0 4.0 NaN 1.0
2   NaN NaN NaN 5.0
3   NaN 3.0 NaN 0.5

Upvotes: 1

Related Questions