Caranarq Aramat
Caranarq Aramat

Reputation: 53

Pandas replace cells matching condition with None

Take the following sample dataframe:

df = pd.DataFrame([['de', None, None], 
                   ['de ditos', 2, 3], 
                   [4, None, None], 
                   [None, None, 9],
                   ['de', 4, 6]])

which looks like

          0    1    2
0        de  NaN  NaN
1  de ditos  2.0  3.0
2         4  NaN  NaN
3      None  NaN  9.0
4        de  4.0  6.0

I Want to replace all the values in the column 0 where the cell value is 'de' with None, so that the dataframe ends like this:

          0    1    2
0      None  NaN  NaN
1  de ditos  2.0  3.0
2         4  NaN  NaN
3      None  NaN  9.0
4      None  4.0  6.0

I have tried:

df[0].where(df[0] == 'de') = None

which returns SyntaxError: can't assign to function call

I also tried:

def erasedes(x):
    if x == 'de':
        return None
    else: pass
df[0] = df[0].apply(lambda x: erasedes(x))

But this replaces every value with None

Upvotes: 2

Views: 2735

Answers (2)

jpp
jpp

Reputation: 164623

This should work:

df[0]= df[0].replace({'de': None})

Upvotes: 2

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95872

What you really want is:

In [3]: df
Out[3]:
          0    1    2
0        de  NaN  NaN
1  de ditos  2.0  3.0
2         4  NaN  NaN
3      None  NaN  9.0
4        de  4.0  6.0

In [4]: df.loc[df[0] == 'de',0] = None

In [5]: df
Out[5]:
          0    1    2
0      None  NaN  NaN
1  de ditos  2.0  3.0
2         4  NaN  NaN
3      None  NaN  9.0
4      None  4.0  6.0

Note, your .apply approach does not work because your erasedes function always returns None, else:pass does nothing, the function terminates, and implicitly returns None. It would have worked if you had used else: return x

In [6]: df = pd.DataFrame([['de', None, None],
   ...:                    ['de ditos', 2, 3],
   ...:                    [4, None, None],
   ...:                    [None, None, 9],
   ...:                    ['de', 4, 6]])

In [7]: def erasedes(x):
   ...:     if x == 'de':
   ...:         return None
   ...:     else:
   ...:         return x
   ...:
In [8]: df[0]
Out[8]:
0          de
1    de ditos
2           4
3        None
4          de
Name: 0, dtype: object

In [9]: df[0].apply(erasedes)
Out[9]:
0        None
1    de ditos
2           4
3        None
4        None
Name: 0, dtype: object

You should prefer .loc/.iloc based assignment over .apply which generally be slow.

Upvotes: 2

Related Questions