Reputation: 77
I have a pandas dataframe as below with 3 columns. I want to compare each column to see if the value matches a particular string, and if yes, replace the value with NaN.
For example, if there are 5 values in column 1 of the data frame:
abcd
abcd
defg
abcd
defg
and if the comparison string is defg
, the end result for column 1 in the data frame should be.
abcd
abcd
NaN
abcd
NaN
Upvotes: 2
Views: 11938
Reputation: 162
There's a bunch of solutions... If you want to practice with using lambda functions you could always do...
df['Col1'] = df.Col1.apply(lambda x: np.nan if x == 'defg' else x)
Result:
0 abcd
1 abcd
2 NaN
3 abcd
4 NaN
Seconds: 0.0020899999999999253
Processing time is probably a little bit slower than the solutions above though after some unit testing.
Upvotes: 2
Reputation: 153460
You can use mask
, this will replace 'defg' in the entire dataframe with NaN:
df.mask(df == 'defg')
Output:
0
0 abcd
1 abcd
2 NaN
3 abcd
4 NaN
You can do this for a column also:
df['col1'].mask(df['col1'] == 'defg')
Or using replace as @pygo suggest in his solution
df['col1'].replace('defg',np.nan)
Upvotes: 1
Reputation: 8816
Use pandas in-built solution Using replace
method as a regex and inplace method to make it permanent in the dataframe, while use numpy to replace the matching values to NaN
.
import pandas as pd
import numpy as np
Example DataFrame:
df
col1
0 abcd
1 abcd
2 defg
3 abcd
4 defg
Result:
df['col1'].replace(['defg'], np.nan, regex=True, inplace=True)
df
col1
0 abcd
1 abcd
2 NaN
3 abcd
4 NaN
Upvotes: 4
Reputation: 4660
You can use numpy where to set values based on boolean conditions:
import numpy as np
df["col_name"] = np.where(df["col_name"]=="defg", np.nan, df["col_name"])
Obviously replace col_name
with whatever your actual column name is.
An alternative is to use pandas .loc
to change the values in the DataFrame in place:
df.loc[df["col_name"]=="defg", "col_name"] = np.nan
Upvotes: 1