Reputation: 57
I would like to replace all cells of a column if each value IS NOT in a specific value range.
E.g. value range between 0 and 10
The function should put np.NaN on all cells which are below 0 or above 10.
I tried with this:
df.loc[(df["B"] < 5 ), "B"] = np.NaN
but it only works with a specific value, not with a value range.
Is there a simple solution to replace all values outside a specific value range, without iterating through all rows?
Upvotes: 0
Views: 3300
Reputation: 351
You can use np.where, specifying the desired conditions. If True, yield x, otherwise yield y.
np.where(condition, x, y)
So, the solution would be:
df.B = np.where((df.B < 0) & (df.B > 10), np.NaN, df.B)
For example:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
Will output something like that:
A B C D
0 2 5 6 2
1 0 4 0 0
2 4 3 9 0
3 5 1 1 8
4 2 3 6 5
5 3 0 3 9
6 0 4 3 4
7 4 1 4 5
8 0 5 1 5
9 6 7 4 4
Then if you apply the where condition:
df.B = np.where((df.B < 6) & (df.B > 2), np.NaN, df.B)
A B C D
0 2 NaN 6 2
1 0 NaN 0 0
2 4 NaN 9 0
3 5 1.0 1 8
4 2 NaN 6 5
5 3 0.0 3 9
6 0 NaN 3 4
7 4 1.0 4 5
8 0 NaN 1 5
9 6 7.0 4 4
You can find more information here: https://numpy.org/doc/stable/reference/generated/numpy.where.html
Upvotes: 3
Reputation: 2412
Yup, you can just do something like this:
df["B"] = df["B"].where((df["B"] >= 0) & (df["B"] <= 10))
# or
df["B"] = df["B"].map(lambda x: x if 0 <= x <= 10 else None)
# or
df.loc[(df["B"] < 0) | (df["B"] > 10), "B"] = None
Upvotes: 1
Reputation: 13656
More closely to your original syntax
df.loc[(df["B"] < 0 )|(df["B"] > 10 ), "B"] = np.NaN
Upvotes: 2