Reputation: 367
Is there a way to check if a pandas series value contains any numeric characters and replace those who does not contain any with NaN?
Series.str.isnumeric
only checks whether all characters are numeric.
Given the following series:
d = {'a': "Python", 'b': "$|", 'c': "|32", "c":"dos"}
ser = pd.Series(data=d, index=['a', 'b', 'c', "c"])
The only values that contain numeric characters are c, so that the values of a and b should be replaced with NaN. I am struggling to find a solution for this problem as there are multiple values for c which is therefore a pd.series by itself, according to type(ser["c"])
Upvotes: 0
Views: 1095
Reputation: 150745
Try str.contains('\d')
to check for existence of a digit. Then groupby().transform('any')
to propagate that to the whole group. Finally, use where
to mask the values of the series:
ser.where(ser.str.contains('\d').groupby(level=0).transform('any'))
Output:
a NaN
b NaN
c |32
c dos
dtype: object
Upvotes: 1
Reputation: 61910
res = ser.apply(lambda x: x if any(c.isnumeric() for c in x) else pd.NA)
print(res)
Output
a <NA>
b <NA>
c |32
c |32
dtype: object
Upvotes: 1