xxgaryxx
xxgaryxx

Reputation: 367

Check if data pandas.series value contains numeric character

Is there a way to check if a pandas series value contains any numeric characters and replace those who does not contain any with NaN? Series.str.isnumeric only checks whether all characters are numeric.

Given the following series:

d = {'a': "Python", 'b': "$|", 'c': "|32", "c":"dos"}
ser = pd.Series(data=d, index=['a', 'b', 'c', "c"])

The only values that contain numeric characters are c, so that the values of a and b should be replaced with NaN. I am struggling to find a solution for this problem as there are multiple values for c which is therefore a pd.series by itself, according to type(ser["c"])

Upvotes: 0

Views: 1095

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150745

Try str.contains('\d') to check for existence of a digit. Then groupby().transform('any') to propagate that to the whole group. Finally, use where to mask the values of the series:

ser.where(ser.str.contains('\d').groupby(level=0).transform('any'))

Output:

a    NaN
b    NaN
c    |32
c    dos
dtype: object

Upvotes: 1

Dani Mesejo
Dani Mesejo

Reputation: 61910

Use any + apply:

res = ser.apply(lambda x: x if any(c.isnumeric() for c in x) else pd.NA)
print(res)

Output

a    <NA>
b    <NA>
c     |32
c     |32
dtype: object

Upvotes: 1

Related Questions