dm0yang
dm0yang

Reputation: 23

Locate all non-number elements in a pandas.Series

For a pd.Series with mixed strings and numbers (integers and floats), I need to identify all non-number elements. For example

data = pd.Series(['1','wrong value','2.5','-3000','>=50','not applicable', '<40.5'])

I want it to return the following elements:

wrong value
>=50
not applicable
<40.5

What I'm currently doing is:

data[~data.str.replace(r'[\.\-]','').str.isnumeric()]

That is, because .str.isnumeric() will give False to decimal points and negative signs, I had to mask "." and "-" first and then find out the non-numeric fields.

Is there a better way of doing this? Or is there any potential problem/warning with my current method? Thanks!!

Upvotes: 2

Views: 397

Answers (1)

Andy L.
Andy L.

Reputation: 25259

Use pd.to_numeric to flag them

data[pd.to_numeric(data, errors='coerce').isna()]

Out[1159]:
1       wrong value
4              >=50
5    not applicable
6             <40.5
dtype: object

Upvotes: 3

Related Questions