Error Replicator
Error Replicator

Reputation: 292

Pandas - how to filter rows base on regular expression

Can you please let me know how to filter rows using Pandas base on character range like [0-9] or [A-Z].

case like this where all the column types are objects

A         B
2.3     234
4.5     4b6
7b       275

I would like to check if all the values in the column A are floats meaning contains [0-9] and '.' ? I'm aware of pd.to_numeric, applymap, isreal, isdigit etc but this is object column before I convert it to any numeric I would like to know the scale of the problem for non float values.

and which rows in dataset contains chars other than [0-9]

Upvotes: 1

Views: 1797

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210842

try this:

In [8]: df
Out[8]:
     A    B
0  2.3  234
1  4.5  4b6
2   7b  275
3   11   11

In [9]: df.A.str.match(r'^\d*\.*\d*$')
Out[9]:
0     True
1     True
2    False
3     True
Name: A, dtype: bool

In [10]: df.loc[df.A.str.match(r'^\d*\.*\d*$')]
Out[10]:
     A    B
0  2.3  234
1  4.5  4b6
3   11   11

UPDATE:

starting from Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.

Upvotes: 2

Related Questions