JA-pythonista
JA-pythonista

Reputation: 1323

How can I strip off all non-numeric characters in a Pandas Series

I have a Pandas DataFrame. And I am interested in getting a particular column with only numeric characters.

For example, the column contains rows like this:

4'> delay trip
4/
4'>book flight 'trip
34
4"> book flight delay
4"

How can I strip off all non-numeric characters and have just numeric characters like this:

4
4
4
[3,4]
4
4

Upvotes: 0

Views: 408

Answers (1)

Serge Ballesta
Serge Ballesta

Reputation: 149075

You have 2 different problems here:

  • first is to extract digits from the column cells
  • second is to make a list if you have more than one digit

Just chain both operations:

df[col].str.findall(r'\d').apply(lambda x: x[0] if len(x) == 1 else '' if len(x) == 0 else x)

With you example it gives:

0         4
1         4
2         4
3    [3, 4]
4         4
5         4

Upvotes: 2

Related Questions