How to extract numbers from mixed dataframe column and replace with numbers only (inplace)?

Question

Given the following toy dataframe:

import pandas as pd
import numpy as np
df = pd.DataFrame({'A':['1a',np.nan,'10a','100b','0b'],
                   })
df

    A
0   1a
1   NaN
2   10a
3   100b
4   0b

I want to remove all the characters/strings and extract the numbers in A column. There is an inplace=True method, but how can extract the numbers and replace them inplace?

I want to get:

Here is how I am doing it now:

df.A = df.A.str.extract('(\d+)')

Quang Hoang · Accepted Answer

str.extract as the name suggested, doesn't replace, only extracts. Try:

df['A'].replace('(\D.*)','',inplace=True, regex=True)

Output:

More info on the regex pattern here. Basically:

\D matches any non-digit character
.* matches all the characters that following \D.

So the pattern replaces everything from the first non-digit character with the empty string ''.

How to extract numbers from mixed dataframe column and replace with numbers only (inplace)?

Answers (2)

Related Questions