Reputation: 4040
Given the following toy dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':['1a',np.nan,'10a','100b','0b'],
})
df
A
0 1a
1 NaN
2 10a
3 100b
4 0b
I want to remove all the characters/strings and extract the numbers in A column. There is an inplace=True method, but how can extract the numbers and replace them inplace?
I want to get:
A
0 1
1 NaN
2 10
3 100
4 0
Here is how I am doing it now:
df.A = df.A.str.extract('(\d+)')
Upvotes: 0
Views: 325
Reputation: 150745
str.extract
as the name suggested, doesn't replace, only extracts. Try:
df['A'].replace('(\D.*)','',inplace=True, regex=True)
Output:
A
0 1
1 NaN
2 10
3 100
4 0
More info on the regex pattern here. Basically:
\D
matches any non-digit character.*
matches all the characters that following \D
.So the pattern replaces everything from the first non-digit character with the empty string ''
.
Upvotes: 3
Reputation: 133518
With your shown samples, please try following. Simple explanation would be: using replace
function of pandas, where I am making regex true, then in regex place its mentioned that to replace anything apart from digits with NULL.
df['A'].replace('([^0-9]*)','', regex=True)
Upvotes: 3