python pandas-extracting a portion of a string based on the pattern around it

Question

I have a string column that follows a following pattern:

yariyada up to a maximum of (number)% yariyada

For example, like this.

will be granted up to a maximum of 75.5% If less, then nothing

I want to create another column that extracts that number that comes between "up to a maximum of" and "%".

So far I'm only able to detect if the string column contains that pattern, using .contains method.

If this is of any elucidation, in Stata (I'm a stata user), I would use regexm to break the string into parts and use regexs to retreive the parts. I'm wondering if Pandas has a similar, or better!, function.

Thanks for your help!

Zero · Accepted Answer

You could use pandas.core.strings.StringMethods.extract method to ind groups in each string using passed regular expression

df['col_name'].str.extract('up to a maximum of (.*)%')

Will give you a new column with number extracted

python pandas-extracting a portion of a string based on the pattern around it

Answers (2)

Related Questions