crashwap
crashwap

Reputation: 3068

Python regex - Ungreedy quantifier question

I messed up an earlier question and deleted it (provided a poor example that led to incorrect solutions). Mea Culpa

Here it is again, stated more correctly. I also need to cover cases where the 2 is not the first char of the string.

I have this string:

bobsmith2kgminus10meshcompt3kgfredrogers

I wish to only return the 2.

Here is my regex:

.*(\d+?)kg.*

It is returning 3 and I don't see what I've missed.

RegEx101.com Demo

My python code:

import re
val = 'bobsmith2kgminus10meshcompt3kgfredrogers'
out = re.sub(r'.*(\d+?)kg.*', r'\1', val)
print(out) #prints: 3

I've also tried:

(.*)(\d+?)kg.*
(\d+?)kg.*

Upvotes: 1

Views: 95

Answers (2)

anubhava
anubhava

Reputation: 785761

If you really want to use re.sub then use:

.*?(\d+)kg.*

This will 0 or more characters as few times as possible, expanding as needed before matching and grouping \d+.

Code:

>>> import re
>>> val = 'bobsmith2kgminus10meshcompt3kgfredrogers'
>>> print ( re.sub(r'.*?(\d+)kg.*', r'\1', val) )
2

RegEx Demo


Otherwise, you can use this simpler regex in re.search:

(\d+)kg

Code:

>>> print ( re.search(r'(\d+)kg', val).group(1) )
2

Upvotes: 1

Emma
Emma

Reputation: 27743

My guess is that this expression might simply work:

(\d+)kg.*

Demo

Test

import re

regex = r"(\d+)kg.*"

test_str = """
2kgminus10meshcomp3kg
some_content_before200kgminus10meshcomp3kg
"""
print(re.findall(regex, test_str))

Output

['2', '200']

Or with re.sub:

import re

regex = r".*?(\d+)kg.*"

test_str = """
2kgminus10meshcomp3kg
some_content_before200kgminus10meshcomp3kg
"""
subst = "\\1"
print(re.sub(regex, subst, test_str))

Upvotes: 1

Related Questions