Reputation: 3068
I messed up an earlier question and deleted it (provided a poor example that led to incorrect solutions). Mea Culpa
Here it is again, stated more correctly. I also need to cover cases where the 2 is not the first char of the string.
I have this string:
bobsmith2kgminus10meshcompt3kgfredrogers
I wish to only return the 2
.
Here is my regex:
.*(\d+?)kg.*
It is returning 3
and I don't see what I've missed.
My python code:
import re
val = 'bobsmith2kgminus10meshcompt3kgfredrogers'
out = re.sub(r'.*(\d+?)kg.*', r'\1', val)
print(out) #prints: 3
I've also tried:
(.*)(\d+?)kg.*
(\d+?)kg.*
Upvotes: 1
Views: 95
Reputation: 785761
If you really want to use re.sub
then use:
.*?(\d+)kg.*
This will 0 or more characters as few times as possible, expanding as needed before matching and grouping \d+
.
Code:
>>> import re
>>> val = 'bobsmith2kgminus10meshcompt3kgfredrogers'
>>> print ( re.sub(r'.*?(\d+)kg.*', r'\1', val) )
2
Otherwise, you can use this simpler regex in re.search
:
(\d+)kg
Code:
>>> print ( re.search(r'(\d+)kg', val).group(1) )
2
Upvotes: 1
Reputation: 27743
My guess is that this expression might simply work:
(\d+)kg.*
import re
regex = r"(\d+)kg.*"
test_str = """
2kgminus10meshcomp3kg
some_content_before200kgminus10meshcomp3kg
"""
print(re.findall(regex, test_str))
['2', '200']
Or with re.sub
:
import re
regex = r".*?(\d+)kg.*"
test_str = """
2kgminus10meshcomp3kg
some_content_before200kgminus10meshcomp3kg
"""
subst = "\\1"
print(re.sub(regex, subst, test_str))
Upvotes: 1