Valeriy Gaydar
Valeriy Gaydar

Reputation: 500

Modify regular expression

I am trying to get first pair of numbers from "09_135624.jpg"

My code now:

import re

string = "09_135624.jpg"
pattern = r"(?P<pair>(.*))_135624.jpg"
match = re.findall(pattern, string)

print match

Output:

[('09', '09')]

Why I have tuple in output?

Can you help me modify my code to get this:

['09']

Or:

'09'

Upvotes: 0

Views: 66

Answers (2)

vks
vks

Reputation: 67968

(?P<pair>(?:.*))_135624.jpg

Try this. You are getting two results because you are capturing them twice. I have modified it to capture only once:

http://regex101.com/r/lS5tT3/62

Upvotes: 1

falsetru
falsetru

Reputation: 368954

re.findall returns differently according to the number of capturing group in the pattern:

>>> re.findall(r"(?P<pair>.*)_135624\.jpg", "09_135624.jpg")
['09']

According to the documentation:

Return all non-overlapping matches of pattern in string, as a list of strings. The string is scanned left-to-right, and matches are returned in the order found. If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group. Empty matches are included in the result unless they touch the beginning of another match.


Alternative using re.search:

>>> re.search(r"(?P<pair>.*)_135624\.jpg", "09_135624.jpg")
<_sre.SRE_Match object at 0x00000000025D0D50>
>>> re.search(r"(?P<pair>.*)_135624\.jpg", "09_135624.jpg").group('pair')
'09'
>>> re.search(r"(?P<pair>.*)_135624\.jpg", "09_135624.jpg").group(1)
'09'

UPDATE

To match . literally, you need to escape it: \..

Upvotes: 1

Related Questions