Reputation: 11
I have a regular expression pattern as follows:
.*\b(?P<core>[A-Z][0-9]?\b.*)(?P<extra>\b[0-9]+[xX][0-9]+.*)?\.png
To match some strings as follows:-
UI SCREEN 5-1 F2 ROUND TAB REFLECTION 224x18px.png
In Python, I get the following result
{u'core': u'F2 ROUND TAB REFLECTION 224x18px', u'extra': None}
instead of
{u'core': u'F2 ROUND TAB REFLECTION ', u'extra': u'224x18px'}
As far as I kown, regex quantifier is greedy by default in python. So I think it should work.
What am I doing wrong?
Upvotes: 0
Views: 340
Reputation: 92627
Add a ?
after your first greedy .*
import re
x = "UI SCREEN 5-1 F2 ROUND TAB REFLECTION 224x18px.png"
re.search(r'.*\b(?P<core>[A-Z][0-9]?\b.*?)(?P<extra>\b[0-9]+[xX][0-9]+.*)?.png', x).groups()
# OUTPUT
('F2 ROUND TAB REFLECTION ', '224x18px')
Upvotes: 1
Reputation: 1031
Could you write regular expression just like you are using? Because I can't see group name in your regex.
>>> re.match(r'(?P<core>[A-Z0-9- ]+) (?P<extra>[0-9]+[xX][0-9]+px)\.png', a).groups()
('UI SCREEN 5-1 F2 ROUND TAB REFLECTION', '224x18px')
Upvotes: 0
Reputation: 23206
The expression (?P[A-Z][0-9]?\b.*)
probably doesn't do what you think it does ... it will match:
Which swallows everything up to your terminating .png
(which should be a \.png
)
Upvotes: 1