Reputation: 8637
I have a string that looks like this:
<name>-<gender>-<age>.jpg
I want to be very liberal as far as what I accept. Requirements being:
<name>
component is required..jpg
file extension<name>.jpg
<name>-<gender>.jpg
<name>-<gender>-<age>.jpg
Examples of what is considered valid:
Beamin-M.jpg
Jean.jpg
Maria-F-23.jpg
I want to break down the each component of the string using regular expressions but I do not want to capture the dash (-
). I tried using non-capturing groups but was not able to get the results I was looking for:
>>> import re
>>> r = re.compile(r'([^\-]*)((?:\-)[^\-]*)?((?:\-)[^\-]*)?\.jpg')
>>> for d in (
... 'Beamin-M.jpg',
... 'Jean.jpg',
... 'Maria-F-23.jpg',
... ):
... print r.match(d).groups()
...
('Beamin', '-M', None)
('Jean', None, None)
('Maria', '-F', '-23')
Does anyone have any suggestions?
Upvotes: 0
Views: 120
Reputation: 298096
I'm not a huge fan of regex when there's a more logic-friendly solution readily available, so I'd try something like this:
from os.path import splitext
test = '<name>-<gender>-<age>.jpg'
fname, ext = splitext(test) # works with names like 'xxx.yyy.jpg'
if ext in ('.jpg', '.jpeg'):
name, gender, age = (fname.split('-') + [None, None])[:3]
Upvotes: 6
Reputation: 56809
Rewrite your regex as:
r'([^\-]*)(?:-([^\-]*))?(?:-([^\-]*))?\.jpg'
Technically, you don't need to escape -
in the character class []
, since it is the last in the class. But I'll just leave it there to be on the safe side.
Upvotes: 2
Reputation: 20500
Huh?
You meant r'([^-])(?:(?:-)([^-]))?((?:-)[^-]*)?.jpg')
Seriously, you are capturing the the dash because it is in outer captured parenthesis.
Upvotes: 0