Belmin Fernandez
Belmin Fernandez

Reputation: 8637

In Python Regular Expressions, how can I disregard an expression inside a captured group?

I have a string that looks like this:

<name>-<gender>-<age>.jpg

I want to be very liberal as far as what I accept. Requirements being:

  1. The <name> component is required.
  2. Must have the .jpg file extension
  3. You may leave a component blank or leave it out entirely as long as the end result is one of the following permutations:
    • <name>.jpg
    • <name>-<gender>.jpg
    • <name>-<gender>-<age>.jpg

Examples of what is considered valid:

Beamin-M.jpg
Jean.jpg
Maria-F-23.jpg

I want to break down the each component of the string using regular expressions but I do not want to capture the dash (-). I tried using non-capturing groups but was not able to get the results I was looking for:

>>> import re
>>> r = re.compile(r'([^\-]*)((?:\-)[^\-]*)?((?:\-)[^\-]*)?\.jpg')
>>> for d in (
...  'Beamin-M.jpg',
...  'Jean.jpg',
...  'Maria-F-23.jpg',
... ):
...  print r.match(d).groups()
...
('Beamin', '-M', None)
('Jean', None, None)
('Maria', '-F', '-23')

Does anyone have any suggestions?

Upvotes: 0

Views: 120

Answers (3)

Blender
Blender

Reputation: 298096

I'm not a huge fan of regex when there's a more logic-friendly solution readily available, so I'd try something like this:

from os.path import splitext    

test = '<name>-<gender>-<age>.jpg'

fname, ext = splitext(test) # works with names like 'xxx.yyy.jpg'
if ext in ('.jpg', '.jpeg'):
    name, gender, age = (fname.split('-') + [None, None])[:3]

Upvotes: 6

nhahtdh
nhahtdh

Reputation: 56809

Rewrite your regex as:

r'([^\-]*)(?:-([^\-]*))?(?:-([^\-]*))?\.jpg'

Technically, you don't need to escape - in the character class [], since it is the last in the class. But I'll just leave it there to be on the safe side.

Upvotes: 2

Charles Merriam
Charles Merriam

Reputation: 20500

Huh?

You meant r'([^-])(?:(?:-)([^-]))?((?:-)[^-]*)?.jpg')

Seriously, you are capturing the the dash because it is in outer captured parenthesis.

Upvotes: 0

Related Questions