fuminlu
fuminlu

Reputation: 45

Regular Expression Matches

([0-9a-zA-Z_\-]+)/?([^.\/]+)[\.php|\.html]

For this expression, why does the string 'people' match and 'person' doesn't?

Upvotes: 2

Views: 129

Answers (1)

jcomeau_ictx
jcomeau_ictx

Reputation: 38432

because what you've got at the end in square brackets should be in parentheses. 'people' matches it up to the 'l' in the square brackets, but 'person' has no letter in the square brackets.

and if you don't want to force a match with either .php or .html, you should follow that with a question mark.

here is a possible revised regex: ([0-9a-zA-Z_-]+)/?([^.\/]+)(.php|.html)?

>>> p='([0-9a-zA-Z_\-]+)/?([^.\/]+)(\.php|\.html)?'
>>> p=re.compile(p)
>>> p.match('person')
<_sre.SRE_Match object at 0x9bac0c0>
>>> p.match('people')
<_sre.SRE_Match object at 0x9bac2f0>
>>> p.match('people').group()
'people'
>>> p.match('person').group()
'person'

Use the match.group() function, or its equivalent in your favorite language, to see what part of a regex is actually matching. it can be very illuminating.


for the revised question in the comments:

>>> p=re.compile('([0-9a-zA-Z_\-]+)(|\.html|\.php)$')
>>> p.match('ddd').group()
'ddd'
>>> p.match('ddd.html').group()
'ddd.html'
>>> p.match('ddd.jpeg').group()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

Upvotes: 3

Related Questions