Reputation: 45
([0-9a-zA-Z_\-]+)/?([^.\/]+)[\.php|\.html]
For this expression, why does the string 'people' match and 'person' doesn't?
Upvotes: 2
Views: 129
Reputation: 38432
because what you've got at the end in square brackets should be in parentheses. 'people' matches it up to the 'l' in the square brackets, but 'person' has no letter in the square brackets.
and if you don't want to force a match with either .php or .html, you should follow that with a question mark.
here is a possible revised regex: ([0-9a-zA-Z_-]+)/?([^.\/]+)(.php|.html)?
>>> p='([0-9a-zA-Z_\-]+)/?([^.\/]+)(\.php|\.html)?'
>>> p=re.compile(p)
>>> p.match('person')
<_sre.SRE_Match object at 0x9bac0c0>
>>> p.match('people')
<_sre.SRE_Match object at 0x9bac2f0>
>>> p.match('people').group()
'people'
>>> p.match('person').group()
'person'
Use the match.group() function, or its equivalent in your favorite language, to see what part of a regex is actually matching. it can be very illuminating.
>>> p=re.compile('([0-9a-zA-Z_\-]+)(|\.html|\.php)$')
>>> p.match('ddd').group()
'ddd'
>>> p.match('ddd.html').group()
'ddd.html'
>>> p.match('ddd.jpeg').group()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
Upvotes: 3