Reputation: 11
I want to match an optionally double-quoted string with regular expression using Python regex module re
The expression should give the following results:
"Assets".
=> Should Match
Assets.
=> Should Match
"Assets.
=> Shouldn't Match
Assets".
=> Shouldn't Match
I tried to achieve this using back reference in regular expression :
("?)Assets\1
However, it matches even if there is no matching end quote.
"Assets.
-> neglects initial quote ", and matches the rest of the word.
What would be right expression for this ?
Upvotes: 1
Views: 210
Reputation: 12015
You regexp pattern is almost correct. You just have to make sure there are no quotes before and after your pattern. So use the pattern r'(?<!")("?)Assets\1(?!")
>>> words = ['"Assets"', 'Assets', '"Assets', 'Assets"']
>>> ptrn = re.compile(r'(?<!")("?)Assets\1(?!")')
>>> [bool(ptrn.match(word)) for word in words]
[True, True, False, False]
Upvotes: 1
Reputation: 73450
You can use the following pattern. Note that it basically lists the two separate cases because parentheses are notoriously not regular, but context-sensitive and, thus, difficult to handle with regular expressions:
>>> p = re.compile(r'^(?:"[^"]+"|[^"]+)$')
>>> bool(p.match('"assets"'))
True
>>> bool(p.match('"assets'))
False
>>> bool(p.match('assets'))
True
This also assumes that are no chars before or after the string that is being matched.
Upvotes: 2