Reputation: 4676
I need to match the following sets of input:
foo_abc_bar
foo_bar
and get "abc" or an empty string as the result.
So this is the regular expression I wrote:
r'foo_(abc|)[_|]bar'
But for some reason, this does not match with the second string that I have given.
On further inspection, I found that [_|]
does not match an empty string.
So, how do I solve this problem?
Upvotes: 1
Views: 3468
Reputation: 500683
To make abc_
optional, you could use the question mark operator:
(abc_)?
Thus, the entire regex becomes:
r'foo_(abc_)?bar'
With this regex, the second underscore (if present) will become part of the capture group. If you don't want that, you could either remove it post-match with .rstrip('_')
or use a slightly more complex regex:
r'foo_(?:(abc)_)?bar'
I found that
[_|]
does not match an empty string.
That's right. Square brackets denote a character group. The [_|]
would match exactly one underscore or exactly one vertical bar, and nothing else. In other words, the vertical bar loses its special meaning when it appears inside a character group.
Upvotes: 5
Reputation: 263803
if you want a string pattern like this
xxx_xxx_xxx
xxx_xxx
then you need
([A-Za-z]{3})((_[A-Za-z]{3})+)?
but this will work also
r'foo(_abc)?_bar'
?
means optional (may or may not match).
Upvotes: 1