ewok
ewok

Reputation: 21463

regular expression to say either string ends or continues with specific character

I want to write a regex that will match if the string starts with "PR-\d+", but then either the string ends, or the next character is a hyphen. So, for instance, the following would match:

PR-123
PR-123-foo

But the following would not:

PR-123a
PR-
PR-foo

I tried re.match(r'PR-\d+[-$]', st), but that didn't work. It appears that this is searching for the literal dollar sign character, rather than end of string.

How can I write this expression?

Upvotes: 2

Views: 873

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626870

A dollar sign inside a character class is parsed as a literal $ char. You need to use an alternation group, or a positive lookahead.

Here is a version with a non-capturing group:

re.match(r'PR-\d+(?:-|$)', st) 

See the regex demo.

A positive lookahead version:

re.match(r'PR-\d+(?=-|$)', st)

Or the identically working negative lookahead solution coupled with a negated character class (so as to avoid the alternation and make the pattern a tiny bit more efficient):

re.match(r'PR-\d+(?![^-])', st)

The only difference is what these regex matches return: the non-capturing group version will actually return the - as part of the match value, and the second one will not contain that - char. There is no difference if you are just checking for a match.

And just FYI: re.match will only look for a match at the start of the string, that is why there is no need using ^ at the start of the pattern. Else, if you were to use re.search or other non-anchoring methods, you would have to prepend the pattern with ^ or \A anchors that match the start of the string.

Upvotes: 3

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

You can use a double negation with a negative lookahead and a negative character class:

re.match(r'PR-\d+(?![^-])', st)

In plain english: not followed by a character that isn't an hyphen.

This description handles the two cases: followed by an hyphen or followed by the end of the string.

demo

Upvotes: 3

Geancarlo Murillo
Geancarlo Murillo

Reputation: 507

r"PR-\d+[-\w+]*$" It works, I tested it

Upvotes: -2

Related Questions