marek_maras
marek_maras

Reputation: 23

python regex: looking for correct pattern with negative lookbehind

I am using python tool which checks git log commit messages to find out if feature with given ID was introduced or reverted. I cannot change the code of the tool. Can only provide proper regex as an input. Input looks like this:

input_regexes = {
    "add_pattern": r".*\[\s*(ID\d{3})\s*\](.*)"
    "revert_pattern": r"[Rr]evert.*\[\s*(ID\d{3})\s*\](.*)"
}

First capture group is used to get feature ID and second is used as a feature description. The problem is, when string with [Rr]evert appears, then both patterns match. What I would like to achieve is:

In following example revert_pattern should match only revert_feature_message and add_pattern should match only strings available in add_feature_messages:

revert_feature_message='Revert "[ID123] some cool feature."'
add_feature_messages=[
  '[ID123] some cool feature.',
  'some prefix [ID123] some cool feature'
]

I tried using:

(?<!Revert).*?\[\s*(ID\d{3})\s*\](.*)

as add_pattern but it didn't workout. Could you help make it correct?

Upvotes: 1

Views: 57

Answers (1)

The fourth bird
The fourth bird

Reputation: 163267

The revert pattern [Rr]evert.*\[\s*(ID\d{3})\s*\](.*) already matches only the revert_feature_message

To match only the strings in add_feature_messages you can assert that the string does not contain revert or Revert.

^(?!.*[Rr]evert).*\[\s*(ID\d{3})\s*\](.*)

Regex demo

Or a bit more specific:

^(?!.*[Rr]evert [^][]*\[\s*ID\d{3}\s*]).*\[\s*(ID\d{3})\s*\](.*)

Regex demo

If Revert is at the start of the string, you can omit the leading .*

Upvotes: 1

Related Questions