Reputation: 35305
I am trying to come up with a Ruby Regex that will match the following string:
MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something
But should not match the following:
MAINT: Refactored something
STRY-1: Added something
STRY-2: Updated something
MAINT: Refactoring something
STRY-3: Adding something
STRY-4: Updating something
Basically, the first word after : should not end with either ed or ing
I have been using the following regex for GitLab
commit message for a while now.
^(MAINT|(STRY|PRB)-\d+):\s(?:(?!(?:ed|ing)\b)[A-Za-z])+\s([a-zA-Z0-9._\-"].*)
However, recently they seem to have switched to using google/re2 which does not support lookahead.
Would it be possible to rewrite this regex in way so that lookahead is not used?
Upvotes: 0
Views: 1067
Reputation: 110725
str =<<_
MAINT: Refactor something
STRY-1: Added something
MAINT: Refactoring something
Add something
STRY-3: Adding something
STRY-1: Add something
MAINT: Refactored something
Refactor something
STRY-4: Updating something
STRY-9: Update something
STRY-2: Updated something
_
r = /
^ # Match beginning of line
(?: # Begin non-capture group
MAINT\:[ ]+Refactor # Match string
| # or
STRY-\d+\:[ ]+ # match string
(?:Add|Update) # match 'Add' or 'Update'
) # end non-capture group
[ ]+something # match one or more spaces followed by 'something'
$ # match end of line
/x # free-spacing regex definition modes
str.scan(r)
#=> ["MAINT: Refactor something\n",
# "STRY-1: Add something\n",
# "STRY-9: Update something\n"]
To match a space in the regular expression I've use a character class containing a space ([ ]
). That's needed because free-spacing mode removes spaces that are not in character classes. Written in the convention way, the regular expression is as follows.
/^(?:MAINT\: +Refactor|STRY-\d+\: +(?:Add|Update)) +something$/
Upvotes: 1
Reputation: 48741
You are dealing with a regex which has to be aware about three endings:
ed\b
ing\b
ied\b
You have to consider existence of each single spot. For instance, e[^d]\b
and [^e]d\b
. Writing all of them you will come with this regex:
^(MAINT|(STRY|PRB)-\d+):\s*(?i:\w*(e[a-ce-z]|[a-df-z]d|i(n[a-fh-z]|[a-mo-z]g|e[a-ce-z]|[a-df-z]d)|[a-hj-z]ng|[a-hj-z][a-df-mo-z][a-cefh-z])|\w)\s([a-zA-Z0-9._\-"].*)
Upvotes: 1
Reputation: 80085
Without regular expression:
str = "MAINT: Refactor something
STRY-1: Add something
STRY-2: Update something"
p str.lines.none?{|line| line.split[1].end_with?("ed", "ing")}
# => true
Upvotes: -1