Exocomp
Exocomp

Reputation: 1537

Regex ignore part of the string in matches

Suppose I have a tags object as such:

["warn-error-fatal-failure-exception-ok","parsefailure","anothertag","syslog-warn-error-fatal-failure-exception-ok"]

I would like to be able to use regex to match on "failure" but exclude "warn-error-fatal-failure-exception-ok".

So in the above case if I used my regex to search for failure it should only match failure on parsefailure and ignore the rest.

How can this be accomplished using regex?

NOTE: The regex has to exclude the whole string "warn-error-fatal-failure-exception-ok"

Upvotes: 3

Views: 17663

Answers (2)

Luis Guzman
Luis Guzman

Reputation: 1026

EDIT

After documenting the answer below, I realized that maybe what you are looking for is:

(?<!warn-error-fatal-)failure(?!-exception-ok)

So I'm adding it here in case that it fits what you are looking for better. This regex is just looking for "failure" but using a Negative Lookbehind and a Negative Lookahead to specify that "failure" may not be preceded by "warn-error-fatal-" or followed by "-exception-ok".

ANSWER DEVELOPED FROM COMMENTS:

The following regex captures the "failure" substring in the "parsefailure" tag, and it puts it in Group 1:

^.*"(?![^"]*warn-error-fatal-failure-exception-ok[^"]*)[^"]*(failure)[^"]*".*$

DETAIL

I will break the regex in parts, and I'll explain each. First, let's forget about everything in between the first set of parentheses, and let's just look at the rest.

^.*"[^"]*(failure)[^"]*".*$

The important part of the regex is what we are trying to capture in the group, which is the word "failure" which itself is a part of a tag surrounded by double-quotes. The regular expression above matches the whole test string, but it focuses on a tag surrounded by double-quotes and containing the substring "failure".

^.*" matches any character from the beginning of the string to a quote

"[^"]*(failure)[^"]*" matches a tag surrounded by double-quotes and containing the substring "failure". Literally: a double-quote, followed by zero or more characters that are not double-quotes, followed by "failure", followed by zero or more characters that are not double-quotes, followed by a double-quote. The parentheses capture the word "failure" in group 1.

".*$ matches any character from the double-quote to the end of the test string

Because [^"]*(failure)[^"]* matches all tags containing the substring "failure", ^.*"[^"]*(failure)[^"]*".*$ will capture the substring "failure" from the first tag containing the string. In other words, it will capture "failure" from the warn-error-fatal-failure-exception-ok tag which is not what we want, so we most exclude the warn-error-fatal-failure-exception-ok tag from being a possible match to the tag portion of the regex: [^"]*(failure)[^"]*. This is achieved with a Negative Lookahead:

(?![^"]*warn-error-fatal-failure-exception-ok[^"]*)

This Negative Lookahead basically means: "The regular expression following the Negative Lookahead can't match [^"]*warn-error-fatal-failure-exception-ok[^"]*". The (?! and ) are just part of the syntax. You can read more about it here.

MORE BREAKDOWN

^ matches the beginning of the test string

.* matches any character zero or more times

" matches a double-quote character

[^"]* matches any character other than the double-quote character zero or more times

(failure) matches the word "failure", and since it is in parentheses, it will capture it in a group; in this case, it will be captured in group 1 because there is only one set of capturing parentheses. The parentheses of the Negative Lookahead are non-capturing.

$ matches the end of the test string

Upvotes: 3

Tejus
Tejus

Reputation: 724

RegularExpression : [A-Za-z-]*(?<!("warn-error-fatal-))failure

Recognizes parsefailure and "syslog-warn-error-fatal-failure-exception-ok" not the other failure.

Upvotes: 0

Related Questions