Mohayemin
Mohayemin

Reputation: 3870

Regex to match a string NOT surrounded by brackets

I have to parse a text where with is a key word if it is not surrounded by square brackets. I have to match the keyword with. Also, there must be word boundaries on both side of with.

Here are some examples where with is NOT a keyword:

Here are some examples where with IS keyword

Anyone to help? Thanks in advance.

Upvotes: 12

Views: 11971

Answers (4)

Tim Pietzcker
Tim Pietzcker

Reputation: 336088

You can look for the word with and see that the closest bracket to its left side is not an opening bracket, and that the closest bracket to its right side is not a closing bracket:

Regex regexObj = new Regex(
    @"(?<!     # Assert that we can't match this before the current position:
     \[        #  An opening bracket
     [^[\]]*   #  followed by any other characters except brackets.
    )          # End of lookbehind.
    \bwith\b   # Match ""with"".
    (?!        # Assert that we can't match this after the current position:
     [^[\]]*   #  Any text except brackets
     \]        #  followed by a closing bracket.
    )          # End of lookahead.", 
    RegexOptions.IgnorePatternWhitespace);
Match matchResults = regexObj.Match(subjectString);
while (matchResults.Success) {
    // matched text: matchResults.Value
    // match start: matchResults.Index
    // match length: matchResults.Length
    matchResults = matchResults.NextMatch();
}

The lookaround expressions don't stop at line breaks; if you want each line to be evaluated separately, use [^[\]\r\n]* instead of [^[\]]*.

Upvotes: 19

Alan Moore
Alan Moore

Reputation: 75222

I think the simplest solution is to preemptively match balanced pairs of brackets and their contents to get them out of the way as you search for the keyword. Here's an example:

string s = 
  @"[with0]
  [ with0 ]
  [sometext with0 sometext]
  [sometext with0]
  [with0 sometext]


  with1
  ] with1
  hello with1
  hello with1 world
  hello [ world] with1 hello
  hello [ world] with1 hello [world]";

Regex r = new Regex(@"\[[^][]*\]|(?<KEYWORD>\bwith\d\b)");
foreach (Match m in r.Matches(s))
{
  if (m.Groups["KEYWORD"].Success)
  {
    Console.WriteLine(m.Value);
  }
}

Upvotes: 1

Kirk Broadhurst
Kirk Broadhurst

Reputation: 28698

Nice question. I think it'll be easier to find the matches where your [with] pattern applies, and then inverse the result.

You need to match [, not followed by ], followed by with (and then the corresponding pattern for closed square bracket)

Matching the [ and the with are easy.

\[with

add a lookahead to exclude ], and also allow any number of other characters (.*)

\[(?!]).*with

then the corresponding closed square bracket, i.e. the reverse with a lookbehind.

\[(?!]).*with.*\](?<1[)

a bit more tweaking

\[(?!(.*\].*with)).*with.*\](?<!(with.*\[.*))

and now if you inverse this, you should have your desired result. (i.e. when this returns 'true', your pattern matches and want to exclude those results).

Upvotes: 3

gf3
gf3

Reputation: 480

You'll want to look into both negative look-behinds and negative look-aheads, this will help you match your data without consuming the brackets.

Upvotes: 0

Related Questions