Vidhyardhi Gorrepati
Vidhyardhi Gorrepati

Reputation: 682

Regex to get text between braces which works with occasional missing braces

I have some text like

The quick brown [fox] jumps over the lazy [dog]

If I use the regex

\[(.*?)\]

I get matches as

fox
dog

I am looking for a regex which works even when one of the braces are missing.

For example, if I have text like this

The quick brown [fox jumps over the lazy [dog]

I want the matches to return "dog"

Update: Another example, if I have text like this

The quick brown [fox] jumps over the lazy dog]

I want the matches to return "fox"

The text can have multiple matches and multiple braces can be missing too :(.

I can also use C# to do substring of the results I get from regex matches.

Upvotes: 4

Views: 176

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

If you plan to match anything but [ and ] between the closest [ and ] while capturing what is inside, use

\[([^][]*)]

Pattern details

  • \[ - a literal [
  • ([^][]*) - Group 1 capturing 0+ characters other than [ and ] (as [^...] is a negated character class and it matches all characters other than those defined inside the class) (this Group 1 value is accessed via Regex.Match(INPUT_STRING, REGEX_PATTERN).Groups[1].Value)
  • ] - a literal ] (it does not have to be escaped outside a character class)

See the regex demo and here is C# demo:

var list = new List<string>() {"The quick brown [fox] jumps over the lazy dog]",
        "The quick brown [fox] jumps over the lazy [dog]",
        "The quick brown [fox jumps over the lazy [dog]"};
list.ForEach(m =>
             Console.WriteLine("\nMatch: " + 
                Regex.Match(m, @"\[([^][]*)]").Value + // Print the Match.Value
                "\nGroup 1: " + 
                Regex.Match(m, @"\[([^][]*)]").Groups[1].Value)); // Print the Capture Group 1 value

Results:

Match: [fox]
Group 1: fox

Match: [fox]
Group 1: fox

Match: [dog]
Group 1: dog

Upvotes: 1

MaKCbIMKo
MaKCbIMKo

Reputation: 2820

Try this one: \[[^[]*?\]

It will skip all matches if it contains [ character.

Upvotes: 4

Laurel
Laurel

Reputation: 6173

Here you go: \[[^\[]+?\]

It just avoids capturing [ with the char class.

Upvotes: 1

Related Questions