MurariAlex
MurariAlex

Reputation: 331

Regex - Match quotes between brackets

Example string:

cov('Age', ['5','7','9'])

I have this RegEx that matches the values inside quotes:

(["'])(?:(?=(\\?))\2.)*?\1

I´m trying to modifiy it to only return the quotes inside the square brackets from the example string using lookahead/lookbehind:

(?<=\[)(["'])(?:(?=(\\?))\2.)*?\1(?=\])

But it matches everything inside the square brackets.

How can i match only the quotes without the commas like in the first regex, but inside the square brackets?

Edit.

The language is .NET.

Upvotes: 1

Views: 2171

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626961

You do not seem to need any complex regex: grab a string between two square brackets and split the captured contents with single quote or comma, or plainly match what you need there.

Given

var text = "cov('Age', ['5','7','9'])";

The approaches can be:

// Split captured text with ' and , 
var results = Regex.Matches(text, @"\[([^][]+)]")
        .Cast<Match>()
        .Select(x => x.Groups[1].Value.Split('\'', ',').Where(c => !string.IsNullOrEmpty(c)));

Or, match the strings between brackets and then extract all 1+ digit chunks from it:

var results1 = Regex.Matches(text, @"\[([^][]+)]")
        .Cast<Match>()
        .Select(x => Regex.Matches(x.Groups[1].Value, @"\d+"));

Or, just extract all numbers inside [...]:

var results = Regex.Matches(text, @"(?<=\[[^][]*)\d+(?=[^][]*])").Cast<Match>().Select(x => x.Value);

Here, the regex matches

  • (?<=\[[^][]*) - position that is preceded with [ and any amount of chars other than [ and ]
  • \d+ - 1+ digits
  • (?=[^][]*]) - position followed with any 0+ chars other than [ and ] and then ].

See the online C# demo.

It gets a bit more complex to extract any number, replace \d+ with [-+]?\d*\.?\d+([eE][-+]?\d+)?

Upvotes: 1

The fourth bird
The fourth bird

Reputation: 163372

One option if supported is to make use of the \G anchor and a capturing group:

(?:\[|\G(?!^))('[^']+'),?(?=[^\]]*\])

In parts

  • (?: Non capturing group
  • \[ Match opening [
    • | Or
    • \G(?!^) Assert position at the end of the previous match
  • ) Close non capturing group
  • ( Capture group 1
    • '[^']+' Match ', 1+ times any char except ', then match ' again
  • ) Close group 1
  • ,? Match an optional ,
  • (?=[^\]]*\]) Positive lookahead, assert a closing ]

Regex demo | C# demo

For example

string pattern = @"(?:\[|\G(?!^))('[^']+'),?(?=[^\]]*\])";
string input = @"cov('Age', ['5','7','9'])";

var results = Regex.Matches(input, pattern)
.Cast<Match>()
.Select(m => m.Groups[1].Value)
.ToArray();

foreach(string result in results)
{
    Console.WriteLine(result);
}

Output

'5'
'7'
'9'

Upvotes: 3

ctwheels
ctwheels

Reputation: 22817

You haven't specified a language or regex engine so answering your question is difficult. The fourth bird's answer works for specific regex engines (e.g. PCRE) but not others. Another alternative exists in .NET as well.

For you can use the following since this regex engine collects all captures into a CaptureCollection:

See regex in use here

\[('[^']*'[,\]])+(?<=])

For most other languages (not covered by this answer or @Thefourthbird's), you'll want to do this in two steps:

  • Get all strings that match \[([^[\]]*)] (you want the value of group 1)
  • Match all occurrences of '([^']*)' (you want the value of group 1 for contents)

Upvotes: 3

Related Questions