user3186610
user3186610

Reputation: 83

Regular expression that matches group as many times as it can find

I have written a regular expression to match some tags that look like this:

@("hello, world" bold italic font-size="15")

I want the regular expression to match these strings: ['hello, world', 'bold', 'italic', 'font-size="15"'].

However, only these strings are matched: ['hello, world', 'font-size="15"'].

Other examples:

  1. (success)@("test") -> ["test"]
  2. (success)@("test" bold) -> ["test", "bold"]
  3. (fail)@("test" bold size="15") -> ["test", "bold", 'size="15"']

I have tried using this regular expression:

\@\(\s*"((?:[^"\\]|\\.)*)"(?:\s+([A-Za-z0-9-_]+(?:\="(?:[^"\\]|\\.)*")?)*)\s*\)

A broken down version:

\@\(
  \s*
  "((?:[^"\\]|\\.)*)"
  (?:
    \s+
    (
      [A-Za-z0-9-_]+
      (?:
        \=
        "(?:[^"\\]|\\.)*"
      )?
    )
  )*
  \s*
\)

The regular expression is trying to

  1. match beginning of the sequence ($(),
  2. match a string with escaped characters,
  3. match some (>= 1) blanks,
  4. (optional, grouped with (5)) match a = sign,
  5. (optional, grouped with (4)) match a string with escaped characters,
  6. repeat (3) - (5)
  7. match end of the sequence ())

However, this regular expression only matches "hello, world" and font-size="15". How can I make it also match bold and italic, i.e. to match the group ([A-Za-z0-9-_]+(?:\="(?:[^"\\]|\\.)*")?) multiple times?

Expected result: ['"hello, world"', 'bold', 'italic', 'font-size="15']

P.S. using JavaScript native regular expression

Upvotes: 3

Views: 84

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626738

You need a 2-step solution:

Example code:

var re = /@\((?:\s*(?:"[^"\\]*(?:\\.[^"\\]*)*"|[\w-]+(?:="?[^"\\]*(?:\\.[^"\\]*)*"?)?))+\s*\)/g; 
var re2 = /(?:"([^"\\]*(?:\\.[^"\\]*)*)"|[\w-]+(?:="?[^"\\]*(?:\\.[^"\\]*)*"?)?)/g;
var str = 'Text here @("hello, world" bold italic font-size="15") and here\nText there @("Welcome home" italic font-size="2345") and there';
var res = [];

while ((m = re.exec(str)) !== null) {
    tmp = [];
    while((n = re2.exec(m[0])) !== null) {
      if (n[1]) {
        tmp.push(n[1]);
      } else {
        tmp.push(n[0]);
      }
    }
    res.push(tmp);
}
document.body.innerHTML = "<pre>" + JSON.stringify(res, 0, 4) + "</pre>";

Upvotes: 2

Related Questions