Tim Needham
Tim Needham

Reputation: 70

Javascript regex to match between two patterns, where first pattern is optional

I've tried so many things and tried adapting similar answers... but still lost today to this, if anyone can help I'd be eternally grateful!

I need to use a regex (the JS lexer-library I'm using doesn't allow for anything else) to match:

Given this:

xxx. 123 $$yyy.234 */zzz.567
           ^^^^^^^^^^

...I need the indicated string to be matched.

As such, this seems to work fine:

(?<=\$\$)(?:[\s\S])*?(?:[\s\S])*?\*\/

(...as seen here)

But there's an additional requirement of:

E.g.:

xxx. 123 yyy.234 */zzz.567
^^^^^^^^^^^^^^^^^^^

Yeah, at the limits of my regex knowledge and just can't land it! :-(

Might be worth mentioning the opening $$ symbol isn't quite that solid, it's more like:

\$[\p{L}0-9_]*?\$

Upvotes: 0

Views: 99

Answers (2)

Tsubasa
Tsubasa

Reputation: 1429

I know this has already been answered and accepted. But here's the shortest way of doing it.

let str = "xxx. $$ 123 $$yyy.234 */zzz.567";
let regex = /\$?\w*\$?([\w \d.-]*\$?[\w \d.-]*\*\/)/gm;

console.log(regex.exec(str)[1]);

Update:
As mentioned in the comments, the above method fails for a $ b */ kind of strings. So, I came up with this. This isn't as good as @ikugami's, but this can definitely be another way.

let str = "$$xxx. $$gjjd*/ fhjgd";
let regex = /(\$?\w*\$?)([\w \d.-]*\$?[\w \d.-]*\*\/)/gm;

result = regex.exec(str).slice(1);

if (result[0].startsWith('$')) {
  result = result[1]

} else {
  result = result[0] + result[1]
}

console.log(result);

Upvotes: 0

ikegami
ikegami

Reputation: 385847

When matching against www $$ xxx $$ yyy */ zzz, I'm assuming the result should be $$ yyy */ rather than $$ xxx $$ yyy */. The solution may be more complicated than it needs to be if this isn't a requirement.


(?: ^ | \$\$ )        # Starting at the start of the string or at "$$"
( (?: (?!\$\$). )*    # A sequence of characters (.) now of which starting with "$$"
  \*/                 # Followed by "*/"
)                     # End capture

Except not quite. That will fail for $$$abc*/. So we fix:

(?: ^ | \$\$(?!\$) )  # Starting at the start of the string or at "$$" (but not "$$$")
( (?: (?!\$\$). )*    # A sequence of characters (.) now of which starting with "$$"
  \*/                 # Followed by "*/"
)

We could also avoid lookaheads.

(?: ^ | \$\$ )
( (?: [^$]+ ( \$[^$]+ )* \$? )?
  \*/
)

Regarding the the updated question, the lookahead version can be modified to accommodate \$[\p{L}0-9_]*\$.

(?: ^
|   \$ [\p{L}0-9_]* \$ (?! [\p{L}0-9_]* \$ )
)
( (?: (?! \$ [\p{L}0-9_]* \$ ) . )*
  \*/
)

I've used line breaks and whitespace for readability. You will need to remove them (since JS's engine doesn't appear to have a flag to cause them to be ignored like some other engines do).

Upvotes: 2

Related Questions