Jem
Jem

Reputation: 6406

Javascript - Greedy regex issue

My regex must match what's within /* @DT-HIDE / and / @/DT-HIDE */. It's fine until a page contains two blocks.

If there are two blocks, well, $1 will match all that's between the first opening @DT-HIDE and last @/DT-HIDE. I suppose it's about a greedy * instead of an ? but I can't figure it out.

Regex:

const pattern = new RegExp(/(\/\*\s@DT-HIDE\s\*\/) ([\s\S]*?) (\/\*\s@\/DT-HIDE\s\*\/)/g);

Example value being processed:

/* @DT-HIDE */
function(){
    return "...";
}
/* @/DT-HIDE */

/* @DT-HIDE */
function logic(url){
    return new Promise( (resolve, reject) => {
        ...
    });
}
/* @/DT-HIDE */

Upvotes: 1

Views: 72

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626903

You should have had [\s\S]* in your original pattern, otherwise you would not obtain a "greedy" match.

However, as Slim noticed, the spaces you introduced failed the match since there is no space after /* @DT-HIDE */. So, you may solve the issue by removing the spaces. Declaring the pattern with a conbstructor notation is a good idea when your pattern contains many slashes, but in this case you may use a regex literal without the RegExp constructor:

const pattern = /(\/\*\s@DT-HIDE\s\*\/)([\s\S]*?)(\/\*\s@\/DT-HIDE\s\*\/)/g;

However, the pattern is not optimal, since the lazy dot matching patterns may involve many "forward-trackin" steps. I suggest unrolling it as

const pattern = /(\/\*\s@DT-HIDE\s\*\/)([^\/]*(?:\/(?!\*\s@\/DT-HIDE\s\*\/)[^\/]*)*)(\/\*\s@\/DT-HIDE\s\*\/)/g;
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

See the regex demo.

The [^\/]*(?:\/(?!\*\s@\/DT-HIDE\s\*\/)[^\/]*)* will make matching more efficient especially if the number of / is not that big in between the delimiters.

Upvotes: 1

Related Questions