MarksCode
MarksCode

Reputation: 8584

String split with regex, return array with full matches

I have a string with multiple substrings of the format {{******}} where ***** can be a couple things. I'm trying to split my string so that the resulting array contains the substrings before and after these substrings, as well as the full substrings themselves.

I've created a regular expression that works here: https://regex101.com/r/I65QQD/1/

I want the resulting array when I call str.split(...) to contain the full matches as seen in the link above. Right now it is returning subgroups so my array looks really weird:

let body = "Hello, thanks for your interest in the Melrose Swivel Stool. Although it comes in 2 different wood finishes, there aren't any options for the upholstery fabric. {{youtube:hyYnAioXOqQ}}\n Some similar stools in different finishes are below for your review. I hope this is helpful to you!\n\n{{attachment:2572795}}\n\n{{attachment:2572796}}\n\n{{attachment:2572797}}\n\n{{attachment:2572798}}\n";

let bodyComponents = body.split(/{{attachment:([\d]+)}}|{{(YOUTUBE|VIMEO):([\d\w]+)}}/i);

console.log(bodyComponents);

Is there any way to have the resulting array contain the full matches instead of the subgroups? So that it looks like this:

[
"Hello, thanks for your interest in the Melrose Swivel Stool. Although it comes in 2 different wood finishes, there aren't any options for the upholstery fabric. ",
"{{youtube:hyYnAioXOqQ}}",
...
]

Thanks

Upvotes: 2

Views: 326

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626794

You need to remove unnecessary capturing parentheses and turn an alternation group you have into a non-capturing one:

/({{attachment:\d+}}|{{(?:YOUTUBE|VIMEO):\w+}})/

Note that [\d\w] = \w and [\d] = \d.

Note that the whole pattern is wrapped with a single capturing group. ({{attachment:\d+}} has no capturing group round \d+, (?:YOUTUBE|VIMEO) is now a non-capturing group (and thus its value won't appear as a separate item in the resulting array) and ([\d\w]+) is turned into \w+ (\d is redundant as \w matches digits, too).

let body = "Hello, thanks for your interest in the Melrose Swivel Stool. Although it comes in 2 different wood finishes, there aren't any options for the upholstery fabric. {{youtube:hyYnAioXOqQ}}\n Some similar stools in different finishes are below for your review. I hope this is helpful to you!\n\n{{attachment:2572795}}\n\n{{attachment:2572796}}\n\n{{attachment:2572797}}\n\n{{attachment:2572798}}\n";
let bodyComponents = body.split(/({{attachment:\d+}}|{{(?:YOUTUBE|VIMEO):\w+}})/i);
console.log(bodyComponents);

Upvotes: 1

Related Questions