Salman Siddiqui
Salman Siddiqui

Reputation: 130

JS regex not returning all matched group

My string is as follows:

var data = "Validation failed: Attachments document 01april2015_-_Copy.csv has contents that are not what they are reported to be, Attachments document 01april2015.csv has contents that are not what they are reported to be"

My regex:

var regex = /Validation failed:(?:(?:,)* Attachments document ([^,]*) has contents that are not what they are reported to be)+/;

result:

data.match(regex)

["Validation failed: Attachments document 01april2015_-_Copy.csv has contents that are not what they are reported to be, Attachments document 01april2015.csv has contents that are not what they are reported to be", "01april2015.csv"]

data.match(regex).length == 2

true

Expected result:

data.match(regex)

["Validation failed: Attachments document 01april2015_-Copy.csv has contents that are not what they are reported to be, Attachments document 01april2015.csv has contents that are not what they are reported to be", "01april2015-_Copy.csv", "01april2015.csv"]

data.match(regex).length == 3

true

I am unable to comprehend why is it not returning the first filename(01april2015_-_Copy.csv) after match. Any sort of explanation would be sincerely appreciated.

Upvotes: 1

Views: 1718

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627082

In JS, there is no Captures collection as in C#, thus, I suggest using a shortened regex with g option and use it with exec in order not to lose captured texts:

var re = /Attachments document ([^,]*) has contents that are not what they are reported to be/g; 
var str = 'Validation failed: Attachments document 01april2015_-_Copy.csv has contents that are not what they are reported to be, Attachments document 01april2015.csv has contents that are not what they are reported to be';
var m;
var arr = [str];
while ((m = re.exec(str)) !== null) {
    if (m.index === re.lastIndex) {
        re.lastIndex++;
    }
    arr.push(m[1]);
}
console.log(arr);

Note that a shortest possible pattern that can match the desired substring can be used to look for multiple matches. We cannot use String#match because:

If the regular expression includes the g flag, the method returns an Array containing all matched substrings rather than match objects. Captured groups are not returned.

if you want to obtain capture groups and the global flag is set, you need to use RegExp.exec() instead.

See the RegExp#exec behavior with /g:

If your regular expression uses the "g" flag, you can use the exec() method multiple times to find successive matches in the same string.

If the match succeeds, the exec() method returns an array and updates properties of the regular expression object. The returned array has the matched text as the first item, and then one item for each capturing parenthesis that matched containing the text that was captured.

Upvotes: 5

Related Questions