Reputation: 4217
I hope I can explain myself clearly here and that this is not too much of a specific issue.
I am working on some javascript that needs to take a string, find instances of chars between square brackets, store any returned results and then remove them from the original string.
My code so far is as follows:
parseLine : function(raw)
{
var arr = [];
var regex = /\[(.*?)]/g;
var arr;
while((arr = regex.exec(raw)) !== null)
{
console.log(" ", arr);
arr.push(arr[1]);
raw = raw.replace(/\[(.*?)]/, "");
console.log(" ", raw);
}
return {results:arr, text:raw};
}
This seems to work in most cases. If I pass in the string [id1]It [someChar]found [a#]an [id2]excellent [aa]match
then it returns all the chars from within the square brackets and the original string with the bracketed groups removed.
The problem arises when I use the string [id1]It [someChar]found [a#]a [aa]match
.
It seems to fail when only a single letter (and space?) follows a bracketed group and starts missing groups as you can see in the log if you try it out. It also freaks out if i use groups back to back like [a][b]
which I will need to do.
I'm guessing this is my RegEx - begged and borrowed from various posts here as I know nothing about it really - but I've had no luck fixing it and could use some help if anyone has any to offer. A fix would be great but more than that an explanation of what is actually going on behind the scenes would be awesome.
Thanks in advance all.
Upvotes: 1
Views: 83
Reputation: 56819
The problem is due to the lastIndex
property of the regex /\[(.*?)]/g;
not resetting, since the regex is declared as global. When the regex has global flag g
on, lastIndex
property of RegExp
is used to mark the position to start the next attempt to search for a match, and it is expected that the same string is fed to the RegExp.exec()
function (explicitly, or implicitly via RegExp.test()
for example) until no more match can be found. Either that, or you reset the lastIndex
to 0 before feeding in a new input.
Since your code is reassigning the variable raw
on every loop, you are using the wrong lastIndex
to attempt the next match.
The problem will be solved when you remove g
flag from your regex. Or you could use the solution proposed by Tibos where you supply a function to String.replace()
function to do replacement and extract the capturing group at the same time.
Upvotes: 1
Reputation: 27833
You could use the replace method with a function to simplify the code and run the regexp only once:
function parseLine(raw) {
var results = [];
var parsed = raw.replace(/\[(.*?)\]/g, function(match,capture) {
results.push(capture);
return '';
});
return { results : results, text : parsed };
}
Upvotes: 3