Reputation: 4891
I am using Javascript and regex to parse some strings in "csv like flavour" with ;
as separator. The regex I figured out so far is trying to get all the occurrences of a pattern like: "INTERESTING1 (INTERESTING2; INTERESTING3)
".
The problems I am facing is that I can match only the last occurrence of the previous pattern in the whole string, whereas I would like to match all the occurring patterns. I have tried both the Javascript functions exec()
and match()
with or without some loops, but I can not figure out what's wrong with what I am doing?
var complexString = 'some boring stuff; some other boring stuff; interesting prefix (interesting inner stuff1; interesting inner stuff2; etc.); boring stuff; another interesting prefix (another interesting string 1; another interesting string 2; etc.)';
//var complexString = 'XXX';
// regex to apply
var roundBraketsRegex = /.*;(.*)\((.*)\)/g; // string pattern: "INTERESTING1 (INTERESTING2; INTERESTING3)"
// array of matched groups
var matchesArray = roundBraketsRegex.exec(complexString);
var outputString = '';
if(matchesArray == null ) {
outputString = 'NULL!!! ';
}
// I have tried also the following commented line with stuff related to
// while loops and functions like .exec() or .match()
//while ((matchesArray = roundBraketsRegex.match( complexString )) != null) {
outputString = outputString + ' ### ' + matchesArray[1] + ' ### ' + matchesArray[2] + ' ### NOT INTERESTED IN: ' + matchesArray[0];
//}
// print what has been found
console.log(document.getElementById('result'));
document.getElementById('result').innerHTML = outputString;
The output (I manually added some carriage returns here in Stackoverflow, just to get the string more readable):
### another interesting prefix
### another interesting string 1; another interesting string 2; etc.
### NOT INTERESTED IN: some boring stuff; some other boring stuff; interesting prefix (interesting inner stuff1; interesting inner stuff2; etc.); boring stuff; another interesting prefix (another interesting string 1; another interesting string 2; etc.)
Upvotes: 2
Views: 1256
Reputation: 3569
The thing you need to understand about regular expressions is that multiple runs of the matcher will only find non-overlapping targets. If your regex is capturing too much, then you will not be able to find extra matches with additional runs.
Try this regular expression, which captures less:
([^;]+?)\s+\(([^\)]*)\)
It has two capturing groups, which grab the interesting prefix and the other interesting stuff in the brackets. Please note that you will need to use String.trim() on the results. Here is the regex explained on Regex 101.
Here is the final JavaScript solution, which includes the regex:
var complexString = 'some boring stuff; some other boring stuff; interesting prefix (interesting inner stuff1; interesting inner stuff2; etc.); boring stuff; another interesting prefix (another interesting string 1; another interesting string 2; etc.)';
var roundBraketsRegex = /([^;]+?)\s+\(([^\)]*)\)/g;
var matchesArray;
var i = 1;
while (matchesArray = roundBraketsRegex.exec(complexString)) {
var group1 = matchesArray[1].trim();
var group2 = matchesArray[2].trim();
console.log("Match #" + i + " [1]: '" + group1 + "' [2]: '" + group2 + "'");
++i;
}
Here is the output from running the above:
Match #1 [1]: 'interesting prefix' [2]: 'interesting inner stuff1; interesting inner stuff2; etc.'
Match #2 [1]: 'another interesting prefix' [2]: 'another interesting string 1; another interesting string 2; etc.'
I hope that you find this helpful.
--Jonathan
Upvotes: 1