Reputation: 5271
How can I solve this Java regex problem?
Input:
some heading text... ["fds afsa","fwr23423","42df f","1a_4(211@#","3240acg!g"] some trailing text....
Problem: I would like to capture everything between the double quotes. (Example: fds afsa, fwr23423, etc.)
I have tried the following pattern:
\[(?:"([^"]+)",?)+\]
But when performing a Matcher.find(), it will result in a StackOverflowError, when using a larger input (but does work for a small input, this is a bug in Java). And even if it did work, then matcher.group(1) will only give "3240acg!g".
How can I solve this issue? (Or is the use of multiple patterns required, where the first pattern strips the brackets?)
Upvotes: 6
Views: 2284
Reputation: 336108
Three suggestions:
If strings only can occur between brackets, then you don't need to check for them at all and just use "[^"]*"
as your regex and find all matches (assuming no escaped quotes).
If that doesn't work because strings could occur in other places too, where you don't want to capture them, do it in two steps.
\[[^\]]*\]
."[^"]*"
within the result of the first match. Or even use a JSON parser to read that string.Third possibility, cheating a bit:
Search for "[^"\[\]]*"(?=[^\[\]]*\])
. That will match a string only if the next bracket that follows is a closing bracket. Limitation: No brackets are allowed inside the strings. I consider this ugly, especially if you look at how it would look like in Java:
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("\"[^\"\\[\\]]*\"(?=[^\\[\\]]*\\])");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
Do you think anybody who looks at this in a few months can tell what it's doing?
Upvotes: 1