Reputation: 3019
I am running into a small issue with my program. It seems to freeze up, most likely caused by the while loop.
What I am trying to do is pick up and replace Java comments. So, when typing out a block comment, you will open that comment with a /*
. If there is no closing end (*/
) the program takes 5-6 seconds where it is frozen and you are unable to use it. I've ran this with even more regexs and a file well over 10,000 lines with no performance issues, so any type of performance decrease is alarming, yet alone a 5-second delay.
private static final String COMMENT_MATCHER = "(//.*)|(/\\u002A((\\s)|(.))*?\\u002A/)";
private String clearMatches(String code, final String regex) {
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(code);
while (matcher.find()) {
final String match = matcher.group();
code = code.replace(match, CharBuffer.allocate(match.length()).toString());
}
return code;
}
I am guessing the issue lies within it finding many matches and iterating through all of them, due to a stray asterisk.
Regards, Obicere.
Upvotes: 2
Views: 823
Reputation: 170158
Try this:
COMMENT_MATCHER = "//[^\r\n]*+|/[*](?:(?![*]/)[\\s\\S])*+[*]/";
which should run considerably faster.
A quick break-down of the pattern:
// # match "//" [^\r\n]*+ # possessively match any chars other than line break chars | # OR /[*] # match "/*" (?: # start non-capture group (?![*]/)[\\s\\S] # match any char, only if "*/" is not ahead )*+ # end non-capture group and possessively repeat zero or more times [*]/ # match "*/"
Upvotes: 4
Reputation: 87251
Your timing observations are not surprising. Java regexp matching can be very slow (i.e. O(2**n) if n is the length of the regexp) because of backtracking. Sometimes it's possible to modify the regexp to avoid backtracking, so it will become fast.
One speedup idea is using possessive quantifiers, see them in http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html. Another speedup idea is using fewer |
operators.
Try this:
private static final String COMMENT_MATCHER = "(//.*+)|(?s)(/[*].*?[*]/)";
Upvotes: 2