Reputation: 5423
I have a set of regex replacements that are needed to be applied to a set of String,
For example:
("\s{2,}" --> " ")
(\.([a-zA-Z]-->". $1")
So I will have something like this:
String s="hello .how are you?";
s=s.replaceAll("\\s{2,}"," ");
s=s.replaceAll("\\.([a-zA-Z])",". $1");
....
it works , however imagine I'm trying to replace 100+ such expressions on a long String. needless to say how slow this can be.
so my question is if there is a more efficient way to generalize these replacements with a single replaceAll (or something similar e.g. Pattern/Matcher)
I have followed Java Replacing multiple different...,
but the problem is that my regex(s) are not simple Strings
.
Upvotes: 5
Views: 8184
Reputation: 6962
Look at Replace multiple substrings at Once and modify it.
Use a Map<Integer, Function<Matcher, String>>
.
Modify the loop to check which group was matched. Then use that group number for getting the replacement lambda.
Pseudo code
Map<Integer, Function<Matcher, String>> replacements = new HashMap<>() {{
put(1, matcher -> "");
put(2, matcher -> " " + matcher.group(2));
}};
String input = "lorem substr1 ipsum substr2 dolor substr3 amet";
// create the pattern joining the keys with '|'. Need to add groups for referencing later
String regexp = "(\\s{2,})|(\\.(?:[a-zA-Z]))";
StringBuffer sb = new StringBuffer();
Pattern p = Pattern.compile(regexp);
Matcher m = p.matcher(input);
while (m.find()) {
//TODO change to find which groupNum matched
m.appendReplacement(sb, replacements.get(m.group(groupNum)));
}
m.appendTail(sb);
System.out.println(sb.toString()); // lorem repl1 ipsum repl2 dolor repl3 amet
Upvotes: 1
Reputation: 785156
You have these 2 replaceAll
calls:
s = s.replaceAll("\\s{2,}"," ");
s = s.replaceAll("\\.([a-zA-Z])",". $1");
You can combine them into a single replaceAll
like this:
s = s.replaceAll("\\s{2,}|(\\.)(?=[a-zA-Z])", "$1 ");
Upvotes: 4