Reputation: 17
I have the below regex expression in a Java code, it is taking a good deal of time to complete on some cases. Is there a way to improve it?
String decimal = "([0-9]+(\\.[0-9]+)?[/-]?)+";
String units = "(in|ft)\\.?";
String unitName = "(cu\\.? *ft|gauge|watt|rpm|ft|lbs|K|GPF|btu|mph|cfm|volt|oz|pounds|dbi|miles|amp|hour|kw|f|degrees|year)";
sizePattern.add(Pattern.compile("(?i)" + decimal + " *" + units + " *x? *" + decimal + " *" + units + " *x? *" + decimal + " *" + units + ""));
sizePattern.add(Pattern.compile("(?i)" + decimal + " *" + units + " *x? *" + decimal + " *" + units));
sizePattern.add(Pattern.compile("(?i)" + decimal + " *x *" + decimal + " *" + units));
sizePattern.add(Pattern.compile("(?i)" + decimal + "( *" + units + ")"));
sizePattern.add(Pattern.compile("(?i)" + decimal + "( *sq?\\.?)( *ft?\\.?)"));
sizePattern.add(Pattern.compile("(?i)" + decimal + " *" + unitName));
sizePattern.add(Pattern.compile("(?i)" + decimal + "(d)"));
sizePattern.add(Pattern.compile("(?i)" + decimal + "( *(%|percent))"));
sizePattern.add(Pattern.compile("(?i)" + decimal));
for (Pattern p : sizePattern)
{
ODebug.Write(Level.FINER, "PRD-0079: Using pattern = " + p.pattern());
m = p.matcher(_data);
while (m.find())
{
ODebug.Write(Level.FINER, " Got => [" + m.group(0) + "]");
this.Dimensions.add(m.group(0));
_data = _data.replaceAll("\\Q" + m.group(0) + "\\E", ".");
m = p.matcher(_data);
}
}
String causing the issue:
Micro-Induction Cooktop provides the best in cooktop performance, safety and efficiency. Induction heats as electricity flows through a coil to produce a magnetic field under the ceramic plate. When a ferromagnetic cookware is placed on the ceramic surface, currents are induced in the cookware and instant heat is generated due to the resistance of the pan. Heat is generated to the pan only and no heat is lost. As there are no open flames, inductions are safer to use than conventional burners. Once cookware is removed, all molecular activity ceases and heating is stopped immediately.Flush surface for built-in or freestanding applicationDual functions: Cook and Warm7 power settings (100-300-500-700-900-1100-1300W)* The 2 lowest power settings cannot be actually achieved, but are ""simulated"":100W = 500W intermittently heat for 2 seconds and stop for 8 seconds300W = 500W intermittently heat for 6 seconds and stop for 4 seconds13 Keep Warm settings (100-120-140-160-180-190-210-230-250-280-300-350-390F)Touch sensitive panel with control lockUp to 8 hours timerMicro-crystal ceramic plateAutomatic pan detectionLED panelETL/ETL-Sanitation/FCC certified for household or commercial useHome Depot Protection Plan:
Upvotes: 0
Views: 993
Reputation: 46492
Assuming your _data
is long, it's not the matching what takes the time, but rather the assignment
_data = _data.replaceAll("\\Q" + m.group(0) + "\\E", ".");
which is O(n**2)
in terms of the string length. Just don't do it.
You could do it simpler with
_data = _data.replace(m.group(0), ".");
but just don't do it. Do you need a reduced _data
at the end? If so, use a single replaceAll
per pattern (it uses a StringBuffer
internally and is only linear in the size of the string).
Additionally:
Matcher
by using reset(CharSequence)
and usePattern(Pattern)
.decimal
can probably get slow in case there's no match. Leaving out the optional part, you get "([0-9]+)+"
which can backtrack needlessly a lot. Consider using atomic groups.Upvotes: 1