Reputation: 225
Hello everyone I want to ask about memory utilization and time required for a process. I have these following code. I want to optimize my code so that it will be faster. String will take more memory any alternative for that?
public String replaceSingleToWord(String strFileText) {
strFileText = strFileText.replaceAll("\\b(\\d+)[ ]?'[ ]?(\\d+)\"", "$1 feet $2 ");
strFileText = strFileText.replaceAll("\\b(\\d+)[ ]?'[ ]?(\\d+)''", "$1 feet $2 inch");
//for 23o34'
strFileText = strFileText.replaceAll("(\\d+)[ ]?(degree)+[ ]?(\\d+)'", "$1 degree $3 second");
strFileText = strFileText.replaceAll("(\\d+((,|.)\\d+)?)sq", " $1 sq");
strFileText = strFileText.replaceAll("(?i)(sq. Km.)", " sqkm");
strFileText = strFileText.replaceAll("(?i)(sq.[ ]?k.m.)", " sqkm");
strFileText = strFileText.replaceAll("(?i)\\s(lb.)", " pound");
//for pound
strFileText = strFileText.replaceAll("(?i)\\s(am|is|are|was|were)\\s?:", "$1 ");
return strFileText;
}
I think it will take more memory and time I just want to reduce the complexity.I just want reduce time and memory for process what changes i need to do.is there any alternative for replaceAll function? How this code i will minimize? so that my get faster and with low memory utilization? Thank you in advanced
Upvotes: 1
Views: 1637
Reputation: 109547
The regex patterns can be improved at spots_ [,.]
or ?
(instead [ ]?
).
Use compiled static final Pattern
s outside the functions.
private static final Pattern PAT = Pattern.compile("...");
StringBuffer sb = new StringBuffer();
Matcher m = PAT.matcher(strFileText);
while (m.find()) {
m.appendReplacement(sb, "...");
}
m.appendTail(sb);
strFileText = sb.toString();
Optimisable with first testing if (m.find)
before doing a new StringBuffer
.
Upvotes: 0
Reputation: 46392
Use precompiled Pattern and a loop just like Joop Eggen suggested. Group your expressions together. For example, the first two can be written like
`"\\b(\\d++) ?' ?(\\d+)(?:''|\")"`
You can go much further at the expense of readability loss. A single expression for all your replacements is possible, too.
`"\\b(\\d++) ?(?:' ?(?:(\\d+)(?:''|\")|degree ?(\\d++)|...)"`
Then you need to branch on conditions like group(2) == null
. This gets very hard to maintain, but with a single loop and cleverly written regex you'll win the race. :D
what will be the regex for words like can't -> canot, shouldn't -> should not etc.
It depends how exact you want to be. The most trivial way is s.replaceAll("\\Bn't\\b", " not")
. The above optimizations apply, so don't ever use replaceAll
when performance matters.
A general solution could go like this
Pattern SHORTENED_WORD_PATTERN =
Pattern.compile("\\b(ca|should|wo|must|might)(n't)\\b");
String getReplacement(String trunk) {
switch (trunk) { // needs Java 7
case "wo": return "will not";
case "ca": return "cannot";
default: return trunk + " not";
}
}
... relevant part of the replacer loop (see [replaceAll][])
while (matcher.find()) {
matcher.appendReplacement(result, getReplacement(matcher.group(1)));
}
what should i do in case of strFileText = strFileText.replace("á", "a"); strFileText = strFileText.replace("’", "\'"); strFileText = strFileText.replace("â€Â", "\'"); strFileText = strFileText.replace("ó", "o"); strFileText = strFileText.replace("é", "e"); strFileText = strFileText.replace("á", "a"); strFileText = strFileText.replace("ç", "c"); strFileText = strFileText.replace("ú", "u"); if i want to write this in one line or other way replaceEach() is better for that case
If you go for efficiency note that all the above string starts with the same character Ã
. A single regex could like á|’"|...
is much slower than Ã(ƒÂƒÃ‚¡|¢Â€Â™"|...)
(unless the regex engine can optimize it itself, which is currently not the case).
So write a regex where all common prefixes are extracted and use
String getReplacement(String match) {
switch (match) { // needs Java 7
case "á": return "a";
case "’"": return "\\";
...
default: throw new IllegalArgumentException("Unexpected: " + match);
}
}
and
while (matcher.find()) {
matcher.appendReplacement(result, getReplacement(matcher.group()));
}
Maybe a HashMap
might be faster than the switch
above.
Upvotes: 1
Reputation: 22241
Optimization methods:
Pattern.compile()
for each replace. Create a class, make patterns fields, and compile the patterns only once. That way you will save a lot of time, since regex compile takes place each time you call replaceAll()
and it is a very costly operation(\\d+)
use (\\d+?)
.lb.
->pound
)?sqkm
or feet
replacesStringBuilder
; then use addReplacement to process your text.Moreover a dot in many of your replace
s is unescaped. Dot matches any character. Use \\.
.
Class idea:
class RegexProcessor {
private Pattern feet1rep = Pattern.compile("\\b(\\d+)[ ]?'[ ]?(\\d+)\"");
// ...
public String process(String org) {
String mod = feet1rep.match(org).replaceAll("$1 feet $2 ");
/...
}
}
Upvotes: 3
Reputation: 2670
The StringBuffer and StringBuilder classes are used when there is a necessity to make a lot of modifications to Strings of characters.
Unlike Strings objects of type StringBuffer and Stringbuilder can be modified over and over again with out leaving behind a lot of new unused objects.
The StringBuilder class was introduced as of Java 5 and the main difference between the StringBuffer and StringBuilder is that StringBuilders methods are not thread safe(not Synchronised).
It is recommended to use StringBuilder whenever possible because it is faster than StringBuffer. However if thread safety is necessary the best option is StringBuffer objects.
public class Test{
public static void main(String args[]){
StringBuffer sBuffer = new StringBuffer(" test");
sBuffer.append(" String Buffer");
System.ou.println(sBuffer);
}
}
public class StringBuilderDemo {
public static void main(String[] args) {
String palindrome = "Dot saw I was Tod";
StringBuilder sb = new StringBuilder(palindrome);
sb.reverse(); // reverse it
System.out.println(sb);
}
}
so according to your need you cal select one of tham.
Reference http://docs.oracle.com/javase/tutorial/java/data/buffers.html
Upvotes: 1