Reputation: 5398
I have created an application to process log files but am having some bottle neck when the amount of files = ~20
The issue comes from a particular method which takes on average a second or so to complete roughly and as you can imagime this isn't practical when it needs to be done > 50 times
private String getIdFromLine(String line){
String[] values = line.split("\t");
String newLine = substringBetween(values[4], "Some String : ", "Value=");
String[] split = newLine.split(" ");
return split[1].substring(4, split[1].length());
}
private String substringBetween(String str, String open, String close) {
if (str == null || open == null || close == null) {
return null;
}
int start = str.indexOf(open);
if (start != -1) {
int end = str.indexOf(close, start + open.length());
if (end != -1) {
return str.substring(start + open.length(), end);
}
}
return null;
}
A line comes from the reading of a file which is very efficient so I don't feel a need to post that code unless someone asks.
Is there anyway to improve perofmrance of this at all?
Thanks for your time
Upvotes: 1
Views: 544
Reputation: 34900
One of the main problems in this code is the "split
" method.
For example this one:
private String getIdFromLine3(String line) {
int t_index = -1;
for (int i = 0; i < 3; i++) {
t_index = line.indexOf("\t", t_index+1);
if (t_index == -1) return null;
}
//String[] values = line.split("\t");
String newLine = substringBetween(line.substring(t_index + 1), "Some String : ", "Value=");
// String[] split = newLine.split(" ");
int p_index = newLine.indexOf(" ");
if (p_index == -1) return null;
int p_index2 = newLine.indexOf(" ", p_index+1);
if (p_index2 == -1) return null;
String split = newLine.substring(p_index+1, p_index2);
// return split[1].substring(4, split[1].length());
return split.substring(4, split.length());
}
UPD: It could be 3 times faster.
Upvotes: 1
Reputation: 536
Could you try the regex anyway and post results please just for comparison:
Pattern p = Pattern.compile("(Some String : )(.*?)(Value=)"); //remove first and last group if not needed (adjust m.group(x) to match
@Test
public void test2(){
String str = "Long java line with Some String : and some object with Value=154345 ";
System.out.println(substringBetween(str));
}
private String substringBetween(String str) {
Matcher m = p.matcher(str);
if(m.find(2)){
return m.group(2);
}else{
return null;
}
}
If this is faster find a regex that combines both functions
Upvotes: 0
Reputation: 3762
A few things are likely problematic:
Whether or not you realized, you are using regular expressions. The argument to String.split()
is a treated as a regex. Using String.indexOf()
will almost certainly be a faster way to find the particular portion of the String that you want. As HRgiger points out, Guava's splitter is a good choice because it does just that.
You're allocating a bunch of stuff you don't need. Depending on how long your lines are, you could be creating a ton of extra String
s and String[]
s that you don't need (and the garbage collecting them). Another reason to avoid String.split()
.
I also recommend using String.startsWith()
and String.endsWith()
rather that all of this stuff that you're doing with the indexOf()
if only for the fact that it'd be easier to read.
Upvotes: 3
Reputation: 3337
I would recommend to use the VisualVM to find the bottle neck before oprimisation.
If you need performance in your application, you will need profiling anyways.
As optimisation i would make an custom loop to replace yours substringBetween
method and get rid of multiple indexOf
calls
Upvotes: 0