Reputation: 191
I am reading a file to parse later on. The file is not likely to exceed an MB in size, so this is perhaps not a crucial question for me at this stage. But for best practise reasons, I'd like to know when is the optimum time to perform an operation.
Example:
Using a method I've pasted from http://www.dzone.com/snippets/java-read-file-string, I am reading a buffer into a string. I would now like to remove all whitespace. My method is currently this:
private String listRaw;
public boolean readList(String filePath) throws java.io.IOException {
StringBuffer fileData = new StringBuffer(1024);
BufferedReader reader = new BufferedReader(
new FileReader(filePath));
char[] buf = new char[1024];
int numRead=0;
while((numRead=reader.read(buf)) != -1){
String readData = String.valueOf(buf, 0, numRead);
fileData.append(readData);
buf = new char[1024];
}
reader.close();
listRaw = fileData.toString().replaceAll("\\s","");
return true;
}
So, I remove all whitespace from the string at the time I store it - in it's entirety - to a class variable.
To me, this means less processing but more memory usage. Would I be better off applying the replaceAll()
operation on the readData
variable as I append it to fileData
for best practise reasons? Using more processing but avoiding passing superfluous whitespace around.
I imagine this has little impact for a small file like the one I am working on, but what if it's a 200MB log file?
Is it entirely case-dependant, or is there a consensus I'd do better to follow?
Thanks for the input everybody. I'm sure you've helped to aim my mindset in the right direction for writing Java.
I've updated my code to take into consideration the points raised. Including the suggestion by Don Roby that at some point, I may want to keep spaces. Hopefully things read better now!
private String listRaw;
public boolean readList(String filePath) throws java.io.IOException {
StringBuilder fileData = new StringBuilder(51200);
BufferedReader reader = new BufferedReader(new FileReader(filePath));
char[] buf = new char[51200];
boolean spaced = false;
while(reader.read(buf) != -1){
for(int i=0;i<buf.length;i++) {
char c = buf[i];
if (c != '\t' && c != '\r' && c != '\n') {
if (c == ' ') {
if (spaced) {
continue;
}
spaced = true;
} else {
spaced = false;
}
fileData.append(c);
}
}
}
reader.close();
listRaw = fileData.toString().trim();
return true;
}
Upvotes: 0
Views: 113
Reputation: 692181
You'd better create and apply the regexp replacement only once, at the end. But you would gain much more by
To avoid an unnecessary long temporary String creation, you could read char by char, and only append the char to the StringBuilder if it's not a whitespace. In the end, the StringBuilder would contain only the good characters, and you wouldn't need any replaceAll()
call.
Upvotes: 6
Reputation: 41145
Depending somewhat on the parse you're going to do, you may well be better off not removing the spaces in a separate step at all, and just ignore them during the parse.
It's also reasonably rare to want to remove all whitespace. Are you sure you don't want to just replace multiple spaces with single spaces?
Upvotes: 0
Reputation: 81724
THere are actually several very significant inefficiencies in this code, and you'd have to fix them before worrying about the relatively less important issue you've raised.
First, don't create a new buf
object on each iteration of the loop -- use the same one! There's no problem with doing so -- the new data overwrites the old, and you save on object allocation (which is one of the more expensive operations you can do.)
Second, similarly, don't create a String
to call append()
-- use the form of append
that takes a char
array and an offset (0, in this case) and length (numRead
, in this case.) Again, you create one less object per loop iteration.
Finally, to come to the question you actually asked: doing it in the loop would create a String object per iteration, but with the tuning we've just done, you're creating zero objects per iterataion -- so removing the whitespace at the end of the loop is the clear winner!
Upvotes: 3