John David Ravenscroft
John David Ravenscroft

Reputation: 191

Optimum time to perform an operation: within, or after loop

I am reading a file to parse later on. The file is not likely to exceed an MB in size, so this is perhaps not a crucial question for me at this stage. But for best practise reasons, I'd like to know when is the optimum time to perform an operation.

Example:

Using a method I've pasted from http://www.dzone.com/snippets/java-read-file-string, I am reading a buffer into a string. I would now like to remove all whitespace. My method is currently this:

private String listRaw;

public boolean readList(String filePath) throws java.io.IOException {
    StringBuffer fileData = new StringBuffer(1024);
    BufferedReader reader = new BufferedReader(
            new FileReader(filePath));
    char[] buf = new char[1024];
    int numRead=0;
    while((numRead=reader.read(buf)) != -1){
        String readData = String.valueOf(buf, 0, numRead);
        fileData.append(readData);
        buf = new char[1024];
    }
    reader.close();
    listRaw = fileData.toString().replaceAll("\\s","");
    return true;
}

So, I remove all whitespace from the string at the time I store it - in it's entirety - to a class variable.

To me, this means less processing but more memory usage. Would I be better off applying the replaceAll() operation on the readData variable as I append it to fileData for best practise reasons? Using more processing but avoiding passing superfluous whitespace around.

I imagine this has little impact for a small file like the one I am working on, but what if it's a 200MB log file?

Is it entirely case-dependant, or is there a consensus I'd do better to follow?


Thanks for the input everybody. I'm sure you've helped to aim my mindset in the right direction for writing Java.

I've updated my code to take into consideration the points raised. Including the suggestion by Don Roby that at some point, I may want to keep spaces. Hopefully things read better now!

private String listRaw;

public boolean readList(String filePath) throws java.io.IOException {
    StringBuilder fileData = new StringBuilder(51200);
    BufferedReader reader = new BufferedReader(new FileReader(filePath));
    char[] buf = new char[51200];
    boolean spaced = false;
    while(reader.read(buf) != -1){
        for(int i=0;i<buf.length;i++) {
            char c = buf[i];
            if (c != '\t' && c != '\r' && c != '\n') {
                if (c == ' ') {
                    if (spaced) {
                        continue;
                    }
                    spaced = true;
                } else {
                    spaced = false;
                }

                fileData.append(c);
            }
        }
    }
    reader.close();
    listRaw = fileData.toString().trim();
    return true;
}

Upvotes: 0

Views: 113

Answers (3)

JB Nizet
JB Nizet

Reputation: 692181

You'd better create and apply the regexp replacement only once, at the end. But you would gain much more by

  • initializing the StringBuilder with a reasonable size
  • avoiding the creation of a String inside the loop, and append the read characters directly to the StringBuilder
  • avoiding the instantiation of a new char buffer, for nothing, at each iteration.

To avoid an unnecessary long temporary String creation, you could read char by char, and only append the char to the StringBuilder if it's not a whitespace. In the end, the StringBuilder would contain only the good characters, and you wouldn't need any replaceAll() call.

Upvotes: 6

Don Roby
Don Roby

Reputation: 41145

Depending somewhat on the parse you're going to do, you may well be better off not removing the spaces in a separate step at all, and just ignore them during the parse.

It's also reasonably rare to want to remove all whitespace. Are you sure you don't want to just replace multiple spaces with single spaces?

Upvotes: 0

Ernest Friedman-Hill
Ernest Friedman-Hill

Reputation: 81724

THere are actually several very significant inefficiencies in this code, and you'd have to fix them before worrying about the relatively less important issue you've raised.

First, don't create a new buf object on each iteration of the loop -- use the same one! There's no problem with doing so -- the new data overwrites the old, and you save on object allocation (which is one of the more expensive operations you can do.)

Second, similarly, don't create a String to call append() -- use the form of append that takes a char array and an offset (0, in this case) and length (numRead, in this case.) Again, you create one less object per loop iteration.

Finally, to come to the question you actually asked: doing it in the loop would create a String object per iteration, but with the tuning we've just done, you're creating zero objects per iterataion -- so removing the whitespace at the end of the loop is the clear winner!

Upvotes: 3

Related Questions