Reputation: 43
There is a compressed file, first I need to decompress it, then read the contents of the line and process each line of data by splitting the two fields and using one of them as the key, then encrypt another field. Some code is as follows:
try (GZIPInputStream stream = new GZIPInputStream(new ByteArrayInputStream(event.getBody()));
BufferedReader br = new BufferedReader(new InputStreamReader(stream))) {
String line;
StringBuilder builder = new StringBuilder();
while ((line = br.readLine()) != null) {
builder.append(line);
this.handleLine(builder);
builder.setLength(0);
builder.trimToSize();
}
} catch (Exception e) {
// ignore
}
StringBuilder
like this?aaa|bbb|ccc|ddd|eee|fff|ggg|hhh
. What I want to know is how to correctly use String
and StringBuilder
in this extremely large amount of data loop.
Upvotes: 0
Views: 108
Reputation: 308259
For handling many individual items in a loop there's basically 2 possible sources of trouble related to memory management:
Violating #1 would mean that your total memory usage would increase throughout the loop and thus create an upper limit to how many items you can handle.
Violating #2 would "only" cause more garbage collection pauses and not cause your application to fail (i.e. it'd slow down, but still work).
If you actually need the StringBuilder
(as indicated by your comment) then you should get rid of the trimToSize()
call (as Stephen C correctly commented), because it will basically force the StringBuilder
to re-allocate space for the content of line
in each iteration (effectively gaining you very, very little over just plain re-creating the StringBuilder
in each iteration).
The only drawback of removing that call is that the memory used by StringBuilder
will never be reduced until the loop has finished.
As long as there are no extreme outliers in line length in that file that is probably not a problem.
As an additional side-note: you mention that String.split
is too inefficient for you. A major source of that inefficiency is the fact that it needs to re-compile the regular expression every time. If you use pre-compile the pattern outside of the loop using Pattern.compile
and then call Pattern.split()
inside the loop, then that might already be much quicker.
Upvotes: 4