Jason Yang
Jason Yang

Reputation: 130

Java - GC a large string

I have a method to read and parse an extremely long xml file. The xml file is read into a string, which then is parsed by a different class. However, this causes the Java to use a large amount of memory (~500 MB). Normally, the program runs at around 30 MB, but when parse() is called, it increases to 500 MB. When parse() is done running, however, the memory usage doesn't go back down to 30 MB; instead it stays at 500 MB.

I've tried setting s = null and calling System.gc() but the memory usage still stays at 500 MB.

public void parse(){
        try {
            System.out.println("parsing data...");
            String path = dir + "/data.xml";
            InputStream i = new FileInputStream(path);
            BufferedReader reader = new BufferedReader(new InputStreamReader(i));
            String line;
            String s = "";
            while ((line = reader.readLine()) != null){
                s += line + "\n";
            }

            ... parse ...

        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
}

Any ideas?

Thanks.

Upvotes: 3

Views: 1101

Answers (4)

RoHaN
RoHaN

Reputation: 1355

You should keep in mind that calling System.gc(); will not definitely do the Garbage collection but it suggest GC to do it's thing and it can ignore doing that if GC dont want to garbage collect. it is better to use StringBuilder do reduce the number of Strings you create in memory because it only creates String when you call toString() on it.

Upvotes: 0

user207421
user207421

Reputation: 311039

The 500MB is caused by parsing, so it has nothing to do with the string, or the BufferedReader either. It is the DOM of the parsed XML. Release that and your memory usage will revert.

But why read the entire file into a string? This is a waste of time and space. Just parse the input directly from the file.

Upvotes: 0

CharithJ
CharithJ

Reputation: 47570

Solution for your memory leak

You should Close the BufferReader at the end in order to close the stream and releases any system resources associated with it. You can close both InputStream and BufferReader. However, closing the BufferReader actually closes its stream as well.

Generally it's better to add a finally and close it.

finally 
{
   i.Close();
   reader.Close();
}

Better approach try-with-resources Statement

try (BufferedReader br = new BufferedReader(new FileReader(path))) 
{
        return br.readLine();
}

Bonus Note

Use a StringBuilder instead of concatenating strings

String does not allow appending. Each append/concatenate on a String creates a new object and returns it. This is because String is immutable - it cannot change its internal state.

On the other hand StringBuilder is mutable. When you call Append, it alters the internal char array, rather than creating a new string object.

Thus it is more memory efficient to use a StringBuilder when you want to append many strings.

Upvotes: 2

Oly
Oly

Reputation: 2479

Just a note: a try-with-resources block will help you a lot with IO objects like those readers.

try(InputStream i = new FileInputStream(path);
    BufferedReader reader = new BufferedReader(new InputStreamReader(i))) {
    //your reading here
}

This will make sure these objects are disposed of by calling close() on them, regardless of how your method block exits (success, exception...). Closing these objects may also help to free up some memory.

The thing that's probably causing a big slowdown and probably blowup of memory usage, though, is your string concatenation. Calling s += line + "\n" is fine for a single concatenation, but the + operator actually has to create a new String instance each time, and copy the characters from the ones being concatenated. The StringBuilder class was designed just for this purpose. :)

Upvotes: 0

Related Questions