Joshua
Joshua

Reputation: 125

visualvm profiles jvm heap found int array takes huge memory

I have a Spring boot program using OpenJdk (jdk1.8) running at server, consuming about 200 or 300 million data from kafka and write to csv files each day. Less than 2 hours after starts up, it using more than 6GB memory. So I dump heap using jmap histo. And find that int[] array using 2.6GB and byte[] array using 1.3GB. enter image description here But I defined neither int[] nor byte[] in my project. I'm using spring kafka(org.springframework.kafka, version2.3.3) consume kafka message, opencsv(com.opencsv, version4.6) write csv.

Any one knows the reason?

Below is part of my code:

public <T> Boolean parseDataToFile(String filePath, List<T> data) throws IOException, CsvDataTypeMismatchException, CsvRequiredFieldEmptyException {
    if (data == null || data.size() <= 0) {
        return false;
    }
    File file = new File(filePath);
    //创建父目录
    boolean mkdirs = file.getParentFile().mkdirs();
    Writer writer = null;
    try {
        writer = new FileWriter(filePath, true);
        StatefulBeanToCsv beanToCsv = new StatefulBeanToCsvBuilder(writer).withThrowExceptions(false).withSeparator(',').build();
        beanToCsv.write(data);
        return true;
    } finally {
        if (writer != null) {
                writer.flush();
                writer.close();
        }
    }
}

Addtion: at Instance view, most(more than 90%) of them are none-used(retained size are 0), so it can be GCed? But why not? What are these int[] byte[] data? enter image description here

Upvotes: 3

Views: 2778

Answers (2)

Joshua
Joshua

Reputation: 125

Thanks to @AlBlue and @JurajMartinka. I analysed the byte[] array and int[] array and found part of the answer. The int[] array which have 166921 instance, using 2.6GB memory(57.3%). Most of them have no reference: enter image description here Some of them are used by kafka: enter image description here Meanwhile spring boot loader: enter image description here

On the other hand, the byte[] array have 39400 instance, using 1.3GB memory(28.8%). Most of them are kafka data: enter image description here Some are other referred dependencies: enter image description here

The real memory using by live objects are not big. MAT(The Eclipse Memory Analyzer tool) shows that only 43.6M are occupied. enter image description here

Yet, there are still many questions needs to find out. Such as when to GC, where is netty used, etc.

Upvotes: 3

AlBlue
AlBlue

Reputation: 24040

The occurrence of byte or char arrays in a program is typically due to Strings used by the code. You should be able to see if they are being referenced by String objects by looking at the memory dominators in MAT.

The use of int arrays is far less common in general code in the JDK so you would need to find where they are being dominated from to find out.

However note that in both cases, what you are directly using in your code is not relevant; the usage is most likely from your dependencies under the covers. Whether it’s the JVM classes or some other cache mechanism is likely to depend on where the objects are being referred from, so the next stage is to use the tooling to find that out.

Upvotes: 2

Related Questions