Reputation: 982
Using a 1 GB Java heap (-Xmx1g
), I store data in many large byte arrays. I get an OutOfMemoryError
quite some time before 1 GB of data is stored. At that time, there is still quite a lot of free heap as calculated by Runtime rt.maxMemory() - rt.totalMemory() + rt.freeMemory()
:
byte array size | approx. data that can be stored | approx. free heap shown |
---|---|---|
2^18 (262144) | 800 MB | 270 MB |
2^17 (131072) | 930 MB | 140 MB |
2^16 (65536) | 997 MB | 72 MB |
2^15 (32768) | 1032 MB | 36 MB |
Why is heap size calculation off for large byte arrays and can I do something to mend it?
Note: When using 2^19 (or larger) sized byte arrays, a different thing occurs: Java byte array of 1 MB or more takes up twice the RAM - let's focus this question on 2^18 sized byte arrays.
Run with 64-Bit Server VM AdoptOpenJDK 11.0.11, both on Windows java -cp .\lib\* -Xmx1g tryit.Main
and Debian java -cp .:./lib/* -Xmx1g tryit.Main
:
package tryit;
public class Main {
public static void main(String[] args) throws Exception {
byte[][] array = new byte[1000000][];
long freeAtStart = free();
System.out.println("Free at start: " + freeAtStart);
int chunkSize = 2<<17; // This is 2^18.
System.out.println("Chunk size : " + chunkSize);
for (int n = 0; n < 1000000; n++) {
if (n % 50 == 0) {
long currentFree = free();
System.out.printf("%d: stored %d / allocated %d / free %d\n", n, n * chunkSize, freeAtStart - currentFree, currentFree);
}
array[n] = new byte[chunkSize];
}
}
static long free() throws Exception {
System.gc(); // Called just in case - there should not be anything to garbage collect.
Thread.sleep(100); // Give GC some time to work
return Runtime.getRuntime().maxMemory() - Runtime.getRuntime().totalMemory() + Runtime.getRuntime().freeMemory();
}
}
And finally the (shortened) output of four runs:
2^15:
Free at start: 1068751960 / Chunk size: 32768
31500: stored 1032192000 / allocated 1032933912 / free 35818048
2^16:
Free at start: 1068751960 / Chunk size: 65536
15200: stored 996147200 / allocated 996627400 / free 72124560
2^17:
Free at start: 1068751960 / Chunk size: 131072
7100: stored 930611200 / allocated 930960032 / free 137791928
2^18:
Free at start: 1068751960 / Chunk size: 262144
3050: stored 799539200 / allocated 799823160 / free 268928800
2^19 (humongous objects - allocation size is two times stored size):
Free at start: 1068751960 / Chunk size: 524288
1000: stored 524288000 / allocated 1048811120 / free 19940840
Upvotes: 2
Views: 1360
Reputation: 21435
As described in the linked answer (Java byte array of 1 MB or more takes up twice the RAM) and the G1 garbage collector documentation the G1 garbage collector divides the heap into regions of 1 MByte (2^20 bytes) each. For a 1GB heap that gives 1024 regions (probably a bit less due to administrative overhead).
Naively you would expect the a region of 2^20 bytes can hold 4 byte arrays of 2^18 bytes each - but unfortunately this is not the case. Byte arrays are objects, and objects have a hidden object header (see https://stackoverflow.com/a/50509263 for an explanation).
So the effective size of a byte[262144]
is not 262144 bytes, it is 262160 bytes (depending on the JVM and the maximum heap size it could be even more), which means that each region can only hold 3 byte arrays of length 262144.
Combining 3 byte arrays per region with 1024 regions would give you a maximum of 3072 byte arrays of 262144 bytes for a 1 GB heap, which matches nicely with your numbers.
What you can do about it:
-XX:G1HeapRegionSize=4M
) - a 4MB region can hold 15 byte arrays of length 262144, whereas 4 regions of 1MB can hold only 12 byte arrays of length 262144Note: this post uses 2^20 to mean two to the power of twenty, which is not the same as the java expression 2^20
, but rather 1<<20
Upvotes: 5