Reputation: 42575
It is well known limitation that the common way to store data in byte[]
arrays is limited to 2^31 bytes (2GB).
There are plenty of Java bug reports and Java specification requests that address this issue. Some of them has been files at the beginning of this century!
However all related entries I found were closed and/or marked as duplicate. In the mean time every consumer PC has enough memory so that this issue gets more and more important.
Therefore I am asking myself:
What is th (Java) official way to handle large in-memory data? E.g. storing 4GB in RAM
If there is no official solution what is the common solution used by the community?
Note: I consider saving the data to a temporary files not as a solution. Servers with more than 100GB RAM are not uncommon...
Upvotes: 4
Views: 4787
Reputation: 42575
As the existing good answers by Andremoniy and Kostiantyn are not what I had in mind I investigated the topic a bit deeper.
What I originally had in mind was a library or code snippet of a Java class that handles all the magic internally (e.g. chunking the data to multiple byte arrays). But as questions asking for library recommendations are immediately closed because of some stupid rule I could not write that in my question.
This this the collection of what I found regarding existing solutions:
Provides monodimensional arrays with 64-bit indices. It uses multiple byte[]
to store the data internally. It seems to be developed to store large arrays of numbers as it provides methods for sorting the values in the array.
Negative: Accessing a BigArrays
seems to be a bit complicated as there is no adaption to InputStream
or OutputStream
.
The elsasticsearch project contains classes that allow handling of large byte (and other primitive) arrays. The important classes are located in the package org.elasticsearch.common.util.
The disadvantage is that the classes are only available as part of the elasticsearch core library, which is pretty large and has additionally a lot of dependencies. However as it uses the Apache 2.0 license extracting and repackaging the necessary classes seems to be a reasonable way.
I found a very interesting hint that Sun planned back in 2009 to provide a class named BigByteBuffer
for Java NIO.2. In 2010 Oracle bought Sun and now 8 years later we still have neither BigByteBuffer
nor byte[]
with 64bit indices...
Upvotes: 1
Reputation: 1906
Java, as general-purpose language, does not have neither specific instruments for handling large in-memory data out of the box, nor any special official recommendations for it so far.
You have the following options while using Java to work with as much memory as possible under single JVM:
Each and any approach has own drawbacks and advantages in terms of read/write speed, footprint, durability, maintainability, etc. At the same time it depends on the nature of objects being stored in memory, their lifecycle, access scheme, etc.
So desired choice must be elaborated by strictly matching it against particular requirements/use cases.
Upvotes: 2
Reputation: 34900
There is no such thing as an "official" way. I have never met anything about this problem in the official Java language specification.
But generally saying, you can always represent such a big array as array of arrays, i.e. byte[][]
. In this case each element of the top-level array will describe a "page" of your storage. This will allow you to store theoretically 2^31x2^31=2^62 bytes.
Upvotes: 3