Reputation: 5281
What is the most space efficient way of storing an array of 1s and 0s in Java?
Upvotes: 0
Views: 2437
Reputation: 9091
The answer is going to be different depending on whether you know the size of the array in advance, and how sparse or random the data is.
For starters, if you're looking for the most efficient storage, you're going to want to compress the data instead of storing the raw 0s and 1s. One typically good compression algorithm is Huffman Coding, although it is not always the "best" particularly if the data is random. You can find an implementation here.
Going back to the original question and assuming you want to keep the raw values; the most efficient storage will depend on whether you know the size of the array in advance. If it is a fixed size, you can create a number of byte
primitives. Each of these would take exactly 1 byte, plus the overhead for the object in which they are stored. You could reduce the number of variables by using short
, int
, or long
as needed to group 2, 4, or 8 bytes together. If you're including these variables as a member of a class with other variables, it may make a difference which type you use, as the object itself takes 8 bytes for overhead, and the size will always be a multiple of 8 bytes; so any variables short of that will be padded to a multiple 8-bytes.
If you need an arbitrary size array (which incurs its own 12-byte overhead, 8 for the object and 4 for the array length) the answer would continue to be an array of byte[]
with each of the 8 bits mapped to your 1's and 0's. However, the JVM allocates memory in 8-byte chunks so 1 to 4 bytes will take the the memory footprint of an int
so a byte[]
array will ultimately match the memory footprint of a short[]
, or int[]
and there's no real need to allocate an array in smaller sizes than 32 bits (as long as you ensure you use all the bits efficiently. A long[]
would always end up taking an extra 8 bytes over the other integer array types, due to the 12-byte object overhead and 16-byte allocation roundoff.
At the end of the day, however, readability/usability probably trumps memory usage. A BitSet
stores values as long[]
under the hood and has friendlier access methods and is probably the best choice here to minimize (not exactly, but good enough for practical purposes) memory footprint.
A boolean[]
would likely be the fastest to deal with CPU wise but would take 8x the memory as a primitive integer type.
Upvotes: 3
Reputation: 1090
I think the most most space efficient way to store a binary matrix, if you have most '0's then '1's, is using a sparse matrix. In a sparse matrix you need represent only the '1' values using a crossed-list structure.
You can find some implementations on GitHub.
Upvotes: 2
Reputation: 3932
I hope you have gone through BitSet
. I think thats the efficient way of storing Bits.
https://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html
but you should treat 0 as false and 1 as true
Upvotes: 4