user3639557
user3639557

Reputation: 5281

Efficient way to store an array of only 0s and 1s in Java

What is the most space efficient way of storing an array of 1s and 0s in Java?

Upvotes: 0

Views: 2437

Answers (3)

Daniel Widdis
Daniel Widdis

Reputation: 9091

The answer is going to be different depending on whether you know the size of the array in advance, and how sparse or random the data is.

For starters, if you're looking for the most efficient storage, you're going to want to compress the data instead of storing the raw 0s and 1s. One typically good compression algorithm is Huffman Coding, although it is not always the "best" particularly if the data is random. You can find an implementation here.

Going back to the original question and assuming you want to keep the raw values; the most efficient storage will depend on whether you know the size of the array in advance. If it is a fixed size, you can create a number of byte primitives. Each of these would take exactly 1 byte, plus the overhead for the object in which they are stored. You could reduce the number of variables by using short, int, or long as needed to group 2, 4, or 8 bytes together. If you're including these variables as a member of a class with other variables, it may make a difference which type you use, as the object itself takes 8 bytes for overhead, and the size will always be a multiple of 8 bytes; so any variables short of that will be padded to a multiple 8-bytes.

If you need an arbitrary size array (which incurs its own 12-byte overhead, 8 for the object and 4 for the array length) the answer would continue to be an array of byte[] with each of the 8 bits mapped to your 1's and 0's. However, the JVM allocates memory in 8-byte chunks so 1 to 4 bytes will take the the memory footprint of an int so a byte[] array will ultimately match the memory footprint of a short[], or int[] and there's no real need to allocate an array in smaller sizes than 32 bits (as long as you ensure you use all the bits efficiently. A long[] would always end up taking an extra 8 bytes over the other integer array types, due to the 12-byte object overhead and 16-byte allocation roundoff.

At the end of the day, however, readability/usability probably trumps memory usage. A BitSet stores values as long[] under the hood and has friendlier access methods and is probably the best choice here to minimize (not exactly, but good enough for practical purposes) memory footprint.

A boolean[] would likely be the fastest to deal with CPU wise but would take 8x the memory as a primitive integer type.

Upvotes: 3

pedrohreis
pedrohreis

Reputation: 1090

I think the most most space efficient way to store a binary matrix, if you have most '0's then '1's, is using a sparse matrix. In a sparse matrix you need represent only the '1' values using a crossed-list structure.

You can find some implementations on GitHub.

Upvotes: 2

Raghu K Nair
Raghu K Nair

Reputation: 3932

I hope you have gone through BitSet. I think thats the efficient way of storing Bits. https://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html

but you should treat 0 as false and 1 as true

Upvotes: 4

Related Questions