Horse SMith
Horse SMith

Reputation: 1035

efficiently iterate over the not set values in a massive bit array in Java

I got a massive bit array saved as a byte array, which represent all signed int values (4,294,967,295).

byte[] bitArray = byte[536870912];

Each byte in the array represents 8 numbers, one for each bit. That means byte[0] stores 1, 2, 3, 4, 5, 6, 7, 8, and byte[1] stores 9, 10, 11, 12, 13, 14, 15, 16 etc.

I use this to store a huge table, where I can either set numbers to either true or false (0 or 1). I got some rather efficient methods to check if a bit is set and to set a bit (only using bitwise operators).

Now I need to iterate this table over and over to find the bits that are set to 0. Of course it would be rather efficient to only store the numbers I want to iterate over, so I don't need to check them all each time, but there are so many numbers that storing them in an ArrayList uses a lot of memory.

How can I efficiently iterate multiple times over the not set values in the bit array?

Upvotes: 1

Views: 1054

Answers (1)

Peter Lawrey
Peter Lawrey

Reputation: 533750

How can I efficiently iterate over this bit array?

One way to do this is to use a BitSet. This will both scan a long[] examining 64-bits at a time but it's underlying methods are turned into intrinsics. i.e. single machine code instructions which are likely to be faster than anything you can write in Java.

If you really want to write this yourself, I suggest you look at how BitSet works and copy it's code. (Or use BitSet)

I suggest you have a look at the methods Long's numberOfLeadingZeros(long) numberOfTrailingZeros(long) bitCount(long)

An intrinsic is a method the JVM "recognizes" and replaces with specialised machine code instruction(s) This can make it much faster than copying the code and running the same code in Java.

How can I efficiently iterate multiple times over the not set values in the bit array?

In BitSet it uses the following loop

public int nextSetBit(int fromIndex) {
    if (fromIndex < 0)
        throw new IndexOutOfBoundsException("fromIndex < 0: " + fromIndex);

    checkInvariants();

    int u = wordIndex(fromIndex);
    if (u >= wordsInUse)
        return -1;

    long word = words[u] & (WORD_MASK << fromIndex);

    while (true) {
        if (word != 0)
            return (u * BITS_PER_WORD) + Long.numberOfTrailingZeros(word);
        if (++u == wordsInUse)
            return -1;
        word = words[u];
    }
}

Note: this is examining 64-bits in each iteration.

Upvotes: 5

Related Questions