myborobudur
myborobudur

Reputation: 4435

Improvement of Algorithm: Counting set bits in Byte-Arrays

We store knowledge in byte arrays as bits. Counting the number of set bits is pretty slow. Any suggestion to improve the algorithm is welcome:

public static int countSetBits(byte[] array) {
    int setBits = 0;

    if (array != null) {
        for (int byteIndex = 0; byteIndex < array.length; byteIndex++) {
            for (int bitIndex = 0; bitIndex < 7; bitIndex++) {
                if (getBit(bitIndex, array[byteIndex])) {
                    setBits++;
                }
            }
        }
    }
    return setBits;
}
public static boolean getBit(int index, final byte b) {
    byte t = setBit(index, (byte) 0);
    return (b & t) > 0;
}
public static byte setBit(int index, final byte b) {
    return (byte) ((1 << index) | b);
}

To count the bits of a byte array of length of 156'564 takes 300 ms, that's too much!

Upvotes: 0

Views: 4840

Answers (5)

Jinxmcg
Jinxmcg

Reputation: 1970

By far the fastest way is counting bits set, in "parallel", method is called Hamming weight and is implemented in Integer.bitCount(int i) as far as I know.

Upvotes: 0

Vinod Jayachandran
Vinod Jayachandran

Reputation: 3898

As per my understaning,

1 Byte = 8 Bits

So if Byte Array size = n , then isn't total number of bits = n*8 ?

Please correct me if my understanding is wrong

Thanks Vinod

Upvotes: -1

jlordo
jlordo

Reputation: 37813

I would use:

    byte[] yourByteArray = ...
    BitSet bitset = BitSet.valueOf(yourByteArray);  // java.util.BitSet
    int setBits = bitset.cardinality();

I don't know if it's faster, but I think it will be faster than what you have. Let me know?

Your method would look like

 public static int countSetBits(byte[] array) {
     return BitSet.valueOf(array).cardinality();
 }

You say:

We store knowledge in byte arrays as bits.

I would recommend to use a BitSet for that. It gives you convenient methods, and you seem to be interested in bits, not bytes, so it is a much more appropriate data type compared to a byte[]. (Internally it uses a long[]).

Upvotes: 1

Denys S&#233;guret
Denys S&#233;guret

Reputation: 382170

I would use the same global loop but instead of looping inside each byte I would simply use a (precomputed) array of size 256 mapping bytes to their bit count. That would probably be very efficient.

If you need even more speed, then you should separately maintain the count and increment it and decrement it when setting bits (but that would mean a big additional burden on those operations so I'm not sure it's applicable for you).

Another solution would be based on BitSet implementation : it uses an array of long (and not bytes) and here's how it counts :

658        int sum = 0;
659        for (int i = 0; i < wordsInUse; i++)
660            sum += Long.bitCount(words[i]);
661        return sum;

Upvotes: 2

Arkku
Arkku

Reputation: 42139

Try Integer.bitcount to obtain the number of bits set in each byte. It will be more efficient if you can switch from a byte array to an int array. If this is not possible, you could also construct a look-up table for all 256 bytes to quickly look up the count rather than iterating over individual bits.

And if it's always the whole array's count you're interested in, you could wrap the array in a class that stores the count in a separate integer whenever the array changes. (edit: Or, indeed, as noted in comments, use java.util.BitSet.)

Upvotes: 5

Related Questions