gmnnn
gmnnn

Reputation: 339

Java BitSet and byte[] usage

I have this application where I should use BitSet class heavily and write to a file bit by bit. I know I can't write bits to a file, so first I convert the BitSet object to byte array and write as byte array. But the problem is since BitSet class indexed from right to left, when I convert the BitSet object to byte array and write to a file, it writes backwards.

For example this is my BitSet object:

10100100

and BitSet.get(0) gives false, and BitSet.get(7) gives true. I want to write this to file like:

00100101

so first bit will be 0 and last bit will be 1.

My convert method:

public static byte[] toByteArray(BitSet bits) 
{
    byte[] bytes = new byte[(bits.length() + 7) / 8];       
    for (int i = 0; i < bits.length(); i++) {
        if (bits.get(i)) {
            bytes[bytes.length - i / 8 - 1] |= 1 << (i % 8);
        }
    }
    return bytes;
}

My write method:

    FileOutputStream fos = new FileOutputStream(filePath);
    fos.write(BitOperations.toByteArray(cBitSet));
    fos.close();

Is this intended to be like this or am I doing something wrong? Thank you.

Upvotes: 12

Views: 6997

Answers (3)

fge
fge

Reputation: 121712

BitSet has several problems:

  • the length of the byte array it provides on output, using .toByteArray(), depends on the uppermost bit set to 1 (0 if no bit set, 1 if the last bit set is < 8, 2 if < 16 etc -- in essence, indexOf(highestBitSet) + 7) / 8);
  • as such, you cannot rely on it for computing a bit mask of fixed length.

Consider using a wrapper over ByteBuffer instead. Sample code below.

Note: this uses "static factory methods" for construction, so you will need to use either of BitFlags.withByteLength() or BitFlags.withBitLength() to create a new instance. You can, of course, devise your own methods for that or just make the constructor public. To get the underlying array, call .toByteArray().

public final class BitFlags
{
    private final int nrBytes;
    private final ByteBuffer buf;

    private BitFlags(final int nrBytes)
    {
        if (nrBytes < 1)
            throw new IllegalArgumentException("need at least one byte");
        this.nrBytes = nrBytes;
        buf = ByteBuffer.allocate(nrBytes);
    }

    public static BitFlags withByteLength(final int nrBytes)
    {
        return new BitFlags(nrBytes);
    }

    public static BitFlags withBitLength(final int nrBits)
    {
        return new BitFlags((nrBits - 1) / 8 + 1);
    }

    public void setBit(final int bitOffset)
    {
        if (bitOffset < 0)
            throw new IllegalArgumentException();

        final int byteToSet = bitOffset / 8;
        if (byteToSet > nrBytes)
            throw new IllegalArgumentException();

        final int offset = bitOffset % 8;
        byte b = buf.get(byteToSet);
        b |= 1 << offset;
        buf.put(byteToSet, b);
    }

    public void unsetBit(final int bitOffset)
    {
        if (bitOffset < 0)
            throw new IllegalArgumentException();

        final int byteToSet = bitOffset / 8;
        if (byteToSet > nrBytes)
            throw new IllegalArgumentException();

        final int offset = bitOffset % 8;
        byte b = buf.get(byteToSet);
        b &= ~(1 << offset);
        buf.put(byteToSet, b);
    }

    public byte[] toByteArray()
    {
        return buf.array();
    }
}

Upvotes: 7

Patricia Shanahan
Patricia Shanahan

Reputation: 26185

BitSet implements Serializable. If you only need to be able to restore the BitSet in Java, and don't need to otherwise examine its state in the file, you should just tell it to save itself to the file.

If you wish to write it to a file that contains other, non-serialized data, you can write it to a ByteArrayOutputStream and retrieve the byte[] from that. However, you will probably get better performance writing directly to the file.

Upvotes: 6

Ilmari Karonen
Ilmari Karonen

Reputation: 50328

That looks reasonable to me. It won't be very fast, but it should work. If you want it to write out the bits in the opposite order, just reverse the indexing and the shift:

byte[] bytes = new byte[(bits.length() + 7) / 8];       
for (int i = 0; i < bits.length(); i++) {
    if (bits.get(i)) {
        bytes[i / 8] |= 1 << (7 - i % 8);
    }
}

or even:

        bytes[i / 8] |= 128 >> (i % 8);

If your bitset is fairly sparse (or possibly even if it isn't), only iterating over the 1 bits might be more efficient:

byte[] bytes = new byte[(bits.length() + 7) / 8];
for ( int i = bits.nextSetBit(0); i >= 0; i = bits.nextSetBit(i+1) ) {
    bytes[i / 8] |= 128 >> (i % 8);
}

If you need more speed for dense bitsets, you could try using the standard BitSet.toByteArray() method and then use bit-twiddling tricks to reverse the bits in the individual bytes:

byte[] bytes = bits.toByteArray();
for ( int i = 0; i < bytes.length; i++ ) {
    byte b = bytes[i];
    b = ((b & 0x0F) << 4) | ((b & 0xF0) >> 4);
    b = ((b & 0x33) << 2) | ((b & 0xCC) >> 2);
    b = ((b & 0x55) << 1) | ((b & 0xAA) >> 1);
    bytes[i] = b;
}

Upvotes: 2

Related Questions