Igor Tupitsyn
Igor Tupitsyn

Reputation: 1193

Java. Extracting integers from bits in a byte array not fitting the byte boundary

I have the following array of bytes: 01010110 01110100 00100101 01001011

These bytes are broken into two groups to encode seven integers. I know that the first group consists of 3 values 4 bits each (0101 0110 0111) that represent numbers 5,6,7. The second group consists of 4 values 5 bits each (01000 01001 01010 01011), which represent integers 8,9,10, and 11.

To extract the integers, I am currently using the following approach. Convert the array into a binary string:

public static String byteArrayToBinaryString(byte[] byteArray)
    {
    String[] arrayOfStrings = new String[byteArray.length];

    for(int i=0; i<byteArray.length; i++)
    {
        arrayOfStrings[i] = byteToBinaryString(byteArray[i]);
    }

    String bitsetString = "";
    for(String testArrayStringElement : arrayOfStrings)
    {
        bitsetString += testArrayStringElement;
    }

    return bitsetString;
}

// Taken from here: http://helpdesk.objects.com.au/java/converting-large-byte-array-to-binary-string
public static String byteToBinaryString(byte byteIn) 
{
    StringBuilder sb = new StringBuilder("00000000");

    for (int bit = 0; bit < 8; bit++) 
    {
        if (((byteIn >> bit) & 1) > 0) 
        {
            sb.setCharAt(7 - bit, '1');
        }
    }

    return sb.toString();
}

Then, I split the binary string into 2 substrings: 12 characters and 20 characters. Then I split each substring into new substrings, each of which has length that equals the number of bits. Then I convert each sub-substring into an integer.

It works but a byte array representing thousands of integers takes 30 seconds to a minute to extract.

I am a bit at a loss here. How do I do this using bitwise operators?

Thanks a lot!

Upvotes: 3

Views: 2126

Answers (2)

Margaret Bloom
Margaret Bloom

Reputation: 44066

I assume you have an understanding of the basic bit operations and how to express them in Java.


Use a pencil to draw a synthetic picture of the problem

 byte 0     byte 1     byte 2     byte 3
01010110   01110100   00100101   01001011
\__/\__/   \__/\______/\___/\______/\___/
 a   b      c     d      e     f      g

To extract a, b and c we need to do the following

   a          b          c

 byte 0     byte 0     byte 1
01010110   01010110   01110100
\.  \.     ||||||||   \.  \.  
  '\  '\   XXXX||||     '\  '\
0.. 0101   0.. 0110   0.. 0111

  Shift       And       Shift

In Java

int a = byteArray[0] >>> 4, b = byteArray[0] & 0xf, c = byteArray[1] >>> 4;

The other values d, e, f and g are computed similarly but some of them require to read two bytes from the array (d and f actually).

          d                      e

   byte 1    byte 2            byte 2
  01110100  00100101          00100101
  ||||\\\\  |                 |\\\\\
  XXXX \\\\ |                 X \\\\\
        \\\\|                    \\\\\
  0..   01000                    01001

To compute d we need to isolate the least four bits of byte 1 with byteArray[1] & 0xf then make space for the bit from byte 2 with (byteArray[1] & 0xf) << 1, extract that bit with byteArray[1] >>> 7 and finally merge together the result.

int d = (byteArray[1] & 0xf) << 1 | byteArray[2] >>> 7;
int e = (byteArray[2] & 0x7c) >>> 2;
int f = (byteArray[2] & 0x3) << 3 | byteArray[3] >>> 5;
int g = byteArray[3] & 0x1f;

When you are comfortable with handling bits operations you may consider generalizing the function that extract the integers.

I made function int extract(byte[] bits, int[] sizes, int[] res), that given an array of bytes bits, an array of sizes sizes, where the even indices hold the size of the integers to extract in bits and the odd indices the number of integers to extract, and an output array res large enough to hold all the integers in output, extracts from bits all the integers expressed by sizes.
It returns the number of integers extracted.

For example the original problem can be solved as

int res[] = new int[8];
byte bits[] = new byte[]{0x56, 0x74, 0x25, 0x4b};

//Extract 3 integers of 4 bits and 4 integers of 5 bits
int ints = BitsExtractor.extract(bits, new int[]{4, 3,  5, 4}, res);

public class BitsExtractor
{
    public static int extract(byte[] bits, int[] sizes, int[] res)
    {

        int currentByte = 0;            //Index into the bits array
        int intProduced = 0;            //Number of ints produced so far
        int bitsLeftInByte = 8;         //How many bits left in the current byte
        int howManyInts = 0;            //Number of integers to extract 

        //Scan the sizes array two items at a time
        for (int currentSize = 0; currentSize < sizes.length - 1; currentSize += 2)
        {
            //Size, in bits, of the integers to extract
            int intSize = sizes[currentSize];

            howManyInts += sizes[currentSize+1];

            int temp = 0;                   //Temporary value of an integer
            int sizeLeft = intSize;         //How many bits left to extract 


            //Do until we have enough integer or we exhaust the bits array
            while (intProduced < howManyInts && currentByte <= bits.length)
            {
                //How many bit we can extract from the current byte
                int bitSize = Math.min(sizeLeft, bitsLeftInByte);               //sizeLeft <= bitsLeftInByte ? sizeLeft : bitsLeftInByte;
                //The value to mask out the number of bit extracted from
                //The current byte (e.g. for 3 it is 7)
                int byteMask = (1 << bitSize) - 1;
                //Extract the new bits (Note that we extract starting from the
                //RIGHT so we need to consider the bits left in the byte)
                int newBits = (bits[currentByte] >>> (bitsLeftInByte - bitSize)) & byteMask;

                //Create the new temporary value of the current integer by
                //inserting the bits in the lowest positions
                temp = temp << bitSize | newBits;

                //"Remove" the bits processed from the byte
                bitsLeftInByte -= bitSize;

                //Is the byte has been exhausted, move to the next
                if (bitsLeftInByte == 0)
                {
                    bitsLeftInByte = 8;
                    currentByte++;
                }

                //"Remove" the bits processed from the size
                sizeLeft -= bitSize;

                //If we have extracted all the bits, save the integer
                if (sizeLeft == 0)
                {
                    res[intProduced++] = temp;
                    temp = 0;
                    sizeLeft = intSize;
                }
            }
        }

        return intProduced;

    }
}

Upvotes: 8

vlatkozelka
vlatkozelka

Reputation: 999

Well I did the first group , the second can be done in similar fashion

public static void main(String args[]) {
        //an example 32 bits like your example
        byte[] bytes = new byte[4];
        bytes[0] = 31;//0001 1111
        bytes[1] = 54;//0011 0110
        bytes[2] = 67;
        bytes[3] = 19;
        //System.out.println(bytes[0]);
        int x = 0;
        int j = -1; // the byte number
        int k = 0; // the bit number in that byte
        int n = 0; // the place of the bit in the integer we are trying to read
        for (int i = 0; i < 32; i++) {

            if (i < 12) { //first group 
                if (i % 8 == 0) {
                    j++;
                    k = 0;

                }
                if (i % 4 == 0) {

                    x = 0;
                    n = 0;
                }

                byte bit = (byte) ((bytes[j] & (1 << (7 - k))) >> (7 - k));
                System.out.println("j is :" + j + " k is :" + k + "  " + bit);
                x = x | bit << (3 - n);
                if ((i + 1) % 4 == 0) {
                    System.out.println(x);
                }
                k++;
                n++;

            } else {

            }

        }

    }

It's a bit tricky because you are trying to encode an integer on less than what java allocates (8 bits). So I had to take each bit and "construct" the int from them

To get each bit

byte bit = (byte) ((bytes[j] & (1 << (7 - k))) >> (7 - k));

this takes the byte we are at and does And operation. For example I want the 3rd bit of the 1st byte, I do

bytes[0] &  1 << (7 - 3)

but this gives me an integer encoded over 8 bits, so I still have to shift it to get that single bit with >> (7 - 3)

Then I just Or it with x (the int we are trying to decode). All while putting it at the right position with << (3 - n) . 3 because your integer is encoded over 4 bits

Try running the code and reading the output.

I am honestly not sure if this is the best way, but I believe it's at least faster than dealing with Strings

Upvotes: 1

Related Questions