Johan
Johan

Reputation: 76597

How to turn every bit into a byte

I have the following code to turn a bit into a byte.

__device__ UINT64 bitToByte(const UINT8 input) {
    UINT64 b = ((0x8040201008040201ULL * input) >> 7) & 0x0101010101010101ULL; 
    //reverse the byte order <<-- this step is missing
    return b;
}

However the bytes are in the wrong order, the endianness is reversed. On the CPU I can simply to a bswap reg,reg to fix this, but what do I do on the GPU?

Alternatively, what similar trick can I use so that the bytes are put the right way round, i.e. the Most Significant bit goes to the Most Significant Byte, such that I don't need a bswap trick.

Upvotes: 5

Views: 522

Answers (3)

Alain Merigot
Alain Merigot

Reputation: 11547

To reverse byte order, the bit extraction can be done with the same trick, but by swapping the coefficients that perform the shift in the multiplication. However, to avoid clashes in the multiplication, it must be done in two steps, for the even and odd bits. This way, 2 bytes are free to hold the result of the every multiplication which is sufficient to ensure integrity of the result.

__device__ UINT64 bitToByte(const UINT8 input) {
  UINT64 b = ( ((0x0002000800200080ULL * input) >> 7) & 0x0001000100010001ULL) 
          |  ( ((0x0100040010004000ULL * input) >> 7) & 0x0100010001000100ULL);

    return b;
}

As spotted in the comments, to optimize, the shifts can be factorized.

__device__ UINT64 bitToByte(const UINT8 input) {
  UINT64 b =  ( ((0x0002000800200080ULL * input) & 0x0080008000800080ULL) 
              | ((0x0100040010004000ULL * input) & 0x8000800080008000ULL) )
                >> 7 ;
    return b;
}

Upvotes: 2

Johan
Johan

Reputation: 76597

Thanks to @tera, here is the answer:

//Expand every bit into a byte
__device__ static UINT64 Add012(const UINT8 input) {
    const UINT64 b = ((0x8040201008040201ULL * input) >> 7) & 0x0101010101010101ULL; //extract every bit into a byte
    //unfortunatly this returns the wrong byte order
    UINT32* const b2 = (UINT32*)&b;
    UINT64 Result;
    UINT32* const Result2 = (UINT32*)&Result;
    Result2[0] = __byte_perm(b2[0]/*LSB*/, b2[1], 0x4567);  //swap the bytes around, the MSB's go into the LSB in reverse order
    Result2[1] = __byte_perm(b2[0]/*LSB*/, b2[1], 0x0123);  //and the LSB -> MSB reversed.
    return Result;
}

The __byte_perm replaces the bswap instruction.

Alternatively the input can be reversed using the __brev (bit-reverse) intrinsic:

//Expand every bit into a byte
__device__ static UINT64 Add012(const UINT8 input) {
    const UINT32 reversed = (__brev(input) >> 24);
    return ((0x8040201008040201ULL * reversed) >> 7) & 0x0101010101010101ULL; //extract every bit into a byte
}

The second version looks easier.

Upvotes: 3

Giovanni Cerretani
Giovanni Cerretani

Reputation: 1724

Instead of reverse the result, you can reverse input, with any of the tricks explained here. For example, using the approach of this answer:

static UINT8 lookup[16] = {
    0x0, 0x8, 0x4, 0xc, 0x2, 0xa, 0x6, 0xe,
    0x1, 0x9, 0x5, 0xd, 0x3, 0xb, 0x7, 0xf, };

UINT8 reverse(UINT8 n) {
    return (lookup[n & 0xF] << 4) | lookup[n >> 4];
}

__device__ UINT64 bitToByte(const UINT8 input) {
    UINT64 b = ((0x8040201008040201ULL * reverse(input)) >> 7) & 0x0101010101010101ULL; 
    return b;
}

Upvotes: 2

Related Questions