user3262424
user3262424

Reputation: 7489

Packing 4 Integers as ONE BYTE?

I have four integers {a, b, c, d} that can have the following range of values:

a - {0 or 1} (1 bit)

b - {0 or 1} (1 bit)

c - {0, 1, 2, ..., 7} (3 bits)

d - {0, 1, 2, ..., 7} (3 bits)

at first, I would like to pack them into a one byte that can be then written to a binary file. later, I would like to unpack that one byte and get from it to a tuple of the form (a, b, c, d).

I know how to read/write a byte to a binary file in python. But how do I do the bits packing/unpacking?

Upvotes: 17

Views: 3031

Answers (5)

RARE Kpop Manifesto
RARE Kpop Manifesto

Reputation: 2915

nested arithmetic form (NAF) :

p = 8 * (8 * (2 * a + b) + c) + d                  # "a" at MSB
p =               a + (b + (c + d * 8) * 2) * 2    # "a" at LSB

nested shifting form (NSF) :

p = ( ( ( ( (a << 1) + b) << 3) + c) << 3) + d     # "a" at MSB
p = ( ( ( ( (d << 3) + c) << 1) + b) << 1) + a     # "a" at LSB

Personally I think NAF is much cleaner (even if not the fastest), because it keeps all the digit groups on the same side and all the scaling factors on the other regardless of endian-ness, all while minimizing the layers of nesting required, and total arithmetic ops.

"NAF" and "NSF" aren't formal terms - it's just a colloquial way of describing them


So packing it would be chr(p) or just sprintf("%c", p)

as for decoding, using the "a" at MSB approach, it's basically splitting out a byte's octal codes back out to its components ("a" at LSB is the same decoding process in little-endian) :

     2a+b c  d   a  b
        \  \/     \/
======================
62      \0 76     00
63      \0 77     00
64      \1 00     01
65      \1 01     01

126     \1 76     01
127     \1 77     01
128     \2 00     10
129     \2 01     10

190     \2 76     10
191     \2 77     10
192     \3 00     11
193     \3 01     11

254     \3 76     11
255     \3 77     11
0       \0 00     00
1       \0 01     00

2a+b is really useful when it comes to encoding UTF-8 :

\0 ## - all the ASCII digits and arithmetic operators, most of the ASCII linguistic punctuation symbols, and nearly all the ASCII [[:cntrl:]] bytes (sans \177 DEL)

\1 ## - all the ASCII letters, plus misc [[:punct:]]


\2 ## - trailing/"continuation" bytes for Unicode code points U+0080 - U+10FFFFF

\3 ## - all the leading byte for Unicode code points U+0080 - U+10FFFFF

— sans \300 xC0 - \301 xC1 and \365 xF5 - \377 xFF

Upvotes: 0

Sven Marnach
Sven Marnach

Reputation: 602715

Use shift and bitwise OR, then convert to a character to get a "byte":

x = chr(a | (b << 1) | (c << 2) | (d << 5))

To unpack this byte again, first convert to an integer, then shift and use bitwise AND:

i = ord(x)
a = i & 1
b = (i >> 1) & 1
c = (i >> 2) & 7
d = (i >> 5) & 7

Explanation: Initially, you have

0000000a
0000000b
00000ccc
00000ddd

The left-shifts give you

0000000a
000000b0
000ccc00
ddd00000

The bitwise OR results in

dddcccba

Converting to a character will convert this to a single byte.

Unpacking: The four different right-shifts result in

dddcccba
0dddcccb
00dddccc
00000ddd

Masking (bitwise AND) with 1 (0b00000001) or 7 (0b00000111) results in

0000000a
0000000b
00000ccc
00000ddd

again.

Upvotes: 32

Scott Griffiths
Scott Griffiths

Reputation: 21925

If you need to this kind of thing a lot then bit shifting can become tedious and error prone. There are third-party libraries that can help - I wrote one called bitstring:

To pack and convert to a byte:

x = bitstring.pack('2*uint:1, 2*uint:3', a, b, c, d).bytes

and to unpack:

a, b, c, d = bitstring.BitArray(bytes=x).unpack('2*uint:1, 2*uint:3')

This is probably overkill for your example, but it's helpful when things get more complicated.

Upvotes: 5

sjr
sjr

Reputation: 9895

def encode(a, b, c, d):
  return a | b << 1 | c << 2 | d << 5

def decode(x):
  return x & 1, (x >> 1) & 1, (x >> 2) & 7, (x >> 5) & 7

Upvotes: 10

Stephen Waits
Stephen Waits

Reputation: 501

Pretty simple. Mask (for range), shift them into place, and or them together.

packed = ((a & 1) << 7) | ((b & 1) << 6) | ((c & 7) << 3) | (d & 7)

a = (packed >> 7) & 1
b = (packed >> 6) & 1
c = (packed >> 3) & 7
d = packed & 7

Upvotes: 4

Related Questions