Dave Rove
Dave Rove

Reputation: 953

Python bit list to byte list

I have a long 1-dimensional list of integer 1's and 0's, representing 8-bit binary bytes. What is a neat way to create a new list from that, containing the integer bytes.

Being familiar with C, but new to Python, I've coded it in the way I'd do it with C: an elaborate structure that loops though each bit. However, I'm aware that the whole point of Python over C is that such things can usually be done compactly and elegantly, and that I should learn how to do that. Maybe using list comprehension?

This works, but suggestions for a more "Pythonic" way would be appreciated:

#!/usr/bin/env python2
bits = [1,0,0,1,0,1,0,1,0,1,1,0,1,0,1,1,1,1,1,0,0,1,1,1]
bytes = []
byt = ""
for bit in bits:
  byt += str(bit)
  if len(byt) == 8:
    bytes += [int(byt, 2)]
    byt = ""
print bytes

$ bits-to-bytes.py
[149, 107, 231]

Upvotes: 4

Views: 4989

Answers (3)

mmj
mmj

Reputation: 5780

Since you start from a numeric list you might want to avoid string manipulation. Here there are a couple of methods:

  • dividing the original list in 8 bits chunks and computing the decimal value of each byte (assuming the number of bits is a multiple of 8); thanks to Padraic Cunningham for the nice way of dividing a sequence by groups of 8 subelements;

    bits = [1,0,0,1,0,1,0,1,0,1,1,0,1,0,1,1,1,1,1,0,0,1,1,1]
    [sum(b*2**x for b,x in zip(byte[::-1],range(8))) for byte in zip(*([iter(bits)]*8))]
    
  • using bitwise operators (probably more efficient); if the number of bits is not a multiple of 8 the code works as if the bit sequence was padded with 0s on the left (padding on the left often makes more sense than padding on the right, because it preserves the numerical value of the original binary digits sequence)

    bits = [1,0,0,1,0,1,0,1,0,1,1,0,1,0,1,1,1,1,1,0,0,1,1,1]
    n = sum(b*2**x for b,x in zip(bits[::-1],range(len(bits)))) # value of the binary number represented by 'bits'
    # n = int(''.join(map(str,bits)),2) # another way of finding n by means of string manipulation
    [(n>>(8*p))&255 for p in range(len(bits)//8-(len(bits)%8==0),-1,-1)]
    

Upvotes: 0

Padraic Cunningham
Padraic Cunningham

Reputation: 180401

You can slice the list into chunks of 8 elements and map the subelements to str:

[int("".join(map(str, bits[i:i+8])), 2) for i in range(0, len(bits), 8)]

You could split it up into two parts mapping and joining once:

mapped = "".join(map(str, bits))
[int(mapped[i:i+8], 2) for i in range(0, len(mapped), 8)]

Or using iter and borrowing from the grouper recipe in itertools:

it = iter(map(str, bits))
[int("".join(sli), 2) for sli in zip(*iter([it] * 8))]

iter(map(str, bits)) maps the content of bits to str and creates an iterator, zip(*iter([it] * 8)) groups the elements into groups of 8 subelements.
Each zip(*iter.. consumes eight subelements from our iterator so we always get sequential groups, it is the same logic as the slicing in the first code we just avoid the need to slice.

As Sven commented, for lists not divisible by n you will lose data using zip similarly to your original code, you can adapt the grouper recipe I linked to handle those cases:

from itertools import zip_longest # izip_longest python2

bits = [1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1,1,0]
it = iter(map(str, bits))

print( [int("".join(sli), 2) for sli in izip_longest(*iter([it] * 8),fillvalue="")])
[149, 107, 231, 2] # using just zip would be  [149, 107, 231] 

The fillvalue="" means we pad the odd length group with empty string so we can still call int("".join(sli), 2) and get correct output as above where we are left with 1,0 after taking 3 * 8 chunks.

In your own code bytes += [int(byt, 2)] could simply become bytes.append(int(byt, 2))

Upvotes: 4

Cyphase
Cyphase

Reputation: 12012

Padraic's solution is good; here's another way to do it:

from itertools import izip_longest


def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # Taken from itertools recipes
    # https://docs.python.org/2/library/itertools.html#recipes
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

bits = [1, 0, 0, 1, 0, 1, 0, 1,
        0, 1, 1, 0, 1, 0, 1, 1,
        1, 1, 1, 0, 0, 1, 1, 1]

byte_strings = (''.join(bit_group) for bit_group in grouper(map(str, bits), 8))
bytes = [int(byte_string, 2) for byte_string in byte_strings]

print bytes  # [149, 107, 231]

Upvotes: 1

Related Questions