Reputation: 165242
Consider a file that contains binary data represented as bytes:
with open('foo', 'rb') as f:
bs = f.read()
print(bs)
# b'\x00\x01\x00\x01\x00\x01'...
The bytes can only have either 0
or 1
values.
What is the most performant way to take a group of 32 bit/bytes and parse them into a (32-bit) integer? The struct
module is probably what I need but I couldn't find an immediate way to do this.
Alternative methods that involve casting bytes into chars and then parsing the integer from a bitstring e.g. int('01010101...', 2)
don't perform as fast as I need them to for my use case.
Upvotes: 0
Views: 281
Reputation: 165242
Considering the test number 101010...
:
b = b'\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00'
print(0b10101010101010101010101010101010)
# 2863311530
Map bytes to string, then parse the int:
s = ''.join(map(lambda x: chr(x+48), b))
i = int(s, 2)
print(i)
# 2863311530
Iterate over the bytes and build the integer using bitshifts:
idx = 0
tmp = 0
for bit in b:
tmp <<= 1
tmp |= bit
idx += 1
if idx == 32:
print(tmp)
idx = 0
tmp = 0
# 2863311530
Upvotes: 2