Yuval Adam
Yuval Adam

Reputation: 165242

Reading 'binary' bytes from a file in Python

Consider a file that contains binary data represented as bytes:

with open('foo', 'rb') as f:
    bs = f.read()
    print(bs)
    # b'\x00\x01\x00\x01\x00\x01'...

The bytes can only have either 0 or 1 values.

What is the most performant way to take a group of 32 bit/bytes and parse them into a (32-bit) integer? The struct module is probably what I need but I couldn't find an immediate way to do this.

Alternative methods that involve casting bytes into chars and then parsing the integer from a bitstring e.g. int('01010101...', 2) don't perform as fast as I need them to for my use case.

Upvotes: 0

Views: 281

Answers (1)

Yuval Adam
Yuval Adam

Reputation: 165242

Workaround Solutions

Considering the test number 101010...:

b = b'\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00\x01\x00'
print(0b10101010101010101010101010101010)
# 2863311530

Map bytes to string, then parse the int:

s = ''.join(map(lambda x: chr(x+48), b))
i = int(s, 2)
print(i)
# 2863311530

Iterate over the bytes and build the integer using bitshifts:

idx = 0
tmp = 0
for bit in b:
    tmp <<= 1
    tmp |= bit
    idx += 1
    if idx == 32:
        print(tmp)
        idx = 0
        tmp = 0
# 2863311530

Upvotes: 2

Related Questions