user966939
user966939

Reputation: 719

Fastest way to pack and unpack binary data into list

I am writing a script which will read 32 bytes of data from over thousands of files. The 32 bytes consists of 8 pairs of 16-bit integers and I want to unpack them to Python integers to build a list consisting of average numbers. I would then like to print out a hex string (packed the same way it was unpacked) of the list, along with the list object itself, to the user running the script.

My current code looks like this, and it's slower than I'd like it to be (even considering the heavy I/O load):

import os
import sys
import struct
import binascii

def list_str(list):
    return str(list)

def list_s16be_hex(list):
    i = 0
    bytes = b""
    while i < len(list):
        bytes += struct.pack(">h", list[i])
        i += 1
    return binascii.hexlify(bytes).decode("ascii")

def main():
    averages = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
    root = os.path.dirname(__file__)
    for dirpath, dirnames, filenames in os.walk(root):
        for filename in filenames:
            with open(os.path.join(dirpath, filename), "rb") as f:
                f.seek(0x10)
                tmp = f.read(32)

            i = 0
            while i < 32:
                averages[i//2] = (averages[i//2] + struct.unpack(">h", tmp[i:i+2])[0]) // 2
                i += 2

    print("Updated averages (hex): " + list_s16be_hex(averages))
    print("Updated averages (list): " + list_str(averages))

    return 0

if __name__=="__main__":
    main()

Is there a more efficient way of doing this?

Upvotes: 1

Views: 1440

Answers (1)

fpbhb
fpbhb

Reputation: 1519

You can unpack all 16 integers at once, using struct.unpack(">16h", tmp), which should be significantly faster for the computational part. Otherwise I'd expect your program runtime to be dominated by the I/O, which you can check by measuring it's runtime without the average computation. There is not so much you can do about the I/O.

Upvotes: 2

Related Questions