Can someone explain Python struct unpacking?

Question

I have a binary file made from C structs that I want to parse in Python. I know the exact format and layout of the binary but I am confused on how to use Python Struct unpacking to read this data.

Would I have to traverse the whole binary unpacking a certain number of bytes at a time based on what the members of the struct are?

C File Format:

typedef struct {
  int data1;
  int data2;
  int data4;
} datanums;

typedef struct {
  datanums numbers;
  char *name;
 } personal_data;

Lets say the binary file had personal_data structs repeatedly after another.

abarnert · Accepted Answer

Assuming the layout is a static binary structure that can be described by a simple struct pattern, and the file is just that structure repeated over and over again, then yes, "traverse the whole binary unpacking a certain number of bytes at a time" is exactly what you'd do.

For example:

record = struct.Struct('>HB10cL')

with open('myfile.bin', 'rb') as f:
    while True:
        buf = f.read(record.size)
        if not buf:
            break
        yield record.unpack(buf)

If you're worried about the efficiency of only reading 17 bytes at a time and you want to wrap that up by buffering 8K at a time or something… well, first make sure it's an actual problem worth optimizing; then, if it is, loop over unpack_from instead of unpack. Something like this (untested, top-of-my-head code):

buf, offset = b'', 0
with open('myfile.bin', 'rb') as f:
    if len(buf) < record.size:
        buf, offset = buf[offset:] + f.read(8192), 0
        if not buf:
            break
    yield record.unpack_from(buf, offset)
    offset += record.size

Or, even simpler, as long as the file isn't too big for your vmsize, just mmap the whole thing and unpack_from on the mmap itself:

with open('myfile.bin', 'rb') as f:
    with mmap.mmap(f, 0, access=mmap.ACCESS_READ) as m:
        for offset in range(0, m.size(), record.size):
            yield record.unpack_from(m, offset)

Can someone explain Python struct unpacking?

Answers (2)

Related Questions