Simd
Simd

Reputation: 21343

How to read a simple binary file

I have a binary file that consists of consecutive binary subsequences of fixed and equal length. Each subsequence can be unpacked into the same number of values. I know the length of each subsequence and the binary format of the values.

How can I work through the binary file, chopping out the subsequences, unpacking their content and write them out as csv as I go.

I know how to write out as csv. My problem is the reading from file and unpacking part. This is my non-working code.

import csv
import sys
import struct
writer = csv.writer(sys.stdout, delimiter=',', quoting=csv.QUOTE_NONE,escapechar='\\')  
?  rows = sys.stdin. ?
?  header = id, time ....
? write the header with csv
i = 0
for row in rows:
    unpacked_row = unpack('QqqqddiBIBcsbshlshhlQB',row)
    writer.writerow(unpacked_row)
    i += 1

Possible solution using Reading binary file in Python and looping over each byte and the answer of ignacio.

First calculate chunksize = struct.calcsize()

def bytes_from_file(filename, chunksize=8192):
    with open(filename, "rb") as f:
        while True:
            chunk = f.read(chunksize)
            if chunk:
                yield chunk
            else:
                break

# example:
for chunk in bytes_from_file('filename'):
#        row = unpack(chunk)
#        write out row as csv

Upvotes: 1

Views: 169

Answers (2)

jfs
jfs

Reputation: 414795

You could use struct.Struct to unpack values from a file:

#!/usr/bin/env python
import csv
import sys
from struct import Struct

record = Struct('QqqqddiBIBcsbshlshhlQB')
with open('input_filename', 'rb') as file:
    writer = csv.writer(sys.stdout, quoting=csv.QUOTE_NONE, escapechar='\\')
    while True:
        buf = file.read(record.size)
        if len(buf) != record.size: 
            break
        writer.writerow(record.unpack_from(buf))

You could also write the while-loop as:

from functools import partial

for buf in iter(partial(file.read, record.size), b''):
    writer.writerow(record.unpack_from(buf))

Upvotes: 1

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799280

You need to calculate the size of the structure (Hint: struct.calcsize()) and read some multiple of that from the file at a time. You cannot directly iterate over the input as you can with a text file, since there is no delimiter as such.

Upvotes: 1

Related Questions