Reputation: 2518
I'm attempting to read and parse a binary file with Python.
The issue is that the data in the file can be in little-endian or big-endian format, as well as 32- or 64-bit values. In the file header there are a few bytes that specify the data format and size. Let's assume that I've read these in and I know the format and size, and I try to construct a format string as follows:
if (bitOrder == 1): # little-endian format
strData = '<'
elif (bitOrder == 2): # bit-endian format
strData = '>'
if (dataSize == 1): # 32-bit data
strLen = 'L'
elif (dataSize == 2):
strLen = 'q'
strFormat = strData + strLen
struct.unpack(strFormat, buf)
When I do this I get the error: "struct.error: unpack requires a string argument of length 2"
, yet if I write struct.unpack('<L', buf)
I get the expected result.
On an interactive shell, if I run type(strFormat)
I get the result <type, 'str'>
and when I run len(strFormat)
I get a result of 2
.
So, being relatively new to Python, I have the following questions:
Is not str
the same as a string? If not, how do I convert between the two?
How would I correctly construct the format string for use in an unpack
function?
------ edit ------ to address comments:
at this time I'm using python-2.7 due to constraints of other projects.
I'm trying to avoid posting my code (its several hundred lines long), however here is an interact python (run from inside emacs, if that matters) that shows the behaviour I'm experiencing:
Python 2.7.5 (default, Jun 17 2014, 18:11:42)
[GCC 4.8.2 20140120 (Red Hat 4.8.2-16)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> >>> >>> >>>
>>> import array
>>> import struct
>>> header = array.array('B',[0x7f, 0x45, 0x4c, 0x46, 0x02, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00,0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00,0x3e, 0x00, 0x01, 0x00, 0x00, 0x00, 0x40, 0x04, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x40, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x70, 0x11, 0x00, 0x00, 0x00,0x00, 0x00, 0x00, 0x00,0x00, 0x00, 0x00, 0x40, 0x00, 0x38, 0x00, 0x09, 0x00, 0x40, 0x00, 0x1e, 0x00, 0x1b, 0x00])
>>> entry = header[24:32]
>>> phoff = header[32:40]
>>> shoff = header[40:48]
>>> strData = '<'
>>> strLen = 'H'
>>> strFormat = strData + strLen
>>> print strFormat
<H
>>> type(strFormat)
<type 'str'>
>>> len(strFormat)
2
>>> temp = struct.unpack(strFormat, entry)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
struct.error: unpack requires a string argument of length 2
>>>
fixed types in original code.
Upvotes: 1
Views: 383
Reputation: 30151
Going by the interactive session, your problem would appear to be this:
temp = struct.unpack(strFormat, entry)
Earlier, you said:
entry = header[24:32]
entry
is 8 bytes long, but strFormat
says it should be 2 bytes long. That's what struct
is complaining about.
It should also be a bytes
object (str
under 2.x), not an array.array
.
Upvotes: 1