Reputation: 257
I'm attempting to use Python's struct module to decode some binary headers from a GPS system. I have two types of header, long and short, and I have an example of reading each one below:
import struct
import binascii
packed_data_short = binascii.unhexlify('aa44132845013b078575e40c')
packed_data_long = binascii.unhexlify('aa44121ca20200603400000079783b07bea9bd0c00000000cc5dfa33')
print packed_data_short
print len(packed_data_short)
sS = struct.Struct('c c c B H H L')
unpacked_data_short = sS.unpack(packed_data_short)
print 'Unpacked Values:', unpacked_data_short
print ''
print packed_data_long
print len(packed_data_long)
sL = struct.Struct('c c c B H c b H H b c H L L H H')
unpacked_data_long = sL.unpack(packed_data_long)
print 'Unpacked Values:', unpacked_data_long
In both cases I get the length I am expecting - 12 bytes for a short header and 28 bytes for a long one. In addition all the fields appear correctly and (to the best of my knowledge with old data) are sensible values. All good so far.
I move this across onto another computer (running a different version of Python - 2.7.6 as opposed to 2.7.11) and I get different struct lengths using calcsize
, and get errors trying to pass it the length I've both calculated and the other version is content with. Now the short header is expecting 16 bytes and the long one 36 bytes.
If I pass the larger amount it is asking for most of the records are find until the "L" records. In the long example the first one is as expected but the second one, which should just be 0, is not correct, and consequently the two fields after are also incorrect. In light of the number of bytes the function wants I noticed that it is 4 for each of the "L"s, and indeed just running struct.calcsize('L')
I get 8 for the length in 2.7.6 and 4 for 2.7.11. This at least narrows down where the problem is, but I don't understand why it is happening.
At present I'm updating the second computer to Python 2.7.11 (will update once I have it), but I can't find anything in the struct documentation which would suggest there has been a change to this. Is there anything I have clearly missed or is this simply a version problem?
The documentation I have been referring to is here.
EDIT: Further to comment regarding OS - one is a 64 bit version of Windows 7 (the one which works as expected), the second is a 64 bit version of Ubuntu 14.04.
Upvotes: 3
Views: 1323
Reputation: 133929
This is not a bug; see struct
documentation:
Note
By default, the result of packing a given C struct includes pad bytes in order to maintain proper alignment for the C types involved; similarly, alignment is taken into account when unpacking. This behavior is chosen so that the bytes of a packed struct correspond exactly to the layout in memory of the corresponding C struct. To handle platform-independent data formats or omit implicit pad bytes, use standard size and alignment instead of native size and alignment: see Byte Order, Size, and Alignment for details.
To decode the data from that GPS device, you need to use <
or >
in your format string as described in 7.3.2.1. Byte Order, Size, and Alignment. Since you got it working on the other machine, I presume the data is in little-endian format, and it would work portably if you used
sS = struct.Struct('<cccBHHL')
sL = struct.Struct('<cccBHcbHHbcHLLHH')
whose sizes are always
>>> sS.size
12
>>> sL.size
28
Why did they differ? The original computer you're using is either a Windows machine or a 32-bit machine, and the remote machine is a 64-bit *nix. In native sizes, L
means the type unsigned long
of a C compiler. In 32-bit Unixen and all Windows versions, this is 32-bit wide.
In 64-bit Unixes the standard ABI on x86 is LP64 which means that long
and pointers are 64-bit wide. However, Windows uses LLP64; only long long
is 64-bit there; the reason is that lots of code and even Windows API itself has for long relied on long
being exactly 32 bits.
With <
flag present, L
and I
both are always guaranteed to be 32-bit. There was no problem with other field specifiers because their size remains the same on all x86 platforms and operating systems.
Upvotes: 6