Gabriel Brunheira
Gabriel Brunheira

Reputation: 3

struct.unpack() requires wrong length from bytes object with specific format pattern

I'm trying to decode a bytes object with 'BQ' format (i.e., unsigned char + unsigned long) on Python 3.6.2, which length is supposed to be 9 bytes, but struct.unpack gets an error asking for more bytes:

In [96]: struct.unpack('BQ',bytesObj)
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-96-667267f631a1> in <module>()
----> 1 struct.unpack('BQ',bytesObj)

error: unpack requires a bytes object of length 16

When I change the order of the format specifier to 'QB', it doesn't complain about the length, although it's supposed to be the same:

In [97]: struct.unpack('QB',bytesObj)
Out[97]: (35184770581765, 0)

But it gets even stranger when I replace 'B' for 'f', which should increase the required lenght in 3 bytes, but the error stays the same:

In [98]: struct.unpack('fQ',bytesObj)
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-98-c3792c78fd43> in <module>()
----> 1 struct.unpack('fQ',bytesObj)

error: unpack requires a bytes object of length 16

In [99]: struct.unpack('Qf',bytesObj)
---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-99-78065617d606> in <module>()
----> 1 struct.unpack('Qf',bytesObj)

error: unpack requires a bytes object of length 12

No matter which format I used before 'Q', it gets always the same error asking for a length of 16. It seems to work fine only when there's no preceeding format to 'Q'.

Am I missing something?

Upvotes: 0

Views: 1442

Answers (1)

AS Mackay
AS Mackay

Reputation: 2847

The jump from 9 to 16 bytes happens because Python adds packing bytes to ensure that the elements in a struct are aligned on the same boundaries as in C.

There is an explanation for this in section 7.3 of the manual.

The q format elements (long long) and Q format elements (unsigned long long) are forced to align STARTING on 8 byte boundaries. Padding bytes are added AFTER any elements BEFORE q/Q to ensure this.

Running the following code shows this in action:

from struct import *

print "QB: " + str(calcsize ('QB'))
bytesObj = pack('QB', 1, 2)
print unpack('QB', bytesObj)

print "BQ: " + str(calcsize ('BQ'))
bytesObj = pack('BQ', 1, 2)
print unpack('BQ', bytesObj)

print "qB: " + str(calcsize ('qB'))
bytesObj = pack('qB', 1, 2)
print unpack('qB', bytesObj)

print "Bq: " + str(calcsize ('Bq'))
bytesObj = pack('Bq', 1, 2)
print unpack('Bq', bytesObj)

print "Qf: " + str(calcsize ('Qf'))
bytesObj = pack('Qf', 1, 2.0)
print unpack('Qf', bytesObj)

print "fQ: " + str(calcsize ('fQ'))
bytesObj = pack('fQ', 1.0, 2)
print unpack('fQ', bytesObj)

This gives the following output:

QB: 9
(1, 2)
BQ: 16
(1, 2)
qB: 9
(1, 2)
Bq: 16
(1, 2)
Qf: 12
(1, 2.0)
fQ: 16
(1.0, 2)

Hope this helps.

(Edit): Also, as pointed out by the OP, this default behavior can be overridden; see the link in the comment below.

Upvotes: 1

Related Questions