bugpulver
bugpulver

Reputation: 35

python3 struct.pack with strings as parameters

i'm actually wondering about python3's struct.pack behaviour, but maybe i missed something.

i'm submitting an .jpg file via UDP. Btw: the socket.sendTo() and sendAll() function throws a "Python IOError: [Errno 90] Message too long" when i try to submit the whole file (~200kB) at once. So i submit the file in pieces of 1024 bytes. No prob, i'm just wondering why i find nothing about this size-limitation in the python docs.

Anyway, my main issue is: i need struct.pack to put some information at the beginning of each piece -> 2 fixed-size strings.

but when i do

chunk = struct.pack("!3c4cI", bytes("JPG", "utf-8"), bytes(dev_id, "utf-8"), i)

it goes "struct.error: pack expected 8 items for packing (got 3)"

so i have to go

chunk = struct.pack("!3c4ci", b"J", b"P", b"G", 
    bytes(dev_id[0:1], "utf-8"),
    bytes(dev_id[1:2], "utf-8"),
    bytes(dev_id[2:3], "utf-8"),
    bytes(dev_id[3:4], "utf-8"), i)

to make it work. Why is that!?

Upvotes: 2

Views: 7320

Answers (1)

struct.pack requires that each item is passed as a separate argument. Additionally Python makes distinction between char and byte, even though in C they're synonyms, that's why you need to have one-byte bytes values for the c, instead of integers in range 0 .. 255.

However, struct also supports the s format specifier, where s stands for a string of given length:

>>> dev_id, i = 'R2D2', 42
>>> struct.pack("!3s4sI", b"JPG", dev_id.encode(), i)
b'JPGR2D2\x00\x00\x00*'

Alternatively, if you're using at least Python 3.5, then thanks to PEP 448 -- Additional Unpacking Generalizations you can use the B (unsigned byte) format in conjunction with splat operator * like this:

>>> struct.pack("!3B4BI", *b"JPG", *dev_id.encode(), i)

What * does here is to unpack every byte from the given bytes value as an integer within range 0 ... 255 into separate arguments; if dev_id.encode() results in 4 UTF-8 bytes, then a total of 8 arguments will be passed to the struct.pack. And unlike c, B accepts a single byte value as an integer.


P.S. notice that I used b'JPG' directly instead of calling bytes('JPG', 'UTF-8'), and likewise called .encode() on the string which encodes in UTF-8 by default, for shorter code.

Upvotes: 3

Related Questions