Wei
Wei

Reputation: 341

strange bytes length when using python3 struct.pack()

I am writing a python3 script to write some numbers as binary on file. While doing it I found something quite strange. for example, the following python code writes a "unsign short" and a "float" number to tmp file:

import struct
with open('tmp', "wb") as f:
  id1 = 1
  i = 0.5785536878880112
  fmt = "Hf"
  data = struct.pack('Hf', id1, i)
  f.write(data)
  print("no. of bytes:%d"%struct.calcsize(fmt))

According to the docs "H" (unsigned short) is 2 bytes and "f"(float) is 4 bytes. so I'd expect a 6-byte file, however the output is a 8byte data:

01 00 00 00 18 1c 14 3f

as indicated by

struct.calcsize(fmt)

which says "Hf" is of 8 bytes in size

if I do it separately, e.g.

data = struct.pack('H', id1)
f.write(data)
data = struct.pack('f', i)
f.write(data)

then the output is an expected 6-byte file:

01 00 18 1c 14 3f

what is happening here?

Upvotes: 2

Views: 1249

Answers (2)

Alastair McCormack
Alastair McCormack

Reputation: 27704

According to the documentation, specifying the byte order removes any padding:

No padding is added when using non-native size and alignment, e.g. with ‘<’, ‘>’, ‘=’, and ‘!’.

Therefore, assuming you require little endian packing, the following gives the required output:

>>> struct.pack('<Hf', id1, i)
'\x01\x00\x18\x1c\x14?'

Note the <. (3f can be encoded as ASCII ?, hence the replacement)

Upvotes: 5

Tim Pietzcker
Tim Pietzcker

Reputation: 336088

struct.pack() aligns values according to their length, so a 4-byte value will always start at an index divisible by 4. If you write the data in chunks like you did in the second example, this padding can't be performed, obviously.

As the docs you linked to say:

By default, C numbers are represented in the machine’s native format and byte order, and properly aligned by skipping pad bytes if necessary (according to the rules used by the C compiler).

Upvotes: 1

Related Questions