user3222338
user3222338

Reputation: 65

How does struct.unpack and struct.pack works?

I'm currently trying to to learn how to parse PPM file. I did the following in a python interpreter:

>>> x = open('file.ppm')
>>> x.readline()
'P6\n'
>>> x.readline()
'2 3\n'
>>> x.readline()
'255\n'
>>> x.readline()
'\n'
>>> x.readline()
'\x174R\x03\xd7\x1e\xb5e!-\xcd(D\\oL\x01'

I understand the basic structure of a PPM file. I'm most curious about the last line, however. The byte encoding that contains the info on the pixel's colour. The file above should parse back into

P6
2 3
255
10 23 52 82 3 215 30 181 101 33 45 205 40 68 92 111 76 1

By using struct.pack('B',x), I see that the integers get packed into the byte encoding above. However, I'm not sure how to reverse this process using struct.unpack. Most importantly, I'm not sure where to cut off each of the byte encoding since they all appear on the same line, and each doesn't seem to have the same length either (?).

I also tried to pack the whole line by doing struct.pack('I','\x174R\x03\xd7\x1e\xb5e!-\xcd(D\\oL\x01'). I don't understand why it could not directly convert the byte encoding into integers.

How can I use struct.pack(...) to parse the byte encoding back into integers? Also, what is happening as those values are being packed/ unpacked?

Upvotes: 3

Views: 3362

Answers (1)

falsetru
falsetru

Reputation: 369074

First three strings are not packed by struct.pack. So simply use them (strip or rstrip if you want remove spaces around).

>>> 'P6\n'
'P6\n'
>>> 'P6\n'.rstrip()
'P6'

For the last bytes:

>>> b = b'\x174R\x03\xd7\x1e\xb5e!-\xcd(D\\oL\x01'
>>> struct.unpack('%dB' % len(b), b)
(23, 52, 82, 3, 215, 30, 181, 101, 33, 45, 205, 40, 68, 92, 111, 76, 1)

or using bytearray: (You can use bytes instead in Python 3.x); Iterating bytearray yields ints.

>>> list(bytearray(b))
[23, 52, 82, 3, 215, 30, 181, 101, 33, 45, 205, 40, 68, 92, 111, 76, 1]

NOTE As Martijn Pieters commented, you'd better open the file with binary mode when you're dealing with binary data.

f = open('file.ppm', 'rb') # b: binary mode

Upvotes: 1

Related Questions