Reputation: 1063
What I am really doing is creating a BMP file from JPEG using python and it's got some header data which contains info like size, height or width of the image, so basically I want to read a JPEG file, gets it width and height, calculate the new size of a BMP file and store it in the header.
Let's say the new size of the BMP file is 40000 bytes whose hex value is 0x9c40
, now as there is 4 byte space to save this in the header, we can write it as 0x00009c40
. In BMP header data, LSB is written first and then MSB so I have to write, 0x409c0000
in the file.
My Problems:-
I was able to do this in C but I am totally lost how to do so in Python.
For example, if I have i=40000
, and by using str=hex(i)[2:]
I got the hex value, now by some coding I was able to add the extra zeros and then reverse the code. Now how to write this '409c0000'
data in the file as hex?
The header size is 54 bytes for BMP file, so is there is another way to just store the data in a string like str='00ffcf4f...'
(upto 54 bytes) and just convert the whole str at once as hex and write it to file?
My friend told me to use unhexlify
from binascii
,
by doing unhexlify('fffcff')
I get '\xff\xfc\xff'
which is what I want but when I try unhexlify('3000')
I get '0\x00'` which is not what I want. It is same for any value containing 3, 4, 5, 6 or 7. Is it the right way to do this?
Upvotes: 1
Views: 4590
Reputation: 1121784
You are not writing hex, you are writing binary data. Hexadecimal is a helpful notation when dealing with binary data, but don't confuse the notation with the value.
Use the struct
module to pack integer data into binary structures, the same way C would.
binascii.unhexlify
also is a good choice, provided you already have the data in a string using hex notation. The output is correct, but the binary representation only uses hex escapes for bytes outside the printable ASCII range.
Thus fffcff
does correctly becomes \xff\xfc\xff
, representing 3 bytes in hex escape notation, and 3000
is \x30\x00
, but \x30
is the '0'
character in ASCII, so the Python representation for that byte simply uses that ASCII character, as that is the most common way to interpret bytes.
Packing the integer value 40000 using struct.pack()
as an unsigned integer (little endian) then becomes:
>>> import struct
>>> struct.pack('<I', 40000)
'@\x9c\x00\x00'
where the 40
byte is represented by the ASCII character for that byte, the @
glyph.
If this is confusing, you can always create a new hex representation by going the other way and use 0binascii.hexlify()
function](https://docs.python.org/2/library/binascii.html#binascii.hexlify) to create a hexadecimal representation for yourself, just to debug the output:
>>> import binascii
>>> binascii.hexlify(struct.pack('<I', 40000))
'409c0000'
and you'll see that the @
byte is still the right hex value.
Upvotes: 6