Reputation: 954
let me start off by admitting I am not a computer scientist, so if this is a dumb question I apologize in advance. I am trying to figure out a binary file format using a hex editor. I am able to read ints and chars fine (using numpy in python) but when I get to the floats I am having issues, they don't appear to be IEEE 754 binary 32 and when I try to use numpy with dtype 'f4' to read this chunk of memory it returns an incorrect value. I have tried switching the endianness to no avail. Any insight to what format these numbers are in would be useful, but more importantly, how would I read them in python (assuming they are in a byte string)? The following is an example with the known value given in decimals on the top, the hex values found in the editor, followed by the binary in the file.
250
00 00 7a 43
00000000 00000000 01111010 01000011
-250
00 00 7a c3
00000000 00000000 01111010 11000011
0
00 00 00 00
00000000 00000000 00000000 00000000
200
00 00 48 43
00000000 00000000 01001000 0011 1111
250.1
9a 19 7a 43
10011010 00011001 01111010 01000011
Upvotes: 0
Views: 2393
Reputation: 24052
The values you're showing indicate that your data is being stored little-endian. The 32-bit IEEE floating point string for 250.0, for example, is 437a0000
in hex. The corresponding little-endian byte sequence is:
00 00 7a 43
This is exactly what you've been seeing.
I think the problem is that Python usually uses double-precision for its float
type, i.e. 64-bit floating point values.
One way to correctly unpack these values is with struct
:
>>> import struct
>>> struct.unpack('>f', '\x43\x7a\x00\x00')[0]
250.0
>>>
The above works for Python 2. In Python 3, you need a buffer, such as:
>>> struct.unpack('>f', b'\x43\x7a\x00\x00')[0]
250.0
>>>
or:
>>> struct.unpack('>f', bytes([0x43, 0x7a, 0x00, 0x00]))[0]
250.0
>>>
For more info, see the struct documentation for Python 2 or Python 3. This seems to be exactly what you're looking for.
Upvotes: 1