Reputation: 23
I have a raw data file, that represent voltages recorded by a device. I want to convert the binary file into a file with numbers that can be plotted.
The raw data is little endian and each sample is 3 bytes (24bit). When I view the file with a text editor, I can strange, unreadable characters as shown below:
</DataInfo>
ò ì ê ì ð ô ù þ ý ù ø ÷ õ ò ï î î ï ò ô ò ï î î ï í é ç ë ò ø ú ü þ
Which makes sense because the data is still binary. So I used command line to produce a hexadecimal file that looks like:
000007F0 2F 46 50 3E 0A 3C 2F 44 61 74 61 49 6E 66 6F 3E /FP>.</DataInfo>
00000800 0D 0A 0D 0A F2 08 00 EC 08 00 EA 08 00 EC 08 00 ....ò..ì..ê..ì..
00000810 F0 08 00 F4 08 00 F9 08 00 FE 08 00 00 09 00 FD ð..ô..ù..þ.....ý
00000820 08 00 F9 08 00 F8 08 00 F7 08 00 F5 08 00 F2 08 ..ù..ø..÷..õ..ò.
My issue is when I convert the hexadecimal to a decimal number, the number is way too large to be correct and I can't figure out what went wrong?
I don't have a lot of knowledge on binary files but not even sure where to look, so any guidance is greatly appreciated!
FYI I have some python programming skills and the file I am working with can be seen here: https://drive.google.com/file/d/1WZ6OBPLKIqrxw1GsvG776jqD8U08CMD9/view?usp=sharing
Upvotes: 2
Views: 892
Reputation: 371
It looks like there is a header which ends with 0D 0A 0D 0A
.
So the data starts at byte 0x804.
You can use this trick to parse it (NumPy: 3-byte, 6-byte types (aka uint24, uint48))
a = "F2 08 00 EC 08 00 EA 08 00 EC 08 00".replace(" ", "")
a = bytes(bytearray.fromhex(a))
a = np.frombuffer(a, dtype='<u1')
e = np.zeros(a.size // 3, np.dtype('<u4'))
for i in range(3):
e.view(dtype='<u1')[i::4] = a.view(dtype='<u1')[i::3]
print(e)
# Outputs [2290 2284 2282 2284]
Upvotes: 2