Reputation: 9
Ok. I have spent the last 14 hours trying to figure this out. I have a binary file with the following contents - (much more, but this is truncated version). I wish to convert this to readable string format.
^@^P<9A>^@^@^A^@^@И^@^@^A^@^@Κ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^F<9A>^@^@^@^@^@^@^C^@FQ]U:^@^M^@ ^B^@^E^@^@^@`ESC^B^@d^@^@^@^T^R^B^@^E^@^@^@^@^@^@^@^T^R^B^@^@^@^@^@^@^@^@^@^K^B^@^@^@^@^@^C^@HQ]U:^@^S^@^@^@(^@^@^@V^@^@2^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^C^@HQ]U:^@^V^@<8C>I^B^@^E^@^@^@O^B^@ ^@^@^@O^B^@^E^@^@^@^@^@^@^@O^B^@^@^@^@^@^@^@^@^@^RK^B^@^@^@^@^@^C^@HQ]U:^@^Y^@0^A^@d^@^@^@1^A^@<96>^@^@^@L0^A^@d^@^@^@^@^@^@ ^@71^A^@^@^@^@^@^@^@^@^@0^A^@^@^@^@^@^C^@=Q]U:^@"^@<92>T^@^@2^@^@^@CN^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^AT^@^@ ^@^@^@^@^C^@FQ]U:^@(^@$^M^A^@ ^@^@^@^G^A^@2^@^@^@^O^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@R^L^A^@^@^@^@^@^C^@=Q]U:^@.^@<85>^B ^@^@^G^@^@g^B^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@<85>^B^@^@^@^@^@^@^C^@HQ]U:^@4^@^CH^@^@^Y^@^@^@G^@^@d^@^@^@ H^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^CH^@^@^@^@^@^@^C^@HQ]U:^@O^@^M^@^@<89>^@^@^@^G^M^@^@^A^@^@^P^N^@^@^@^@^@^@^@^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^M^@^@^@^@^@^@^C^@HQ]U:^@R^@^B^@^@^A^@^@^B^@^@<8C>0^B^@^B^@^@^A^@^@^@^@^@^@^B^@^@^@^@^@^@^@^@^@^@^B^@^@^@^@^@^@^C^@HQ]U:^@d^@F^A^@ ^@^@^@^TJ^A^@ ^@^@^@<98>M^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@<8A>G^A^@^@^@^@^@^C^@HQ]U:^@y^@j;^@^@^A^@^@^@=;^@^@d^@^@^@(<^@^@^C^@^@^@^@^@^@P<^@^@^@^@^@^@^@^@^@^@=;^@^@^@^@^@^@^C^@FQ]U:^@<88>^@&^@^@^A^@^@^@&^@^@d^@^@^@'^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@&^@^@ ^@^@^@^@^C^@FQ]U:^@<94>^@^H^@^@^@^@^@^H^@^@d^@^@^@ ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^H^@^@^@^@^@^@^C^@HQ]U:^@<9A>^@w^@^@^A^@^@^@\^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Z^@^@^@^@^@^@^C^@HQ]U:^@<9D>^@^A^B^@ ^@^@^@^A^B^@^A^@^@^@^A^@ ^@^@^@^@^@^@^@"^A^@^@^@^@^@^@^@^@^@^A^@^@^@^@^@^C^@HQ]U:^@^@4I^@^@^A^@^@^@DH^@^@^A^@^@^@^]B^@^@<9E>^@^@^@^@^@^@^@I^@^@^@^@^@^@^@^@^@^@MI^@^@^@^@^@^@^C^@FQ]U:^@^@y^@^@^A^@^@^@^Xy^@^@^A^@^@^@]a^@^@^C^@^@^@^@^@^@^@Px^@^@^@^@^@^@^@^@^@^@wy^@^@^@^@^@^@^C^@HQ]U: ^@^@V^^@^@^T^@^@^@^^@^@^A^@^@^@ZU^@^@e^@^@^@^@^@^@^@$^^@^@^@^@^@^@^@^@^@^@^^@^@^@^@^@^@^C^@DQ]U:^@^@DESC^@^@^A^@^@XESC^@ ^@^A^@^@^@<84>^\^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@<80>ESC^@^@^@^@^@^@^C^@HQ]U:^@^@ESC^@^@2^@^@^@ESC^@^@d^@^@^@ESC^@^@^@^@^@^@^@^@^@ESC^@^@^@^@^@^@^@^@^@^@ESC^@^@^@^@^@^@^C^@HQ]U:^@^@<8B>-^A^@^@^@^@@-^A^@<^@^@^@,^A^@@^@^@^@^@^@^@^@@-^A^@^@^@^@^@^@^@^@^@@-^A^@^@^@^@^@^C^@HQ]U:^@^@<86>^A^@^@U^@^@<86>^A^@^@@<9C>^@^@<90>^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ <86>^A^@^@^@^@^@^@^C^@FQ]U:^@^G^A^T^A^@ ^@^@^@Y^A^@Q^@^@^@^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^P^A^@^@^@^@^@^C^@HQ]U:^@^S^A^^^B^@^A^@^@^@<80>2^B^@ ^@^@^@^O^B^@n^@^@^@^@^@^@^@^^B^@^@^@^@^@^@^@^@^@^_^B^@^@^@^@^@^C^@DQ]U:^@^V^A4^A^@^@^@^@^P!^A^@K^@^@^@8D^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@*^A^@^@^@^@^@^C^@?Q]U:^@.^Aw^F^@^@^A^@^@^@h^F^@^@^O^@^@b^G^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@w^F^@^@^@^@^@^@^C^@HQ]U:^@1^A^A^@^A^@^@^@^A^@^\^B^@^@X^O^B^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^A^@^@^@^@^@^C^@HQ]U:^@4^A^X^F^@^@^G^@^@x^E^@^@^Z^D^@^@@^F^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^X^F^@^@^@^@^@^@^C^@FQ]U:^@=^A^L^F^@^A^@^@^@\^F^@^G^@^@^@X^F^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@S^F^@^@^@^@^@^C^@=Q]U:^@O^A^P!^A^@^@^@^@^@^@^@^@^@^A^@^@^@^H^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ "^A^@^@^@^@^@^C^@BQ]U:^@R^AX^@^@^Y^@^@^@^@^@^E^@^@^@x^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^C^@HQ]U:^@U^A^R^Q^@^@^A^@^@^@^P^@^@2^@^@^@^P^@^@^A^@^@^@^@^@^@^@^P^@^@^@^@^@^@^@^@^@^@^H^Q^@^@^@^@^@^@^C^@@Q]U:^@^^An^A^@^A^@^@^@pM ^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@hn^A^@^@^@^@^@^C^@HQ]U:^@p^A<9D>^A^@^B^@^@^@d<90>^A^@^E^@^@^@<90>^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^R<9C>^A^@^@^@^@^@^C^@HQ]U:^@s^A^A^@^Y^@^@^@ȩ^A^@^T^@^@^@а^A^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ ^@^@^@y^A^@^@^@^@^@^C^@HQ]U:^@|^A<8E>^@^@^A^@^@M<9E>^@^@d^@^@^@<^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
I have the template definition for this file as follows -
HEADER Transcode Short 2 Bytes, Timestamp Long 4 Bytes, Message Length Short 2 Bytes, (Total 8 Bytes)
DATA Security Token Short 2 Bytes, Last Traded Price Long 4 Bytes, Best Buy Quantity Long 4 Bytes, Best Buy Price Long 4 Bytes, Best Sell Quantity Long 4 Bytes, Best Sell Price Long 4 Bytes, Total Traded Quantity Long 4 Bytes, Average Traded Price Long 4 Bytes, Open Price Long 4 Bytes, High Price Long 4 Bytes, Low Price Long 4 Bytes, Close Price Long 4 Bytes, Filler Long 4 Bytes (Blank), (Total 50 Bytes)
I tried perl's pack, unpack, ord, reading byte by byte, getting rid of those "^@" and trying to make sense of what is remaining, seeming to be Hex code, but I am not able to make this readable in ASCII strings via perl. I also tried raw, encoding, decoding and even searched stackoverflow thoroughly. There were few problems in the same league but none of those guys shared the template to decode it back. I have it but still can't figure it out.
There is something basic that I am missing but can't really point out. Would really appreciate if someone can show me step by step with code how this conversion is supposed to be done.
Have never done this before ...
$ xxd 1.bin
0000000: 0300 3b51 5d55 3a00 0700 f87f 0000 0100 ..;Q]U:.........
0000010: 0000 587f 0000 0100 0000 6b67 0000 0100 ..X.......kg....
0000020: 0000 0000 0000 587f 0000 0000 0000 0000 ......X.........
0000030: 0000 e880 0000 0000 0000 0300 4851 5d55 ............HQ]U
0000040: 3a00 0a00 109a 0000 f401 0000 d098 0000 :...............
0000050: f401 0000 ce9a 0000 0000 0000 0000 0000 ................
0000060: 0000 0000 0000 0000 0000 0000 069a 0000 ................
0000070: 0000 0000 0300 4651 5d55 3a00 0d00 a80a ......FQ]U:.....
0000080: 0200 0500 0000 601b 0200 6400 0000 1412 ......`...d.....
0000090: 0200 0500 0000 0000 0000 1412 0200 0000 ................
00000a0: 0000 0000 0000 ac0b 0200 0000 0000 0300 ................
00000b0: 4851 5d55 3a00 1300 f8f2 0000 2800 0000 HQ]U:.......(...
00000c0: 56c2 0000 3200 0000 fbf9 0000 0000 0000 V...2...........
00000d0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
00000e0: a3f2 0000 0000 0000 0300 4851 5d55 3a00 ..........HQ]U:.
00000f0: 1600 8c49 0200 0500 0000 cc4f 0200 0a00 ...I.......O....
0000100: 0000 cc4f 0200 0500 0000 0000 0000 cc4f ...O...........O
0000110: 0200 0000 0000 0000 0000 124b 0200 0000 ...........K....
Still doesn't make much sense.
Upvotes: 0
Views: 2427
Reputation: 44
The biggest issue I see is that there's more to unpacking binary data than just knowing "short" or "long".
For numeric values, you need to specify whether or not the byte data is in little or big endian order. You also need to know whether or not you're dealing with signed or unsigned values.
For this example, I'm just going to assume everything is in little endian and unsigned: which is probably wrong, but it's up to you to tweak the pack templates once you find it out. In case you need a link, try http://perldoc.perl.org/functions/pack.html
I haven't tested this on my machine, so pardon if there are any errors, but this is roughly how I would go about what you're trying to do.
#!/usr/bin/perl
use strict;
$/ = undef; #may not be necessary, I haven't tested this
open IN, "path/to/file.ext"; #open the file for reading
read(IN,my $raw_header, 8); #read 8 bytes off of the file into $raw_header
my @header = unpack("vVv", $raw_header); #unpack header into array
read(IN,my $raw_data, 50); #similar
my @data = unpack("vVVVVVVVVVVVV", $raw_data); #"vV12" is also acceptable, assuming everything is little endian and unsigned
print join "\n", @header, @data; #print all the values in order on their own lines.
Upvotes: 1