Reputation: 23
I'm building a script that reads 381 bytes from a file and attempts to decode the input. I am interested in 348 of those bytes I am labelling "presets". 3 byte chunks of the presets ByteString can be decoded into a single Int16, and "values" below are the 116 Int16 I am interested in...
decodeFile :: FilePath -> IO [Maybe PresetValue]
decodeFile filename =
do h <- openFile (dir ++ filename) ReadMode
header <- h `BL.hGet` 32
presets <- h `BL.hGet` 348
f7 <- h `BL.hGet` 1
let values = Bin.runGet getPresets presets
hClose h
return values
getPresets = do
empty <- Bin.isEmpty
if empty
then return []
else do p <- getAndDecodeTriple
ps <- getPresets
return (p:ps)
getAndDecodeTriple = do
b1 <- Bin.getWord8
b2 <- Bin.getWord8
b3 <- Bin.getWord8
return $ decode (b1,b2,b3)
The problem I am having is decoding a 3 byte chunk, given I know how it was encoded in C++
Here is the C++ encoding
void SysexReader::sx_encode(int val, char* dest)
{
char encode;
// Encode Byte 1 (4 bits of payload)
encode = 0x40 | ((val >> 12) & 0x000F);
*dest++ = encode;
// Encode Byte 2 (6 bits of payload)
encode = (val >> 6) & 0x003F;
*dest++ = encode;
// Encode Byte 3 (6 bits of payload)
encode = val & 0x003F;
*dest = encode;
}
Here is the C++ encoding translated to Haskell...
type Encoding a = (a,a,a)
type PresetValue = Int16
encode :: Integral a => PresetValue -> Encoding a
encode val =
let f = fromIntegral
in (f $ enc1 val, f $ enc2 val, f $ enc3 val)
where
enc1 = or40 . and000F . (flip shiftR 12)
where and000F = (0x000F .&.)
or40 = (0x40 .|.)
enc2 = enc3 . flip shiftR 6
enc3 = (0x003F .&.)
My attempt at decoding uses the fact that I have the encoding procedure and I know that PresetValue can only be in the range of (0,127)
-- (3 Sysex Bytes) -> (Preset Value) --
-------------------------------------------------------
decode :: Integral a => (a,a,a) -> Maybe PresetValue
decode encoded =
case match of
[value] -> Just value
[] -> Nothing --error "encode not surjective"
many -> error "encode not injective"
where
match = filter (\x -> encode x == encoded) [0..127]
Unfortunately I can't decode all values, as you can see from the 116-entry list below containing Nothing in many places.
[Just 14,Just 84,Just 97,Just 117,Just 114,Just 117,Just 115,Just 32,Just 73,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Nothing,Nothing,Just 0,Nothing,Nothing,Nothing,Just 0,Nothing,Nothing,Just 0,Just 0,Nothing,Nothing,Just 0,Just 1,Nothing,Just 0,Nothing,Nothing,Just 0,Just 0,Just 0,Just 1,
Just 0,Just 0,Nothing,Just 5,Just 0,Just 1,Just 0,Just 0,Just 0,Nothing,Nothing,
Just 3,Just 2,Just 0,Just 0,Nothing,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Nothing,Nothing,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Just 0,Nothing]
What am I doing wrong? I feel like it must be the types I am using to represent each chunk from the incoming file. Or maybe I'm losing information using fromIntegral.
I've been a developer for a while and have never posted a question on here and always fought through for an answer, but I'm really lost on this one. Thanks.
Upvotes: 1
Views: 239
Reputation: 50819
It might be better to use openBinaryFile
in place of openFile
. This shouldn't make a difference here, since I believe hGet
ignores whether files have been open in text or binary mode, but it's good practice.
Also, it would also be better to use a Word16
in place of your Int16
. The C code is using an int
, so any 16-bit integer value is going to be unsigned. Again, if you really are only dealing with presets in the range [0..127] it shouldn't matter, but it seems like good practice.
There's nothing obviously wrong with your code that I can see, but it's pretty much impossible to duplicate your problem without access to the input file. I might suggest using a better implementation of decode
:
decode :: (Word8, Word8, Word8) -> Maybe PresetValue
decode (a,b,c)
| 0x40 <= a && a <= 0x4f
&& b <= 0x3f && c <= 0x3f
= Just $ (fromIntegral a .&. 0xf) `shiftL` 12 .|. fromIntegral b `shiftL` 6 .|. fromIntegral c
decode _ = Nothing
which handles all possible encoded preset values from 0 to 65535. If you still get Nothing
values in your decode, then the encoded file is probably corrupt.
It looks like the first bad value is at offset 19, corresponding to bytes 57-59 (0x39-0x41), or accounting for the 32-byte header, bytes 89-91 (0x59-0x61). It might be helpful to open the file in a hex editor and see what three bytes are at that offset that are giving you trouble.
Upvotes: 1