Reputation: 6778
I want to use a lazy Bytestring
to represent a stream of bits. I need to be able to take arbitrary slices of bits from this stream efficiently. For example, I might have a ByteString
of length 10, and I'd like slice a new ByteString
consisting of bits 24-36 from the original ByteString
.
The problem is that ByteStrings
are arrays of Word8
s, so taking ranges that are not multiples of 8 is difficult. The best I've been able to come up with is this, using Data.Binary
and Data.Binary.Bits
. Note that get32BitRange
is specifically for ranges <= 32.
get32BitRange :: Int -> Int -> ByteString -> ByteString
get32BitRange lo hi = runPut . putWord32be
. runGet (runBitGet . block $ word8 (8 - (lo `quot` 8)) *> word32be len)
. drop offset
where len = hi - lo
lo' = lo `div` 8
offset = fromIntegral lo' - 1
The algorithm is:
Word8
containing the bits I wantByteString
up to that indexWord8
, so skip thoseWord32
Word32
into a ByteString
It looks more than a little ugly, is there a more efficient way to grab arbitrary slices of bits from a ByteString
?
EDIT: here is a more efficient version
get32BitRange :: Int -> Int -> ByteString -> Word32
get32BitRange lo hi = runGet get
where get = runBitGet . block $ byteString byteOff *> word8 bitOff *> word32be len
len = hi - lo
(byteOff, bitOff) = lo `quotRem` 8
Upvotes: 2
Views: 928
Reputation: 6778
I'm going to mark this as resolved. This is what I ended up using:
get32BitRange :: Int -> Int -> ByteString -> Word32
get32BitRange lo hi = assert (lo < hi) $
runGet (runBitGet bitGet)
where bitGet = block $ byteString byteOff
*> word8 bitOff
*> word32be len
len = hi - lo
(byteOff, bitOff) = lo `quotRem` 8
Upvotes: 1
Reputation: 38893
I think other solutions are way better but you can use the Internal module to get at the underlying structure: http://hackage.haskell.org/packages/archive/bytestring/0.10.2.0/doc/html/src/Data-ByteString-Internal.html#ByteString
data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8) -- payload
{-# UNPACK #-} !Int -- offset
{-# UNPACK #-} !Int -- length
Then you can use standard pointer tools to generate ByteString
s pointing exactly where you want, by manipulating the ForeignPtr
directly...
Upvotes: 2
Reputation: 29100
You can't make this efficient with ByteString
as your API type, because it doesn't carry the information that the "bits" you want really start at some offset into the first byte.
Best bet is to make a wrapper type:
data BitStream =
BitStream {
info :: ByteString,
-- values from 0-7: ignore all bits in the first byte up to
-- but not including this offset
firstBitOffset :: !Int,to but not including this offset
-- values from 0-7: ignore all bits in the last byte after
-- but not including this offset
lastBitOffset :: !Int
}
Then you can design a bit-based API around this.
Upvotes: 1