Abhiroop Sarkar
Abhiroop Sarkar

Reputation: 2311

Why do Data.Binary instances of bytestring add the length of the bytestring as prefix

Looking at the put instances of the various ByteString types we find that the length of the bytestring is always prefixed in the binary file before writing it. For example here - https://hackage.haskell.org/package/binary-0.8.8.0/docs/src/Data.Binary.Class.html#put

Taking an example

instance Binary B.ByteString where
    put bs = put (B.length bs) -- Why this??
             <> putByteString bs
    get    = get >>= getByteString

Is there any particular reason for doing this? And is the only way to write Bytestring without prefixing the length - creating our own newtype wrapper and having an instance for Binary?

Upvotes: 2

Views: 105

Answers (1)

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 476493

Is there any particular reason for doing this?

The idea of get and put is that you can combine several objects. For example you can write:

write_func :: ByteString -> Char -> Put
write_func some_bytestring some_char = do
    put some_bytestring
    put some_char

then you want to define a function that can read the data back, and evidently you want the two functions to act together as an identity function: that if the writer writes a certain ByteString and a certain Char, then you want the read function to read the same ByteString and character.

The reader function should look similar to:

read_fun :: Get (ByteString, Char)
read_fun = do
    bs <- get
    c <- get
    return (bs, c)

but the problem is, when does a ByteString ends? The 'A' character could also be part of the ByteString. You thus need to somehow indicate where the ByteString ends. This can be done by saving the length, or some marker at the end. In case of a marker, you will need to "escape" the bytestring, such that it can not contain the marker itself.

But you thus need some mechanism to specify that when the ByteString ends.

And is the only way to write Bytestring without prefixing the length - creating our own newtype wrapper and having an instance for Binary?

No, in fact it is already in the instance definition. If you want to write a ByteString without length, then you can use putByteString :: ByteString -> Put:

write_func :: ByteString -> Char -> Put
write_func some_bytestring some_char = do
    putByteString some_bytestring
    put some_char

but when reading the ByteString back, you will need to figure out how many bytes you have to read.

Upvotes: 3

Related Questions