Deserialize binary encoded enum

Question

I receive checksums as part of a binary protocol, which are represented in Rust in the following form:

enum Crc {
   Crc16([u8; 16])
   Crc32([u8; 32])
   Crc64([u8; 64])
}

I receive the enum variants encoded as byte arrays with the leading discriminant as u8 followed by the checksum byte arrays. As bincode cannot handle u8 variant discriminants (see internal discussion) in conjunction with other fixed size types and serde has a limit for arrays of size 32, I would like to implement a Deserializer by myself. I know I have to manually deserialize the array using a sequence visitor invoked with Deserializer::deserialize_tuple, but how can I handle the different variants?

Anders Evensen · Accepted Answer

There is no need to implement a custom Deserializer for this. You can still use bincode's Deserializer as long as you provide a custom Deserialize implementation to indicate the data you expect to receive.

Essentially, there are two separate pieces of data you expect: the variant discriminant (as a byte) and the checksum bytes (of varying size, depending on the variant). You can model this as a 2-tuple of data: (discriminant: u8, checksum_bytes: [u8; LEN]) for some LEN that is either 16, 32, or 64.

We can treat each piece separately:

Variant

You're right that bincode can't, by default, treat variant discriminants as u8s. But that doesn't mean we can't define a Deserialize implementation to treat them as u8s ourselves.

enum Variant {
    Crc16,
    Crc32,
    Crc64,
}

impl<'de> Deserialize<'de> for Variant {
    fn deserialize(deserializer: D) -> Result
    where
        D: Deserializer<'de>,
    {
        struct VariantVisitor;

        impl<'de> Visitor<'de> for VariantVisitor {
            type Value = Variant;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("a single descriminant byte, either a 0, 1, or 2")
            }

            fn visit_u8(self, byte: u8) -> Result
            where
                E: de::Error,
            {
                match byte {
                    0 => Ok(Variant::Crc16),
                    1 => Ok(Variant::Crc32),
                    2 => Ok(Variant::Crc64),
                    _ => Err(E::invalid_value(Unexpected::Unsigned(byte.into()), &self)),
                }
            }
        }

        deserializer.deserialize_u8(VariantVisitor)
    }
}

Now we have defined our own variant type (as opposed to the one derived by serde_derive) that deserializes a single byte as the variant. We can treat this Variant type as the first type in our 2-tuple.

Checksum Bytes

As you mentioned, serde only provides implementations for arrays up to length 32. We can use those implementations for the 16 and 32 byte arrays, but we'll need to define our own type to deserialize an array of 64 bytes:

struct Crc64([u8; 64]);

impl<'de> Deserialize<'de> for Crc64 {
    fn deserialize(deserializer: D) -> Result
    where
        D: Deserializer<'de>,
    {
        struct Crc64Visitor;

        impl<'de> Visitor<'de> for Crc64Visitor {
            type Value = Crc64;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("an array of length 64")
            }

            fn visit_seq(self, mut seq: A) -> Result
            where
                A: SeqAccess<'de>,
            {
                let mut value = [0; 64];
                for i in 0..64 {
                    value[i] = seq
                        .next_element()?
                        .ok_or(de::Error::invalid_length(i, &self))?;
                }
                Ok(Crc64(value))
            }
        }

        deserializer.deserialize_tuple(64, Crc64Visitor)
    }
}

This deserializes an array of 64 bytes into a Crc64 struct. The method used is very similar to the one provided by serde for arrays of length 1-32.

Putting it Together

Now we have every part we need: we can deserialize the variant discriminate, and we can deserialize the byte array for any of the sizes required. The last step is to tell serde that we expect a 2-tuple, and then deserializing that data in the visitor.

impl<'de> Deserialize<'de> for Crc {
    fn deserialize(deserializer: D) -> Result
    where
        D: Deserializer<'de>,
    {
        struct CrcVisitor;

        impl<'de> Visitor<'de> for CrcVisitor {
            type Value = Crc;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("a single descriminant byte followed by a checksum byte array")
            }

            fn visit_seq(self, mut seq: A) -> Result
            where
                A: SeqAccess<'de>,
            {
                match seq
                    .next_element()?
                    .ok_or(de::Error::invalid_length(0, &self))?
                {
                    Variant::Crc16 => Ok(Crc::Crc16(
                        seq.next_element()?
                            .ok_or(de::Error::invalid_length(1, &self))?,
                    )),
                    Variant::Crc32 => Ok(Crc::Crc32(
                        seq.next_element()?
                            .ok_or(de::Error::invalid_length(1, &self))?,
                    )),
                    Variant::Crc64 => Ok(Crc::Crc64(
                        seq.next_element::()?
                            .ok_or(de::Error::invalid_length(1, &self))?
                            .0,
                    )),
                }
            }
        }

        deserializer.deserialize_tuple(2, CrcVisitor)
    }
}

This first deserializes the variant into our Variant type, and then, based on the discriminant found, deserializes the remaining bytes into the correct length. Note that we had to use our Crc64 type to deserialize the 64-byte array.

The full code is available at this playground. Note that you won't actually be able to test this with bincode on the playground, because bincode is not available as a dependency there, but it should work correctly with bincode version 1.3.3, which is the version I tested on.

Deserialize binary encoded enum

Answers (1)

Variant

Checksum Bytes

Putting it Together

Related Questions