Reputation: 765
I'm trying to deserialise a binary format (OpenType) which consists of data in multiple tables (binary structs). I would like to be able to deserialise the tables independently (because of how they're stored in the top-level file structure; imagine them being in separate files, so they have to be deserialised separately), but sometimes there are dependencies between them.
A simple example is the loca
table which contains an array of either 16-bit or 32-bit offsets, depending on the value of the indexToLocFormat
field in the head
table. As a more complex example, these loca
table offsets in turn are used as offsets into the binary data of the glyf
table to locate elements. So I need to get indexToLocFormat
and loca: Vec<32>
"into" the serializer somehow.
Obviously I need to implement Deserialize
myself and write visitors, and I've got my head around doing that. When there are dependencies from a table to a subtable, I've also been able to work that out using deserialize_seed
inside the table's visitor. But I don't know how to apply that to pass in information between tables.
I think I need to store what is essentially configuration information (value of indexToLocFormat
, array of offsets) when constructing my serializer object:
pub struct Deserializer<'de> {
input: &'de [u8],
ptr: usize,
locaShortVersion: Option<bool>,
glyfOffsets: Option<Vec<u32>>,
...
}
The problem is that I don't know how to retrieve that information when I'm inside the Visitor impl for the struct; I don't know how to get at the deserializer object at all, let alone how to type things so that I get at my Deserializer object with the configuration fields, not just a generic serde::de::Deserializer
:
impl<'de> Visitor<'de> for LocaVisitor {
type Value = Vec<u32>;
fn expecting(&self, formatter: &mut std::fmt::Formatter) -> std::fmt::Result {
write!(formatter, "A loca table")
}
fn visit_seq<A: SeqAccess<'de>>(self, mut seq: A) -> Result<Self::Value, A::Error> {
let locaShortVersion = /* what goes here? */;
if locaShortVersion {
Ok(seq.next_element::Vec<u16>()?
.ok_or_else(|| serde::de::Error::custom("Oops"))?
.map { |x| x as u32 }
} else {
Ok(seq.next_element::Vec<u32>()?
.ok_or_else(|| serde::de::Error::custom("Oops"))?
}
}
}
(terrible code here; if you're wondering why I'm writing Yet Another OpenType Parsing Crate, it's because I want to both read and write font files.)
Upvotes: 3
Views: 383
Reputation: 765
Actually, I think I've got it. The trick is to do the deserialization in stages. Rather than calling the deserializer module's from_bytes
function (which wraps the struct creation, and T::deserialize
call), do this instead:
use serde::de::DeserializeSeed; // Having this trait in scope is also key
let mut de = Deserializer::from_bytes(&binary_loca_table);
let ssd: SomeSpecialistDeserializer { ... configuration goes here .. };
let loca_table: Vec<u32> = ssd.deserialize(&mut de).unwrap();
In this case, I use a LocaDeserializer
defined like so:
pub struct LocaDeserializer { locaIs32Bit: bool }
impl<'de> DeserializeSeed<'de> for LocaDeserializer {
type Value = Vec<u32>;
fn deserialize<D>(self, deserializer: D) -> std::result::Result<Self::Value, D::Error>
where
D: serde::de::Deserializer<'de>,
{
struct LocaDeserializerVisitor {
locaIs32Bit: bool,
}
impl<'de> Visitor<'de> for LocaDeserializerVisitor {
type Value = Vec<u32>;
fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
write!(formatter, "a loca table")
}
fn visit_seq<A>(self, mut seq: A) -> std::result::Result<Vec<u32>, A::Error>
where
A: SeqAccess<'de>,
{
if self.locaIs32Bit {
Ok(seq.next_element::<u32>()?.ok_or_else(|| serde::de::Error::custom(format!("Expecting a 32 bit glyph offset")))?)
} else {
Ok(seq.next_element::<u16>()?.ok_or_else(|| serde::de::Error::custom(format!("Expecting a 16 bit glyph offset")))?
.iter()
.map(|x| (*x as u32) * 2)
.collect())
}
}
}
deserializer.deserialize_seq(LocaDeserializerVisitor {
locaIs32Bit: self.locaIs32Bit,
})
}
}
And now:
fn loca_de() {
let binary_loca = vec![
0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x1a,
];
let mut de = Deserializer::from_bytes(&binary_loca);
let cs: loca::LocaDeserializer = loca::LocaDeserializer { locaIs32Bit: false };
let floca: Vec<u32> = cs.deserialize(&mut de).unwrap();
println!("{:?}", floca);
// [2, 0, 2, 0, 0, 52]
let mut de = Deserializer::from_bytes(&binary_loca);
let cs: loca::LocaDeserializer = loca::LocaDeserializer { locaIs32Bit: true };
let floca: Vec<u32> = cs.deserialize(&mut de).unwrap();
println!("{:?}", floca);
// [65536, 65536, 26]
}
Serde is very nice - once you have got your head around it.
Upvotes: 0