mycognosist
mycognosist

Reputation: 143

Enforce strict ordering when deserializing JSON with serde

I want to deserialize a string of JSON data into a struct with multiple fields and return an error if the ordering of the serialized data does not match the order of the fields in the struct.

I have read through the serde documentation, including the section on custom serialization, but cannot find a solution. I imagine it might be possible to enforce strict ordering by implementing Deserializer with field name checks but I'm not entirely sure about this.

An example following the format of the serde_json docs:

#[derive(Serialize, Deserialize)]
struct Person {
    name: String,
    age: u8,
    phones: Vec<String>,
}

let correct_order = r#"
    {
        "name": "John Doe",
        "age": 43,
        "phones": [
            "+44 1234567",
            "+44 2345678"
        ]
    }"#;

// this deserializes correctly (no error)
let p: Person = serde_json::from_str(data)?;

let incorrect_order = r#"
    {
        "age": 43,
        "phones": [
            "+44 1234567",
            "+44 2345678"
        ]
        "name": "John Doe"
    }"#;

// how to ensure this returns an error? (data fields out of order)
let p2: Person = serde_json::from_str(data)?;

Upvotes: 6

Views: 4054

Answers (1)

Anders Evensen
Anders Evensen

Reputation: 881

You can do this by providing a custom Deserialize implementation.

For JSON, the visitor function you'll be going through for struct deserialization is Visitor::visit_map(). Normally, struct fields are visited in whatever order they are given (for example, when you use #[derive(Deserialize)]). We simply have to write the visitor to ensure the fields come in the strict order we expect.

use serde::{
    de,
    de::{Deserialize, Deserializer, MapAccess, Visitor},
};
use std::fmt;

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
    phones: Vec<String>,
}

impl<'de> Deserialize<'de> for Person {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        // Some boilerplate logic for deserializing the fields.
        enum Field {
            Name,
            Age,
            Phones,
        }

        impl<'de> Deserialize<'de> for Field {
            fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
            where
                D: Deserializer<'de>,
            {
                struct FieldVisitor;

                impl<'de> Visitor<'de> for FieldVisitor {
                    type Value = Field;

                    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                        formatter.write_str("name, age, or phones")
                    }

                    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
                    where
                        E: de::Error,
                    {
                        match v {
                            "name" => Ok(Field::Name),
                            "age" => Ok(Field::Age),
                            "phones" => Ok(Field::Phones),
                            _ => Err(E::unknown_field(v, FIELDS)),
                        }
                    }
                }

                deserializer.deserialize_identifier(FieldVisitor)
            }
        }

        // Logic for actually deserializing the struct itself.
        struct PersonVisitor;

        impl<'de> Visitor<'de> for PersonVisitor {
            type Value = Person;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("struct Person with fields in order of name, age, and phones")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: MapAccess<'de>,
            {
                // Deserialize name.
                let name = match map.next_key()? {
                    Some(Field::Name) => Ok(map.next_value()?),
                    Some(_) => Err(de::Error::missing_field("name")),
                    None => Err(de::Error::invalid_length(0, &self)),
                }?;

                // Deserialize age.
                let age = match map.next_key()? {
                    Some(Field::Age) => Ok(map.next_value()?),
                    Some(_) => Err(de::Error::missing_field("age")),
                    None => Err(de::Error::invalid_length(1, &self)),
                }?;

                // Deserialize phones.
                let phones = match map.next_key()? {
                    Some(Field::Phones) => Ok(map.next_value()?),
                    Some(_) => Err(de::Error::missing_field("phones")),
                    None => Err(de::Error::invalid_length(2, &self)),
                }?;

                Ok(Person { name, age, phones })
            }
        }

        const FIELDS: &[&str] = &["name", "age", "phones"];
        deserializer.deserialize_struct("Person", FIELDS, PersonVisitor)
    }
}

There's a lot of boilerplate here (that is normally hidden behind #[derive(Deserialize)]):

  • First we define an internal enum Field to deserialize the struct fields, with its own Deserialize implementation. This is a standard implementation, we just write it out by hand here.
  • Then we define a PersonVisitor to actually provide our Visitor trait implementation. This part is where we actually enforce the ordering of the fields.

You can see that this now works as expected. The following code:

fn main() {
    let correct_order = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;

    // this deserializes correctly (no error)
    let p: serde_json::Result<Person> = serde_json::from_str(correct_order);
    dbg!(p);

    let incorrect_order = r#"
        {
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
            "name": "John Doe"
        }"#;

    // how to ensure this returns an error? (data fields out of order)
    let p2: serde_json::Result<Person> = serde_json::from_str(incorrect_order);
    dbg!(p2);
    assert!(false)
}

prints this output:

[src/main.rs:114] p = Ok(
    Person {
        name: "John Doe",
        age: 43,
        phones: [
            "+44 1234567",
            "+44 2345678",
        ],
    },
)
[src/main.rs:128] p2 = Err(
    Error("missing field `name`", line: 3, column: 17),
)

Upvotes: 1

Related Questions