Peter Bašista
Peter Bašista

Reputation: 907

Pydantic 2: JSON serialize a set to a sorted list

Context

In Pydantic 2, fields of type set are already JSON-serialized to lists. However, these lists are unordered. Or, more specifically, their items are ordered according to internal ordering of the original set.

Unfortunately, even when two sets contain the same items, their internal ordering might still be different. Consequently, serializing sets without explicitly ordering them produces nondeterministic results.

Question

I am looking for a way to configure a particular Pydantic 2 model to JSON-serialize all of its fields whose type is set to a sorted list first, before converting the outcome to string. I would like to avoid defining custom set type or adding a custom type annotation for every such attribute. The solution should be more generic because the number of attributes which might need this kind of handling is larger. Moreover, they might also be defined in subclasses of that particular model.

Is there a reasonably simple way to achieve this?

Options

It seems to me that using a model serializer is the most straightforward way to do it. But at the same time it seems cumbersome to me to loop through all the attributes, check their type, call the serialization routines for the items and then sort the results into a list. If possible, I would like to avoid that and leverage Pydantic's knowledge about the attribute types in some way.

In Pydantic 1, the desired effect could be achieved by using the json_encoders parameter of the configuration and defining a custom serialization function for all attributes of type set. However, in Pydantic 2 this option has been removed due to "performance overhead and implementation complexity". It seems understandable.

I am not necessarily looking for a way to mimic the behavior of Pydantic 1. If there is a way to achieve similar effect using primary or recommended Pydantic 2 features, I would prefer to use it.

It seems that at some point of JSON serialization, Pydantic is converting the sets to lists anyway. In an ideal case, I would like to somehow tap into this conversion and merely call sorted on its outcome.

Upvotes: 3

Views: 2144

Answers (1)

Icarus
Icarus

Reputation: 1864

Another approach I see is probably more cumbersome than what you hoped for and what you proposed with the model_serializer, but it only targets explicity selected attributes:

Serializing a set as a sorted list pydantic 2 (2.6 to be precise) can be done with a @field_serializer decorator (Source: pydantic documentation > functional serializers). Here is the example given in the referenced documentation:

from typing import Set

from pydantic import BaseModel, field_serializer

class StudentModel(BaseModel):
    name: str = 'Jane'
    courses: Set[str]

    @field_serializer('courses', when_used='json')
    def serialize_courses_in_order(courses: Set[str]):
        return sorted(courses)

student = StudentModel(courses={'Math', 'Chemistry', 'English'})
print(student.model_dump_json())
#> {"name":"Jane","courses":["Chemistry","English","Math"]}

The attribute courses is serialized as a sorted list.

You can also apply a field_serializier to multiple attributes:

class AnotherModel(BaseModel):
    a: Set[str]
    b: Set[str]

    @field_serializer('a', 'b', when_used='json')
    def serialize_sets(set_of_str: Set[str]):
        return sorted(set_of_str)

Upvotes: 1

Related Questions