Christian Sauer
Christian Sauer

Reputation: 10909

Attr: Deserialize deeply nested json?

I have a deeply nested JSON structure like this:

  json_data =  """{
      "title: "...",
      "links": [
        {
          "href": "string",
          "method": {
            "method": "string"
          },
          "rel": "string"
        }
      ]
    }"""

my classes:

import attr
import typing as ty
import enum
    class HttpMethod(enum.Enum):
        GET = 0
        HEAD = 1
        POST = 2
        PUT = 3
        DELETE = 4
        CONNECT = 5
        OPTIONS = 6
        TRACE = 7
        PATCH = 8


@attr.s(frozen=True, auto_attribs=True)
class LinkInfo:
    href: str = attr.ib()
    link: str = attr.ib(converter=ensure_cls)
    method: HttpMethod = attr.ib(converter=lambda x: x.name)

@attr.s(frozen=True, auto_attribs=True)
class AnalysisTemplateGetModel:
    title: x: str = attr.ib()
    links: ty.List[mlinks.LinkInfo]= attr.ib(default=attr.Factory(list))

and I want to deserialize them to attr classes,

js = json.loads(**json_data)

but the links field of js is still a dictionary, not a LinkInfo object?

str(js)
AnalysisTemplateGetModel(..., links=[{'href': 'string', 'method': {'method': 'string'}, 'rel': 'string'}])

Upvotes: 3

Views: 626

Answers (1)

Darkdragon84
Darkdragon84

Reputation: 901

Let me edit your code examples to make them consistent. Also, let me use a more up to date version of attrs>=20.1.0 on Python 3.10

Assume the json string input

json_data = """{
    "title": "my_title",
    "links": [
        {
            "href": "some_href",
            "method": "POST",
            "rel": "some_rel"
        },
        {
            "href": "another_href",
            "method": "GET",
            "rel": "another_rel"
        }
    ]
}"""

You can achieve nested deserialization using converters for fields that are not native json data types

from collections.abc import Iterable
import json
import typing as ty
import enum
from attrs import frozen, field, Factory


class HttpMethod(enum.Enum):
    GET = 0
    HEAD = 1
    POST = 2
    PUT = 3
    DELETE = 4
    CONNECT = 5
    OPTIONS = 6
    TRACE = 7
    PATCH = 8


def ensure_http_method(data: str | HttpMethod) -> HttpMethod:
    if isinstance(data, str):
        data = HttpMethod[data]
    return data


@frozen
class LinkInfo:
    href: str
    rel: str
    method: HttpMethod = field(converter=ensure_http_method)


def ensure_list_of_linkinfos(
    iterable: Iterable[ty.Dict[str, ty.Any] | LinkInfo]
) -> ty.List[LinkInfo]:
    return [
        link_info if isinstance(link_info, LinkInfo) else LinkInfo(**link_info)
        for link_info in iterable
    ]


@frozen
class AnalysisTemplateGetModel:
    title: str
    links: ty.List[LinkInfo] = field(
        default=Factory(list), converter=ensure_list_of_linkinfos
    )

analysis_template = AnalysisTemplateGetModel(**json.loads(json_data))
print(analysis_template)

which will give

AnalysisTemplateGetModel(title='my_title', links=[LinkInfo(href='some_href', rel='some_rel', method=<HttpMethod.POST: 2>), LinkInfo(href='another_href', rel='another_rel', method=<HttpMethod.GET: 0>)])

However, let me point you to the excellent sister project of attrs, cattrs, which solves exactly that problem of (de)serializing nested attrs classes.

With cattrs you can skip the converters and just do


import json
import typing as ty
import enum

from attrs import frozen, field, Factory
from cattrs import Converter

class HttpMethod(enum.Enum):
    GET = 0
    HEAD = 1
    POST = 2
    PUT = 3
    DELETE = 4
    CONNECT = 5
    OPTIONS = 6
    TRACE = 7
    PATCH = 8


@frozen
class LinkInfo:
    href: str
    rel: str
    method: HttpMethod

@frozen
class AnalysisTemplateGetModel:
    title: str
    links: ty.List[LinkInfo]


converter = Converter()
# this is necessary, because cattrs serializes enums by its value per default 
# however you want the opposite, so we need to tell it
converter.register_structure_hook(enum.Enum, lambda string, cl: cl[string])

print(converter.structure(json.loads(json_data), AnalysisTemplateGetModel))

This also gives

AnalysisTemplateGetModel(title='my_title', links=[LinkInfo(href='some_href', rel='some_rel', method=<HttpMethod.POST: 2>), LinkInfo(href='another_href', rel='another_rel', method=<HttpMethod.GET: 0>)])

and is much more concise and less invasive (you don't have to add anything to your classes to make serialization work, that's the beauty of cattrs).

Upvotes: 1

Related Questions