kerzane
kerzane

Reputation: 465

Python dataclasses: omit field from asdict

I've started making heavy use of the python dataclasses module and find it very useful. I especially like the flags that can be set on each field allowing for toggling of compare, init etc.

I often find however that there is a field which I wish to omit from the asdict behaviour of the class. In some situations this may be possible with the dict_factory argument, but it sometimes happens that a field will cause the asdict function to raise an exception before it is omitted through use of the dict_factory.

Can anyone else suggest a clean way to do this? Would it not be a useful additional flag to add to the dataclasses module?

Upvotes: 16

Views: 9415

Answers (3)

Ilias Dzhabbarov
Ilias Dzhabbarov

Reputation: 11

You could write your own asdict function based on source code of dataclasses python 3.12 standart library.

I marked the part that changed

import copy
import types

from dataclasses import Field, fields

def field_filter(f: Field):
    return f.init and f.repr

# This file is essentially an extract from 
# `dataclasses.py` standart lib script

# I made some modification to exclude `init=False` and `repr=False` fields
# from `asdict` function

# --------------------------------------------------------------------
# Atomic immutable types which don't require any recursive handling and for which deepcopy
# returns the same object. We can provide a fast-path for these types in asdict and astuple.
_ATOMIC_TYPES = frozenset({
    # Common JSON Serializable types
    types.NoneType,
    bool,
    int,
    float,
    str,
    # Other common types
    complex,
    bytes,
    # Other types that are also unaffected by deepcopy
    types.EllipsisType,
    types.NotImplementedType,
    types.CodeType,
    types.BuiltinFunctionType,
    types.FunctionType,
    type,
    range,
    property,
})



# The name of an attribute on the class where we store the Field
# objects.  Also used to check if a class is a Data Class.
_FIELDS = '__dataclass_fields__'


def _is_dataclass_instance(obj):
    """Returns True if obj is an instance of a dataclass."""
    return hasattr(type(obj), _FIELDS)

def asdict(obj, *, dict_factory=dict):
    """Custom funtion which return the fields of a dataclass instance as a new dictionary 
    mapping field names to field values.
    
    
    #### Fields with no `init` or `repr` are scipped

    Example usage::

      @dataclass
      class C:
          x: int
          y: int
          z: int = field(init=False)
          d: int = field(default=0, repr=False)

      c = C(1, 2)
      assert asdict(c) == {'x': 1, 'y': 2}

    If given, 'dict_factory' will be used instead of built-in dict.
    The function applies recursively to field values that are
    dataclass instances. This will also look into built-in containers:
    tuples, lists, and dicts. Other objects are copied with 'copy.deepcopy()'.
    """
    if not _is_dataclass_instance(obj):
        raise TypeError("asdict() should be called on dataclass instances")
    return _asdict_inner(obj, dict_factory)





def _asdict_inner(obj, dict_factory):
    if type(obj) in _ATOMIC_TYPES:
        return obj
    elif _is_dataclass_instance(obj):
        # fast path for the common case
        if dict_factory is dict:
            return {
                f.name: _asdict_inner(getattr(obj, f.name), dict)
                for f in fields(obj)
# --------------CHANGED-----------------------------------------------
                if field_filter(f)
# --------------------------------------------------------------------
            }
        else:
            result = []
            for f in fields(obj):
                value = _asdict_inner(getattr(obj, f.name), dict_factory)
# --------------CHANGED-----------------------------------------------
                if field_filter(f):
                    result.append((f.name, value))
# --------------------------------------------------------------------
            return dict_factory(result)
    elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
        # obj is a namedtuple.  Recurse into it, but the returned
        # object is another namedtuple of the same type.  This is
        # similar to how other list- or tuple-derived classes are
        # treated (see below), but we just need to create them
        # differently because a namedtuple's __init__ needs to be
        # called differently (see bpo-34363).

        # I'm not using namedtuple's _asdict()
        # method, because:
        # - it does not recurse in to the namedtuple fields and
        #   convert them to dicts (using dict_factory).
        # - I don't actually want to return a dict here.  The main
        #   use case here is json.dumps, and it handles converting
        #   namedtuples to lists.  Admittedly we're losing some
        #   information here when we produce a json list instead of a
        #   dict.  Note that if we returned dicts here instead of
        #   namedtuples, we could no longer call asdict() on a data
        #   structure where a namedtuple was used as a dict key.

        return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
    elif isinstance(obj, (list, tuple)):
        # Assume we can create an object of this type by passing in a
        # generator (which is not true for namedtuples, handled
        # above).
        return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
    elif isinstance(obj, dict):
        if hasattr(type(obj), 'default_factory'):
            # obj is a defaultdict, which has a different constructor from
            # dict as it requires the default_factory as its first arg.
            result = type(obj)(getattr(obj, 'default_factory'))
            for k, v in obj.items():
                result[_asdict_inner(k, dict_factory)] = _asdict_inner(v, dict_factory)
            return result
        return type(obj)((_asdict_inner(k, dict_factory),
                          _asdict_inner(v, dict_factory))
                         for k, v in obj.items())
    else:
        return copy.deepcopy(obj)

This is worked very well for me

Upvotes: 1

Victor Di
Victor Di

Reputation: 1198

I've ended up defining dict_factory in dataclass as staticmethod and then using in as_dict(). Found it more straightforward than messing with metadata.

from typing import Optional, Tuple
from dataclasses import asdict, dataclass

@dataclass
class Space:
    size: Optional[int] = None
    dtype: Optional[str] = None
    shape: Optional[Tuple[int]] = None

    @staticmethod
    def dict_factory(x):
        exclude_fields = ("shape", )
        return {k: v for (k, v) in x if ((v is not None) and (k not in exclude_fields))}


s1 = Space(size=2)
s1_dict = asdict(s1, dict_factory=Space.dict_factory)
print(s1_dict)
# {"size": 2}

s2 = Space(dtype='int', shape=(2, 5))
s2_dict = asdict(s2, dict_factory=Space.dict_factory)
print(s2_dict)
# {"dtype": "int"}
# no "shape" key, because it is excluded in dict_factory of the class.

Upvotes: 8

Mateus Terra
Mateus Terra

Reputation: 179

You can add custom metadata to field like field(metadata={"include_in_dict":True}) and in the dict_factory you can check this before anything else and skip the field if needed.

if field_.metadata.get("include_in_dict", False):
    continue

Upvotes: 3

Related Questions