yetixhunting
yetixhunting

Reputation: 41

Descriptor Protocol with custom __str__() method in Python?

UPDATED Question: In Python, can I create a custom data structure that is a dict, but which I get and set as a set, and for which I can create a custom __str__ representation?

I want a class attribute that is structurally a dict{str:list[str]}, but which the user-interface (for lack of better words) treats like a set[str]. And I want to print it like a dict with custom formatting.

Attempted Solution: I implemented a Descriptor, but I haven't figured out how to customize the __str__, so I'm thinking a Descriptor is not actually what I should be trying.

class TreatDictLikeSet():  # The Descriptor I wish existed
    def __set_name__(self, owner, name):
        self.name = name

    def __get__(self, obj, type=None) -> object:
        my_dict = obj.__dict__.get(self.name) or {}
        return [e for v in my_dict.values() for e in v]

    def __set__(self, obj, value) -> None:
        value = ...<rules to insert set values into a dict>...
        obj.__dict__[self.name] = value


class Foo():
    my_dict = TreatDictLikeSet()

Upvotes: 1

Views: 135

Answers (1)

Daniil Fajnberg
Daniil Fajnberg

Reputation: 18388

If all you want is the behavior "assign set, but get dict", I am not sure you need to deal with descriptors at all.

Seems like a simple property would do just fine:

class Foo:
    _my_set: set[str]

    @property
    def my_dict(self) -> dict[str, list[str]]:
        return {f"key_{i}": [value] for i, value in enumerate(self._my_set)}

    @my_dict.setter
    def my_dict(self, value: set[str]) -> None:
        self._my_set = value


foo = Foo()
foo.my_dict = {'a', 'b', 'c'}
print(f"{foo.my_dict}")  # {'key_0': ['a'], 'key_1': ['c'], 'key_2': ['b']}

Update

If you want something that behave like a standard collection class (e.g. a set), a good starting point is usually the collections.abc module.

For example, you could subclass MutableSet, implement its abstract methods (__contains__, __iter__, __len__, add, and discard), and also implement your own __init__ and __str__ methods for it:

from collections.abc import Iterable, Iterator, MutableSet
from typing import TypeVar

T = TypeVar("T")


class SetButAlsoDictOfLists(MutableSet[T]):
    _data: dict[str, list[T]]

    def __init__(self, values: Iterable[T] = ()) -> None:
        self._data = {}
        for value in values:
            self.add(value)

    def __str__(self) -> str:
        return str(self._data)

    def __contains__(self, value: object) -> bool:
        return any(value in list_ for list_ in self._data.values())

    def __iter__(self) -> Iterator[T]:
        return (list_[0] for list_ in self._data.values())

    def __len__(self) -> int:
        return len(self._data)

    def add(self, value: T) -> None:
        self._data[f"key_{value}"] = [value]

    def discard(self, value: T) -> None:
        del self._data[f"key_{value}"]

As you wished, the underlying data structure is a dictionary of lists. I just implemented some arbitrary rule for creating the dictionary keys here for demonstration purposes.

As @Blckknght pointed out in a comment, the fact that you are using a different data structure underneath means that the runtime of operations can be very different. Specifically, as you can see, the way I implemented __contains__ here is in O(n) as opposed to O(1) with actual sets. This is because I am looping over the entire values view of the dict to find some value instead of just hashing and looking up as I would with a set.

On the other hand, even though deletion in principle would be just as expensive, due to this specific implementation of the dict keys logic, removal (discard) is just as efficient because the value is part of the key.

You could of course store the values in an actual set alongside the dictionary, thus making these operations efficient again, but this would obviously take up twice as much memory for each value.

Either way, you can use this class as a regular (mutable) set now, but its string representation is that of the underlying dictionary:

obj = SetButAlsoDictOfLists({"a", "b", "d"})
print(obj.isdisjoint(["x", "y"]))  # True
obj.add("c")
obj.remove("d")
print(obj)  # {'key_b': ['b'], 'key_a': ['a'], 'key_c': ['c']}

Now if you still want that descriptor magic for some reason, you can just write one that uses such a class under the hood, i.e. initializes a new object in its __set__ and returns it in its __get__ methods:

from typing import Generic, TypeVar

# ... import SetButAlsoDictOfLists

_T = TypeVar("_T")


class Descriptor(Generic[_T]):
    name: str

    def __set_name__(self, owner: type, name: str) -> None:
        self.name = name

    def __get__(
        self,
        instance: object,
        owner: type | None = None,
    ) -> SetButAlsoDictOfLists[_T]:
        return instance.__dict__.get(self.name, SetButAlsoDictOfLists())

    def __set__(self, instance: object, value: Iterable[_T]) -> None:
        instance.__dict__[self.name] = SetButAlsoDictOfLists(value)

And use it like this:

class Foo:
    my_cool_set = Descriptor[str]()


foo = Foo()
print(foo.my_cool_set)  # {}
foo.my_cool_set = {"a", "b"}
print(foo.my_cool_set)  # {'key_b': ['b'], 'key_a': ['a']}
foo.my_cool_set |= ["b", "c"]
print(foo.my_cool_set)  # {'key_b': ['b'], 'key_a': ['a'], 'key_c': ['c']}

Upvotes: 0

Related Questions