Reputation: 41
UPDATED Question: In Python, can I create a custom data structure that is a dict
, but which I get and set as a set
, and for which I can create a custom __str__
representation?
I want a class attribute that is structurally a dict{str:list[str]}, but which the user-interface (for lack of better words) treats like a set[str]. And I want to print it like a dict with custom formatting.
Attempted Solution:
I implemented a Descriptor, but I haven't figured out how to customize the __str__
, so I'm thinking a Descriptor is not actually what I should be trying.
class TreatDictLikeSet(): # The Descriptor I wish existed
def __set_name__(self, owner, name):
self.name = name
def __get__(self, obj, type=None) -> object:
my_dict = obj.__dict__.get(self.name) or {}
return [e for v in my_dict.values() for e in v]
def __set__(self, obj, value) -> None:
value = ...<rules to insert set values into a dict>...
obj.__dict__[self.name] = value
class Foo():
my_dict = TreatDictLikeSet()
Upvotes: 1
Views: 135
Reputation: 18388
If all you want is the behavior "assign set
, but get dict
", I am not sure you need to deal with descriptors at all.
Seems like a simple property
would do just fine:
class Foo:
_my_set: set[str]
@property
def my_dict(self) -> dict[str, list[str]]:
return {f"key_{i}": [value] for i, value in enumerate(self._my_set)}
@my_dict.setter
def my_dict(self, value: set[str]) -> None:
self._my_set = value
foo = Foo()
foo.my_dict = {'a', 'b', 'c'}
print(f"{foo.my_dict}") # {'key_0': ['a'], 'key_1': ['c'], 'key_2': ['b']}
If you want something that behave like a standard collection class (e.g. a set
), a good starting point is usually the collections.abc
module.
For example, you could subclass MutableSet
, implement its abstract methods (__contains__
, __iter__
, __len__
, add
, and discard
), and also implement your own __init__
and __str__
methods for it:
from collections.abc import Iterable, Iterator, MutableSet
from typing import TypeVar
T = TypeVar("T")
class SetButAlsoDictOfLists(MutableSet[T]):
_data: dict[str, list[T]]
def __init__(self, values: Iterable[T] = ()) -> None:
self._data = {}
for value in values:
self.add(value)
def __str__(self) -> str:
return str(self._data)
def __contains__(self, value: object) -> bool:
return any(value in list_ for list_ in self._data.values())
def __iter__(self) -> Iterator[T]:
return (list_[0] for list_ in self._data.values())
def __len__(self) -> int:
return len(self._data)
def add(self, value: T) -> None:
self._data[f"key_{value}"] = [value]
def discard(self, value: T) -> None:
del self._data[f"key_{value}"]
As you wished, the underlying data structure is a dictionary of lists. I just implemented some arbitrary rule for creating the dictionary keys here for demonstration purposes.
As @Blckknght pointed out in a comment, the fact that you are using a different data structure underneath means that the runtime of operations can be very different. Specifically, as you can see, the way I implemented __contains__
here is in O(n) as opposed to O(1) with actual sets. This is because I am looping over the entire values
view of the dict
to find some value instead of just hashing and looking up as I would with a set.
On the other hand, even though deletion in principle would be just as expensive, due to this specific implementation of the dict
keys logic, removal (discard
) is just as efficient because the value is part of the key.
You could of course store the values in an actual set alongside the dictionary, thus making these operations efficient again, but this would obviously take up twice as much memory for each value.
Either way, you can use this class as a regular (mutable) set now, but its string representation is that of the underlying dictionary:
obj = SetButAlsoDictOfLists({"a", "b", "d"})
print(obj.isdisjoint(["x", "y"])) # True
obj.add("c")
obj.remove("d")
print(obj) # {'key_b': ['b'], 'key_a': ['a'], 'key_c': ['c']}
Now if you still want that descriptor magic for some reason, you can just write one that uses such a class under the hood, i.e. initializes a new object in its __set__
and returns it in its __get__
methods:
from typing import Generic, TypeVar
# ... import SetButAlsoDictOfLists
_T = TypeVar("_T")
class Descriptor(Generic[_T]):
name: str
def __set_name__(self, owner: type, name: str) -> None:
self.name = name
def __get__(
self,
instance: object,
owner: type | None = None,
) -> SetButAlsoDictOfLists[_T]:
return instance.__dict__.get(self.name, SetButAlsoDictOfLists())
def __set__(self, instance: object, value: Iterable[_T]) -> None:
instance.__dict__[self.name] = SetButAlsoDictOfLists(value)
And use it like this:
class Foo:
my_cool_set = Descriptor[str]()
foo = Foo()
print(foo.my_cool_set) # {}
foo.my_cool_set = {"a", "b"}
print(foo.my_cool_set) # {'key_b': ['b'], 'key_a': ['a']}
foo.my_cool_set |= ["b", "c"]
print(foo.my_cool_set) # {'key_b': ['b'], 'key_a': ['a'], 'key_c': ['c']}
Upvotes: 0