Reputation: 1230
I have a fairly basic question about dataclasses. If I have an event dict that I pass as data to a dataclass is it a good use of dataclasses in general to use the class to parse out the data I need? Or use it to handle conditionals in order to return the right data based on the data that was passed in.
@dataclass
class Event:
data: dict[str,str]
def type(self):
return self.data["detail"]["eventName"]
I've just started using dataclasses and looking at how crap my code usually is because it's written in a rush with little thought to open-close extensibility or abstraction. So I'm trying to get my head around compositional root and when and where things should be coupled and when they shouldn't. I.e. as opposed to:
@dataclass
class Event:
type: str
Event(e["detail"]["eventName"])
It makes a lot of sense to me to have something like
@dataclass
class Event:
type: str
name: str
id: int
But what if the path into the dict to access to the id changes or more relevently what if the path is different depending on the type, which actually is one of the problems I've run into. If you create class EventType1
class EventType2
I'd still need to try event type 1 to see if the constructor will work since the paths might work and then move onto 2. Seems like I'm missing something and I'm replacing bad design with bad design since you'd need a class for every possible event type. What's people's thoughts? Is this a bad use of dataclasses? Should all the indexing be pulled out of the class and done somewhere else?
EDIT ---
I've decided to add a more concrete example of my issue. Where I've used EventData
to abstract the concept of event formats, to where data1 and data2 become irrelevant as I'm actually consuming EventData
. I still have to know based on the event format coming in which EventType constructor in this toy example GitPush
or CreatePullRequest
to use. As I could extend this interface to many events and the code consuming EventData
doesn't care.
class Event(ABC):
""" implement me """
@abstractmethod
def type(self) -> str:
pass
@abstractmethod
def repository(self) -> str:
pass
# event type 1
class GitPush(Event):
""" implemented supported event type """
def __init__(self, data):
self.data = data
@property
def type(self) -> str:
return self.data["detail"]["eventName"]
@property
def repository(self) -> str:
return self.data["detail"]["additionalEventData"]["repositoryName"]
# event type 2
class PullRequest(Event):
""" implemented supported event type """
def __init__(self, data):
self.data = data
@property
def type(self) -> str:
return self.data["detail"]["eventName"]
@property
def repository(self) -> str:
return self.data["detail"]["requestParameters"]["targets"][0]["repositoryName"]
@dataclass
class EventData:
data: Event
@property
def type(self):
return self.data.type
@property
def repository(self):
return self.data.repository
event = GitPush(data1)
data = EventData(event)
print(data.repository)
event = PullRequest(data2)
data = EventData(event)
print(data.repository)
I am and was confused about the idea of EventType and whether it can or should be abstracted or if this is just a point of extensibility where new events can just meet the implementation requirements.
event = EventType(some_event)
data = EventData(event)
print(data.repository)
I can't think of any other way to do it other than using a conditional:
if some_event["detail"]["eventName"] == "Type1":
e = Type1(some_event)
if some_event["detail"]["eventName"] == "Type2":
e = Type2(some_event)
data = EventData(e)
print(data.repository)
Upvotes: 2
Views: 1237
Reputation: 70277
I'll share with you a pattern that I've used a lot in this exact situation. The problem is that we've got some untrusted data in a JSON-like format and we want to store it in nice data structures. It's a good instinct, to get the messy business out of the way early and be able to assume, for the rest of the program, that the shape of the data is good (see Parse, don't Validate, which is an excellent article on the topic).
Here's what I've done in the past.
@dataclass
class Event:
type: str
name: str
id: int
@classmethod
def from_json(data):
return Event(
type=data["detail"]["eventName"],
name=data["name"],
id=data["id"],
)
You've got a dataclass. It's a real, genuine @dataclass
and anyone who feels so inclined can construct instances of it directly. But the intended entrypoint is the factory function from_json
, which takes your dictionary and parses it into an Event
object. If the data format changes, you only need to change that one function.
And if there's any validation you should do (maybe IDs have to be nonnegative or something, or names have a maximum length), then you can do that in from_json
as well and throw an exception on bad input. Obviously you'd want to document this behavior, but it can still be checked in this one place, at the endpoint of your application that touches whatever API you're talking to.
If you've got multiple different event types, then you can have a parent (abstract) class Event
and all of its (concrete) subclasses, and the parent class can provide a @classmethod
which constructs an instance of the appropriate subclass based on the shape of the data. Still one entrypoint.
Upvotes: 3