Reputation: 199
I've been writing code to parse and extract information from messages sent by a bot. There are only a few different kinds of messages, but each one of them contains wildly different kinds of information I'm interested in, and I'm struggling with finding the best way to deal with them as objects in my code.
If I were using Haskell, I'd simply create a type Message
and define a tailored constructor for each kind of message
data Message = Greeting Foo Bar | Warning Yadda Yadda Yadda | ...
It's a very nice and clean way to both have them all under the same type
and be able to tell the message kinds apart easily.
How would one go about designing object classes to that effect in a OOP-friendly (or better, pythonic) way? I've thought of two approaches, namely:
Defining a base-class Message
and subclassing it for each kind of message. Pros: conceptually clean. Cons: lots of boilerplate code, and it doesn't really make the code very readable or the relationship between different message classes clear.
Defining a universal class Message
, which represents every message type. It will have an attribute .type
to differentiate between message kinds, and its __init__
function will instantiate the attributes appropriate to the message type accordingly. Pros: Simple to code, practical. Cons: it seems a bad practice to have the class' attributes be so unpredictable, and it generally feels wrong.
but I'm not completely satisfied with neither. While I realize that this is a just small programme, I'm using it as an opportunity to learn more about the use of abstractions and software architecture, I suppose. Can someone show me the way?
Upvotes: 1
Views: 262
Reputation: 1122312
For a message class design, I'd use dataclasses to minimise the boilerplate. You get to focus entirely on the fields:
from dataclasses import dataclass
class Message:
# common message methods
@dataclass
class Greeting(Message):
foo: str
bar: int
@dataclass
class Warning(Message):
yadda: list[str]
There usually isn't much more you need for a simple project. You could add a @classmethod
factory to the Message
base class to help generating specific message types, and Message
could be a @dataclass
itself too, if there are common attributes shared between the different types.
That said, once you start to factor in serialisation and deserialisation requirements, using a type
field that is an enum can be helpful.
To illustrate that point: For a current RESTFul API project that includes automated OpenAPI 3.1 documentation, we are using Marshmallow to handle translation from and to JSON, marshmallow-dataclasses to avoid having to repeat ourselves to define the schema and validation, and marshmallow-oneofschema to reflect a polymorphic schema for a hierarchy of classes that differ by their type much like your Message
example.
Using 3rd-party libraries then constrains your options, so I used metaprogramming (mostly class.__init_subclass__
and Generic
type annotations) to make it possible to concisely define such a polymorphic type hierachy that's keyed on an enum.
Your message type would be expressed like this:
class MessageType(enum.Enum):
greeting = "greeting"
warning = "warning"
# ...
@dataclass
class _BaseMessage(PolymorphicType[MessageType]):
type: MessageType
# ...
@dataclass
class Greeting(_BaseMessage, type_key=MessageType.greeting):
foo: str
bar: int
@dataclass
class Warning(_BaseMessage, type_key=MessageType.warning):
yadda: list[str]
MessageSchema = _BaseMessage.OneOfSchema("MessageSchema")
after which messages are loaded from JSON using MessageSchema.load()
, producing a specific instance based on the "type"
key in the dictionary, e.g.
message = MessageSchema.load({"type": "greeting", "foo": "spam", "bar": 42})
isinstance(message, Greeting) # True
while MessageSchema.dump()
gets you suitable JSON output regardless of the input type:
message = Warning([42, 117])
MessageSchema.dump(message) # {"type": "warning", "yadda": [42, 117]}
It is the use of an enum
here that makes the integration work best; PolymorphicType
is the custom class that handles most of the heavy lifting to make the _BaseMessage.OneOfSchema()
call at the end work. You don't have to use metaprogramming to achieve that last part, but for us it reduced removed most of the marshmallow-oneschema
boilerplate.
Plus, we get OpenAPI schemas that reflect each specific message type, which documentation tools like Redocly know how to process:
components:
schemas:
Message:
oneOf:
- $ref: '#/components/schemas/Greeting'
- $ref: '#/components/schemas/Warning'
discriminator:
propertyName: type
mapping:
greeting: '#/components/schemas/Greeting'
warning: '#/components/schemas/Warning'
Greeting:
type: object
properties:
type:
type: string
default: greeting
foo:
type: string
bar:
type: integer
Warning:
type: object
properties:
type:
type: string
default: warning
yadda:
type: array
items:
type: string
Upvotes: 6