Reputation: 147
Short version: How can one serialize a class (class reference, i.e. not an object) that is a member of an object (see: example)?
Long version:
I have been using the answer to this question in my work: How can I ignore a member when serializing an object with PyYAML?
So, my current implementation is this:
class SecretYamlObject(yaml.YAMLObject):
"""Helper class for YAML serialization.
Source: https://stackoverflow.com/questions/22773612/how-can-i-ignore-a-member-when-serializing-an-object-with-pyyaml """
def __init__(self, *args, **kwargs):
self.__setstate__(self, kwargs) #Default behavior, so one could just use setstate
pass
hidden_fields = []
@classmethod
def to_yaml(cls,dumper,data):
new_data = copy(data)
for item in cls.hidden_fields:
if item in new_data.__dict__:
del new_data.__dict__[item]
res = dumper.represent_yaml_object(cls.yaml_tag, new_data, cls, flow_style=cls.yaml_flow_style)
return res
So far, this has been working fine for me because until now I have only needed to hide loggers:
class EventManager(SecretYamlObject):
yaml_tag = u"!EventManager"
hidden_fields = ["logger"]
def __setstate__(self, kw): # For (de)serialization
self.logger = logging.getLogger(__name__)
self.listeners = kw.get("listeners",{})
#...
return
def __init__(self, *args, **kwargs):
self.__setstate__(kwargs)
return
However, a different problem appears when I try to serialize non-trivial objects (if Q is directly from object, this is fine, but from yaml.YAMLObject it fails with "can't pickle int objects"). See this example:
class Q(SecretYamlObject): #works fine if I just use "object"
pass
class A(SecretYamlObject):
yaml_tag = u"!Aobj"
my_q = Q
def __init__(self, oth_q):
self.att = "att"
self.oth_q = oth_q
pass
pass
class B(SecretYamlObject):
yaml_tag = u"!Bobj"
my_q = Q
hidden_fields = ["my_q"]
def __init__(self, oth_q):
self.att = "att"
self.oth_q = oth_q
pass
pass
class C(SecretYamlObject):
yaml_tag = u"!Cobj"
my_q = Q
hidden_fields = ["my_q"]
def __init__(self, *args, **kwargs):
self.__setstate__(kwargs)
pass
def __setstate__(self, kw):
self.att = "att"
self.my_q = Q
self.oth_q = kw.get("oth_q",None)
pass
pass
a = A(Q)
a2 = yaml.load(yaml.dump(a))
b = B(Q)
b2 = yaml.load(yaml.dump(b))
c = C(my_q=Q)
c2 = yaml.load(yaml.dump(c))
c2.my_q
c2.oth_q
A and B give "can't pickle int objects" errors, while C doesn't initialize oth_q (because there is no information about it).
Question: How to preserve the information about which class reference is held?
(I need to hold the class reference to be able to make objects of that type - an alternate for this might work too)
Upvotes: 0
Views: 1749
Reputation: 76792
When loading dumped YAML, you normally don't need to preserve the information about which class needs to be instantiated. That is what tag information, stored in the file with !XObj
, is for.
If you hide a reference to an object of a certain class, by not dumping the attribute that refers to it, and then run into problems instantiating that object (because you don't know its class) when loading, you are doing something wrong. In that case you should hide the internals of the referenced object, not the attribute that references the object. You could e.g. dump the referenced object using !XObj null
.
By hiding the internals, you will have the appropriate tag, pointing to the right class to create an object from, when loading. You'll have to decide what your programs with the internals for that object, based on the limited null
information.
Warning: you should seriously reconsider using yaml.YAMLObject
in the way you do. You are using the, documented as unsafe, load()
and if you cannot guarantee 100% control, now and at any time in the future, of your YAML input, you might lose the content of your drive, the secrecy of the objects you try to hide, or worse. You should be using safe_load()
or move away from using a library like PyYAML, which defaults to being unsafe.
Upvotes: 1