Reputation: 3535
I want to store my class object into spacy.Doc
and save it with doc.to_disk
, as follows:
from spacy.tokens import Doc
from spacy.vocab import Vocab
from dataclasses import dataclass
@dataclass
class Foo:
a: int
doc = Doc(Vocab(), [])
doc.user_data["foo"] = Foo(1)
doc.to_disk("/tmp/fooo")
But this code raise Errors:
TypeError: can not serialize 'Foo' object
What should I do?
Upvotes: 1
Views: 348
Reputation: 1181
Per this thread here, you should try the following work around:
def remove_unserializable_results(doc):
doc.user_data = {}
for x in dir(doc._):
if x in ['get', 'set', 'has']: continue
setattr(doc._, x, None)
for token in doc:
for x in dir(token._):
if x in ['get', 'set', 'has']: continue
setattr(token._, x, None)
return doc
nlp.add_pipe(remove_unserializable_results, last=True)
Upvotes: 1