Reputation: 232
In the latest documentation for Spacy, the following example is given at the following link:
https://spacy.io/usage/embeddings-transformers
import spacy
from spacy.tokens import Doc
from spacy_transformers import TransformerData
from thinc.api import set_gpu_allocator, require_gpu
def custom_annotation_setter(docs, trf_data):
doc_data = list(trf_data.doc_data)
for doc, data in zip(docs, doc_data):
doc._.custom_attr = data
nlp = spacy.load("en_core_web_trf")
nlp.get_pipe("transformer").set_extra_annotations = custom_annotation_setter
doc = nlp("This is a text")
assert isinstance(doc._.custom_attr, TransformerData)
print(doc._.custom_attr.tensors)
This code throws an exception when it try's to process the test data:
AttributeError: [E047] Can't assign a value to unregistered extension attribute 'custom_attr'. Did you forget to call the set_extension
method?
I set the extension using:
Doc.set_extension('custom_attr', default=True)
My question is, should the Transform class handle adding this special extension itself (as is implied in the example code), or is this just a bug in the example?
Upvotes: 1
Views: 2205
Reputation: 15623
Your code runs without error for me if I set the extension before your function definition.
import spacy
from spacy.tokens import Doc
from spacy_transformers import TransformerData
from thinc.api import set_gpu_allocator, require_gpu
Doc.set_extension('custom_attr', default=True)
def custom_annotation_setter(docs, trf_data):
doc_data = list(trf_data.doc_data)
for doc, data in zip(docs, doc_data):
doc._.custom_attr = data
nlp = spacy.load("en_core_web_trf")
nlp.get_pipe("transformer").set_extra_annotations = custom_annotation_setter
doc = nlp("This is a text")
assert isinstance(doc._.custom_attr, TransformerData)
print(doc._.custom_attr.tensors)
Maybe you called the set_extension
in the wrong place?
Upvotes: 1