Reputation: 3054
Spacy documentation shows how to update an NER with additional training examples. However, it trains using the entities offsets. How can I perform the same task but using BILUO scheme? I want to use training examples that contain for each sentence the list of tokens and the respective BILUO tags.
Upvotes: 0
Views: 475
Reputation: 10139
Thanks for your question. From the documentation:
The spacy.gold module also exposes two helper functions to convert offsets to BILUO tags, and BILUO tags to entity offsets.
So, it will go like this:
from spacy.gold import offsets_from_biluo_tags
doc = nlp('I like London.')
tags = ['O', 'O', 'U-LOC', 'O']
entities = offsets_from_biluo_tags(doc, tags)
Using the entities variable for each sentence create the TRAIN_DATA list and proceed with the code in the documentation.
Hope it helps :)
Upvotes: 0