Shrinidhi Narasimhan
Shrinidhi Narasimhan

Reputation: 143

BILOU Tagging scheme for multi-word entities in Spacy's NER

I am working on building a custom NER using spacy for recognizing new entities apart from spacy's NER. Now I have my training data to be tagged and added using spacy.Example. I am using the BILOU scheme. My doubt is that I have entities which have more than 3 words. For example:

Housing Development Finance Corporation reported heavy losses in the past quarter.

I want to tag Housing Development Finance Corporation as a single Entity using the BILOU scheme. Something like

'Housing'     B-Entity
'Development' I-Entity
'Finance'     I-Entity
'Corporation' L-Entity

Is this tagging correct?How will the model interpret the order within each entity?Any guidance would be much appreciated.

Upvotes: 2

Views: 468

Answers (1)

Albin Sidås
Albin Sidås

Reputation: 365

The tagging you have is correct while all outside words which are not entities would be marked with O.

The model will be depending on the same order within the entity to match it towards a previous entity of the same name, ex:

'Housing'     B-Entity
'Development' I-Entity
'Finance'     I-Entity
'Corporation' L-Entity

and

'Housing'     B-Entity
'Finance'     I-Entity
'Development' I-Entity
'Corporation' L-Entity

will not be linked as the same entity, although if you want this to be the case, you could look into a classification model to classify your foud entities towards your previously known entities and work from there.

Upvotes: 1

Related Questions