Reputation: 2845
I need to do some NER. I've found DeepPavlov library that does this.
Here is an example from docs:
from deeppavlov import configs, build_model
ner_model = build_model(configs.ner.ner_ontonotes, download=True)
ner_model(['Bob Ross lived in Florida'])
>>> [[['Bob', 'Ross', 'lived', 'in', 'Florida']], [['B-PERSON', 'I-PERSON', 'O', 'O', 'B-GPE']]]
I don't understand what all those tags mean. As I understood from the documentation, they are in the CoNLL 2003 NER Task format.
Can somebody point me at the list of possible tags with a description of CoNLL 2003 NER Task?
Upvotes: 3
Views: 1412
Reputation: 2704
For NER task there are some common types of entities used as tags:
and many others
Furthermore, to distinguish adjacent entities with the same tag many applications use BIO tagging scheme. Here B denotes the beginning of an entity, I stands for "inside" and is used for all words comprising the entity except the first one, and O means the absence of entity.
So on the example above, B-PERSON means that the person name begins with the token Bob, the next tag I-PERSON says that Ross relates to the entity as the previous tag. Then goes O which means that lived doesn't belong to any entity, the same is with in, whereas Florida is the begginging of Geopolitical entity (GPE).
Please let me know if this was helpful enough.
Upvotes: 5