rominf
rominf

Reputation: 2845

What is the list of possible tags with a description of CoNLL 2003 NER Task?

I need to do some NER. I've found DeepPavlov library that does this.

Here is an example from docs:

from deeppavlov import configs, build_model

ner_model = build_model(configs.ner.ner_ontonotes, download=True)
ner_model(['Bob Ross lived in Florida'])
>>> [[['Bob', 'Ross', 'lived', 'in', 'Florida']], [['B-PERSON', 'I-PERSON', 'O', 'O', 'B-GPE']]]

I don't understand what all those tags mean. As I understood from the documentation, they are in the CoNLL 2003 NER Task format.

Can somebody point me at the list of possible tags with a description of CoNLL 2003 NER Task?

Upvotes: 3

Views: 1412

Answers (1)

com
com

Reputation: 2704

For NER task there are some common types of entities used as tags:

  • persons (PER)
  • organizations (ORG)
  • monetary values (MONEY)
  • Geopolitical entity, i.e. countries, cities, states (GPE)

and many others

Furthermore, to distinguish adjacent entities with the same tag many applications use BIO tagging scheme. Here B denotes the beginning of an entity, I stands for "inside" and is used for all words comprising the entity except the first one, and O means the absence of entity.

So on the example above, B-PERSON means that the person name begins with the token Bob, the next tag I-PERSON says that Ross relates to the entity as the previous tag. Then goes O which means that lived doesn't belong to any entity, the same is with in, whereas Florida is the begginging of Geopolitical entity (GPE).

Please let me know if this was helpful enough.

Upvotes: 5

Related Questions