Pierre A
Pierre A

Reputation: 15

Is there a way to have a reference term in addition to a label with Doccano?

Hi I would like to know if we can have something like the following example on Doccano:

So let's say that we have a sentence like this : "MS is an IT company". I want to label some words in this sentence, for example MS (Microsoft). MS should be labelled as a Company (so imagine that I have an entity named Company) but I also want to say that MS stands for Microsoft.

Is there a way to do that with Doccano?

Thanks

Upvotes: 0

Views: 114

Answers (1)

louis_guitton
louis_guitton

Reputation: 5737

Doccano supports

  • Sequence Labelling good for Named Entity Recognition (NER)
  • Text Classification good e.g. for Sentiment Analysis
  • Sequence To Sequence good for Machine Translation

What you're describing sounds a little like Entity Linking. You can see from Doccano's roadmap in its docs that Entity Linking is part of the plans, but not yet available.

For now, I suggest to frame this as a NER problem, and to have different entities for MS (Microsoft) and MS (other). If you have too many entities to choose from, the labelling could become complicated, but then you could break up the dataset in smaller entity-focussed datasets. For example, you could get only documents with MS in them and label the mentions as one of the few synonyms.

Upvotes: 0

Related Questions