Reputation: 15
Hi I would like to know if we can have something like the following example on Doccano:
So let's say that we have a sentence like this : "MS is an IT company". I want to label some words in this sentence, for example MS (Microsoft). MS should be labelled as a Company (so imagine that I have an entity named Company) but I also want to say that MS stands for Microsoft.
Is there a way to do that with Doccano?
Thanks
Upvotes: 0
Views: 114
Reputation: 5737
Doccano supports
Sequence Labelling
good for Named Entity Recognition (NER)Text Classification
good e.g. for Sentiment AnalysisSequence To Sequence
good for Machine TranslationWhat you're describing sounds a little like Entity Linking. You can see from Doccano's roadmap in its docs that Entity Linking is part of the plans, but not yet available.
For now, I suggest to frame this as a NER problem, and to have different entities for MS (Microsoft)
and MS (other)
. If you have too many entities to choose from, the labelling could become complicated, but then you could break up the dataset in smaller entity-focussed datasets. For example, you could get only documents with MS in them and label the mentions as one of the few synonyms.
Upvotes: 0