Stanford CoreNLP Classifier: NER training context

Question

In Stanford's CoreNLP Classifier, all of the examples that I have seen have included words (denoted by O) that one does not want to recognize. For example, below "certain" and "before" are not recognized as Assets:

certain O       O
Apple   ASSET   ASSET
products       ASSET   ASSET
macOS   ASSET   ASSET
before  O       O

1) Do I need words that provide context like "certain" and "before"?

2) Does order matter? Could I, rather than the order "certain, Apple, products, macOS, before" do "before, certain, Apple, macOS, products"?

3) If context is necessary, once I have added enough training data, could I just add more Assets without context?

sophros · Accepted Answer

Ad 1. Context is helpful if your classification is context-dependent.

Ad 2. Under the hood Stanford CoreNLP Classifier uses CRF (Conditional Random Field) algorithm which uses the order of words to classify correctly as well.

Ad 3. See pt. 1. - necessity depends on your problem and your data. You could reuse previous contexts and see if that improves or degrades the classification accuracy.

Stanford CoreNLP Classifier: NER training context

Answers (1)

Related Questions