Reputation: 125
Wondering is there a way to load specific classier in StanfordCoreNLP. I am trying to resolve an issue where 3 of the classifiers that gets loaded by default the third classifier is not reliably returning the ner tag and resulting in inconsistency in the app. Want to know Loading just the english.all.3class is good enough of for basic named entity tagging and what is the relevance of the other two in the following list.
edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz
Upvotes: 0
Views: 1591
Reputation: 8739
Yes you should be fine if you specify the path in the model jar.
Basically you can set "ner.model" to a comma separated list of the serialized crf's you wish to use, so if you want to exclude one of them, just supplying the two models you want will work fine.
And to provide some clarity, the three models have been trained on different data sets.
The all.3class is trained on 7 data sources that have (person, organization, location, none) tagged.
The muc.7class is trained on data from the MUC-7 Named Entity Task, and includes (date,location,money,organization,percent,person,time). More info:
https://catalog.ldc.upenn.edu/LDC2001T02
http://www-nlpir.nist.gov/related_projects/muc/proceedings/ne_task.html
The conll.4class is trained on data from the CONLL 2003 NER corpus, and includes (person,organization,location,misc).
http://www.cnts.ua.ac.be/conll2003/ner/
Upvotes: 0
Reputation: 125
I got the answer after some research. We can load a specific model using ner.model. Wondering if we can refer to the already packaged model in StanfordCoreNLP library jar, instead of having duplicate copy of the model in the project working directory for this purpose.
Properties configuration = new Properties();
configuration.put("annotators", "tokenize,ssplit,pos,lemma,ner");
configuration.put("ner.model", "english.all.3class.distsim.crf.ser.gz");
StanfordCoreNLP coreNLP = new StanfordCoreNLP(configuration);
Upvotes: 4