Partha Bisoi
Partha Bisoi

Reputation: 173

configuring a separate model jar in stanford nlp

I have implemented a logic to use stanford nlp to get the location from the particular english sentence. I was using the following jar stanford-corenlp-3.2.0.jar stanford-corenlp-3.2.0-models.jar

The logic that I wrote is following

 public static edu.stanford.nlp.pipeline.StanfordCoreNLP snlp;
    /**
     * @see ServletContextListener#contextInitialized(ServletContextEvent)
     */
    public void contextInitialized(ServletContextEvent arg0) {
        Properties props = new Properties();
        props.put("annotators", "tokenize,ssplit,pos,lemma,parse,ner,dcoref");
        StanfordCoreNLP snlp = new StanfordCoreNLP(props);
    }

However because of the case sensitive issue, I was adviced to use stanford-corenlp-caseless-2015-04-20-models.jar instead of stanford-corenlp-3.2.0.jar. From the above code the jar which will be loaded by default is stanford-corenlp-3.2.0-models.jar.

However I want to now configure with the following model i.e. stanford-corenlp-caseless-2015-04-20-models.jar Please guide me on how to configure it using java code.

I tried the Gabor's solution. However I got the following exception

SEVERE: Exception sending context initialized event to listener instance of class servlets.NLP_initializer
java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:493)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:260)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:127)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:123)
    at servlets.NLP_initializer.contextInitialized(NLP_initializer.java:34)
    at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4887)
    at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5381)
    at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
    at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1559)
    at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1549)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:749)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:283)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:247)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:78)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:62)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:491)
    ... 14 more
Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:419)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:744)
    ... 19 more

Upvotes: 0

Views: 1714

Answers (1)

Gabor Angeli
Gabor Angeli

Reputation: 5749

See http://nlp.stanford.edu/software/corenlp.shtml#caseless

Copying from the documentation:

It is possible to run StanfordCoreNLP with tagger, parser, and NER models that ignore capitalization. In order to do this, download the caseless models package. Be sure to include the path to the case insensitive models jar in the -cp classpath flag as well. Then, set properties which point to these models as follows:

-pos.model edu/stanford/nlp/models/pos-tagger/english-caseless-left3words-distsim.tagger

-parse.model edu/stanford/nlp/models/lexparser/englishPCFG.caseless.ser.gz

-ner.model edu/stanford/nlp/models/ner/english.all.3class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.muc.7class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.conll.4class.caseless.distsim.crf.ser.gz

In your code, these paths can be set with:

    props.put("pos.model", "edu/stanford/nlp/models/pos-tagger/english-caseless-left3words-distsim.tagger");
    props.put("parse.model", "edu/stanford/nlp/models/lexparser/englishPCFG.caseless.ser.gz");
    props.put("ner.model", "edu/stanford/nlp/models/ner/english.all.3class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.muc.7class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.conll.4class.caseless.distsim.crf.ser.gz");

Upvotes: 2

Related Questions