Vin Shahrdar
Vin Shahrdar

Reputation: 1231

Stanford.NLP for .NET not loading models

I am trying to run the sample code provided here for Stanford.NLP for .NET.

I installed the package via Nuget, downloaded the CoreNLP zip archive, and extracted stanford-corenlp-3.7.0-models.jar. After extracting, I located the "models" directory in stanford-corenlp-full-2016-10-31\edu\stanford\nlp\models.

Here is the code that I am trying to run:

 public static void Test1()
    {
        // Path to the folder with models extracted from `stanford-corenlp-3.6.0-models.jar`
        var jarRoot = @"..\..\..\stanford-corenlp-full-2016-10-31\edu\stanford\nlp\models\";

        // Text for processing
        var text = "Kosgi Santosh sent an email to Stanford University. He didn't get a reply.";

        // Annotation pipeline configuration
        var props = new Properties();
        props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, ner,dcoref");
        props.setProperty("ner.useSUTime", "0");

        // We should change current directory, so StanfordCoreNLP could find all the model files automatically
        var curDir = Environment.CurrentDirectory;
        Directory.SetCurrentDirectory(jarRoot);
        var pipeline = new StanfordCoreNLP(props);
        Directory.SetCurrentDirectory(curDir);

        // Annotation
        var annotation = new Annotation(text);
        pipeline.annotate(annotation);

        // Result - Pretty Print
        using (var stream = new ByteArrayOutputStream())
        {
            pipeline.prettyPrint(annotation, new PrintWriter(stream));
            Console.WriteLine(stream.toString());
            stream.close();
        }
    }

I get the following error when I run the code:

A first chance exception of type 'java.lang.RuntimeException' occurred in stanford-corenlp-3.6.0.dll An unhandled exception of type 'java.lang.RuntimeException' occurred in stanford-corenlp-3.6.0.dll Additional information: edu.stanford.nlp.io.RuntimeIOException: Error while loading a tagger model (probably missing model file)

What am I doing wrong? I really want to get this working. :(

Upvotes: 3

Views: 3058

Answers (3)

Sergey Tihon
Sergey Tihon

Reputation: 12913

Mikael Kristensen's answer is correct. stanfrod-corenlp-ful-*.zip archive contains files stanford-corenlp-3.7.0-models.jar with models inside (this is a zip archive). In Java world, you add this jar on the class path, and it automatically resolves models' location in the archive.

CoreNLP has a file DefaultPaths.java that specifies path to model file. So when you instantiate StanfordCoreNLP with Properties object that does not specify models location, you should guarantee that models could be found by default path (related to Environment.CurrentDirectory).

The simplest way to guarantee existence of files at path like Environment.CurrentDirectory + "edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz" is to unzip a jar archive to the folder, and temporary change the current directory to unzipped folder.

var jarRoot = "nlp.stanford.edu/stanford-corenlp-full-2016-10-31/jar-modules/";
...
var curDir = Environment.CurrentDirectory;
Directory.SetCurrentDirectory(jarRoot);
var pipeline = new StanfordCoreNLP(props);
Directory.SetCurrentDirectory(curDir);

The other way is to specify paths to all models that your pipeline need (it actually depends on the list of annotators). This option is more complicated because you have to find correct property keys, and specify paths to all used model. But it may be useful if you want to minimize the size of you deployment package.

var props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, depparse");
props.put("ner.model",
          "edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz");
props.put("ner.applyNumericClassifiers", "false");
var pipeline = new StanfordCoreNLP(props);

Upvotes: 5

Mikael Kristensen
Mikael Kristensen

Reputation: 260

I think you have gotten the path for the model wrong. It should point to the jar root folder.

Try this path instead:

var jarRoot = @"..\..\..\stanford-corenlp-full-2016-10-31"

Upvotes: 2

Damien974
Damien974

Reputation: 21

I had the same problem. To fix it, use stanford-corenlp-3.6.0-models.jar instead. (The version of the Nuget package must be exactly the same as the version of the CoreNLP library. Actually 3.6.0).

Upvotes: 2

Related Questions