Reputation: 875
I'm trying to deploy stanford-corenlp-3.2.0-models.jar
but my host says the jar is to big?
If I'm just want to use the POS, what jar, can I use instead.
Or how can I split the jar?
Upvotes: 0
Views: 419
Reputation: 781
You can customize your annotator options using Properties file as below:
Properties props1 = new Properties();
props1.put("annotators", "tokenize, cleanxml,ssplit, pos");
Sample Java code:
package parserOnly;
import java.io.*;
import java.util.*;
import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.semgraph.SemanticGraph;
import edu.stanford.nlp.semgraph.SemanticGraphCoreAnnotations;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.util.*;
public class ParserOnly {
public static void main(String[] args) throws IOException {
PrintWriter out;
if (args.length > 1) {
out = new PrintWriter(args[1]);
} else {
out = new PrintWriter(System.out);
}
PrintWriter xmlOut = null;
if (args.length > 2) {
xmlOut = new PrintWriter(args[2]);
}
Properties props1 = new Properties();
props1.put("annotators", "tokenize, cleanxml,ssplit, pos");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props1);
Annotation annotation;
if (args.length > 0) {
annotation = new Annotation(IOUtils.slurpFileNoExceptions(args[0]));
} else {
annotation = new Annotation("Kosgi Santosh sent an email to Stanford University. He didn't get a reply.");
}
pipeline.annotate(annotation);
pipeline.prettyPrint(annotation, out);
if (xmlOut != null) {
pipeline.xmlPrint(annotation, xmlOut);
}
// An Annotation is a Map and you can get and use the various analyses individually.
// For instance, this gets the parse tree of the first sentence in the text.
out.println();
// The toString() method on an Annotation just prints the text of the Annotation
// But you can see what is in it with other methods like toShorterString()
out.println("The top level annotation");
out.println(annotation.toShorterString());
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
if (sentences != null && sentences.size() > 0) {
ArrayCoreMap sentence = (ArrayCoreMap) sentences.get(0);
out.println("The first sentence is:");
out.println(sentence.toShorterString());
// Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
out.println();
out.println("The first sentence tokens are:");
for (CoreMap token : sentence.get(CoreAnnotations.TokensAnnotation.class)) {
ArrayCoreMap aToken = (ArrayCoreMap) token;
out.println(aToken.toShorterString());
}
/* out.println("The first sentence parse tree is:");
tree.pennPrint(out);
out.println("The first sentence basic dependencies are:");
System.out.println(sentence.get(SemanticGraphCoreAnnotations.BasicDependenciesAnnotation.class).toString("plain"));
out.println("The first sentence collapsed, CC-processed dependencies are:");
SemanticGraph graph = sentence.get(SemanticGraphCoreAnnotations.CollapsedCCProcessedDependenciesAnnotation.class);
System.out.println(graph.toString("plain"));*/
}
}
}
Upvotes: 0
Reputation: 197
If You just need the POS tagger then you can download a much lighter version (35mb) of only POS tagging from here: http://nlp.stanford.edu/software/tagger.shtml
Upvotes: 1
Reputation: 9450
You just need to read up on how to use the jar
command. A jar file is just a variant on a zip file. You can expand its contents with jar -xf stanford-corenlp-3.2.0-models.jar
, get what you need, and then put it into a new smaller jar file.
Upvotes: 1