thetna
thetna

Reputation: 7143

stanford Core NLP: Splitting sentences from text

I am new to stanford Core NLP. I would like to use it for splitting sentences from text in English, German,French. Which class does this work?Thanks in advance.

Upvotes: 5

Views: 11728

Answers (4)

Ali Yeşilkanat
Ali Yeşilkanat

Reputation: 607

    Properties properties = new Properties();
    properties.setProperty("annotators", "tokenize, ssplit, parse");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(properties);
    List<CoreMap> sentences = pipeline.process(SENTENCES)
    .get(CoreAnnotations.SentencesAnnotation.class);    
    // I just gave a String constant which contains sentences.
    for (CoreMap sentence : sentences) {
            System.out.println(sentence.toString());
    }

Upvotes: 0

Christopher Manning
Christopher Manning

Reputation: 9450

For the lower level classes that handle this, you can look at the tokenizer documentation. At the CoreNLP level, you can just use the Annotator's "tokenize,ssplit".

Upvotes: 8

Joe K
Joe K

Reputation: 18434

Have you looked at the documentation on the main Stanford NLP page? About half way down, it provides an example of almost the exact thing you're looking for. The example not only splits sentences, but also words.

Upvotes: 3

Kumar Vivek Mitra
Kumar Vivek Mitra

Reputation: 33544

Why not use BreakIterator from java.text package...... to split Sentences, Lines, Words, Characters...etc

See this link:

http://docs.oracle.com/javase/6/docs/api/java/text/BreakIterator.html

Upvotes: 1

Related Questions