Naveen
Naveen

Reputation: 791

stanford nlp tokenizer

How can i tokenize a string in java class using stanford parser?

I am only able to find examples of documentProcessor and PTBTokenizer taking text from external file.

 DocumentPreprocessor dp = new DocumentPreprocessor("hello.txt");
   for (List sentence : dp) {
    System.out.println(sentence);
  }
  // option #2: By token

   PTBTokenizer ptbt = new PTBTokenizer(new FileReader("hello.txt"),
          new CoreLabelTokenFactory(), "");
  for (CoreLabel label; ptbt.hasNext(); ) {
    label = (CoreLabel) ptbt.next();
    System.out.println(label);
  }

Thanks.

Upvotes: 6

Views: 5470

Answers (1)

CapelliC
CapelliC

Reputation: 60004

PTBTokenizer constructor takes a java.io.Reader, then you can use a StringReader to parse your text

Upvotes: 6

Related Questions