HHH
HHH

Reputation: 6485

How to use Chunker Class in OpenNLP?

The ChunkerME class in OpenNLP has a chunk() method which takes two String[]. The first one should be the tags (tags from part of speech tagging process) and the second one is the actual terms.

I'm having a tagged string in the format of Sir_NNP Arthur_NNP Conan_NNP... and I'd like to chunk it using the ChunkerME class. However the chunker does not accept this string as is. however the OpenNLP command line has a command (opennlp ChunkerME en-chunker.bin) which directly accepts a tagged sentence and return a chunked sentence.

How can I use something like the one in the command line.

Upvotes: 1

Views: 142

Answers (1)

Bob Rivers
Bob Rivers

Reputation: 5493

This type of string ("Sir_NNP Arthur_NNP Conan_NNP...") is a POSSample sentence. When running the ChunkerME tool through the CLI, it parses the provided sentence into an array.

POSSample posSample;
try {
   posSample = POSSample.parse(line);
} catch (InvalidFormatException e) {
   logger.warn("Invalid format: {}", line, e);
   continue;
}

I'm not sure if it is possible to use it directly from command line, but I thought it could help to post an answer mentioning the POSSample class to parse your string and use the array to call the chunker.

Upvotes: 0

Related Questions