Reputation: 1688
I want to use the graphaware nlp package to automatically perform nlp feature extraction on Dutch texts in neo4j.
For this purpose I wanted to use OpenNLP as it should have support for Dutch. The installation worked well, and I can annotate English texts, but for Dutch texts, the following error is thrown:
Neo.ClientError.Procedure.ProcedureCallFailed
Failed to invoke procedure `ga.nlp.annotate`: Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unsupported language : nl
I called the opennlp package using
MATCH (n:News)
CALL ga.nlp.annotate({text:n.text, id: n.uuid, textProcessor: "com.graphaware.nlp.processor.opennlp.OpenNLPTextProcessor", pipeline: "tokenizer"}) YIELD result
MERGE (n)-[:HAS_ANNOTATED_TEXT]->(result)
RETURN n, result
So it sucessfully detects that the fragment is Dutch, but it can not annotate this.
As a solution I was trying to manually download the dutch models, but I don't know how to load these up and connect them in a pipeline. It also seems weird that they would not come as default.
Upvotes: 1
Views: 172