Reputation: 21
I need your help please, I am doing NER project using NetBeans v.8.0.2.
I need to get the Person Names and Places out of any Arabic document-file and categorize them as person name, Place. I saw all Stanford files, POS tagger, parser and also Stanford NER. And I tried them all, the tagger works fine with me.
But i had problems with Parser especially in this line of code
LexicalizedParser lp = LexicalizedParser.loadModel(grammar, options);
from ParserDemo and no output comes up. Do i need the parser first to tokenize the document then to use POS tagger, or i can just use the POS tagger with some editing (like using if statement to combine all NNP together and the same for places).
Upvotes: 1
Views: 877
Reputation: 8739
So first of all as of the moment we do not have any Arabic NER models.
Secondly, I'll post some steps for running the Stanford parser on Arabic text.
Get the Stanford parser: http://nlp.stanford.edu/software/lex-parser.shtml
Compile ParserDemo.java ; you need the jars present in the directory stanford-parser-full-2015-04-20 to compile
I ran this command at the command line while in the stanford-parser-full-2015-04-20 directory, (do the analogous thing in NetBeans):
java -cp ".:*" ParserDemo edu/stanford/nlp/models/lexparser/arabicFactored.ser.gz data/arabic-onesent-utf8.txt
You should get a proper parse of the Arabic example sentence.
So when you run ParserDemo in NetBeans, make sure you provide "edu/stanford/nlp/models/lexparser/arabicFactored.ser.gz" as the first argument to ParserDemo , so it knows to load the Arabic model.
For this input:
و نشر العدل من خلال قضاء مستقل
I get this output:
(ROOT
(S (CC و)
(VP (VBD نشر)
(NP (DTNN العدل))
(PP (IN من)
(NP (NN خلال)
(NP (NN قضاء) (JJ مستقل)))))
(PUNC .)))
I am happy to help you further, please let me know if you need any more info.
FYI here is some more info on the Arabic parser:
http://nlp.stanford.edu/software/parser-arabic-faq.shtml
Upvotes: 1