Fadi AbuFarha
Fadi AbuFarha

Reputation: 21

Name Entity Recognition for Arabic documents

I need your help please, I am doing NER project using NetBeans v.8.0.2.

I need to get the Person Names and Places out of any Arabic document-file and categorize them as person name, Place. I saw all Stanford files, POS tagger, parser and also Stanford NER. And I tried them all, the tagger works fine with me.

But i had problems with Parser especially in this line of code

LexicalizedParser lp = LexicalizedParser.loadModel(grammar, options);

from ParserDemo and no output comes up. Do i need the parser first to tokenize the document then to use POS tagger, or i can just use the POS tagger with some editing (like using if statement to combine all NNP together and the same for places).

Upvotes: 1

Views: 877

Answers (1)

StanfordNLPHelp
StanfordNLPHelp

Reputation: 8739

So first of all as of the moment we do not have any Arabic NER models.

Secondly, I'll post some steps for running the Stanford parser on Arabic text.

  1. Get the Stanford parser: http://nlp.stanford.edu/software/lex-parser.shtml

  2. Compile ParserDemo.java ; you need the jars present in the directory stanford-parser-full-2015-04-20 to compile

  3. I ran this command at the command line while in the stanford-parser-full-2015-04-20 directory, (do the analogous thing in NetBeans):

java -cp ".:*" ParserDemo edu/stanford/nlp/models/lexparser/arabicFactored.ser.gz data/arabic-onesent-utf8.txt

You should get a proper parse of the Arabic example sentence.

So when you run ParserDemo in NetBeans, make sure you provide "edu/stanford/nlp/models/lexparser/arabicFactored.ser.gz" as the first argument to ParserDemo , so it knows to load the Arabic model.

For this input:

و نشر العدل من خلال قضاء مستقل 

I get this output:

(ROOT
  (S (CC و)
    (VP (VBD نشر)
      (NP (DTNN العدل))
      (PP (IN من)
        (NP (NN خلال)
          (NP (NN قضاء) (JJ مستقل)))))
    (PUNC .)))

I am happy to help you further, please let me know if you need any more info.

FYI here is some more info on the Arabic parser:

http://nlp.stanford.edu/software/parser-arabic-faq.shtml

Upvotes: 1

Related Questions