Amanda
Amanda

Reputation: 79

Extracting nouns from Noun Phase in NLP

Could anyone please tell me how to extract only the nouns from the following output:

I have tokenized and parsed the string "Give me the review of movie" based on a given grammar using following procedure:-

sent=nltk.word_tokenize(msg)
parser=nltk.ChartParser(grammar)
trees=parser.nbest_parse(sent)
for tree in trees:
    print tree
tokens=find_all_NP(tree)
tokens1=nltk.word_tokenize(tokens[0])
print tokens1

and obtained the following output:

>>> 
(S
  (VP (V Give) (Det me))
  (NP (Det the) (N review) (PP (P of) (N movie))))
(S
  (VP (V Give) (Det me))
  (NP (Det the) (N review) (NP (PP (P of) (N movie)))))
['the', 'review', 'of', 'movie']
>>> 

Now I would like to only obtain the nouns. How do I do that?

Upvotes: 6

Views: 6182

Answers (1)

Joe
Joe

Reputation: 3059

You don't need to use a full parser to get nouns. You can simply use a tagger. One function you can use is nltk.tag.pos_tag(). This will return a list of tuples with the word and part of speech. You'll be able to iterate over the tuples and find words tagged with 'NN' or 'NNS' for noun or plural noun.

NLTK has a how to document for how to use their taggers. It can be found here: https://nltk.googlecode.com/svn/trunk/doc/howto/tag.html and here is a link to the chapter in the NLTK book about using taggers: https://nltk.googlecode.com/svn/trunk/doc/book/ch05.html

There are many code examples in each of those places.

Upvotes: 6

Related Questions