Dohun
Dohun

Reputation: 507

Creating own named entity using NLTK on Python

I am studying NLTK using a book named Natural Language Processing with Python Cookbook.

Here is the code but there was no explanation at all.

grammar = r"NAMED-ENTITY: {<NNP>+}"
cp = nltk.RegexpParser(grammar)

samplestrings = [
    "Microsoft Azure is a cloud service",
    "Bill Gates announces Satya Nadella as new CEO of Microsoft"
]

def demo(samplestrings):
    for s in samplestrings:
        words = nltk.word_tokenize(s)
        tagged = nltk.pos_tag(words)
        # chunks = nltk.ne_chunk(tagged)
        chunks = cp.parse(tagged)
        print(nltk.tree2conllstr(chunks))
        print(chunks)

demo(samplestrings)

So I am stuck with the first line.

What does grammar = r"NAMED-ENTITY: {<NNP>+}" this code do?

Does it mean that if there is more than one NNP (at least one NNP), then that tagged word is a named-entity?

Thanks for the answer

Upvotes: 1

Views: 155

Answers (1)

thorntonc
thorntonc

Reputation: 2126

In this example they are chunking sequences of a proper noun with a regex parser named as NAMED-ENTITY.

cp = nltk.RegexpParser(r"NAMED-ENTITY: {<NNP>+}")

NNP is the part-of-speech tag for proper nouns.

Upvotes: 1

Related Questions