user617861
user617861

Reputation: 73

Named Entity Recognition from personal Gazetter using Python

I try to do named entity recognition in python using NLTK. I want to extract personal list of skills. I have the list of skills and would like to search them in requisition and tag the skills. I noticed that NLTK has NER tag for predefine tags like Person, Location etc. Is there a external gazetter tagger in Python I can use? any idea how to do it more sophisticated than search of terms ( sometimes multi words term )?

Thanks, Assaf

Upvotes: 4

Views: 2337

Answers (2)

Savino Sguera
Savino Sguera

Reputation: 3572

Have a look at RegexpTagger and eventually RegexpParser, I think that's exactly what you are looking for.

You can create your own POS tags, ie. map skills to a tag, and then easily define a grammar.

Some sample code for the tagger is in this pdf.

Upvotes: 1

nflacco
nflacco

Reputation: 5082

I haven't used NLTK enough recently, but if you have words that you know are skills, you don't need to do NER- just a text search.

Maybe use Lucene or some other search library to find the text, and then annotate it? That's a lot of work but if you are working with a lot of data that might be ok. Alternatively, you could hack together a regex search which will be slower but probably work ok for smaller amounts of data and will be much easier to implement.

Upvotes: 1

Related Questions