Reputation: 11
I am working on creating a parser for job description. Here I have to extract every project details separately. I have used NLTK and Stanford but the results are not accurate. Can anyone suggest a module to use which provides best results
Upvotes: 1
Views: 1653
Reputation: 17339
A full working example with spacy:
import spacy
nlp = spacy.load('en_core_web_sm')
text = u"Software Engineer job in San Francisco, California, USA"
doc = nlp(text)
for ent in doc.ents:
if ent.label_ in ['GPE', 'LOC']:
print ent.text, ent.start_char, ent.end_char, ent.label_
San Francisco 25 38 GPE
California 40 50 GPE
USA 52 55 GPE
Note that you can train the model further, by giving it more labeled examples. See the documentation.
Upvotes: 0
Reputation: 141
Use these commands to install Spacy
pip install -U spacy
python -m spacy download en
Then you can tag your dataset and train the model on that or even you can use the pre-trained model.
Upvotes: 2