user1598732
user1598732

Reputation: 41

How to realize Named entity recognition with OpenNLP for the Albanian language?

I am trying out OpenNLP for Albanian language. For this I am using OPenNLP and trying to build models for person, location and organisation entity recognition in Albanian language. I am building my self the corpus, but I need an Open NLP expert to confirm me the below doubts: 1- Should I build a separated corpus for each model, e.g. for the ner-person build a corpus where only tags are present? 2- Is it possible to label person, location and organization in teh same corpus and use it to train a single model able to extract all of teh three entity types? 3- is there a resource where I can find more on the algorithm used from OpenNLP Name finder module?

Thanks for a reply, I really need your support for my thesis

Upvotes: 4

Views: 1135

Answers (1)

Mark Giaconia
Mark Giaconia

Reputation: 3953

1- Should I build a separated corpus for each model, e.g. for the ner-person build a corpus where only tags are present? IMO yes... however it is possible to have a model contain multiple name types. If you keep them separate, you can more easily update and iteratively improve models for given names, especially if they are large models.

2- Is it possible to label person, location and organization in teh same corpus and use it to train a single model able to extract all of teh three entity types? yes it is possible, but if you plan to build on each name type and refine models, keeping them separate has been easier for me.

3- is there a resource where I can find more on the algorithm used from OpenNLP Name finder module? The best was to do this is pull down the source and step through the code with some real data...it is based on Maximum Entropy.

Upvotes: -1

Related Questions