Reputation: 145
I want to train model for extracting person name (part of NER system) but I want to make this model caseless (I mean the model will not take letter case in consideration, no difference between uppercase and lowercase letters), because i have noisy text.
so is there any parameter in training step to do that, or any other way?
Upvotes: 2
Views: 561
Reputation: 718
If you must use OpenNLP, I suppose you could train new models on caseless training data. Simply take whatever existing training data is available (with appropriate annotations, etc.) and lowercase all the content before training a new model.
Or, if you can use Stanford NER instead of OpenNLP, you can just use Stanford NER's pre-trained caseless English models: http://nlp.stanford.edu/software/CRF-NER.shtml#Models
Whichever way you go, keep in mind that your accuracy will decrease by using caseless models.
Upvotes: 2