Reputation: 145
I am trying to make my own NER classifer with my own tags in it. I tried training my model using instuctions in http://nlp.stanford.edu/software/crf-faq.shtml#j. But the problem is I do not have much training data. So I was thinking if there is a way we can add our own tags in existing classifiers like english.all.3class.distsim.crf.ser, english.all.7class.distsim.crf.ser etc. I can train the classifier for my own tags.
Please help me in this regard. Thank you in advance.
Upvotes: 0
Views: 1800
Reputation: 480
You can have any tags(ex: PERSON) by replacing the default ones(ex: PERS) in the .tsv file. The classifier learns the tags you have supplied via the tsv file and then it tags with the ones you supplied when you supply the custom tag based model.
Taking a part of jane-austen-emma-ch1.tsv(from http://nlp.stanford.edu/software/ner-example/jane-austen-emma-ch1.tsv) file and putting our own custom tags for training as follows. I have got two tags- PERSON and ADJECTIVE
CHAPTER O
I O
Emma PERSON
Woodhouse PERSON
, O
handsome ADJECTIVE
, O
clever ADJECTIVE
, O
and O
rich ADJECTIVE
, O
with O
a O
comfortable ADJECTIVE
Now you can feed this tsv file to the classifier(put this tsv file name in .prop file) and generate the model as shown below-
java -cp "stanford-ner.jar:slf4j-api.jar" edu.stanford.nlp.ie.crf.CRFClassifier -prop ner.prop
Now, let's test the model for any text file and see how it annotates. Let's take the following text file(toBeAnnotated.txt)
CHAPTER O
I Emma Woodhouse, handsome, clever and rich, with a comfortable home and happy disposition, seemed to unite some of the best blessings
Running the following command annotates the above text file-
java -mx600m -cp "stanford-ner.jar:slf4j-api.jar" edu.stanford.nlp.ie.crf.CRFClassifier -loadClassifier ner-model.ser.gz -textFile toBeAnnotated.txt -outputFormat inlineXML 2> /dev/null
The output I have got is(I have added newlines for clarity)-
I <PERSON>Emma Woodhouse</PERSON>,
<ADJECTIVE>handsome</ADJECTIVE>, <ADJECTIVE>clever</ADJECTIVE>
and <ADJECTIVE>rich</ADJECTIVE>, with a <ADJECTIVE>comfortable</ADJECTIVE>
home and happy <ADJECTIVE>disposition</ADJECTIVE>,
seemed to unite some of the best blessings
Upvotes: 1