Reputation: 3340
The Stanford CoreNLP library is packaged with models to recognize Time, Location, Organization, Person, Money, Percent, and Dates. Are there any other general-use models available from other groups that recognize additional things?
Also, if we were to train a new model to recognize just band names (for instance), could we run our new model in addition to the packaged ones, or would be have to train the new model to recognize Time, Location, Organization, Person, Money, Percent, Dates, and Bands all together if we wanted to do that? The documentation does say the existing models themselves cannot be extended.
Upvotes: 1
Views: 461
Reputation: 8739
You can definitely train a CRFClassifier or RegexNER to recognize band names and incorporate that with the other NER taggers, and your module could exclusively focus on band names.
I would probably recommend using a RegexNER for band names. Here is the link:
http://nlp.stanford.edu/software/regexner/
Basically you just create a file with band names, or regular expressions matching band names, and you can then use the standard pipeline to tag text based off of your custom work.
Here is a sample command:
java -mx1g -cp "*:." edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators "tokenize,ssplit,pos,lemma,ner,regexner" -file sample_text.txt -regexner.mapping my-band-regexes.txt
Upvotes: 3