Alex Donovan
Alex Donovan

Reputation: 45

Stanford CoreNLP Remove NUMBER entity

I'm trying out the Stanford CoreNLP with a custom NER dictionary map file. It is fairly successful. But I keep having default Stanford NER entities like DATE, NUMBER which my custom NER dictionary does not contain. Is it possible to switch it off?

Example: Toyota Altis 2.0 (found in custom NER dictionary map file)

Stanford result: Toyota:NER=ORGANIZATION, Altis:NER=VEHICLE, 2.0:NER=NUMBER

My expected result: Toyota:NER=ORGANIZATION, Altis:NER=VEHICLE, 2.0:NER=VEHICLE

Is there any properties that I can input to switch it off producing entities DATE and NUMBER?

Thanks in advance!

Upvotes: 0

Views: 251

Answers (1)

Alex Donovan
Alex Donovan

Reputation: 45

I managed to solve the issue. To prevent numeric and date related entities, you need to set the following parameters in your Java code:

props.put( "ner.useSUTime","false" );//do not load the default SUTime models from Stanford
props.put( "ner.applyNumericClassifiers","false" );//do not use numeric from Stanford

Thanks for viewing.

Upvotes: 1

Related Questions