Pradeep Raje
Pradeep Raje

Reputation: 345

Stanford CoreNLP Server: Reduce memory footprint

I am acessing the CoreNLP Server from a Python script running in Jupyter Lab. I am using the full annotator suite to extract quotes from newspaper articles.

request_params={'annotators': "tokenize,ssplit,pos,lemma,ner,depparse,coref,quote",...

As against the recommended 2GB, I have allocated 4GB and yet the quote annotator fails to load. Windows task manager shows memory utilization at >94% for long periods.

Where can I get a list of options that I can tune to improve memory use?

Upvotes: 1

Views: 218

Answers (1)

StanfordNLPHelp
StanfordNLPHelp

Reputation: 8739

The coreference models are probably the main culprit. If you don't care about quote attributions you can set -quote.attributeQuotes false and not use coref, but you will lose quote attributions.

I'm not sure the exact amount, but I think you should be fine in the 6GB-8GB range for running the entire pipeline presented in your question. The models used do take up a lot of memory. I don't think the options you have set in your comment ("useSUTime", "applyNumericClassifiers") will affect memory footprint at all.

Upvotes: 1

Related Questions