Reputation: 345
I am acessing the CoreNLP Server from a Python script running in Jupyter Lab. I am using the full annotator suite to extract quotes from newspaper articles.
request_params={'annotators': "tokenize,ssplit,pos,lemma,ner,depparse,coref,quote",...
As against the recommended 2GB, I have allocated 4GB and yet the quote annotator fails to load. Windows task manager shows memory utilization at >94% for long periods.
Where can I get a list of options that I can tune to improve memory use?
Upvotes: 1
Views: 218
Reputation: 8739
The coreference models are probably the main culprit. If you don't care about quote attributions you can set -quote.attributeQuotes false
and not use coref, but you will lose quote attributions.
I'm not sure the exact amount, but I think you should be fine in the 6GB-8GB range for running the entire pipeline presented in your question. The models used do take up a lot of memory. I don't think the options you have set in your comment ("useSUTime", "applyNumericClassifiers") will affect memory footprint at all.
Upvotes: 1