DMAR
DMAR

Reputation: 240

Pyspark Systemml writing/reading from /tmp

I am running a Flask App with a systemml SVM component through pyspark. The app runs for about a day or so, then it begins to error out whenever the SVM is used to make a prediction. The error that is thrown is:

Caused by: java.io.FileNotFoundException: /tmp/systemml/_p1_10.101.38.73/cache/cache000005058.dat (No such file or directory)

I believe what is happening is that systemml is writing to /tmp/ which is then eventually cleared out by the container that I am using. Then when it goes to predict it attempts to read this file and errors out. Am I correct in that guess? What's the best way to solve this? Is there a way to tell systemml where to write its cache to?

Thanks for any help you can give!

Upvotes: 0

Views: 70

Answers (1)

mboehm7
mboehm7

Reputation: 115

Yes, in SystemML master (and upcoming releases) you can simply use ml.setConfigProperty("localtmpdir", "/tmp2/systemml") to modify this cache directory or any other configuration. However, the Python MLContext API in SystemML 0.14 didn't expose this yet.

As a workaround, you can alternatively put a SystemML-config.xml (see systemml/conf for templates) with your custom configurations into the directory where the SystemML.jar has been installed.

Upvotes: 1

Related Questions