Measuring memory usage of scikit-learn changes model performance

I am trying to measure memory usage of my models in scikit-learn using memory-profiler python module. However, when I profile memory during training, the model shows different accuracy as when I train it without profiling. (With profiling, the performance is much worse) The code how I measure it:

clf = svm.SVC()
mem = memory_usage((clf.fit, (X_train, y_train)), max_usage=True)

pred = clf.predict(X_test)
print(accuracy_score(pred, y_test))

Does anyone have experience with this? What could be the cause?

Upvotes: 2

Answers (1)

Phoenix87

Reputation: 1149

If you're happy with a statistical analysis, then you could give Austin a try. It is a statistical profiler that can also profile memory usage. A really good feature is that you don't need to instrument your code. You could just run your application in production and attach Austin to it with extremely minimal performance impacts.

Austin is available for easy installation from many package providers, like Homebrew, Chocolatey, Snap Store and the Debian official repository. There are also binary releases available from GitHub.

Have a look at the README to see how to use it. In general, things are as easy as just

austin -m yourpythonapp

If you want to attach to a running process, get its PID and do

sudo austin -mp <pid>

Upvotes: 0

Measuring memory usage of scikit-learn changes model performance

Answers (1)

Related Questions