Using Numba with scikit-learn

Question

Has anyone succeeded in speeding up scikit-learn models using numba and jit compilaition. The specific models I am looking at are regression models such as Logistic Regressions.

I am able to use numba to optimize the functions I write using sklearn models, but the model functions themselves are not affected by this and are not optimized, thus not providing a notable increase in speed. Is there are way to optimize the sklearn functions?

Any info about this would be much appreciated.

Andreus · Accepted Answer

Scikit-learn makes heavy use of numpy, most of which is written in C and already compiled (hence not eligible for JIT optimization).

Further, the LogisticRegression model is essentially LinearSVC with the appropriate loss function. I could be slightly wrong about that, but in any case, it uses LIBLINEAR to do the solving, which is again a compiled C library.

The makers of scikit-learn also make heavy use of one of the python-to-compiled systems, Pyrex I think, which again results in optimized machine compiled code ineligible for JIT compilation.

Using Numba with scikit-learn

Answers (2)

Related Questions