Reputation: 31
I'm having a little trouble when turning my python script into an executable. It's size is too big for me to distribute to my client.
Well, the problem is that I use just a few code of sklearn and it results in a total of 240 MB inside my distribution directory. I know that it's not because I use only one thing that I don't need the others. But I'm searching for a way to reduce this size, or even have an alternative to the KMeans class, with a more lightweight machine-learning package for python.
If needed, the parts of the code that use this feature are:
from sklearn.cluster import KMeans
...
# clus is just hanging an object instance of KMeans
clus = KMeans(n_clusters = _numBlocks, random_state = 1, n_jobs = 1)
# and here, I just call its method
_hourmap = clus.fit_predict(Load2Clus)
...
Upvotes: 3
Views: 185
Reputation: 13529
Well kmeans
is a very simple algorithm and just a tiny part of sklearn
as you recognise. I'd avoid using sklearn
if you are constrained on memory and that is the only part of the whole package that you use. You also may not need numpy
, scipy
and possibly other packages unless you're using them elsewhere in your code.
Your options are:
kmeans
package from here which wraps a C implementation of KMeans.Other things to consider for reducing the size of your library archive are given here, including:
Which of these will suit you best depends on your program.
Upvotes: 1