Reputation: 25
I want to evaluate a SGDClassifier on MNIST Dataset using sklearn.model_selection.cross_val_score
.
It took me about 6 minute on 3-fold.
How can i speed up the process using full system power (i mean use everything from CPU to graphic card and etc.)
By the way i was monitoring CPU usage, it only used 54% of its power.
from sklearn.datasets import fetch_openml
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import cross_val_score
mnist = fetch_openml('mnist_784')
X, y = mnist['data'], mnist['target']
X_train, X_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]
y_train_5 = (y_train == 5)
y_test_5 = (y_test == 5)
sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(X_train, y_train)
cross_val_score(sgd_clf, X_train, y_train, cv=3, scoring='accuracy')
Upvotes: 2
Views: 3795
Reputation: 60370
From the docs:
n_jobs : int or None, optional (default=None)
The number of CPUs to use to do the computation. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors.
i.e. you can use all your available cores with
cross_val_score(sgd_clf, X_train, y_train, cv=3, scoring='accuracy', n_jobs=-1)
or specify some other value n_jobs=k
, if using all the cores makes your machine slow or unresponsive.
This will employ more CPU cores; as far as I know, there is no functionality in scikit-learn to offload computations to the GPU.
Upvotes: 5