Reputation: 353
when l run SVM, l get different results even with a fixed random_state=42
.
l have 10 classes and a dataset of 200 examples. Dimension of my dataset dim_dataset=(200,2048)
Here is my code:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.svm import LinearSVC
from sklearn import svm
import random
random.seed(42)
def shuffle_data(x,y):
idx = np.random.permutation(len(x))
x_data= x[idx]
y_labels=y[idx]
return x_data,y_labels
d,l=shuffle_data(dataset,true_labels) # dim_d=(200,2048) , dim_l=(200,)
X_train, X_test, y_train, y_test = train_test_split(d, l, test_size=0.30, random_state=42)
# hist intersection kernel
gramMatrix = histogramIntersection(X_train, X_train)
clf_gram = svm.SVC(kernel='precomputed', random_state=42).fit(gramMatrix, y_train)
predictMatrix = histogramIntersection(X_test, X_train)
SVMResults = clf_gram.predict(predictMatrix)
correct = sum(1.0 * (SVMResults == y_test))
accuracy = correct / len(y_test)
print("SVM (Histogram Intersection): " + str(accuracy) + " (" + str(int(correct)) + "/" + str(len(y_test)) + ")")
# libsvm linear kernel
clf_linear_kernel = svm.SVC(kernel='linear', random_state=42).fit(X_train, y_train)
predicted_linear = clf_linear_kernel.predict(X_test)
correct_linear_libsvm = sum(1.0 * (predicted_linear == y_test))
accuracy_linear_libsvm = correct_linear_libsvm / len(y_test)
print("SVM (linear kernel libsvm): " + str(accuracy_linear_libsvm) + " (" + str(int(correct_linear_libsvm)) + "/" + str(len(y_test)) + ")")
# liblinear linear kernel
clf_linear_kernel_liblinear = LinearSVC(random_state=42).fit(X_train, y_train)
predicted_linear_liblinear = clf_linear_kernel_liblinear.predict(X_test)
correct_linear_liblinear = sum(1.0 * (predicted_linear_liblinear == y_test))
accuracy_linear_liblinear = correct_linear_liblinear / len(y_test)
print("SVM (linear kernel liblinear): " + str(accuracy_linear_liblinear) + " (" + str(
int(correct_linear_liblinear)) + "/" + str(len(y_test)) + ")")
What's wrong with my code ?
Upvotes: 2
Views: 5556
Reputation: 36599
Use numpy.random.seed()
instead of simple random.seed
like this:
np.random.seed(42)
Scikit internally uses numpy to generate random numbers so doing only random.seed will not effect the behaviour of numpy which is still random.
Please see the following links for better understanding:
Upvotes: 3