Reputation: 197
I am trying to run kfold cross validation. but for some reason, it gets stuck here, it wont terminate from here accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
i cant understand whats the problem. and how do i fix it.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Churn_Modelling.csv')
X = dataset.iloc[:, 3:13].values
y = dataset.iloc[:, 13].values
# Encoding categorical data
from sklearn.preprocessing import LabelEncoder, OneHotEncoder
labelencoder_X_1 = LabelEncoder()
X[:, 1] = labelencoder_X_1.fit_transform(X[:, 1])
labelencoder_X_2 = LabelEncoder()
X[:, 2] = labelencoder_X_2.fit_transform(X[:, 2])
onehotencoder = OneHotEncoder(categorical_features = [1])
X = onehotencoder.fit_transform(X).toarray()
X = X[:,1:]
# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)
# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
import keras
from keras.models import Sequential #Required to initialize the ANN
from keras.layers import Dense #Build layers of ANN
from keras.layers import Dropout
# Evaluating the ANN
import keras
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from keras.models import Sequential #Required to initialize the ANN
from keras.layers import Dense #Build layers of ANN
def build_classifier(): # Builds the architecture, or the classifier
classifier = Sequential()
classifier.add(Dense(activation = 'relu', input_dim = 11, units = 6, kernel_initializer = 'uniform'))# add layers
classifier.add(Dense(activation = 'relu', units = 6, kernel_initializer = 'uniform'))# add layers
classifier.add(Dense(activation = 'sigmoid', units = 1, kernel_initializer = 'uniform'))
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])
return classifier
classifier = KerasClassifier(build_fn = build_classifier, batch_size = 10, nb_epoch = 100)
accuracies = cross_val_score(estimator = classifier, X = X_train, y = y_train, cv = 10, n_jobs = -1)
mean = accuracies.mean()
variance = accuracies.std()
Edit
Im on windows 10 using Anaconda with python 3.6.
Dataset : Drive Link for dataset
It works perfectly when i set n_jobs = 1 but not when n_jobs = -1
Upvotes: 1
Views: 1200
Reputation: 8801
Since you have set the n_jobs = -1
, then all the CPUs are being utlised as per the documentation mentioned here. However, you must understand that utilising all the CPUs does not necessarily may lead to reduction in execution time because:
You can check out a similar issue with GridSearchCV and parallization here in this answer.
Also, as mentioned by @ncfith, there is no current solution for this problem.
References
Upvotes: 1