Reputation: 1702
I need to compute multiple deep models in parallel and average their results. My job runs forever after finishing computation with GPU 0
.
def model_train(self, params):
from nn_arch import nn_models
X, y, gpu_no = params
print("GPU NO ", gpu_no)
with tf.device('/gpu:' + str(gpu_no)):
model1 = nn_models.lenet5()
early_callback = CustomCallback()
model1.fit(X, y, batch_size=256, validation_split=0.2, callbacks=[early_callback],
verbose=1,
epochs=1)
return model1
And my main method below. In this case I have 2 GPUs
def main(self, X_train, y_train, X_test, y_test):
random_buckets = self.get_random()
X = [X_train[random_buckets[k]] for k in sorted(random_buckets)]
y = [y_train[random_buckets[j]] for j in sorted(random_buckets)]
params = zip(X, y, [0, 1])
models = pool1.map(self.model_train, params)
How do I train multiple models in parallel with Keras. (Data Parallel Approach)
Upvotes: 3
Views: 5254
Reputation: 481
Before compiling the model in keras. Add this line
model = make_parallel(model, 2)
where 2 is the number of GPUs available.
The make_parallel function is available in this file. Just import the file in your code and your code will be executed on multiple GPUs.
https://github.com/kuza55/keras-extras/blob/master/utils/multi_gpu.py
make_parallel is a simple function that:
Upvotes: 4