Reputation: 31
I am running an LSTM model on AWS instance g3.8xlarge which has 2 GPUs, and using tf.distribute.MirroredStrategy()
so that I can use the 2 GPUs. However, the training time is actually slower than without using this. Does anyone know how to solve this?
I am using:
My code is:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM
from tensorflow.compat.v1.keras.layers import CuDNNLSTM
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import RepeatVector
from tensorflow.keras.layers import TimeDistributed
from tensorflow.keras.optimizers import SGD
import tensorflow.keras.backend as K
import tensorflow as tf
def lstm_model(timesteps, features, neurons_1, dropout, learning, momentum, decay, init ) :
distribute = tf.distribute.MirroredStrategy()
with distribute.scope():
model = Sequential()
model.add(CuDNNLSTM(neurons_1, return_sequences=False, input_shape = (timesteps, features), kernel_initializer = init))
model.add(Dropout(dropout))
model.add(Dense(1))
SGD( lr = learning, momentum = momentum, decay = decay, nesterov = False)
model.compile(loss = lambda y, f: tilted_loss(0.5, y,f), optimizer = 'adam')
return model
Upvotes: 2
Views: 719