Reputation: 635
I'm trying to get a feel for what input parameters might work well for my data so I am training multiple models and trying to compare the results. For some reason some of the models start will a loss of 0 even when accuracy is very low.
My code looks something like:
first=keras.models.Sequential()
first.add(keras.layers.LSTM(100, activation = 'tanh', input_shape = (X.shape[1], 1), return_sequences = True))
first.add(keras.layers.Dense(Y.shape[2], activation = 'softmax'))
first.compile(optimizer = 'rmsprop', loss = 'categorical_crossentropy', metrics=['categorical_accuracy'])
models["first"]=first
and I have a series of code blocks like that changing a few parameters in each one. Even before I try to fit the loss is already 0 (or 2.79e-7) for some of them, but not with any real consistency. Sometimes the first model will have that issue, sometimes the 3rd. I tried even just re-generating the data and the models when that happens, but when I do that the odds of the loss==0 issue seem to rise. The input data is in the form [[[5],[7],[3]...]...] and the labels were the same before I ran them though keras.utils.to_categorical and it is now 1 hot. I'm fairly certain the inputs and labels are correct as any given model works correctly most of the time.
Any suggestions would be useful at this point.
EDIT: This seems to only happen when running on the GPU. When I force TensorFolw to run only on the CPU there are no issues. Any idea what causes issues on the GPU?
Upvotes: 1
Views: 626
Reputation: 478
Try setting your random seed for reproducibility:
random.seed(10)
Also, if the model is stateless, the cell states are reset at each sequence. But if not for example if your using stateful = True, you will need to reset states after each training run.
Upvotes: 1