Reputation: 10033
I have a series of dependent X
variables that I want to use to predict Y
, where X
vary slightly between each series.
if train=='A':
pred = df[['X1','X2','X3','X4','X5','X6','X7','X8','Y']]
elif train=='B':
pred = df[['X1','X2','X3','X4','X5','X6','X7','X8','Y']]
elif train=='C':
pred = df[['X1','X2','X3b','X4','X5','X6','X7','X8','Y']]
elif train=='D':
pred = df[['X1','X2','X3b','X4','X5','X6','X7','X8','Y']]
elif train == 'E':
pred = df[['X1','X2b','X3b','X4','X5','X6','X7','X8','Y']]
And then I try to predict y
for all vectors above, passing pred
for 'A', 'B', 'C', 'D', 'E', sequencially:
....
X = pred.drop(axis=1, columns=['Y'])
# normalize data
X = X.astype('float32') / 255.
y = pred['Y']
# normalize data
y = y.astype('float32') / 255.
network = Sequential()
network.add(Dense(8, input_shape=(8,), activation='relu'))
network.add(Dense(6, activation='relu'))
network.add(Dense(6, activation='relu'))
network.add(Dense(4, activation='relu'))
network.add(Dense(1, activation='relu'))
network.compile('adam', loss='mse', metrics=['mae'])
network.fit(X, y, epochs=2)
y_hat = network.predict(X, verbose=1)
Most of the times, however, predictions fail, and y_hat
generates an array of 0.000
:
[[0.]
[0.]
[0.]
[0.]
[0.]
[0.]
[0.]
..]]
except when train == 'D'
, when network predicts a meaninful array, like so:
[[0.00562879]
[0.00741612]
[0.0066563 ]
[0.00720819]
[0.00596035]
[0.00612469]
[0.00808392]
....]]
Any ideas on why my keras
model prediction works only for 'D' variables, always on the fourth iteration and to how fix this? Makes no sense to me
PS:
If I run the model only for 'D', it does not work either. So it seems the model 'catches on' after fourth iteration...so I assume it has do do with some initialization process..
Upvotes: 1
Views: 36
Reputation: 7666
Maybe switching Relu
to LeaklyReLU
will resolve the issue
From Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems, Aurélien Géron, 2019
Unfortunately, the ReLU activation function is not perfect. It suffers from a problem known as the dying ReLUs: during training, some neurons effectively “die,” meaning they stop outputting anything other than 0. In some cases, you may find that half of your network’s neurons are dead, especially if you used a large learning rate. A neuron dies when its weights get tweaked in such a way that the weighted sum of its inputs are negative for all instances in the training set. When this happens, it just keeps outputting zeros, and Gradient Descent does not affect it anymore because the gradient of the ReLU function is zero when its input is negative.
You can use LeaklyReLU
like this
from tensorflow.keras.layers import LeakyReLU # for leakly relu
model.add(Dense(8, input_shape=(8,)))
model.add(LeakyReLU(alpha=0.05))
model.add(Dense(5))
Upvotes: 1