8-Bit Borges
8-Bit Borges

Reputation: 10033

Keras predict only working for one set of variables

I have a series of dependent X variables that I want to use to predict Y, where X vary slightly between each series.

    if train=='A':
        pred = df[['X1','X2','X3','X4','X5','X6','X7','X8','Y']]
    
    elif train=='B':
        pred = df[['X1','X2','X3','X4','X5','X6','X7','X8','Y']]
    
    elif train=='C':
        pred = df[['X1','X2','X3b','X4','X5','X6','X7','X8','Y']]
    
    elif train=='D':
        pred = df[['X1','X2','X3b','X4','X5','X6','X7','X8','Y']]
    
    elif train == 'E':
        pred = df[['X1','X2b','X3b','X4','X5','X6','X7','X8','Y']]

And then I try to predict y for all vectors above, passing pred for 'A', 'B', 'C', 'D', 'E', sequencially:

    ....

    X = pred.drop(axis=1, columns=['Y'])
    # normalize data
    X = X.astype('float32') / 255.
    y = pred['Y']
    # normalize data
    y = y.astype('float32') / 255.

    network = Sequential()

    network.add(Dense(8, input_shape=(8,), activation='relu'))
    network.add(Dense(6, activation='relu'))
    network.add(Dense(6, activation='relu'))
    network.add(Dense(4, activation='relu'))
    network.add(Dense(1, activation='relu'))

    network.compile('adam', loss='mse', metrics=['mae'])

    network.fit(X, y, epochs=2)

    y_hat = network.predict(X, verbose=1)

Most of the times, however, predictions fail, and y_hat generates an array of 0.000:

[[0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 [0.]
 ..]]

except when train == 'D', when network predicts a meaninful array, like so:

[[0.00562879]
 [0.00741612]
 [0.0066563 ]
 [0.00720819]
 [0.00596035]
 [0.00612469]
 [0.00808392]
 ....]]

Any ideas on why my keras model prediction works only for 'D' variables, always on the fourth iteration and to how fix this? Makes no sense to me

PS:

If I run the model only for 'D', it does not work either. So it seems the model 'catches on' after fourth iteration...so I assume it has do do with some initialization process..

Upvotes: 1

Views: 36

Answers (1)

meTchaikovsky
meTchaikovsky

Reputation: 7666

Maybe switching Relu to LeaklyReLU will resolve the issue

From Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow Concepts, Tools, and Techniques to Build Intelligent Systems, Aurélien Géron, 2019

Unfortunately, the ReLU activation function is not perfect. It suffers from a problem known as the dying ReLUs: during training, some neurons effectively “die,” meaning they stop outputting anything other than 0. In some cases, you may find that half of your network’s neurons are dead, especially if you used a large learning rate. A neuron dies when its weights get tweaked in such a way that the weighted sum of its inputs are negative for all instances in the training set. When this happens, it just keeps outputting zeros, and Gradient Descent does not affect it anymore because the gradient of the ReLU function is zero when its input is negative.

You can use LeaklyReLU like this

from tensorflow.keras.layers import LeakyReLU # for leakly relu

model.add(Dense(8, input_shape=(8,)))
model.add(LeakyReLU(alpha=0.05))
model.add(Dense(5))

Upvotes: 1

Related Questions