philgun
philgun

Reputation: 189

Neural Network Doesn't wanna learn in random occasion using the same dataset

I have successfully created a supervised sequential model. The input is 4 dimensional with 1-dimensional output. The dataset is scaled down using a minmaxscaler. I have 5 hidden layers with 12 neurons at each layer. All the kernel initializers for input, hidden and output layer are he_normal. Activation is Relu for all layers including input and output.

I have 500 samples, batch size 128 data.

Sometimes the loss function (MSE) is not improving, see below:

Epoch 1/2000
14/14 [==============================] - 0s 10ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 2/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 3/2000
14/14 [==============================] - 0s 3ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 4/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 5/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 6/2000
14/14 [==============================] - 0s 3ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 7/2000
14/14 [==============================] - 0s 3ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862
Epoch 8/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.6348 - mape: 77.9929 - mse: 0.6348 - mae: 0.7799 - val_mse: 0.5862 - val_mae: 0.7431 - val_mape: 74.3140 - val_loss: 0.5862

and then I stop the run (ctrl+z) and re-execute the command again and it becomes:

Epoch 1/2000
14/14 [==============================] - 0s 10ms/step - loss: 0.2896 - mae: 0.4896 - mse: 0.2896 - mape: 48.9583 - val_mape: 49.4280 - val_loss: 0.2901 - val_mse: 0.2901 - val_mae: 0.4943
Epoch 2/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.2425 - mae: 0.4380 - mse: 0.2425 - mape: 43.8044 - val_mape: 43.7562 - val_loss: 0.2393 - val_mse: 0.2393 - val_mae: 0.4376
Epoch 3/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.2016 - mae: 0.3890 - mse: 0.2016 - mape: 38.9016 - val_mape: 38.2766 - val_loss: 0.1962 - val_mse: 0.1962 - val_mae: 0.3828
Epoch 4/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.1680 - mae: 0.3478 - mse: 0.1680 - mape: 34.7794 - val_mape: 33.7354 - val_loss: 0.1616 - val_mse: 0.1616 - val_mae: 0.3374
Epoch 5/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.1413 - mae: 0.3156 - mse: 0.1413 - mape: 31.5616 - val_mape: 30.1136 - val_loss: 0.1346 - val_mse: 0.1346 - val_mae: 0.3011
Epoch 6/2000
14/14 [==============================] - 0s 3ms/step - loss: 0.1207 - mae: 0.2890 - mse: 0.1207 - mape: 28.9029 - val_mape: 27.5114 - val_loss: 0.1143 - val_mse: 0.1143 - val_mae: 0.2751
Epoch 7/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.1061 - mae: 0.2693 - mse: 0.1061 - mape: 26.9261 - val_mape: 25.3930 - val_loss: 0.0994 - val_mse: 0.0994 - val_mae: 0.2539
Epoch 8/2000
14/14 [==============================] - 0s 2ms/step - loss: 0.0957 - mae: 0.2548 - mse: 0.0957 - mape: 25.4783 - val_mape: 23.8949 - val_loss: 0.0892 - val_mse: 0.0892 - val_mae: 0.2389

I wonder, what happened during the first run? Why, with the same dataset, the model was working on the second run and it was not working on the 1st run? Is there any way that I can do programmatically to ensure that at each execution of the script the model will learn? I want to include the model in an adaptive sampling algorithm that executes the training phase more than once in one go.

Any help will be very appreciated!

Cheers, PG

PS: Below is the code snippet, data sample, network architecture

Data Samples (4 inputs [Q_in,Tamb,T_in,H_drop] 1 output [eff_rcv])

Q_in,Tamb,T_in,H_drop,eff_rcv
609496059.800000,271.792985,807.218964,35.445493,0.870245
783459291.300000,314.275101,828.283384,31.161965,0.923094
391056216.100000,307.686423,816.411201,39.310067,0.748878
289120690.100000,292.437028,828.729067,29.114747,0.812813
508971245.000000,284.898844,812.819974,38.611096,0.817931
.
.

Code Snippet

#split the raw into input (X) and ouput(y)
X_raw = df[df.columns[0:inputdim]].to_numpy()

#Convert to 2D array
y_raw = df[df.columns[-1]].to_numpy()
y_raw = y_raw.reshape(-1,1)

#Import the scaler
mm = MinMaxScaler()

#Scaling the data
X_scaled = mm.fit_transform(X_raw)
y_scaled = mm.fit_transform(y_raw)

#Split into train test - 85% training 15% testing
Xtrain,Xtest,ytrain,ytest = train_test_split(X_scaled,y_scaled,test_size=0.15)

######################  BUILD MODEL ############################ 
inputdim = 4
outputdim = 1

#Number of neurons in each hidden layer < 2*input_dim
num_neurons = 3*inputdim

#Neural network architecture
network_layout = []
for i in range(5):
    network_layout.append(num_neurons)
#Building the neural network#
model = Sequential()

#Adding input layer and first hidden layer
model.add(Dense(network_layout[0],  
                    name = "Input",
                    input_dim=inputdim,
                    kernel_initializer=initializers.RandomNormal(),
                    bias_initializer=initializers.Zeros(),
                    use_bias=True,
                    activation=activation))

#Adding the rest of hidden layer
for numneurons in network_layout[1:]:
        model.add(Dense(numneurons,
                        kernel_initializer=initializers.RandomNormal(),
                        bias_initializer=initializers.Zeros(),
                        activation=activation))

#Adding the output layer
model.add(Dense(outputdim,
                    name="Output",
                    kernel_initializer=initializers.RandomNormal(),
                    bias_initializer=initializers.Zeros(),
                    activation="relu"))

backend.set_epsilon(1)

#Compiling the model
model.compile(optimizer=opt,loss='mse',metrics=['mse','mae','mape'])
model.summary()

#Training the model

history = model.fit(x=Xtrain,y=ytrain,validation_data=(Xtest,ytest),batch_size=batch_size,epochs=epochs)

Upvotes: 0

Views: 46

Answers (1)

drops
drops

Reputation: 1604

You could set the random seed before training your model. That way, you can ensure that the outcome will be same with each run of the script.

tf.keras.backend.clear_session()
tf.random.set_seed(1)

Concerning why it was not improving in the first snipped: it is possible that you ran into a local minimum you couldnt get out of.

Upvotes: 1

Related Questions