Mandallaz
Mandallaz

Reputation: 55

Training a neural network, I can't figure out my learning curves

I am cross validating a model, split in 5. Then I plotted for each split, the loss and the val_loss by epochs.

I get something like that: Training plots, by splits

I found this plot disturbing.

  1. Only the first one seems "normal" to me
  2. I found the loss level very very low
  3. I don't get how it's possible to have a val_loss always lower than the training_loss (plot 4 and 5).

How I cross validate:

def cv(X, y, model, n_splits=5, epochs=5, batch_size=1024,
       random_state=42, verbose=0):
    # kf = KFold(n_splits=n_splits, shuffle=True, random_state=random_state)
    kf = KFold(n_splits=n_splits, shuffle=False, random_state=random_state)
    histories = []
    for s in kf.split(X):
        X_train = X.iloc[s[0]].to_numpy()
        y_train = y.iloc[s[0]]['Target'].to_numpy()
        X_test = X.iloc[s[1]].to_numpy()
        y_test = y.iloc[s[1]]['Target'].to_numpy()

        h = model.fit(X_train, y_train,
                      epochs=epochs,
                      batch_size=batch_size,
                      validation_data=(X_test, y_test),
                      verbose=verbose)
        histories.append(h)
    return histories

The model:

def model_8(input_dim) -> tf.keras.models:
    get_custom_objects().update({'swish': Activation(swish)})

    inputs = Input(shape=(input_dim,))

    x = Dense(200, activation='swish', name='hl_1')(inputs)
    x = Dense(200, activation='swish', name='hl_2')(x)
    x = Dense(200, activation='swish', name='hl_3')(x)
    x = Dense(200, activation='swish', name='hl_4')(x)
    x = Dense(200, activation='swish', name='hl_5')(x)
    x = Dense(200, activation='swish', name='hl_6')(x)
    x = Dense(200, activation='swish', name='hl_7')(x)
    x = Dense(200, activation='swish', name='hl_8')(x)
    x = Dense(200, activation='swish', name='hl_9')(x)
    x = Dense(200, activation='swish', name='hl_10')(x)
    x = Dense(200, activation='swish', name='hl_11')(x)

    output = Dense(1, activation='sigmoid', name='output')(x)

    model = Model(inputs=inputs, outputs=output)
    model.compile(loss='mean_squared_error',
                  optimizer='adam')
    #model._name = function.__name__
    model._name = inspect.stack()[0][3]
    return model

The plot function:

def plots_(models_cv_histories, n_splits, save=False):
    """Plot all the learning curves for each trained models


    Arguments:
        models_cv_histories {array} --   array of histoires
    """

    nb_models = len(models_cv_histories)
    fig, axes = plt.subplots(nrows=nb_models,
                             ncols=n_splits, figsize=(12, 5))

    row_index = 0
    for cv_model in models_cv_histories:
        hist = cv_model[1]
        epochs = range(1, len(hist[0].epoch) + 1)

        col_index = 0
        for split_ in hist:
            loss = split_.history['loss']
            epochs = split_.epoch
            val_loss = split_.history['val_loss']
            model_name = split_.model.name

            if nb_models > 1:
                ax = axes[row_index][col_index]
            else:
                ax = axes[col_index]

            ax.set_title(model_name + ' split ' + str(col_index))
            ax.plot(epochs, loss, color="r", label="loss")
            ax.plot(epochs, val_loss, color="g", label="val_loss")
            ax.set_xlabel("epochs")
            ax.xaxis.set_major_locator(MaxNLocator(integer=True))
            ax.set_ylabel("loss")
            ax.legend(loc="upper right")

            col_index += 1
        row_index += 1

    fig.subplots_adjust()
    if save:
        plt.savefig("plots/test.png")
    fig.tight_layout()
    plt.show()

Would you give me some hint ?

Upvotes: 0

Views: 58

Answers (1)

Mandallaz
Mandallaz

Reputation: 55

I got the answer here. The fit function continues to train after each split. I should have refresh the model after each split.

Upvotes: 1

Related Questions