Is my training data set too complex for my neural network?

I am new to machine learning and stack overflow, I am trying to interpret two graphs from my regression model.

Training error and Validation error from my machine learning model

my case is similar to this guy Very large loss values when training multiple regression model in Keras but my MSE and RMSE are very high.

Is my modeling underfitting? if yes what can I do to solve this problem?

Here is my neural network I used for solving a regression problem

def build_model():
model = keras.Sequential([
    layers.Dense(128, activation=tf.nn.relu, input_shape=[len(train_dataset.keys())]),
    layers.Dense(64, activation=tf.nn.relu),
    layers.Dense(1)
])
optimizer = tf.keras.optimizers.RMSprop(0.001)

model.compile(loss='mean_squared_error',
              optimizer=optimizer,
              metrics=['mean_absolute_error', 'mean_squared_error'])
return model

and my data set I have 500 samples, 10 features and 1 target

Upvotes: 4

Answers (3)

Aman khan Roohani

Reputation: 159

As mentioned in the existing answer by @omoshiroiii your model in fact seems to be overfitting, that's why RMSE and MSE are too high.

Your model learned the detail and noise in the training data to the extent that it is now negatively impacting the performance of the model on new data.

The solution is therefore randomly removing some of the nodes so that the model cannot correlate with them too heavily.

Upvotes: 1

Hossein Karamzadeh

Reputation: 115

Well i think your model is overfitting

There are several ways that can help you :

1-Reduce the network’s capacity Which you can do by removing layers or reducing the number of elements in the hidden layers

2- Dropout layers, which will randomly remove certain features by setting them to zero

3-Regularization

If i want to give a brief explanation on these:

-Reduce the network’s capacity:

Some models have a large number of trainable parameters. The higher this number, the easier the model can memorize the target class for each training sample. Obviously, this is not ideal for generalizing on new data.by lowering the capacity of the network, it's going to learn the patterns that matter or that minimize the loss. But remember،reducing the network’s capacity too much will lead to underfitting.

-regularization:

This page can help you a lot https://towardsdatascience.com/handling-overfitting-in-deep-learning-models-c760ee047c6e

-Drop out layer

You can use some layer like this

model.add(layers.Dropout(0.5))

This is a dropout layer with a 50% chance of setting inputs to zero.

For more details you can see this page:

https://machinelearningmastery.com/how-to-reduce-overfitting-with-dropout-regularization-in-keras/

Upvotes: 2

omoshiroiii

Reputation: 693

Quite the opposite: it looks like your model is over-fitting. When you have low error rates for your training set, it means that your model has learned from the data well and can infer the results accurately. If your validation data is high afterwards however, that means that the information learned from your training data is not successfully being applied to new data. This is because your model has 'fit' onto your training data too much, and only learned how to predict well when its based off of that data.

To solve this, we can introduce common solutions to reduce over-fitting. A very common technique is to use Dropout layers. This will randomly remove some of the nodes so that the model cannot correlate with them too heavily - therefor reducing dependency on those nodes and 'learning' more using the other nodes too. I've included an example that you can test below; try playing with the value and other techniques to see what works best. And as a side note: are you sure that you need that many nodes within your dense layer? Seems like quite a bit for your data set, and that may be contributing to the over-fitting as a result too.

def build_model():
model = keras.Sequential([
layers.Dense(128, activation=tf.nn.relu, input_shape=[len(train_dataset.keys())]),
Dropout(0.2),
layers.Dense(64, activation=tf.nn.relu),
layers.Dense(1)
])
optimizer = tf.keras.optimizers.RMSprop(0.001)

model.compile(loss='mean_squared_error',
          optimizer=optimizer,
          metrics=['mean_absolute_error', 'mean_squared_error'])
return model

Upvotes: 6

Is my training data set too complex for my neural network?

Answers (3)

Related Questions