Reputation: 917
I have written a small neural network for classifying cars and non-cars images. I need help with avoiding over-fitting. The model is shown below:
model = Sequential()
model.add(Conv2D(8, 3, 3, input_shape=X.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(16, 3, 3, input_shape=X.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
I am using generators:
generator = ImageDataGenerator( featurewise_center=True,
samplewise_center=False,
featurewise_std_normalization=False,
samplewise_std_normalization=False,
zca_whitening=False,
rotation_range=20.,
width_shift_range=0.4,
height_shift_range=0.4,
shear_range=0.2,
zoom_range=0.2,
channel_shift_range=0.1,
fill_mode='nearest',
horizontal_flip=True,
vertical_flip=False,
rescale=1.2,
preprocessing_function=None)
Ultimately, training acc is 98% whereas valid acc is 70%. Can you suggest something?
Upvotes: 2
Views: 669
Reputation: 491
I would comment, but I am too new to the site to comment. I agree with Miriam, overfitting is simply saying "believing the training data too much". What is happening in a neural net is essentially a function that outputs a classification (since you are doing classification vs regression). It means that you have a line and everything under a line is of a class, and everything above is another. By increasing the nodes/layer and number of layers total, you are allowing your neural net to represent a more complex function. So by adding more layers/nodes you will always get a better score on your training set, but not necessarily on other data. Imagine a bunch of points in a line, but they are not directly on the line. and there are some outliers. maybe the proper function to represent it would be a straight line, but a huge neural network might fit the points perfectly with some crazy complex function. When adding new points, the line would give a better classification since the neural network is trying fitting your training data so closely. If you are overfitting, I would say the first place to look is the complexity of your neural network.
Upvotes: 0
Reputation: 19664
I would suggest to try to reduce the size of the layers, as this may be the reason for the overfitting (having too many parameters to train).
For example, this layer model.add(Dense(256))
might be too large. You can try to replace the 256 with something in the range 50-70, see how it works, and continue from there. You may also try to decrease the size\amount of convolutional layers.
Upvotes: 2
Reputation: 40526
So I could see at least two techniques:
dropout
.valid
set in your training
set. You might try to increase the value of train - valid
split and check if the loss values are closer to each other. Upvotes: 1