Deep Learning: Validation Loss Fluctuates Wildly Yet Training Loss is Stable

I am working with text sequences, with a sequence length of between 1-3. The labels are a "score". I have over 5 million samples. My network looks like this (Keras):

model.add(Embedding(word_count, 128, input_length=3))
model.add(BatchNormalization())

model.add(Dense(128, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(512, activation='relu'))
model.add(BatchNormalization())
model.add(Dense(1024, activation='relu'))

model.add(Flatten())
model.add(Dense(1, activation='linear'))

I have tried many different network shapes and configurations, including with/without Dropout & BatchNorm. But my loss always looks like this:

I am using a batch size of 1024 and Adam optimiser.

As far as I can tell there are absolutely no differences between the training and testing datasets in regards to pre-processing etc.

Any suggestions on how I can diagnose this?

Upvotes: 3

Answers (2)

amba88

Reputation: 789

I found the problem. I was shuffling the test data between epochs, when I meant to shuffle the training data only. Thank you for your comments.

Upvotes: 2

Murata

Reputation: 13

First of all. you should split your dataset.

model.fit(X, Y, validation_split=0.1, epochs=100, batch_size=10)

and then you can see if the value change.

Upvotes: 1

Deep Learning: Validation Loss Fluctuates Wildly Yet Training Loss is Stable

Answers (2)

Related Questions