Ankit Kotharkar
Ankit Kotharkar

Reputation: 31

CNN regression model gives similar output for all inputs

I'm trying to build a CNN regression model. The input data is satellite images of 5 (256x256x5) bands over 10 years stacked together to obtain an array of 256x256x50.

channels=50
l2(0.0005)
model = models.Sequential()
input_shape=(img_size,img_size,channels)
chanDim=1
reg=l2(0.0005)
init='he_normal'
model.add(layers.Conv2D(64, (7, 7),strides=(2,2),padding='valid',
                        kernel_initializer=init,
                        kernel_regularizer=reg, input_shape=input_shape))
model.add(layers.Activation('gelu'))
model.add(layers.BatchNormalization(axis=chanDim))

model.add(layers.Conv2D(32, (3, 3),padding='valid',
                        kernel_initializer=init,
                        kernel_regularizer=reg))
model.add(layers.Activation('gelu'))
model.add(layers.BatchNormalization(axis=chanDim))

model.add(layers.Conv2D(64, (3, 3),padding='valid',
                        kernel_initializer=init,
                        kernel_regularizer=reg))
model.add(layers.Activation('relu'))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.Dropout(0.25))

model.add(layers.Conv2D(64, (3, 3),padding='valid',
                        kernel_initializer=init,
                        kernel_regularizer=reg))
model.add(layers.Activation('relu'))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.Conv2D(128, (3, 3),padding='valid',
                        kernel_initializer=init,
                        kernel_regularizer=reg))
model.add(layers.Activation('relu'))
model.add(layers.BatchNormalization(axis=chanDim))
model.add(layers.Dropout(0.25))

model.add(layers.Conv2D(128, (3, 3),padding='valid',
                        kernel_initializer=init,
                        kernel_regularizer=reg))
model.add(layers.Activation('relu'))
model.add(layers.BatchNormalization(axis=chanDim))

# model.add(layers.Conv2D(512, (3, 3),padding='valid',
#                         kernel_initializer=init,
#                         kernel_regularizer=reg))
# model.add(layers.Activation('relu'))
# model.add(layers.BatchNormalization(axis=chanDim))
# model.add(layers.Dropout(0.25))

model.add(layers.Flatten())
model.add(layers.Dense(128, activation='gelu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dropout(.5))
model.add(layers.Dense(16, activation='relu'))
model.add(layers.Dense(1, activation='relu'))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=1e-7), loss='mae')

Training steps:

Epoch 1/30
8/8 [==============================] - 208s 26s/step - loss: 1.3836 - val_loss: 1.3476
Epoch 2/30
8/8 [==============================] - 81s 11s/step - loss: 1.3826 - val_loss: 1.3476
Epoch 3/30
8/8 [==============================] - 61s 8s/step - loss: 1.3863 - val_loss: 1.3476
Epoch 4/30
8/8 [==============================] - 60s 8s/step - loss: 1.3837 - val_loss: 1.3476
Epoch 5/30
8/8 [==============================] - 61s 8s/step - loss: 1.3785 - val_loss: 1.3476
Epoch 6/30
8/8 [==============================] - 60s 8s/step - loss: 1.3863 - val_loss: 1.3476
Epoch 7/30
8/8 [==============================] - 60s 8s/step - loss: 1.3869 - val_loss: 1.3476
Epoch 8/30
8/8 [==============================] - 60s 8s/step - loss: 1.3665 - val_loss: 1.3476
Epoch 9/30
8/8 [==============================] - 60s 8s/step - loss: 1.3060 - val_loss: 1.3476
Epoch 10/30
8/8 [==============================] - 61s 8s/step - loss: 1.2391 - val_loss: 1.3443
Epoch 11/30
8/8 [==============================] - 60s 8s/step - loss: 1.1757 - val_loss: 1.2622
Epoch 12/30
8/8 [==============================] - 61s 8s/step - loss: 1.1277 - val_loss: 1.1432
Epoch 13/30
8/8 [==============================] - 60s 8s/step - loss: 1.0967 - val_loss: 1.0280
Epoch 14/30
8/8 [==============================] - 60s 8s/step - loss: 1.0408 - val_loss: 0.9306
Epoch 15/30
8/8 [==============================] - 61s 8s/step - loss: 1.0423 - val_loss: 0.8529
Epoch 16/30
8/8 [==============================] - 60s 8s/step - loss: 1.0277 - val_loss: 0.7910
Epoch 17/30
8/8 [==============================] - 61s 8s/step - loss: 1.0800 - val_loss: 0.7385
Epoch 18/30
8/8 [==============================] - 61s 8s/step - loss: 0.9982 - val_loss: 0.6957
Epoch 19/30
8/8 [==============================] - 62s 8s/step - loss: 1.0466 - val_loss: 0.6648
Epoch 20/30
8/8 [==============================] - 61s 8s/step - loss: 1.0755 - val_loss: 0.6431
Epoch 21/30
8/8 [==============================] - 61s 8s/step - loss: 0.9773 - val_loss: 0.6270
Epoch 22/30
8/8 [==============================] - 61s 8s/step - loss: 0.9878 - val_loss: 0.6173
Epoch 23/30
8/8 [==============================] - 62s 8s/step - loss: 0.9546 - val_loss: 0.6107
Epoch 24/30
8/8 [==============================] - 62s 8s/step - loss: 0.9736 - val_loss: 0.6066
Epoch 25/30
8/8 [==============================] - 62s 8s/step - loss: 0.9398 - val_loss: 0.6051
Epoch 26/30
8/8 [==============================] - 61s 8s/step - loss: 0.9513 - val_loss: 0.6064
Epoch 27/30
8/8 [==============================] - 61s 8s/step - loss: 0.9850 - val_loss: 0.6085
Epoch 28/30
8/8 [==============================] - 61s 8s/step - loss: 0.9534 - val_loss: 0.6120
<tensorflow.python.keras.callbacks.History at 0x7f7e8049b630>

But predictions[:10] and expected_values[:10] are:

[[0.75141275][0.9683605 ][1.0075892 ][0.9710504 ][1.0537224 ][0.95761603]
 [0.8781187 ][0.9666001 ][1.0071822 ][0.8568193 ]]

 [0.96850154 0.98255504 0.88197998 0.7692161  0.9462668  0.81489973
 0.99938562 0.93442511 0.98891429 0.97386952]

Evalution scores are:

actual vs prediction plot

Any ideas?

Upvotes: 0

Views: 537

Answers (2)

DankMasterDan
DankMasterDan

Reputation: 2133

It could also be that the problem is very hard to learn. I've had this and actually after 6 hours of identical outputs in each batch (which happens because the 'average' answer is the easiest to minimize loss), the network finally started learning:

enter image description here

Some things I plan on doing to make the learning happen earlier:

  1. changes in learning rate
  2. using features from earlier in the network

Upvotes: 0

Ankit Kotharkar
Ankit Kotharkar

Reputation: 31

Someone suggested me to treat this time series data as video hence using Conv3D instead of Conv2D to solve this issue and it works, the model is now not predicting the same output. So, the input data should be in the shape [10, 256,256, 10] which represents [Year, Image shape, Image Shape, Channels/Bands] for the time-series input data

Upvotes: 2

Related Questions