Sourodip Kundu
Sourodip Kundu

Reputation: 89

3D CNN model throwing a Negative dimension error-- dimension issue

I am creating a 3D CNN model with Height = 128, Width = 128, Channels = 3.The code for 3D CNN-

def get_model(width=128, height=128, depth=3):
  """
  Build a 3D convolutional neural network
  """
  inputs = tf.keras.Input((width, height, depth, 1))

  x = layers.Conv3D(filters=64, kernel_size=3, activation="relu")(inputs)
  x = layers.MaxPool3D(pool_size=2)(x)
  x = layers.BatchNormalization()(x)

  x = layers.Conv3D(filters=128, kernel_size=3, activation="relu")(x)
  x = layers.MaxPool3D(pool_size=2)(x)
  x = layers.BatchNormalization()(x)

  x = layers.Conv3D(filters=256, kernel_size=3, activation="relu")(x)
  x = layers.MaxPool3D(pool_size=2)(x)
  x = layers.BatchNormalization()(x)

  x = layers.GlobalAveragePooling3D()(x)
  x = layers.Dense(units=512, activation="relu")(x)

  x = layers.Dropout(0.3)(x)

  outputs = layers.Dense(units=4, activation='softmax')(x)

  model= keras.Model(inputs, outputs, name="3DCNN")
  return model

So after creating the model function when I am trying to build the model it throws a value error

-ValueError: Negative dimension size caused by subtracting 2 from 1 for '{{node max_pooling3d_5/MaxPool3D}} = MaxPool3D[T=DT_FLOAT, data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1]](Placeholder)' with input shapes: [?,126,126,1,64].

Code for building the model:- #Build the model

model = get_model(width=128, height=128, depth=3)
model.summary()

Full Error-

   InvalidArgumentError                      Traceback (most recent call last)
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs, op_def)
       1879   try:
    -> 1880     c_op = pywrap_tf_session.TF_FinishOperation(op_desc)
       1881   except errors.InvalidArgumentError as e:
    
    InvalidArgumentError: Negative dimension size caused by subtracting 2 from 1 for '{{node max_pooling3d_5/MaxPool3D}} = MaxPool3D[T=DT_FLOAT, data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1]](Placeholder)' with input shapes: [?,126,126,1,64].
    
    During handling of the above exception, another exception occurred:
    
    ValueError                                Traceback (most recent call last)
    14 frames
    /usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py in _create_c_op(graph, node_def, inputs, control_inputs, op_def)
       1881   except errors.InvalidArgumentError as e:
       1882     # Convert to ValueError for backwards compatibility.
    -> 1883     raise ValueError(str(e))
       1884 
       1885   return c_op
    
    ValueError: Negative dimension size caused by subtracting 2 from 1 for '{{node max_pooling3d_5/MaxPool3D}} = MaxPool3D[T=DT_FLOAT, data_format="NDHWC", ksize=[1, 2, 2, 2, 1], padding="VALID", strides=[1, 2, 2, 2, 1]](Placeholder)' with input shapes:

 [?,126,126,1,64].

What's the meaning of this error?? Is something wrong with my dimension??

Thanks in advance!!!!!

Upvotes: 0

Views: 155

Answers (1)

Kaveh
Kaveh

Reputation: 4960

Without specifying data_format argument, a Conv3D layer considers the input shape as:

batch_shape + (conv_dim1, conv_dim2, conv_dim3, channels)

Which you have specified as:

batch_shape + (width=128, height=128, depth=3, channels=1)

Therefore you have a data which its' shape is (128,128,3) and has 1 channel.

As the convolution operation applies to the first 3 dimensions which are (128,128,3), after first convolution by kernel_size=3 the 3rd dimension (the one you specified as depth=3), shrinks to 1. Then in the next layer (MaxPooling3D) it can not get pooling by 2, because the shape does not fit. So, consider to change the depth dimension by larger numbers or change kernel_size parameter. For example input shape could be (128,128,128,1) or the kernel_size should change to something else like (3,3,1).

P.S: If you have a RGB image, then number of channels is 3 and the last dimension should be set to 3. In 3D images there is another concept named depth (another dimension) which is different from channel. So:

  • 3D Image RGB: (width, height, depth, 3)
  • 3D Image Grayscale: (width, height, depth, 1)
  • 2D Image RGB: (width, height, 3)
  • 2D Image Grayscale: (width, height, 1)

Upvotes: 1

Related Questions