Reputation: 13
I am trying to build a convolutional neural network with an output matrix. The input shape is (100,100,4) and the output shape is (2,125).
Here is the summary of my current model:
Layer (type) Output Shape Param #
input_63 (InputLayer) (None, 100, 100, 4) 0
conv2d_44 (Conv2D) (None, 100, 100, 25) 2525
max_pooling2d_38 (MaxPooling (None, 50, 50, 25) 0
flatten_38 (Flatten) (None, 62500) 0
dense_47 (Dense) (None, 10) 625010
dense_48 (Dense) (None, 250) 2750
reshape_63 (Reshape) (None, 2, 125) 0
Total params: 630,285
Trainable params: 630,285
Non-trainable params: 0
Which I thought should be fine but when I tried to fit the model I got this error:
ValueError: Error when checking target: expected reshape_62 to have shape (2, 1) but got array with shape (2, 125)
Here's the code I used
batch_size = 100
input_layer = Input(shape=(xs[1],xs[2],xs[3]))
conv1 = Conv2D(filters = 25, kernel_size = 5,padding="same",activation="relu", data_format = 'channels_last')(input_layer)
pool1 = MaxPooling2D(pool_size=(2,2),padding="same")(conv1)
flat = Flatten()(pool1)
hidden1 = Dense(10, activation='relu')(flat)
output_layer = Dense(ys[1]*ys[2], activation='softmax')(hidden1)
output_reshape = Reshape((2,125))(output_layer)
model = Model(inputs=input_layer, outputs=output_reshape)
model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', sample_weight_mode='temporal'),y_train,batch_size=batch_size,epochs=3)
I have been looking up how the reshape layer work but I still couldn't figure it out. Any help would be most appreciated.
Upvotes: 1
Views: 390
Reputation: 86650
This is happening because you are using 'sparse_categorical_crossentropy'
"Sparse" means that the system will not expect an entire array, but just the cordinate of the hot spot. Instead of expecting a regular (None, 2, 125)
tensor, it will expect just (None, 2, 1)
indicating which of the 125 classes is the correct one.
To fix this, either you start using a sparse y_train
, or you replace your loss with 'categorical_crossentropy'
I believe a sparse y_train
can be obtained with sparse_y_train = numpy.argmax(y_train, axis=-1)
. If this model does not gives you memory problems, you don't need to go sparse.
Upvotes: 1